Recently I’ve done a few tests querying data (POIs, Points of Interest) from Open Street Map. I imported the data from OSM into different databases and formats and I wanted to query it.
I’ll show 3 queries: from MongoDB, from PostgreSQL (tables using Osmosis format), from PostgreSQL (tables using osm2psql format).
POIs from MongoDB
I found this one the easiest query to write. I imported the data using the steps described on Derick Rethan’s blogpost (see his Github repository, I used the command:
php5 ./3angle/import-data.php localhost spain-latest.osm
to import it.
To query all the POIs for a given bounding box I use:
query = {"l": {"$geoWithin": { "$box": [ [ southwest_lng, southwest_lat], [northeast_lng, northeast_lat] ] } } }
pois = list(pois.find(query))
(replace southwest_lng and the other variables by their positions such as -0.09546518325805664,51.50291256180409,-0.08447885513305664,51.5070930676199 for a small part of London).
This was easy and I liked MongoDB and their GIS (geographical queries) support. It also was my first contact with MongoDB.
POIs from PostgreSQL, Osmosis format
I used osmosis to import data from OSM into PostgreSQL.
The query to get all the POIs for a given bounding box, in my case, is:
SELECT id, tags -> 'amenity' AS amenity, ST_Y(geom) as latitude, ST_X(geom) AS longitude FROM nodes WHERE ST_Intersects(geom, ST_MakeEnvelope(%(southwest_lng)s,%(southwest_lat)s,%(northeast_lng)s,%(northeast_lat)s,4326))
(replace %(southwest_lng)s and the variables by their values, see MongoDB for an example)
Notice that a “\d nodes” (describe nodes) table returns:
tags | hstore |
geom | geometry(Point,4326) |
“tags” column type is hstore (it’s a key-value column). The “geom” type is geometry(Point,4326). 4326 is standard coordinate system for the Earth (see World Geodetic System Wikipedia’s entry).
In this case, the query returns all nodes which geometry (just the position, in this case they are always points) intersects with the given box created using 4326 standard. It’s important that the ST_Intersects projection (4326) matches the projection defined in the column. Forgetting the standard (4326) in the ST_Intersects causes the query to run but not to get the expected results (having some sort of visual validation is important: I thought that the query was working because it returned results… but it wasn’t the case).
POIs from PostgreSQL, osm2psql format
You could import OSM data into PostgreSQL using osm2psql. This format is usually used by Mapnik to render the maps.
In this case the query is:
SELECT osm_id AS id, amenity, ST_Y(ST_Transform(way,4326)) AS latitude, ST_X(ST_Transform(way,4326)) as longitude FROM planet_osm_point WHERE ST_Intersects(way, ST_Transform(ST_MakeEnvelope(%(southwest_lng)s,%(southwest_lat)s,%(northeast_lng)s,%(northeast_lat)s,4326), 900913))
The planet_osm_point_index table contains the position (way column) as:
way | geometry(Point,900913) |
It’s not using the projection 4326 but the projection 900913 (also known as Web Mercator. Wikipedia recognizes that it’s a confusing denomination, and I would like to add that the fact that in PostgreSQL the projections are just numbers instead of names is even more confusing.
In this case the query creates a ST_MakeEnvelope (a bounding box) using the location (using the projection 4326 because my client passes it using this format). After that the function ST_Transform transforms this into the projection 900913 because it’s the one saved in the database (otherwise we might not get the correct results). And, like I did before, the query checks that the POIs are in the bounding box (envelope).
To return the data: it gets the position (way) which converts to the desired projection (ST_Transform) and then returns the latitude or longitude (using ST_X, ST_Y).
I found the PostgreSQL quite confusing because the naming of the functions and a new concept for me was the different projections: if one is not careful they can be easily mixed.
I was surprised that I couldn’t find on the Internet any example queries like these ones: it seems that getting POIs from a bounding box should be the “Hello World!” for OSM and PostgreSQL. So, in this blog post, I hope to fix The Internet adding the missing information.
Leave a Reply