[pgrouting-users] pgRouting Memory Issues

Richard_Marsden · January 14, 2011, 3:55am

Well I’m back to trying to get pgRouting working again with my planet.osm file. This has been loaded into Postgres with Osm2po.

I’m not sure if I’ve got the right parameters, but pretty much every variation I try results in long running (tens of minutes) before it crashes with an out of memory error (typically around 39017600).

An example query:

SELECT * FROM shortest_path (
‘SELECT id, source, target, cost, reverse_cost FROM osm_topo’ , 44070476, 44070478, true, true );

I’ve also tried a few different functions other than shortest_path(). The node numbers were picked from a query of the first few rows. I also checked indexes: osm2po produced indexes on source, target, etc.

It seems that the inner query which is passed as a parameter to shortest_path (or any of the other functions) might be querying the entire dataset? Which of course is huge as I’ve got the entire OSM planet file loaded into PostGres. So I tried adding a WHERE clause to the inner SELECT that only queries x1,y1 coordinates within a range (x1,y1 for Node 44070476 is about 55,51, so I chose 50.0->60.0 and 45.0->56.0 as my ranges).

No better.

So I think I’m really pushing the limits. planet.osm is of course huge, and despite the significant pruning by osm2po the resulting routing network is still very large. I guess pgRouting just doesn’t scale to such admittedly large datasets.
(I note that the online OpenLayers-based demos are limited to individual cities)

I wish to eventually do my bulk routing on a country-by-country basis, but it would not be practical to load each country individually. I’ll look to see if there are any other approaches, perhaps with a different offline routing engine (using a web service is not deemed practical), but it looks like this project is probably going to be dead in the water.

Richard Marsden

dkastl · January 14, 2011, 4:05am

Hi Richard,

When you make a query you select your whole network table into memory … which is huge, if it’s all planet.osm road data.
To make this faster you should make a query, that only selects the road data you need. Why should you for example load North American roads, when you route in Europe.

As smaller the amount of selected data is, as faster you will get the result. That’s the trick And if you route long distances you might want to consider to create a layered network and run multiple queries.
Look at the part with the bounding box wrapper. It applies for all algorithms, just look at Dijkstra chapter in the workshop: http://workshop.pgrouting.org/chapters/shortest_path.html#dijkstra

Daniel

2011/1/14 Richard Marsden <winwaed@gmail.com>

Well I’m back to trying to get pgRouting working again with my planet.osm file. This has been loaded into Postgres with Osm2po.

I’m not sure if I’ve got the right parameters, but pretty much every variation I try results in long running (tens of minutes) before it crashes with an out of memory error (typically around 39017600).

An example query:

SELECT * FROM shortest_path (
‘SELECT id, source, target, cost, reverse_cost FROM osm_topo’ , 44070476, 44070478, true, true );

I’ve also tried a few different functions other than shortest_path(). The node numbers were picked from a query of the first few rows. I also checked indexes: osm2po produced indexes on source, target, etc.

It seems that the inner query which is passed as a parameter to shortest_path (or any of the other functions) might be querying the entire dataset? Which of course is huge as I’ve got the entire OSM planet file loaded into PostGres. So I tried adding a WHERE clause to the inner SELECT that only queries x1,y1 coordinates within a range (x1,y1 for Node 44070476 is about 55,51, so I chose 50.0->60.0 and 45.0->56.0 as my ranges).

No better.

So I think I’m really pushing the limits. planet.osm is of course huge, and despite the significant pruning by osm2po the resulting routing network is still very large. I guess pgRouting just doesn’t scale to such admittedly large datasets.
(I note that the online OpenLayers-based demos are limited to individual cities)

I wish to eventually do my bulk routing on a country-by-country basis, but it would not be practical to load each country individually. I’ll look to see if there are any other approaches, perhaps with a different offline routing engine (using a web service is not deemed practical), but it looks like this project is probably going to be dead in the water.

Richard Marsden

Pgrouting-users mailing list
Pgrouting-users@lists.osgeo.org
http://lists.osgeo.org/mailman/listinfo/pgrouting-users

–
Georepublic UG & Georepublic Japan
eMail: daniel.kastl@georepublic.de
Web: http://georepublic.de

Richard_Marsden · January 14, 2011, 4:09pm

Daniel,

Thanks for your quick reply. Yes that was what I was trying to do with my ‘WHERE’ clause - I thought I was limiting the data pgRouting was looking at.

Going back to your earlier reply (about creating a VIEW for osm2po-produced data), and combining with the info on the URL you provided below, I’ve actually got a result back! I need to study things more, check what I’m getting back, and work out how to get what I actually want; but I’m definitely on the right track. Runtime was 1655ms which is also acceptable. Although the bounding box was small, 1-2 secs/route is my target. However, there’s some leeway and it looks like app-level multi-threading will be practical, but “minutes/route” would not be practical: I have a lot of routes to calculate!

Thanks,

Richard (M)

On Thu, Jan 13, 2011 at 10:05 PM, Daniel Kastl <daniel@georepublic.de> wrote:

Hi Richard,

When you make a query you select your whole network table into memory … which is huge, if it’s all planet.osm road data.
To make this faster you should make a query, that only selects the road data you need. Why should you for example load North American roads, when you route in Europe.

As smaller the amount of selected data is, as faster you will get the result. That’s the trick And if you route long distances you might want to consider to create a layered network and run multiple queries.
Look at the part with the bounding box wrapper. It applies for all algorithms, just look at Dijkstra chapter in the workshop: http://workshop.pgrouting.org/chapters/shortest_path.html#dijkstra

Daniel