[pgrouting-users] What is the architecture of pgRouting and how can I make it work faster?

[Small note: The author of this question is familiar with PostGIS, but is not familiar with C language]

I have a ‘ways’ table in PostGIS consisting of about 90,000 ways. When I do a pgRouting shortest path query it takes about 2,5 seconds to get the result.

I was pretty unsatisfied with it (I would like it to be about 1 second for my imagined user) so I tried to do a ‘select * from ways’ query in psql with \timing enabled. It took 3 seconds to respond. When I did ‘select name, length, x1, y1, x2, y2 from ways’ (that is all needed to build a graph in my opinion) it took even more - 4 seconds to respond.

Based on that, I assume that pgRouting’s SQL queries are somehow optimized.

The questions:

  1. Is it possible to improve the speed of pgRouting queries?
  2. How is the actual shortest path calculation done? Is it anything like the following:
  • Load the whole ways/edges table into memory with one ‘select * from ways’;
  • Build a graph from that;
  • Apply a C-based shortest path algorithm to the graph;
  • Return a result into psql server?

Can any of these steps be improved anyhow?

Thank you!

On Wed, Aug 3, 2011 at 9:03 PM, Martin Lee <martinchlee@gmail.com> wrote:

[Small note: The author of this question is familiar with PostGIS, but is not familiar with C language]

I have a ‘ways’ table in PostGIS consisting of about 90,000 ways. When I do a pgRouting shortest path query it takes about 2,5 seconds to get the result.

This is really not many ways. It should be a lot faster.

  • Did you check your tables have indices?
  • Do you use bounding boxes not to load all the data (but in your case it’s such a small number of ways that even loading all shouldn’t take long.)

I was pretty unsatisfied with it (I would like it to be about 1 second for my imagined user) so I tried to do a ‘select * from ways’ query in psql with \timing enabled. It took 3 seconds to respond. When I did ‘select name, length, x1, y1, x2, y2 from ways’ (that is all needed to build a graph in my opinion) it took even more - 4 seconds to respond.

Based on that, I assume that pgRouting’s SQL queries are somehow optimized.

The questions:

  1. Is it possible to improve the speed of pgRouting queries?

The bottleneck is the number of ways you select from the table.
You can improve your speed by selecting less road links.
Anything else (like routing algorithms for example) might affect calculation time, but this is only very little compared to the data volume to load.

One thing you could check is how your way ID’s look like. Lower ID’s are better, so you may want to renumber them if there are large ID’s and large gaps between them. You can create a new id attribute for that.

Daniel

  1. How is the actual shortest path calculation done? Is it anything like the following:
  • Load the whole ways/edges table into memory with one ‘select * from ways’;
  • Build a graph from that;
  • Apply a C-based shortest path algorithm to the graph;
  • Return a result into psql server?

Can any of these steps be improved anyhow?

Thank you!


Pgrouting-users mailing list
Pgrouting-users@lists.osgeo.org
http://lists.osgeo.org/mailman/listinfo/pgrouting-users


Georepublic UG & Georepublic Japan
eMail: daniel.kastl@georepublic.de
Web: http://georepublic.de

On Wed, Aug 3, 2011 at 2:39 PM, Daniel Kastl <daniel@georepublic.de> wrote:

On Wed, Aug 3, 2011 at 9:03 PM, Martin Lee <martinchlee@gmail.com> wrote:

[Small note: The author of this question is familiar with PostGIS, but is not familiar with C language]

I have a ‘ways’ table in PostGIS consisting of about 90,000 ways. When I do a pgRouting shortest path query it takes about 2,5 seconds to get the result.

This is really not many ways. It should be a lot faster.

  • Did you check your tables have indices?

All the tables have indices. Are there any special requirements to particular tables to have indices?

  • Do you use bounding boxes not to load all the data (but in your case it’s such a small number of ways that even loading all shouldn’t take long.)

I was pretty unsatisfied with it (I would like it to be about 1 second for my imagined user) so I tried to do a ‘select * from ways’ query in psql with \timing enabled. It took 3 seconds to respond. When I did ‘select name, length, x1, y1, x2, y2 from ways’ (that is all needed to build a graph in my opinion) it took even more - 4 seconds to respond.

Based on that, I assume that pgRouting’s SQL queries are somehow optimized.

The questions:

  1. Is it possible to improve the speed of pgRouting queries?

The bottleneck is the number of ways you select from the table.
You can improve your speed by selecting less road links.

I am sorry, but I don’t really understand what ‘road links’ mean in pgRouting parlance. There are vertices which are built from OSM nodes, there are ways which are extracted from OSM multi-line ways. How road links are represented and in which table?

Anything else (like routing algorithms for example) might affect calculation time, but this is only very little compared to the data volume to load.

One thing you could check is how your way ID’s look like. Lower ID’s are better, so you may want to renumber them if there are large ID’s and large gaps between them. You can create a new id attribute for that.

Is the column name ‘gid’? By the way, why is it called ‘gid’ and not ‘id’?

Daniel

  1. How is the actual shortest path calculation done? Is it anything like the following:
  • Load the whole ways/edges table into memory with one ‘select * from ways’;
  • Build a graph from that;
  • Apply a C-based shortest path algorithm to the graph;
  • Return a result into psql server?

Can any of these steps be improved anyhow?

Thank you!


Pgrouting-users mailing list
Pgrouting-users@lists.osgeo.org
http://lists.osgeo.org/mailman/listinfo/pgrouting-users


Georepublic UG & Georepublic Japan
eMail: daniel.kastl@georepublic.de
Web: http://georepublic.de


Pgrouting-users mailing list
Pgrouting-users@lists.osgeo.org
http://lists.osgeo.org/mailman/listinfo/pgrouting-users

On 8/3/2011 3:43 PM, Martin Lee wrote:

On Wed, Aug 3, 2011 at 2:39 PM, Daniel Kastl <daniel@georepublic.de
<mailto:daniel@georepublic.de>> wrote:

    On Wed, Aug 3, 2011 at 9:03 PM, Martin Lee <martinchlee@gmail.com
    <mailto:martinchlee@gmail.com>> wrote:

        [Small note: The author of this question is familiar with
        PostGIS, but is not familiar with C language]

        I have a 'ways' table in PostGIS consisting of about 90,000
        ways. When I do a pgRouting shortest path query it takes about
        2,5 seconds to get the result.

    This is really not many ways. It should be a lot faster.

      * Did you check your tables have indices?

All the tables have indices. Are there any special requirements to
particular tables to have indices?

      * Do you use bounding boxes not to load all the data (but in your
        case it's such a small number of ways that even loading all
        shouldn't take long.)

        I was pretty unsatisfied with it (I would like it to be about 1
        second for my imagined user) so I tried to do a 'select * from
        ways' query in psql with \timing enabled. It took 3 seconds to
        respond. When I did 'select name, length, x1, y1, x2, y2 from
        ways' (that is all needed to build a graph in my opinion) it
        took even more - 4 seconds to respond.

        Based on that, I assume that pgRouting's SQL queries are somehow
        optimized.

        The questions:
        1. Is it possible to improve the speed of pgRouting queries?

    The bottleneck is the number of ways you select from the table.
    You can improve your speed by selecting less road links.

I am sorry, but I don't really understand what 'road links' mean in
pgRouting parlance. There are vertices which are built from OSM nodes,
there are ways which are extracted from OSM multi-line ways. How road
links are represented and in which table?

Routing problems are abstracted into graph theory concepts:

NODES or VERTICES
Roadways need to be "noded" which means that roadways the cross one another and intersect need to have a node or vertex at the intersection point and the roadsways need to be split at that intersection. This is pretty normal for most data but it is common for data to be not noded. I suspect that your data is fine in this regard.

EDGES or ROAD SEGMENTS or ROAD LINKS
these are the polylines that connect two NODES.

    Anything else (like routing algorithms for example) might affect
    calculation time, but this is only very little compared to the data
    volume to load.

    One thing you could check is how your way ID's look like. Lower ID's
    are better, so you may want to renumber them if there are large ID's
    and large gaps between them. You can create a new id attribute for that.

Is the column name 'gid'? By the way, why is it called 'gid' and not 'id'?

This is a postgis-ism, gid means geometry id, it is added when shapefiles are imported and it is typically an integer of bigint serial column and is the table primary key. id often conflicts with an existing column name. gid is dynamically added on data import and stripped off on data export.

-Steve W

    Daniel

        2. How is the actual shortest path calculation done? Is it
        anything like the following:
        - Load the whole ways/edges table into memory with one 'select *
        from ways';
        - Build a graph from that;
        - Apply a C-based shortest path algorithm to the graph;
        - Return a result into psql server?

        Can any of these steps be improved anyhow?

        Thank you!

        _______________________________________________
        Pgrouting-users mailing list
        Pgrouting-users@lists.osgeo.org
        <mailto:Pgrouting-users@lists.osgeo.org>
        http://lists.osgeo.org/mailman/listinfo/pgrouting-users

    --
    Georepublic UG & Georepublic Japan
    eMail: daniel.kastl@georepublic.de <mailto:daniel.kastl@georepublic.de>
    Web: http://georepublic.de/&gt;

    _______________________________________________
    Pgrouting-users mailing list
    Pgrouting-users@lists.osgeo.org <mailto:Pgrouting-users@lists.osgeo.org>
    http://lists.osgeo.org/mailman/listinfo/pgrouting-users

_______________________________________________
Pgrouting-users mailing list
Pgrouting-users@lists.osgeo.org
http://lists.osgeo.org/mailman/listinfo/pgrouting-users

This is really not many ways. It should be a lot faster.

  • Did you check your tables have indices?

All the tables have indices. Are there any special requirements to particular tables to have indices?

It’s good to have indices for attributes you use in your select. For example, if you select by bounding box, you should have an index on your geometry column.

The bottleneck is the number of ways you select from the table.
You can improve your speed by selecting less road links.

I am sorry, but I don’t really understand what ‘road links’ mean in pgRouting parlance. There are vertices which are built from OSM nodes, there are ways which are extracted from OSM multi-line ways. How road links are represented and in which table?

The vertices table is only a temporary table which is not used for pgRouting. It’s used for building the topology.
The relevant table is “ways” of geometry type LINESTRING (or MULTILINESTRING). These are road segments, which some people call “edge”, some “link”.

Daniel


Georepublic UG & Georepublic Japan
eMail: daniel.kastl@georepublic.de
Web: http://georepublic.de