[pgrouting-dev] Time dependent data and input format

Hi,

Lets use this thread for discussion on following points:

  1. What should be the input format for time dependent data?

  2. What are the options to obtain real world time dependent data? If it is not freely available, then what are the alternatives to test our code? (randomly generated time dependent data would be rather inadequate)

Input format for time dependent data:

There has already been a brief discussion on this topic before. A quick recap:

According to Stephen:

  • It is better to assume data to be present in a simpler model that is easy to work.
  • It is better to have 5 simple to work with tables than 1-2 complex ones.
  • We can then provides data loaders that will convert the data from different format into our designed model.
    Daniel had listed many possible scenarios in which time-dependent shortest path can be applicable.

Some Ideas:

The edge cost will eventually be a function of following (please feel free to add any new points):

  • Time
  • Vehicle type
    For time, each discrete time interval can have one entry per edge. If we assume t time intervals (which can be as small as 5 minutes as available in commercial Navteq time dependent data) there would be |E| * t entries.

For vehicle type again if we consider v vehicle types, there will be total |E| * t * v entries.

So, besides the edges table that has entries for every edge and other edge related information (like ways in Barcelona dataset), we can have another table with columns:

  • EdgeID
  • vehicleType
  • TimeInstant
  • Cost
    Once / if we are able to acquire any real world time dependent data, we can then write data loaders that would convert the data into our format.

Comments / improvements welcome.


Regards,
-Jay Mahadeokar

Jay,

On 4/29/2011 5:34 PM, Jay Mahadeokar wrote:

Hi,

Lets use this thread for discussion on following points:

1. What should be the input format for time dependent data?

2. What are the options to obtain real world time dependent data? If it
is not freely available, then what are the alternatives to test our
code? (randomly generated time dependent data would be rather inadequate)

There are probably fewer free time dependent data source available than others, but one that we could potentially use is MassGIS

http://www.mass.gov/mgis/laylist.htm

Look at the infrastructure section and specifically:

# MassDOT Roads - Updated October 2009
# Highway Exits and Routemarker Locations
# Census 2000 TIGER Roads
# Trains
# MBTA Rapid Transit
# MBTA Bus Routes and Stops
# Ferry Routes
# Boston Harbor Water Taxi Stops

This should give use a variety of infrastructure networks that we would need to augment with schedule information.

We can get the MBTA bus and train schedules online also.

http://www.mbta.com/

We looked at these doing the multimodal routing last year.

We should be able to find similar data for some other large cities. I have not looked for data for say Mumbai or other cities in India but you might have some luck if this is of interest.

If you are interested in working with Navteq data, you can register here:

http://www.nn4d.com/site/global/home/p_home.jsp

and then download a city worth of their data, which is primarily oriented to the navigable road networks and traffic feeds, but is probably less interesting than MassGIS above. The think that would be interesting with the Navteq data is that it has the time dependent turn restrictions.

We currently only support turn restriction in the shooting start algorithm but I would love to see the turn restriction time dependent or not get implemented in Dijkstras or A-Star with a clean simple mechanism for loading the turn restrictions.

*Input format for time dependent data:

I would think that you need:

   start_time_and_date,
   start_location,
   end_location,
   options

It you wanted to get fancy, it might be nice to be able to do the reverse and say:

   start_location,
   end_location,
   arrival_time,
   options

In this version you would have to figure on the route starting at the end and working backwards to the start location to figure on the route and the required start time.

*There has already been a brief discussion on this topic before. A quick
recap:

According to Stephen:

    * It is better to assume data to be present in a simpler model that
      is easy to work.
    * It is better to have 5 simple to work with tables than 1-2 complex
      ones.
    * We can then provides data loaders that will convert the data from
      different format into our designed model.

Daniel had listed many possible scenarios in which time-dependent
shortest path can be applicable.

*Some Ideas:

*The edge cost will eventually be a function of following (please feel
free to add any new points):

    * Time
    * Vehicle type

For time, each discrete time interval can have one entry per edge. If we
assume t time intervals (which can be as small as 5 minutes as available
in commercial Navteq time dependent data) there would be |E| * t entries.

For vehicle type again if we consider v vehicle types, there will be
total |E| * t * v entries.

So, besides the edges table that has entries for every edge and other
edge related information (like ways in Barcelona dataset), we can have
another table with columns:

    * EdgeID
    * vehicleType
    * TimeInstant
    * Cost

Once / if we are able to acquire any real world time dependent data, we
can then write data loaders that would convert the data into our format.

Comments / improvements welcome.

Ok, I want to go back to the purpose of my original statements about tables. I'm working under the follow assumptions about how thing will work.

User - application developer, the person build an application using our finished pgRouting product.

Client - the person using the application built by "User"

1. the User will take data like from MassGIS and load the network data and prep it for pgRouting.
2. the User will take the MBTA bus and train and subway schedules and load them into tables the closely reflect the underlying schedule data.
3. the User will develop a stored procedure to read the schedule data and morph it into some predefined notion whatever that might be. This could be to load it into a more normalized table or it might be to service a specific query and return a normalized record.
4. the Client make a route request
5. a wrapper script queries the networks and the schedules and this data gets built into a temporary boost graph structures to service this request.
6. results are returned to the Client and stuff is cleaned up.

I think this is basically the same flow we use today in pgRouting with the exception of the schedules or whatever time dependent data we have.

And with we are not using schedules and only looking at time dependent turn restrictions and oneway streets then I think using the same approach is still valid but instead of loading schedule information, we would load the turn restriction.

Thoughts?

Regards,
   -Steve

--
Regards,
-Jay Mahadeokar

_______________________________________________
pgrouting-dev mailing list
pgrouting-dev@lists.osgeo.org
http://lists.osgeo.org/mailman/listinfo/pgrouting-dev

Hi Jay and Steve,

We always get long email chains here :wink:
So I will also take Jay’s initial email to answer.

  1. What should be the input format for time dependent data?

As Steve said, the additional information is “time”. The rest should be same as with other shortest path algorithm.

For Darp algorithm that also cares about schedules, Anton took “timestamp” as format, which can be “with timezone” or “without timezone”. Anton, maybe you can share some obstacles you had with timestamp format.

  1. What are the options to obtain real world time dependent data? If it is not freely available, then what are the alternatives to test our code? (randomly generated time dependent data would be rather inadequate)

I know that in Japan they collect so called “VICS” data with time dependent information. There night be a possibility to get sample data, but it’s probably better to use resources Steve mentioned, because the documentation would be in English then.
Of course if you collect such kind of data in India and know how to get some, that would be great.

I think we should prefer data that is free to use. For pgRouting users it usually helps a lot to be able to download some sample application with data, and I guess we couldn’t offer this if we use Navteq data.

According to Stephen:

  • It is better to assume data to be present in a simpler model that is easy to work.
  • It is better to have 5 simple to work with tables than 1-2 complex ones.
  • We can then provides data loaders that will convert the data from different format into our designed model.

I agree with Steve. You can make complex tables with table joins and views later.
Also we can define the structure and require users to load the data. Someone then may want to write a Navteq data loader, but we don’t need to design our model according to Navteq.

Daniel


pgrouting-dev mailing list
pgrouting-dev@lists.osgeo.org
http://lists.osgeo.org/mailman/listinfo/pgrouting-dev


Georepublic UG & Georepublic Japan
eMail: daniel.kastl@georepublic.de
Web: http://georepublic.de

Hi,

My thoughts on points made by Stephen:

Firstly, the main goal of the time dependent shortest path project as per my GSoc proposal is to extend the static dijkstra algorithm, to work with time dependent graphs. I made a wiki page for that and we can keep modifying it as the ideas become concrete: https://github.com/pgRouting/pgrouting/wiki/TDSP-Details

*Input format for time dependent data:

I would think that you need:

start_time_and_date,
start_location,
end_location,
options

It you wanted to get fancy, it might be nice to be able to do the reverse and say:

start_location,
end_location,
arrival_time,
options

This looks good.
A tuple will denote the arrival_time which will be time required if we start from source_location at time denoted by start_time and travel to end_location.

So, basically we can assume that the edge weights are the time required to traverse the edge. The core time-dependent dijkstra will assume it can call a function like:

getCurrentEdgeCost( edge_id, start_time_and_date, arrival_time) - edge can be denoted by start_location and end_location. The arrival_time will be returned by function.

While relaxing the edge in dijkstra algorithm, instead of static cost, the above function will be called each time to know the arrival_time for that instant.

In this version you would have to figure on the route starting at the end and working backwards to the start location to figure on the route and the required start time.

*There has already been a brief discussion on this topic before. A quick

User - application developer, the person build an application using our finished pgRouting product.

Client - the person using the application built by “User”

  1. the User will take data like from MassGIS and load the network data and prep it for pgRouting.
  2. the User will take the MBTA bus and train and subway schedules and load them into tables the closely reflect the underlying schedule data.
  3. the User will develop a stored procedure to read the schedule data and morph it into some predefined notion whatever that might be. This could be to load it into a more normalized table or it might be to service a specific query and return a normalized record.
  4. the Client make a route request
  5. a wrapper script queries the networks and the schedules and this data gets built into a temporary boost graph structures to service this request.
  6. results are returned to the Client and stuff is cleaned up.

I think this is basically the same flow we use today in pgRouting with the exception of the schedules or whatever time dependent data we have.

And with we are not using schedules and only looking at time dependent turn restrictions and oneway streets then I think using the same approach is still valid but instead of loading schedule information, we would load the turn restriction.

I did not understand how we will be using bus and train schedules in this algorithm. I guess, the basic aim at this moment is just to extend dijkstra for time-dependent scenario, right? Dealing with schedules will need provision for multimodal routing I guess. Can you please clarify this point?

I was not exactly aware of concept of turn restrictions, so I referred here: http://wiki.openstreetmap.org/wiki/Relation:restriction

I guess to provide support for time dependent turn restriction will just need a slight modification in original approach as per the GSoc proposal. We will just have to make sure if we are allowed to turn to particular road at that particular time, before we relax it in dijkstra algorithm.

Thoughts?


Regards,
-Jay Mahadeokar

On 4/30/2011 1:27 PM, Jay Mahadeokar wrote:

Hi,

My thoughts on points made by Stephen:

Firstly, the main goal of the time dependent shortest path project as
per my GSoc proposal is to extend the static dijkstra algorithm, to work
with time dependent graphs. I made a wiki page for that and we can keep
modifying it as the ideas become concrete:
https://github.com/pgRouting/pgrouting/wiki/TDSP-Details

        *Input format for time dependent data:

    I would think that you need:

      start_time_and_date,
      start_location,
      end_location,
      options

    It you wanted to get fancy, it might be nice to be able to do the
    reverse and say:

      start_location,
      end_location,
      arrival_time,
      options

This looks good.
A tuple will denote the arrival_time which will be time required if we
start from source_location at time denoted by start_time and travel to
end_location.

So, basically we can assume that the edge weights are the time required
to traverse the edge. The core time-dependent dijkstra will assume it
can call a function like:

getCurrentEdgeCost( edge_id, start_time_and_date, arrival_time) - edge
can be denoted by start_location and end_location. The arrival_time will
be returned by function.

While relaxing the edge in dijkstra algorithm, instead of static cost,
the above function will be called each time to know the arrival_time for
that instant.

    In this version you would have to figure on the route starting at
    the end and working backwards to the start location to figure on the
    route and the required start time.

        *There has already been a brief discussion on this topic before.
        A quick

    User - application developer, the person build an application using
    our finished pgRouting product.

    Client - the person using the application built by "User"

    1. the User will take data like from MassGIS and load the network
    data and prep it for pgRouting.
    2. the User will take the MBTA bus and train and subway schedules
    and load them into tables the closely reflect the underlying
    schedule data.
    3. the User will develop a stored procedure to read the schedule
    data and morph it into some predefined notion whatever that might
    be. This could be to load it into a more normalized table or it
    might be to service a specific query and return a normalized record.
    4. the Client make a route request
    5. a wrapper script queries the networks and the schedules and this
    data gets built into a temporary boost graph structures to service
    this request.
    6. results are returned to the Client and stuff is cleaned up.

    I think this is basically the same flow we use today in pgRouting
    with the exception of the schedules or whatever time dependent data
    we have.

    And with we are not using schedules and only looking at time
    dependent turn restrictions and oneway streets then I think using
    the same approach is still valid but instead of loading schedule
    information, we would load the turn restriction.

I did not understand how we will be using bus and train schedules in
this algorithm. I guess, the basic aim at this moment is just to extend
dijkstra for time-dependent scenario, right? Dealing with schedules
will need provision for multimodal routing I guess. Can you please
clarify this point?

The multi-modal aspects are probably more important to Kishore's project, but in general this is also just a time dependent problem. Both these problems have the same issue of tracking time along the solution wave-front and at nodes you have to evaluate what out-bound links are value given some rules. The rules are based on your arrival time at that node.

So in your case given the the arrival time, your out-bound links might be modified by time dependent turn-restrictions and your average speed on the out-bound links might be effected by traffic feeds or historical average speed based on time of day, etc.

In Kishore's case, the valid out-bound links will be determined by modal schedules, and average link speeds will be determined by the mode of travel, like walk, bike, taxi, train etc. At nodes that are transfer points, then there may need to be rules about minimum transfer times, which equate to wait time between arrival and potential departures.

So if we think about these as abstractions, then at nodes we can have rules like:

1. restrictions
    - time dependent
    - vehicle class dependent restrictions
      o mopeds not allowed on divided highway
      o Bus only lanes, or HOV lanes
      o Emergency vehicle only lanes
      o pedestrian ways
      o bike lanes and paths
2. schedules
    - time dependent (this is very similar to restrictions)
    - vehicle class dependent
3. transfer costs
    - additional code of making a left hand turn in a right side country
    - cost to transfer modes in multi-modal network
    - toll cost

I was not exactly aware of concept of turn restrictions, so I referred
here: http://wiki.openstreetmap.org/wiki/Relation:restriction

Yes, this is a good document. I believe it is missing an important case:

                   G
                   |
A---------<-------B-----------<---------C
                   |
F--------->-------E----------->---------D
                   |
                   H

So this depicts a street (G-B-E-H) crossing a divided roadway (A-B-C) west bound and (F-E-D) east bound. Now we don't need to worry about turning the wrong way on the oneway segments as that is handled.

If there is No U-turns at this intersection, then we need a restriction that says you may not turn on AB via BE from FE, about you are allowed to turn onto AB from via BE from EH or from the opposite direction ED is restricted from CB via BE, but EH is ok.

I guess to provide support for time dependent turn restriction will just
need a slight modification in original approach as per the GSoc
proposal. We will just have to make sure if we are allowed to turn to
particular road at that particular time, before we relax it in dijkstra
algorithm.

Right. In Dijkstra, you typically have a pointer to your parent node, ie: the node you came from. And you have a list of node you can go to. In this case it is easy to add a reference to a restrictions on a given node. If the reference is null, then there are not restrictions. If you have a reference then restrictions could be stored in an array or list like:

has_time, time_from, time_to, node_to, ancestors

has_time, time_from, time_to - controls time dependancy or always valid

In my above example, we would have two restrictions for the No U-turns like:

at node B:
0, 0, 0, node_to=A, ancestors[E,F]

at node E:
0, 0, 0, node_to=D, ancestors=[B,C]

Maybe there is also a time dependent restriction on the divided road that restrictions traffic from turning onto the side road in the morning rush-hour, then we would add an additional restrictions to the above:

at node B:
1, 7AM, 9AM, node_to=G, ancestors=[C]
1, 7AM, 9AM, node_to=E, ancestors=[C]

at node E:
1, 7AM, 9AM, node_to=B, ancestors=[F]
1, 7AM, 9AM, node_to=H, ancestors=[F]

So if we translate that into a table design we have something like:

node, time_from, time_to, node_to, ancestors

which would be very easy to create and maintain.

OK, it is late here and I have flooded the list with a lot of mail tonight. I hope this all makes some sense and that the other threads might be useful.

Best regards,
   -Steve

Thoughts?

--
Regards,
-Jay Mahadeokar

_______________________________________________
pgrouting-dev mailing list
pgrouting-dev@lists.osgeo.org
http://lists.osgeo.org/mailman/listinfo/pgrouting-dev

On Sun, May 1, 2011 at 10:39 AM, Stephen Woodbridge <woodbri@swoodbridge.com> wrote:

On 4/30/2011 1:27 PM, Jay Mahadeokar wrote:

Hi,

My thoughts on points made by Stephen:

Firstly, the main goal of the time dependent shortest path project as
per my GSoc proposal is to extend the static dijkstra algorithm, to work
with time dependent graphs. I made a wiki page for that and we can keep
modifying it as the ideas become concrete:
https://github.com/pgRouting/pgrouting/wiki/TDSP-Details

*Input format for time dependent data:

I would think that you need:

start_time_and_date,
start_location,
end_location,
options

It you wanted to get fancy, it might be nice to be able to do the
reverse and say:

start_location,
end_location,
arrival_time,
options

This looks good.
A tuple will denote the arrival_time which will be time required if we
start from source_location at time denoted by start_time and travel to
end_location.

So, basically we can assume that the edge weights are the time required
to traverse the edge. The core time-dependent dijkstra will assume it
can call a function like:

getCurrentEdgeCost( edge_id, start_time_and_date, arrival_time) - edge
can be denoted by start_location and end_location. The arrival_time will
be returned by function.

While relaxing the edge in dijkstra algorithm, instead of static cost,
the above function will be called each time to know the arrival_time for
that instant.

In this version you would have to figure on the route starting at
the end and working backwards to the start location to figure on the
route and the required start time.

*There has already been a brief discussion on this topic before.
A quick

User - application developer, the person build an application using
our finished pgRouting product.

Client - the person using the application built by “User”

  1. the User will take data like from MassGIS and load the network
    data and prep it for pgRouting.
  2. the User will take the MBTA bus and train and subway schedules
    and load them into tables the closely reflect the underlying
    schedule data.
  3. the User will develop a stored procedure to read the schedule
    data and morph it into some predefined notion whatever that might
    be. This could be to load it into a more normalized table or it
    might be to service a specific query and return a normalized record.
  4. the Client make a route request
  5. a wrapper script queries the networks and the schedules and this
    data gets built into a temporary boost graph structures to service
    this request.
  6. results are returned to the Client and stuff is cleaned up.

I think this is basically the same flow we use today in pgRouting
with the exception of the schedules or whatever time dependent data
we have.

And with we are not using schedules and only looking at time
dependent turn restrictions and oneway streets then I think using
the same approach is still valid but instead of loading schedule
information, we would load the turn restriction.

I did not understand how we will be using bus and train schedules in
this algorithm. I guess, the basic aim at this moment is just to extend
dijkstra for time-dependent scenario, right? Dealing with schedules
will need provision for multimodal routing I guess. Can you please
clarify this point?

The multi-modal aspects are probably more important to Kishore’s project, but in general this is also just a time dependent problem. Both these problems have the same issue of tracking time along the solution wave-front and at nodes you have to evaluate what out-bound links are value given some rules. The rules are based on your arrival time at that node.

So in your case given the the arrival time, your out-bound links might be modified by time dependent turn-restrictions and your average speed on the out-bound links might be effected by traffic feeds or historical average speed based on time of day, etc.

In Kishore’s case, the valid out-bound links will be determined by modal schedules, and average link speeds will be determined by the mode of travel, like walk, bike, taxi, train etc. At nodes that are transfer points, then there may need to be rules about minimum transfer times, which equate to wait time between arrival and potential departures.

So if we think about these as abstractions, then at nodes we can have rules like:

  1. restrictions
  • time dependent
  • vehicle class dependent restrictions
    o mopeds not allowed on divided highway
    o Bus only lanes, or HOV lanes
    o Emergency vehicle only lanes
    o pedestrian ways
    o bike lanes and paths
  1. schedules
  • time dependent (this is very similar to restrictions)
  • vehicle class dependent
  1. transfer costs
  • additional code of making a left hand turn in a right side country
  • cost to transfer modes in multi-modal network
  • toll cost

I was not exactly aware of concept of turn restrictions, so I referred
here: http://wiki.openstreetmap.org/wiki/Relation:restriction

Yes, this is a good document. I believe it is missing an important case:

G
|
A---------<-------B-----------<---------C
|
F--------->-------E----------->---------D
|
H

So this depicts a street (G-B-E-H) crossing a divided roadway (A-B-C) west bound and (F-E-D) east bound. Now we don’t need to worry about turning the wrong way on the oneway segments as that is handled.

If there is No U-turns at this intersection, then we need a restriction that says you may not turn on AB via BE from FE, about you are allowed to turn onto AB from via BE from EH or from the opposite direction ED is restricted from CB via BE, but EH is ok.

I guess to provide support for time dependent turn restriction will just
need a slight modification in original approach as per the GSoc
proposal. We will just have to make sure if we are allowed to turn to
particular road at that particular time, before we relax it in dijkstra
algorithm.

Right. In Dijkstra, you typically have a pointer to your parent node, ie: the node you came from. And you have a list of node you can go to. In this case it is easy to add a reference to a restrictions on a given node. If the reference is null, then there are not restrictions. If you have a reference then restrictions could be stored in an array or list like:

has_time, time_from, time_to, node_to, ancestors

has_time, time_from, time_to - controls time dependancy or always valid

In my above example, we would have two restrictions for the No U-turns like:

at node B:
0, 0, 0, node_to=A, ancestors[E,F]

at node E:
0, 0, 0, node_to=D, ancestors=[B,C]

Maybe there is also a time dependent restriction on the divided road that restrictions traffic from turning onto the side road in the morning rush-hour, then we would add an additional restrictions to the above:

at node B:
1, 7AM, 9AM, node_to=G, ancestors=[C]
1, 7AM, 9AM, node_to=E, ancestors=[C]

at node E:
1, 7AM, 9AM, node_to=B, ancestors=[F]
1, 7AM, 9AM, node_to=H, ancestors=[F]

So if we translate that into a table design we have something like:

node, time_from, time_to, node_to, ancestors

which would be very easy to create and maintain.

Thanks Steve, for the detailed explanation.

I agree. So, now can we assume we have following database tables, as per things discussed till now:

  1. Ways: (On lines of the one in pgrouting-workshop)
  • gid

  • source

  • target

  • name

  • the_geom

  • static_length (In case time dependent routing is not used)

  • any other attributes…

  1. TimeVaryingCosts (Any other suitable name?)
  • gid

  • start_time_and_date (May divide this into two columns: day_of_week and time_of_day ? )

  • travel_time / arrival_time

  1. TurnRestrictions (Time dependent)
  • node/source

  • time_from

  • time_to

  • node_to

  • ancestors

If only static shortest path query is made, then only table 1 is required.
If time-dependent shortest path query is made, table 1 and 2 will be used.
If in time-dependent shortest path query with turn restrictions is made , all 3 tables will be used.

Thoughts?

OK, it is late here and I have flooded the list with a lot of mail tonight. I hope this all makes some sense and that the other threads might be useful.

Best regards,
-Steve

Thoughts?


Regards,
-Jay Mahadeokar


pgrouting-dev mailing list
pgrouting-dev@lists.osgeo.org
http://lists.osgeo.org/mailman/listinfo/pgrouting-dev


pgrouting-dev mailing list
pgrouting-dev@lists.osgeo.org
http://lists.osgeo.org/mailman/listinfo/pgrouting-dev


Regards,
-Jay Mahadeokar

Hi,

Just a recap to our planned tdsp query model:

I agree. So, now can we assume we have following database tables, as per things discussed till now:

  1. Ways: (On lines of the one in pgrouting-workshop)
  • gid

  • source

  • target

  • name

  • the_geom

  • static_length (In case time dependent routing is not used)

  • any other attributes…

  1. TimeVaryingCosts (Any other suitable name?)
  • gid

  • start_time_and_date (May divide this into two columns: day_of_week and time_of_day ? )

  • travel_time / arrival_time

  1. TurnRestrictions (Time dependent)
  • node/source

  • time_from

  • time_to

  • node_to

  • ancestors

If only static shortest path query is made, then only table 1 is required.
If time-dependent shortest path query is made, table 1 and 2 will be used.
If in time-dependent shortest path query with turn restrictions is made , all 3 tables will be used.

So, for a general time dependent query, we will fetch edge information from ways table and the time dependent costs information from the
TimeDepCosts table. (See previous msg above.)

The query on ways table can be supplied by the usual as on normal dijkstra shortest_path() query.

How do we form the query on the TimeDepCosts table? I see two options:

  1. We generate the query according to the gids obtained from the query on ways table. We will have to query all the time slots available, or form a query so that the time-window is appropriate and we have sufficient travel_time info so that we can reach destination.

Here we can form query like

Select * from TimeDepCosts where gid in (list of gids queried from ways table).

We can allow user to more filtering parameters according to start_time , date ?
But this will make our query specific to the database design. As already discussed, we cant predict the time units / requirements of applications and so it would be better if we keep the query independent of the database design.

Thats the reason our weight_map assumes start time from source as 0, and all travel times offset from the start time from source. A function that will generate such weight map is to be written.

  1. We allow user to pass the query for TimeDepCosts table too. So, we will have our pgsql function as:
CREATE OR REPLACE FUNCTION shortest_path(
                                                ways_sql text,

                                                time_dep_cost_sql text,
                                                source_id integer,

                                                target_id integer,
                                                directed boolean,

                                                has_reverse_cost boolean)
        RETURNS SETOF path_result

This way, user can have more control on the amount of data being queried. We will only consider the time dependent windows that fit the query.

Example of the time_dep_cost_sql can be:

Select * from TimeDepCosts where " USER_SUPPLIED_CONSTRAINTS".

The user can now supply his constraints according to time_window he needs, start times, holidays etc according to the database table design.

The problem with this is, we cannot filter out edges according to bounding box etc since the_geom field is not there in this table. This is because that information would be redundant.

Thoughts? Any idea that combines the merits of both approaches?


Regards,
-Jay Mahadeokar

On Thu, Jun 9, 2011 at 9:16 AM, Jay Mahadeokar <jai.mahadeokar@gmail.com> wrote:

Hi,

Just a recap to our planned tdsp query model:

I agree. So, now can we assume we have following database tables, as per things discussed till now:

  1. Ways: (On lines of the one in pgrouting-workshop)
  • gid

  • source

  • target

  • name

  • the_geom

  • static_length (In case time dependent routing is not used)

  • any other attributes…

  1. TimeVaryingCosts (Any other suitable name?)
  • gid

  • start_time_and_date (May divide this into two columns: day_of_week and time_of_day ? )

  • travel_time / arrival_time

  1. TurnRestrictions (Time dependent)
  • node/source

  • time_from

  • time_to

  • node_to

  • ancestors

If only static shortest path query is made, then only table 1 is required.
If time-dependent shortest path query is made, table 1 and 2 will be used.
If in time-dependent shortest path query with turn restrictions is made , all 3 tables will be used.

So, for a general time dependent query, we will fetch edge information from ways table and the time dependent costs information from the
TimeDepCosts table. (See previous msg above.)

The query on ways table can be supplied by the usual as on normal dijkstra shortest_path() query.

How do we form the query on the TimeDepCosts table? I see two options:

  1. We generate the query according to the gids obtained from the query on ways table. We will have to query all the time slots available, or form a query so that the time-window is appropriate and we have sufficient travel_time info so that we can reach destination.

Here we can form query like

Select * from TimeDepCosts where gid in (list of gids queried from ways table).

We can allow user to more filtering parameters according to start_time , date ?
But this will make our query specific to the database design. As already discussed, we cant predict the time units / requirements of applications and so it would be better if we keep the query independent of the database design.

Thats the reason our weight_map assumes start time from source as 0, and all travel times offset from the start time from source. A function that will generate such weight map is to be written.

  1. We allow user to pass the query for TimeDepCosts table too. So, we will have our pgsql function as:
CREATE OR REPLACE FUNCTION shortest_path(
                                                ways_sql text,

                                                time_dep_cost_sql text,
                                                source_id integer,

                                                target_id integer,
                                                directed boolean,

                                                has_reverse_cost boolean)
        RETURNS SETOF path_result

This way, user can have more control on the amount of data being queried. We will only consider the time dependent windows that fit the query.

Example of the time_dep_cost_sql can be:

Select * from TimeDepCosts where " USER_SUPPLIED_CONSTRAINTS".

The user can now supply his constraints according to time_window he needs, start times, holidays etc according to the database table design.

The problem with this is, we cannot filter out edges according to bounding box etc since the_geom field is not there in this table. This is because that information would be redundant.

Thoughts? Any idea that combines the merits of both approaches?

Hi all,

I am writing some background code will be needed for the above functionality.

Any input on the above mentioned point? I suppose I should not start coding for any specific approach until we agree upon a viable query strategy.


Regards,
-Jay Mahadeokar


Regards,
-Jay Mahadeokar

On 6/11/2011 12:58 PM, Jay Mahadeokar wrote:

On Thu, Jun 9, 2011 at 9:16 AM, Jay Mahadeokar <jai.mahadeokar@gmail.com
<mailto:jai.mahadeokar@gmail.com>> wrote:

    Hi,

    Just a recap to our planned tdsp query model:

        I agree. So, now can we assume we have following database
        tables, as per things discussed till now:

           1. Ways: (On lines of the one in pgrouting-workshop)

            * gid
            * source
            * target
            * name
            * the_geom
            * static_length (In case time dependent routing is not used)
            * any other attributes..

             2. TimeVaryingCosts (Any other suitable name?)

            * gid
            * start_time_and_date (May divide this into two columns:
              day_of_week and time_of_day ? )
            * travel_time / arrival_time

             3. TurnRestrictions (Time dependent)

            * node/source
            * time_from
            * time_to
            * node_to
            * ancestors

        If only static shortest path query is made, then only table 1 is
        required.
        If time-dependent shortest path query is made, table 1 and 2
        will be used.
        If in time-dependent shortest path query with turn restrictions
        is made , all 3 tables will be used.

    So, for a general time dependent query, we will fetch edge
    information from ways table and the time dependent costs information
    from the
    TimeDepCosts table. (See previous msg above.)

    The query on ways table can be supplied by the usual as on normal
    dijkstra shortest_path() query.

    How do we form the query on the TimeDepCosts table? I see two options:

    1. We generate the query according to the gids obtained from the
    query on ways table. We will have to query all the time slots
    available, or form a query so that the time-window is appropriate
    and we have sufficient travel_time info so that we can reach
    destination.

    Here we can form query like

    Select * from TimeDepCosts where gid in (list of gids queried from
    ways table).

    We can allow user to more filtering parameters according to
    start_time , date ?
    But this will make our query specific to the database design. As
    already discussed, we cant predict the time units / requirements of
    applications and so it would be better if we keep the query
    independent of the database design.

    Thats the reason our weight_map assumes start time from source as 0,
    and all travel times offset from the start time from source. A
    function that will generate such weight map is to be written.

    2. We allow user to pass the query for TimeDepCosts table too. So,
    we will have our pgsql function as:

    CREATE OR REPLACE FUNCTION shortest_path(
                                                     ways_sql text,

                                                     time_dep_cost_sql text,
                                                     source_id integer,

                                                     target_id integer,
                                                     directed boolean,

                                                     has_reverse_cost boolean)
             RETURNS SETOF path_result

    This way, user can have more control on the amount of data being
    queried. We will only consider the time dependent windows that fit
    the query.

    Example of the time_dep_cost_sql can be:

    Select * from TimeDepCosts where " USER_SUPPLIED_CONSTRAINTS".

    The user can now supply his constraints according to time_window he
    needs, start times, holidays etc according to the database table design.

    The problem with this is, we cannot filter out edges according to
    bounding box etc since the_geom field is not there in this table.
    This is because that information would be redundant.

    Thoughts? Any idea that combines the merits of both approaches?

Hi all,

I am writing some background code will be needed for the above
functionality.

Any input on the above mentioned point? I suppose I should not start
coding for any specific approach until we agree upon a viable query
strategy.

Hi Jay,

I always find it easiest to answer these questions by thinking about the problem in terms of use cases, so:

A web app, where someone enters a start and destination and some constraints like departure time or arrival time and maybe restrictions on modes of travel.

Are there other ones we are trying to support? These should probably be documented as part of the project description.

Anyway, assuming the above, then the application which presumably has some basic knowledge of the time constraints could do something like:

Get the straight line distance between start and destination double it and apply some average transportation speed to approximate the time window. The numbers can be changed to be acceptable by the application as long as it over estimates the duration of travel. It can then, specify the CONSTRAINTS in terms of a start and end time. We do something similar with our spatial bounding box used to collect the edges needed to build the graph.

Now as far as collecting the constraints and making then accessible for the Boost_dijkstra, that is another problem. This problem boils down to the following:

1. is there an acceptable abstraction that we can use internally to represent various data sets that we might want to load. Or can we define a schema that will work for some set of constraints we wish to implement initially.
2. do want to move this data into some static memory structures for the analysis like we do for the graph and then dispose of them after the analysis.
3. or do we want to construct a method that can dynamically request constraint information from the database when it is needed by a visitor. (2. is probably much more efficient and the better way to go from a complexity and stability point of view).

Don't wait on me for additional input as I will be gone through June 21st. Keep it simple and modular and we can change or add to it in the future and I;m sure whatever you come up with will be fine. After all it is code and it can be changed it we need to. :slight_smile:

Project seems to be moving along fine.

-Steve

On Sun, Jun 12, 2011 at 1:07 AM, Stephen Woodbridge <woodbri@swoodbridge.com> wrote:

On 6/11/2011 12:58 PM, Jay Mahadeokar wrote:

On Thu, Jun 9, 2011 at 9:16 AM, Jay Mahadeokar <jai.mahadeokar@gmail.com

mailto:[jai.mahadeokar@gmail.com](mailto:jai.mahadeokar@gmail.com)> wrote:

Hi,

Just a recap to our planned tdsp query model:

I agree. So, now can we assume we have following database
tables, as per things discussed till now:

  1. Ways: (On lines of the one in pgrouting-workshop)
  • gid
  • source
  • target
  • name
  • the_geom
  • static_length (In case time dependent routing is not used)
  • any other attributes…
  1. TimeVaryingCosts (Any other suitable name?)
  • gid
  • start_time_and_date (May divide this into two columns:
    day_of_week and time_of_day ? )
  • travel_time / arrival_time
  1. TurnRestrictions (Time dependent)
  • node/source
  • time_from
  • time_to
  • node_to
  • ancestors

If only static shortest path query is made, then only table 1 is
required.
If time-dependent shortest path query is made, table 1 and 2
will be used.
If in time-dependent shortest path query with turn restrictions
is made , all 3 tables will be used.

So, for a general time dependent query, we will fetch edge
information from ways table and the time dependent costs information
from the
TimeDepCosts table. (See previous msg above.)

The query on ways table can be supplied by the usual as on normal
dijkstra shortest_path() query.

How do we form the query on the TimeDepCosts table? I see two options:

  1. We generate the query according to the gids obtained from the
    query on ways table. We will have to query all the time slots
    available, or form a query so that the time-window is appropriate
    and we have sufficient travel_time info so that we can reach
    destination.

Here we can form query like

Select * from TimeDepCosts where gid in (list of gids queried from
ways table).

We can allow user to more filtering parameters according to
start_time , date ?
But this will make our query specific to the database design. As
already discussed, we cant predict the time units / requirements of
applications and so it would be better if we keep the query
independent of the database design.

Thats the reason our weight_map assumes start time from source as 0,
and all travel times offset from the start time from source. A
function that will generate such weight map is to be written.

  1. We allow user to pass the query for TimeDepCosts table too. So,
    we will have our pgsql function as:

CREATE OR REPLACE FUNCTION shortest_path(
ways_sql text,

time_dep_cost_sql text,
source_id integer,

target_id integer,
directed boolean,

has_reverse_cost boolean)
RETURNS SETOF path_result

This way, user can have more control on the amount of data being
queried. We will only consider the time dependent windows that fit
the query.

Example of the time_dep_cost_sql can be:

Select * from TimeDepCosts where " USER_SUPPLIED_CONSTRAINTS".

The user can now supply his constraints according to time_window he
needs, start times, holidays etc according to the database table design.

The problem with this is, we cannot filter out edges according to
bounding box etc since the_geom field is not there in this table.
This is because that information would be redundant.

Thoughts? Any idea that combines the merits of both approaches?

Hi all,

I am writing some background code will be needed for the above
functionality.

Any input on the above mentioned point? I suppose I should not start
coding for any specific approach until we agree upon a viable query
strategy.

Hi Jay,

I always find it easiest to answer these questions by thinking about the problem in terms of use cases, so:

A web app, where someone enters a start and destination and some constraints like departure time or arrival time and maybe restrictions on modes of travel.

Are there other ones we are trying to support? These should probably be documented as part of the project description.

I agree, we need to document the use-cases. A separate thread for this discussion?

Anyway, assuming the above, then the application which presumably has some basic knowledge of the time constraints could do something like:

Get the straight line distance between start and destination double it and apply some average transportation speed to approximate the time window. The numbers can be changed to be acceptable by the application as long as it over estimates the duration of travel. It can then, specify the CONSTRAINTS in terms of a start and end time. We do something similar with our spatial bounding box used to collect the edges needed to build the graph.

Now as far as collecting the constraints and making then accessible for the Boost_dijkstra, that is another problem. This problem boils down to the following:

Just a correction, in this case we will not be using boost_dijkstra, but instead core_tdsp() function which we have build ourselves.

  1. is there an acceptable abstraction that we can use internally to represent various data sets that we might want to load. Or can we define a schema that will work for some set of constraints we wish to implement initially.

Well, we had agreed upon the following schema for testing purpose:

  1. Ways: (On lines of the one in pgrouting-workshop)

    • gid
    • source
    • target
    • name
    • the_geom
    • static_length (In case time dependent routing is not used)
    • any other attributes
  2. TimeDepCosts

    • gid
    • day_of_week
    • time_of_day
    • travel_time / arrival_time
      The time_dep_costs table schema can vary according to application and its needs and the nature of data available.
  1. do want to move this data into some static memory structures for the analysis like we do for the graph and then dispose of them after the analysis.

We are doing exactly that! The Weight_Map data structure that is supplied to the core_tdsp() is designed specifically to fulfil the above functionality. It assumes the start time as 0, and all other times offset from start time. So, there will be schema specific wrapper functions that will help generate the weight_map data structure.

Which brings me back, to my original question (couple of messages back), what is the best way to generate weight_map, ie retrieve data from time dependent cost table.

Well, I think for now, I am going to go with approach #1. i.e user will supply his query. This might retrieve unwanted data, since we cant filter by the_geom in table#2. But, I think we can refine this by adding constraints like: where gid IN (list_of_edgeids_retrieved_from_ways)

So, I guess this approach combines the merits of both approaches discussed before. We can provide flexibility to application as well as add constraint on the number of rows retrieved.

Thoughts?

  1. or do we want to construct a method that can dynamically request constraint information from the database when it is needed by a visitor. (2. is probably much more efficient and the better way to go from a complexity and stability point of view).

Don’t wait on me for additional input as I will be gone through June 21st. Keep it simple and modular and we can change or add to it in the future and I;m sure whatever you come up with will be fine. After all it is code and it can be changed it we need to. :slight_smile:

Project seems to be moving along fine.

-Steve


pgrouting-dev mailing list
pgrouting-dev@lists.osgeo.org
http://lists.osgeo.org/mailman/listinfo/pgrouting-dev


Regards,
-Jay Mahadeokar

On 6/12/2011 12:33 AM, Jay Mahadeokar wrote:

On Sun, Jun 12, 2011 at 1:07 AM, Stephen Woodbridge
<woodbri@swoodbridge.com <mailto:woodbri@swoodbridge.com>> wrote:

    On 6/11/2011 12:58 PM, Jay Mahadeokar wrote:

        On Thu, Jun 9, 2011 at 9:16 AM, Jay Mahadeokar
        <jai.mahadeokar@gmail.com <mailto:jai.mahadeokar@gmail.com>
        <mailto:jai.mahadeokar@gmail.com
        <mailto:jai.mahadeokar@gmail.com>>> wrote:

            Hi,

            Just a recap to our planned tdsp query model:

                I agree. So, now can we assume we have following database
                tables, as per things discussed till now:

                   1. Ways: (On lines of the one in pgrouting-workshop)

                    * gid
                    * source
                    * target
                    * name
                    * the_geom
                    * static_length (In case time dependent routing
        is not used)
                    * any other attributes..

                     2. TimeVaryingCosts (Any other suitable name?)

                    * gid
                    * start_time_and_date (May divide this into two
        columns:
                      day_of_week and time_of_day ? )
                    * travel_time / arrival_time

                     3. TurnRestrictions (Time dependent)

                    * node/source
                    * time_from
                    * time_to
                    * node_to
                    * ancestors

                If only static shortest path query is made, then only
        table 1 is
                required.
                If time-dependent shortest path query is made, table 1 and 2
                will be used.
                If in time-dependent shortest path query with turn
        restrictions
                is made , all 3 tables will be used.

            So, for a general time dependent query, we will fetch edge
            information from ways table and the time dependent costs
        information
            from the
            TimeDepCosts table. (See previous msg above.)

            The query on ways table can be supplied by the usual as on
        normal
            dijkstra shortest_path() query.

            How do we form the query on the TimeDepCosts table? I see
        two options:

            1. We generate the query according to the gids obtained from the
            query on ways table. We will have to query all the time slots
            available, or form a query so that the time-window is
        appropriate
            and we have sufficient travel_time info so that we can reach
            destination.

            Here we can form query like

            Select * from TimeDepCosts where gid in (list of gids
        queried from
            ways table).

            We can allow user to more filtering parameters according to
            start_time , date ?
            But this will make our query specific to the database design. As
            already discussed, we cant predict the time units /
        requirements of
            applications and so it would be better if we keep the query
            independent of the database design.

            Thats the reason our weight_map assumes start time from
        source as 0,
            and all travel times offset from the start time from source. A
            function that will generate such weight map is to be written.

            2. We allow user to pass the query for TimeDepCosts table
        too. So,
            we will have our pgsql function as:

            CREATE OR REPLACE FUNCTION shortest_path(
                                                             ways_sql text,

        time_dep_cost_sql text,
                                                             source_id
          integer,

                                                             target_id
          integer,
                                                             directed
          boolean,

        has_reverse_cost boolean)
                     RETURNS SETOF path_result

            This way, user can have more control on the amount of data being
            queried. We will only consider the time dependent windows
        that fit
            the query.

            Example of the time_dep_cost_sql can be:

            Select * from TimeDepCosts where " USER_SUPPLIED_CONSTRAINTS".

            The user can now supply his constraints according to
        time_window he
            needs, start times, holidays etc according to the database
        table design.

            The problem with this is, we cannot filter out edges
        according to
            bounding box etc since the_geom field is not there in this
        table.
            This is because that information would be redundant.

            Thoughts? Any idea that combines the merits of both approaches?

        Hi all,

        I am writing some background code will be needed for the above
        functionality.

        Any input on the above mentioned point? I suppose I should not start
        coding for any specific approach until we agree upon a viable query
        strategy.

    Hi Jay,

    I always find it easiest to answer these questions by thinking about
    the problem in terms of use cases, so:

    A web app, where someone enters a start and destination and some
    constraints like departure time or arrival time and maybe
    restrictions on modes of travel.

    Are there other ones we are trying to support? These should probably
    be documented as part of the project description.

I agree, we need to document the use-cases. A separate thread for this
discussion?

    Anyway, assuming the above, then the application which presumably
    has some basic knowledge of the time constraints could do something
    like:

    Get the straight line distance between start and destination double
    it and apply some average transportation speed to approximate the
    time window. The numbers can be changed to be acceptable by the
    application as long as it over estimates the duration of travel. It
    can then, specify the CONSTRAINTS in terms of a start and end time.
    We do something similar with our spatial bounding box used to
    collect the edges needed to build the graph.

    Now as far as collecting the constraints and making then accessible
    for the Boost_dijkstra, that is another problem. This problem boils
    down to the following:

Just a correction, in this case we will not be using boost_dijkstra, but
instead core_tdsp() function which we have build ourselves.

    1. is there an acceptable abstraction that we can use internally to
    represent various data sets that we might want to load. Or can we
    define a schema that will work for some set of constraints we wish
    to implement initially.

Well, we had agreed upon the following schema for testing purpose:

Yes, you are correct, sorry I was not paying attention when I wrote that.

   1.

      Ways: (On lines of the one in pgrouting-workshop)

          * gid
          * source
          * target
          * name
          * the_geom
          * static_length (In case time dependent routing is not used)
          * any other attributes
   2.

      TimeDepCosts

          * gid
          * day_of_week
          * time_of_day
          * travel_time / arrival_time

The time_dep_costs table schema can vary according to application and
its needs and the nature of data available.

    2. do want to move this data into some static memory structures for
    the analysis like we do for the graph and then dispose of them after
    the analysis.

We are doing exactly that! The Weight_Map data structure that is
supplied to the core_tdsp() is designed specifically to fulfil the
above functionality. It assumes the start time as 0, and all other
times offset from start time. So, there will be schema specific wrapper
functions that will help generate the weight_map data structure.

Which brings me back, to my original question (couple of messages back),
what is the best way to generate weight_map, ie retrieve data from time
dependent cost table.

Well, I think for now, I am going to go with approach #1. i.e user will
supply his query. This might retrieve unwanted data, since we cant

Yes, I think this is the way to go. It is up to the user that presumable knows his data to formulate an efficient query to retrieve the needed data.

filter by the_geom in table#2. But, I think we can refine this by adding
constraints like: where gid IN (list_of_edgeids_retrieved_from_ways)

A query something like:

select a.*
   from table2 a, table1 b
  where b.the_geom && <BBOX_OF_REGION>
    and b.gid=a.gid
    and <TIME_CONSTRAINTS>;

This would allow the planner to pick the best index.

So, I guess this approach combines the merits of both approaches
discussed before. We can provide flexibility to application as well as
add constraint on the number of rows retrieved.

Thoughts?

Yes, I think this is a good plan.

I think that the bulk of the roads in the network will probably not have time dependent data associated with them. If you look at what Navteq and the Traffic people (however they are) are doing, you will notice that mostly they only track major highways in/out of major metropolitan areas. They do not track that information on minor roads or rural areas. So I think that the number of segments that are likely to have time dependent data for any given request is going to be in the 5% of the total segments. Even if someone were to collect say GPS cellphone average speeds over a much wider range of segments, I still think it is manageable. Anyway we just have to solve the problem, they have to represent that data in a way the we can handle it quickly or this will not be a solution for them.

-Steve

    3. or do we want to construct a method that can dynamically request
    constraint information from the database when it is needed by a
    visitor. (2. is probably much more efficient and the better way to
    go from a complexity and stability point of view).

    Don't wait on me for additional input as I will be gone through June
    21st. Keep it simple and modular and we can change or add to it in
    the future and I;m sure whatever you come up with will be fine.
    After all it is code and it can be changed it we need to. :slight_smile:

    Project seems to be moving along fine.

    -Steve

    _______________________________________________
    pgrouting-dev mailing list
    pgrouting-dev@lists.osgeo.org <mailto:pgrouting-dev@lists.osgeo.org>
    http://lists.osgeo.org/mailman/listinfo/pgrouting-dev

--
Regards,
-Jay Mahadeokar

_______________________________________________
pgrouting-dev mailing list
pgrouting-dev@lists.osgeo.org
http://lists.osgeo.org/mailman/listinfo/pgrouting-dev

Hi all,

Hi, I am working on creating wrapper functions for tdsp, corresponding to the wrapper with and without bounding box as in dijkstra_shortest_paths.

As discussed many times, the TDSP algorithm expects the start_time as 0 and all times offset from start time. For the data I had created, I had taken time from 0 to 24, corresponding to 24 hours and assigned different travel_time for different time periods as per my intuition.

If start time is say 23:30, our query for retrieving entries from time_dep_costs should properly do calculation such that travel times are taken cyclically.

Now, this query should depend on the format of the time dependent data and its units. I think it is best that we keep this task with the user so that he writes proper sql query such that the values are properly retrieved as required by the function.

Do you see any specific cases, data formats where we can provide the wrapper functions for above task?

On Sun, Jun 12, 2011 at 8:26 PM, Stephen Woodbridge <woodbri@swoodbridge.com> wrote:

On 6/12/2011 12:33 AM, Jay Mahadeokar wrote:

On Sun, Jun 12, 2011 at 1:07 AM, Stephen Woodbridge

<woodbri@swoodbridge.com mailto:[woodbri@swoodbridge.com](mailto:woodbri@swoodbridge.com)> wrote:

On 6/11/2011 12:58 PM, Jay Mahadeokar wrote:

On Thu, Jun 9, 2011 at 9:16 AM, Jay Mahadeokar
<jai.mahadeokar@gmail.com mailto:[jai.mahadeokar@gmail.com](mailto:jai.mahadeokar@gmail.com)
<mailto:jai.mahadeokar@gmail.com
mailto:[jai.mahadeokar@gmail.com](mailto:jai.mahadeokar@gmail.com)>> wrote:

Hi,

Just a recap to our planned tdsp query model:

I agree. So, now can we assume we have following database
tables, as per things discussed till now:

  1. Ways: (On lines of the one in pgrouting-workshop)
  • gid
  • source
  • target
  • name
  • the_geom
  • static_length (In case time dependent routing
    is not used)
  • any other attributes…
  1. TimeVaryingCosts (Any other suitable name?)
  • gid
  • start_time_and_date (May divide this into two
    columns:
    day_of_week and time_of_day ? )
  • travel_time / arrival_time
  1. TurnRestrictions (Time dependent)
  • node/source
  • time_from
  • time_to
  • node_to
  • ancestors

If only static shortest path query is made, then only
table 1 is
required.
If time-dependent shortest path query is made, table 1 and 2
will be used.
If in time-dependent shortest path query with turn
restrictions
is made , all 3 tables will be used.

So, for a general time dependent query, we will fetch edge
information from ways table and the time dependent costs
information
from the
TimeDepCosts table. (See previous msg above.)

The query on ways table can be supplied by the usual as on
normal
dijkstra shortest_path() query.

How do we form the query on the TimeDepCosts table? I see
two options:

  1. We generate the query according to the gids obtained from the
    query on ways table. We will have to query all the time slots
    available, or form a query so that the time-window is
    appropriate
    and we have sufficient travel_time info so that we can reach
    destination.

Here we can form query like

Select * from TimeDepCosts where gid in (list of gids
queried from
ways table).

We can allow user to more filtering parameters according to
start_time , date ?
But this will make our query specific to the database design. As
already discussed, we cant predict the time units /
requirements of
applications and so it would be better if we keep the query
independent of the database design.

Thats the reason our weight_map assumes start time from
source as 0,
and all travel times offset from the start time from source. A
function that will generate such weight map is to be written.

  1. We allow user to pass the query for TimeDepCosts table
    too. So,
    we will have our pgsql function as:

CREATE OR REPLACE FUNCTION shortest_path(
ways_sql text,

time_dep_cost_sql text,
source_id
integer,

target_id
integer,
directed
boolean,

has_reverse_cost boolean)
RETURNS SETOF path_result

This way, user can have more control on the amount of data being
queried. We will only consider the time dependent windows
that fit
the query.

Example of the time_dep_cost_sql can be:

Select * from TimeDepCosts where " USER_SUPPLIED_CONSTRAINTS".

The user can now supply his constraints according to
time_window he
needs, start times, holidays etc according to the database
table design.

The problem with this is, we cannot filter out edges
according to
bounding box etc since the_geom field is not there in this
table.
This is because that information would be redundant.

Thoughts? Any idea that combines the merits of both approaches?

Hi all,

I am writing some background code will be needed for the above
functionality.

Any input on the above mentioned point? I suppose I should not start
coding for any specific approach until we agree upon a viable query
strategy.

Hi Jay,

I always find it easiest to answer these questions by thinking about
the problem in terms of use cases, so:

A web app, where someone enters a start and destination and some
constraints like departure time or arrival time and maybe
restrictions on modes of travel.

Are there other ones we are trying to support? These should probably
be documented as part of the project description.

I agree, we need to document the use-cases. A separate thread for this
discussion?

Anyway, assuming the above, then the application which presumably
has some basic knowledge of the time constraints could do something
like:

Get the straight line distance between start and destination double
it and apply some average transportation speed to approximate the
time window. The numbers can be changed to be acceptable by the
application as long as it over estimates the duration of travel. It
can then, specify the CONSTRAINTS in terms of a start and end time.
We do something similar with our spatial bounding box used to
collect the edges needed to build the graph.

Now as far as collecting the constraints and making then accessible
for the Boost_dijkstra, that is another problem. This problem boils
down to the following:

Just a correction, in this case we will not be using boost_dijkstra, but
instead core_tdsp() function which we have build ourselves.

  1. is there an acceptable abstraction that we can use internally to
    represent various data sets that we might want to load. Or can we
    define a schema that will work for some set of constraints we wish
    to implement initially.

Well, we had agreed upon the following schema for testing purpose:

Yes, you are correct, sorry I was not paying attention when I wrote that.

Ways: (On lines of the one in pgrouting-workshop)

  • gid
  • source
  • target
  • name
  • the_geom
  • static_length (In case time dependent routing is not used)
  • any other attributes

TimeDepCosts

  • gid
  • day_of_week
  • time_of_day
  • travel_time / arrival_time

The time_dep_costs table schema can vary according to application and
its needs and the nature of data available.

  1. do want to move this data into some static memory structures for
    the analysis like we do for the graph and then dispose of them after
    the analysis.

We are doing exactly that! The Weight_Map data structure that is
supplied to the core_tdsp() is designed specifically to fulfil the
above functionality. It assumes the start time as 0, and all other
times offset from start time. So, there will be schema specific wrapper
functions that will help generate the weight_map data structure.

Which brings me back, to my original question (couple of messages back),
what is the best way to generate weight_map, ie retrieve data from time
dependent cost table.

Well, I think for now, I am going to go with approach #1. i.e user will
supply his query. This might retrieve unwanted data, since we cant

Yes, I think this is the way to go. It is up to the user that presumable knows his data to formulate an efficient query to retrieve the needed data.

filter by the_geom in table#2. But, I think we can refine this by adding
constraints like: where gid IN (list_of_edgeids_retrieved_from_ways)

A query something like:

select a.*
from table2 a, table1 b
where b.the_geom && <BBOX_OF_REGION>
and b.gid=a.gid
and <TIME_CONSTRAINTS>;

This would allow the planner to pick the best index.

So, I guess this approach combines the merits of both approaches
discussed before. We can provide flexibility to application as well as
add constraint on the number of rows retrieved.

Thoughts?

Yes, I think this is a good plan.

I think that the bulk of the roads in the network will probably not have time dependent data associated with them. If you look at what Navteq and the Traffic people (however they are) are doing, you will notice that mostly they only track major highways in/out of major metropolitan areas. They do not track that information on minor roads or rural areas. So I think that the number of segments that are likely to have time dependent data for any given request is going to be in the 5% of the total segments. Even if someone were to collect say GPS cellphone average speeds over a much wider range of segments, I still think it is manageable. Anyway we just have to solve the problem, they have to represent that data in a way the we can handle it quickly or this will not be a solution for them.

-Steve

  1. or do we want to construct a method that can dynamically request
    constraint information from the database when it is needed by a
    visitor. (2. is probably much more efficient and the better way to
    go from a complexity and stability point of view).

Don’t wait on me for additional input as I will be gone through June
21st. Keep it simple and modular and we can change or add to it in
the future and I;m sure whatever you come up with will be fine.
After all it is code and it can be changed it we need to. :slight_smile:

Project seems to be moving along fine.

-Steve


pgrouting-dev mailing list

pgrouting-dev@lists.osgeo.org mailto:[pgrouting-dev@lists.osgeo.org](mailto:pgrouting-dev@lists.osgeo.org)

http://lists.osgeo.org/mailman/listinfo/pgrouting-dev


Regards,
-Jay Mahadeokar


pgrouting-dev mailing list
pgrouting-dev@lists.osgeo.org
http://lists.osgeo.org/mailman/listinfo/pgrouting-dev


pgrouting-dev mailing list
pgrouting-dev@lists.osgeo.org
http://lists.osgeo.org/mailman/listinfo/pgrouting-dev


Regards,
-Jay Mahadeokar

Hi Jay,

I there are a few important points to this problem:

1. the time based adjustments need to be stored in in real human terms, ie: similar to crontab time specification or similar to the Travel Time Analysis you found.

2. the query to pass the appropriate time dependent info can easily be normalized to the the route start time by subtracting the route start time from the feature start and end times, except when it get complicated like rolling over from a week day schedule to a weekend schedule on a friday to saturday transition. Likewise for holiday and event schedules.

This seems to be the hard part if there is a complicated schedule. So what we need is to define that are the inputs and outputs of this query, something like:

INPUTS:
    Start date:
    Start time:
    Max duration time:
    Start location of route:
    Ending location of route:

OUTPUTS:
    Segment id:
    time offset from:
    time offset to:
    avg speed:
    transit time:
    restriction:

3. I assume because your code only looks at an interval from the start, that it is not limited to an restricted time window like 12 hr, or 24 hr, etc.

more below ...

On 7/18/2011 7:49 AM, Jay Mahadeokar wrote:

Hi all,

Hi, I am working on creating wrapper functions for tdsp, corresponding
to the wrapper with and without bounding box as in dijkstra_shortest_paths.

As discussed many times, the TDSP algorithm expects the start_time as 0
and all times offset from start time. For the data I had created, I had
taken time from 0 to 24, corresponding to 24 hours and assigned
different travel_time for different time periods as per my intuition.

If start time is say 23:30, our query for retrieving entries from
time_dep_costs should properly do calculation such that travel times are
taken cyclically.

What to you mean by cyclically? I think the times should be linear and repeating if they are cyclical. Is this what you mean? For example, a route starting Monday at 8AM and a segment has 9AM-10AM restriction and the route max duration is 48hrs then we should see two records:

ID, From, To, MPH, ...
27, 0900, 1000, 25, ...
27, 3300, 3400, 25, ...

I'm not even sure how I would compute this, I think this puts a lot of burden on the user to figure this out and get it correct. The obvious flip side of this would be for us to define a rigid format for entering schedule information where we target support for something like 80+% of the common schedule definitions and then figure this out once for everyone. I'm not sure if you feel this will fit in your schedule or not?

Thoughts?

-Steve

Now, this query should depend on the format of the time dependent data
and its units. I think it is best that we keep this task with the user
so that he writes proper sql query such that the values are properly
retrieved as required by the function.

Do you see any specific cases, data formats where we can provide the
wrapper functions for above task?

On Sun, Jun 12, 2011 at 8:26 PM, Stephen Woodbridge
<woodbri@swoodbridge.com <mailto:woodbri@swoodbridge.com>> wrote:

    On 6/12/2011 12:33 AM, Jay Mahadeokar wrote:

        On Sun, Jun 12, 2011 at 1:07 AM, Stephen Woodbridge
        <woodbri@swoodbridge.com <mailto:woodbri@swoodbridge.com>
        <mailto:woodbri@swoodbridge. com
        <mailto:woodbri@swoodbridge.com>>> wrote:

            On 6/11/2011 12:58 PM, Jay Mahadeokar wrote:

                On Thu, Jun 9, 2011 at 9:16 AM, Jay Mahadeokar
        <jai.mahadeokar@gmail.com <mailto:jai.mahadeokar@gmail.com>
        <mailto:jai.mahadeokar@gmail. com <mailto:jai.mahadeokar@gmail.com>>
        <mailto:jai.mahadeokar@gmail. com <mailto:jai.mahadeokar@gmail.com>
        <mailto:jai.mahadeokar@gmail. com
        <mailto:jai.mahadeokar@gmail.com>>>> wrote:

                    Hi,

                    Just a recap to our planned tdsp query model:

                        I agree. So, now can we assume we have following
        database
                        tables, as per things discussed till now:

                           1. Ways: (On lines of the one in
        pgrouting-workshop)

                            * gid
                            * source
                            * target
                            * name
                            * the_geom
                            * static_length (In case time dependent
        routing
                is not used)
                            * any other attributes..

                             2. TimeVaryingCosts (Any other suitable name?)

                            * gid
                            * start_time_and_date (May divide this into two
                columns:
                              day_of_week and time_of_day ? )
                            * travel_time / arrival_time

                             3. TurnRestrictions (Time dependent)

                            * node/source
                            * time_from
                            * time_to
                            * node_to
                            * ancestors

                        If only static shortest path query is made, then
        only
                table 1 is
                        required.
                        If time-dependent shortest path query is made,
        table 1 and 2
                        will be used.
                        If in time-dependent shortest path query with turn
                restrictions
                        is made , all 3 tables will be used.

                    So, for a general time dependent query, we will
        fetch edge
                    information from ways table and the time dependent costs
                information
                    from the
                    TimeDepCosts table. (See previous msg above.)

                    The query on ways table can be supplied by the usual
        as on
                normal
                    dijkstra shortest_path() query.

                    How do we form the query on the TimeDepCosts table?
        I see
                two options:

                    1. We generate the query according to the gids
        obtained from the
                    query on ways table. We will have to query all the
        time slots
                    available, or form a query so that the time-window is
                appropriate
                    and we have sufficient travel_time info so that we
        can reach
                    destination.

                    Here we can form query like

                    Select * from TimeDepCosts where gid in (list of gids
                queried from
                    ways table).

                    We can allow user to more filtering parameters
        according to
                    start_time , date ?
                    But this will make our query specific to the
        database design. As
                    already discussed, we cant predict the time units /
                requirements of
                    applications and so it would be better if we keep
        the query
                    independent of the database design.

                    Thats the reason our weight_map assumes start time from
                source as 0,
                    and all travel times offset from the start time from
        source. A
                    function that will generate such weight map is to be
        written.

                    2. We allow user to pass the query for TimeDepCosts
        table
                too. So,
                    we will have our pgsql function as:

                    CREATE OR REPLACE FUNCTION shortest_path(

        ways_sql text,

                time_dep_cost_sql text,

        source_id
                  integer,

        target_id
                  integer,

        directed
                  boolean,

                has_reverse_cost boolean)
                             RETURNS SETOF path_result

                    This way, user can have more control on the amount
        of data being
                    queried. We will only consider the time dependent
        windows
                that fit
                    the query.

                    Example of the time_dep_cost_sql can be:

                    Select * from TimeDepCosts where "
        USER_SUPPLIED_CONSTRAINTS".

                    The user can now supply his constraints according to
                time_window he
                    needs, start times, holidays etc according to the
        database
                table design.

                    The problem with this is, we cannot filter out edges
                according to
                    bounding box etc since the_geom field is not there
        in this
                table.
                    This is because that information would be redundant.

                    Thoughts? Any idea that combines the merits of both
        approaches?

                Hi all,

                I am writing some background code will be needed for the
        above
                functionality.

                Any input on the above mentioned point? I suppose I
        should not start
                coding for any specific approach until we agree upon a
        viable query
                strategy.

            Hi Jay,

            I always find it easiest to answer these questions by
        thinking about
            the problem in terms of use cases, so:

            A web app, where someone enters a start and destination and some
            constraints like departure time or arrival time and maybe
            restrictions on modes of travel.

            Are there other ones we are trying to support? These should
        probably
            be documented as part of the project description.

        I agree, we need to document the use-cases. A separate thread
        for this
        discussion?

            Anyway, assuming the above, then the application which
        presumably
            has some basic knowledge of the time constraints could do
        something
            like:

            Get the straight line distance between start and destination
        double
            it and apply some average transportation speed to
        approximate the
            time window. The numbers can be changed to be acceptable by the
            application as long as it over estimates the duration of
        travel. It
            can then, specify the CONSTRAINTS in terms of a start and
        end time.
            We do something similar with our spatial bounding box used to
            collect the edges needed to build the graph.

            Now as far as collecting the constraints and making then
        accessible
            for the Boost_dijkstra, that is another problem. This
        problem boils
            down to the following:

        Just a correction, in this case we will not be using
        boost_dijkstra, but
        instead core_tdsp() function which we have build ourselves.

            1. is there an acceptable abstraction that we can use
        internally to
            represent various data sets that we might want to load. Or
        can we
            define a schema that will work for some set of constraints
        we wish
            to implement initially.

        Well, we had agreed upon the following schema for testing purpose:

    Yes, you are correct, sorry I was not paying attention when I wrote
    that.

           1.

              Ways: (On lines of the one in pgrouting-workshop)

                  * gid
                  * source
                  * target
                  * name
                  * the_geom
                  * static_length (In case time dependent routing is not
        used)
                  * any other attributes
           2.

              TimeDepCosts

                  * gid
                  * day_of_week
                  * time_of_day
                  * travel_time / arrival_time

        The time_dep_costs table schema can vary according to
        application and
        its needs and the nature of data available.

            2. do want to move this data into some static memory
        structures for
            the analysis like we do for the graph and then dispose of
        them after
            the analysis.

        We are doing exactly that! The Weight_Map data structure that is
        supplied to the core_tdsp() is designed specifically to fulfil the
        above functionality. It assumes the start time as 0, and all other
        times offset from start time. So, there will be schema specific
        wrapper
        functions that will help generate the weight_map data structure.

        Which brings me back, to my original question (couple of
        messages back),
        what is the best way to generate weight_map, ie retrieve data
        from time
        dependent cost table.

        Well, I think for now, I am going to go with approach #1. i.e
        user will
        supply his query. This might retrieve unwanted data, since we cant

    Yes, I think this is the way to go. It is up to the user that
    presumable knows his data to formulate an efficient query to
    retrieve the needed data.

        filter by the_geom in table#2. But, I think we can refine this
        by adding
        constraints like: where gid IN (list_of_edgeids_retrieved_
        from_ways)

    A query something like:

    select a.*
      from table2 a, table1 b
      where b.the_geom && <BBOX_OF_REGION>
       and b.gid=a.gid
       and <TIME_CONSTRAINTS>;

    This would allow the planner to pick the best index.

        So, I guess this approach combines the merits of both approaches
        discussed before. We can provide flexibility to application as
        well as
        add constraint on the number of rows retrieved.

        Thoughts?

    Yes, I think this is a good plan.

    I think that the bulk of the roads in the network will probably not
    have time dependent data associated with them. If you look at what
    Navteq and the Traffic people (however they are) are doing, you will
    notice that mostly they only track major highways in/out of major
    metropolitan areas. They do not track that information on minor
    roads or rural areas. So I think that the number of segments that
    are likely to have time dependent data for any given request is
    going to be in the 5% of the total segments. Even if someone were to
    collect say GPS cellphone average speeds over a much wider range of
    segments, I still think it is manageable. Anyway we just have to
    solve the problem, they have to represent that data in a way the we
    can handle it quickly or this will not be a solution for them.

    -Steve

            3. or do we want to construct a method that can dynamically
        request
            constraint information from the database when it is needed by a
            visitor. (2. is probably much more efficient and the better
        way to
            go from a complexity and stability point of view).

            Don't wait on me for additional input as I will be gone
        through June
            21st. Keep it simple and modular and we can change or add to
        it in
            the future and I;m sure whatever you come up with will be fine.
            After all it is code and it can be changed it we need to. :slight_smile:

            Project seems to be moving along fine.

            -Steve

            ______________________________ _________________
            pgrouting-dev mailing list
        pgrouting-dev@lists.osgeo.org
        <mailto:pgrouting-dev@lists.osgeo.org>
        <mailto:pgrouting-dev@lists. osgeo.org
        <mailto:pgrouting-dev@lists.osgeo.org>>

        http://lists.osgeo.org/ mailman/listinfo/pgrouting-dev
        <http://lists.osgeo.org/mailman/listinfo/pgrouting-dev&gt;

        --
        Regards,
        -Jay Mahadeokar

        ______________________________ _________________
        pgrouting-dev mailing list
        pgrouting-dev@lists.osgeo.org <mailto:pgrouting-dev@lists.osgeo.org>
        http://lists.osgeo.org/ mailman/listinfo/pgrouting-dev
        <http://lists.osgeo.org/mailman/listinfo/pgrouting-dev&gt;

    ______________________________ _________________
    pgrouting-dev mailing list
    pgrouting-dev@lists.osgeo.org <mailto:pgrouting-dev@lists.osgeo.org>
    http://lists.osgeo.org/ mailman/listinfo/pgrouting-dev
    <http://lists.osgeo.org/mailman/listinfo/pgrouting-dev&gt;

--
Regards,
-Jay Mahadeokar

_______________________________________________
pgrouting-dev mailing list
pgrouting-dev@lists.osgeo.org
http://lists.osgeo.org/mailman/listinfo/pgrouting-dev

On Mon, Jul 18, 2011 at 11:21 PM, Stephen Woodbridge <woodbri@swoodbridge.com> wrote:

Hi Jay,

I there are a few important points to this problem:

  1. the time based adjustments need to be stored in in real human terms, ie: similar to crontab time specification or similar to the Travel Time Analysis you found.

  2. the query to pass the appropriate time dependent info can easily be normalized to the the route start time by subtracting the route start time from the feature start and end times, except when it get complicated like rolling over from a week day schedule to a weekend schedule on a friday to saturday transition. Likewise for holiday and event schedules.

This seems to be the hard part if there is a complicated schedule. So what we need is to define that are the inputs and outputs of this query, something like:

INPUTS:
Start date:
Start time:
Max duration time:
Start location of route:
Ending location of route:

OUTPUTS:
Segment id:
time offset from:
time offset to:
avg speed:
transit time:
restriction:

Hi Steve,

Well, this was our initial plan, but we had not kept the query like this since we wanted to keep the query general. I think you or Daniel had pointed out that we should not restrict our query to time units like date, hours etc since in some application, time interval could be milliseconds or in some other application it could be years etc. So, I think if we leave retrieval from time_dep_costs table as sql query parameter, that would be more flexible as we had concluded earlier.

  1. I assume because your code only looks at an interval from the start, that it is not limited to an restricted time window like 12 hr, or 24 hr, etc.

more below …

On 7/18/2011 7:49 AM, Jay Mahadeokar wrote:

Hi all,

Hi, I am working on creating wrapper functions for tdsp, corresponding
to the wrapper with and without bounding box as in dijkstra_shortest_paths.

As discussed many times, the TDSP algorithm expects the start_time as 0
and all times offset from start time. For the data I had created, I had
taken time from 0 to 24, corresponding to 24 hours and assigned
different travel_time for different time periods as per my intuition.

If start time is say 23:30, our query for retrieving entries from
time_dep_costs should properly do calculation such that travel times are
taken cyclically.

What to you mean by cyclically? I think the times should be linear and repeating if they are cyclical. Is this what you mean?

Yes.

For example, a route starting Monday at 8AM and a segment has 9AM-10AM restriction and the route max duration is 48hrs then we should see two records:

ID, From, To, MPH, …
27, 0900, 1000, 25, …
27, 3300, 3400, 25, …

Instead of seeing keeping this as two records, we can add a parameter called is_cyclic in query. If that is true, then once we run out of the range, we can go back to zero.

That would need many modifications in current code, but I guess it is manageable.

Would that be ok?

I’m not even sure how I would compute this, I think this puts a lot of burden on the user to figure this out and get it correct. The obvious flip side of this would be for us to define a rigid format for entering schedule information where we target support for something like 80+% of the common schedule definitions and then figure this out once for everyone. I’m not sure if you feel this will fit in your schedule or not?

We can sure provide functionality for some common formats, I will try and do that within GSoc timeline if it works out, or we can work on that later too. What im not clear is, how do we package this in the overall pgRouting project?

Thoughts?

-Steve

Now, this query should depend on the format of the time dependent data
and its units. I think it is best that we keep this task with the user
so that he writes proper sql query such that the values are properly
retrieved as required by the function.

Do you see any specific cases, data formats where we can provide the
wrapper functions for above task?

On Sun, Jun 12, 2011 at 8:26 PM, Stephen Woodbridge

<woodbri@swoodbridge.com mailto:[woodbri@swoodbridge.com](mailto:woodbri@swoodbridge.com)> wrote:

On 6/12/2011 12:33 AM, Jay Mahadeokar wrote:

On Sun, Jun 12, 2011 at 1:07 AM, Stephen Woodbridge
<woodbri@swoodbridge.com mailto:[woodbri@swoodbridge.com](mailto:woodbri@swoodbridge.com)

<mailto:woodbri@swoodbridge. com

mailto:[woodbri@swoodbridge.com](mailto:woodbri@swoodbridge.com)>> wrote:

On 6/11/2011 12:58 PM, Jay Mahadeokar wrote:

On Thu, Jun 9, 2011 at 9:16 AM, Jay Mahadeokar
<jai.mahadeokar@gmail.com mailto:[jai.mahadeokar@gmail.com](mailto:jai.mahadeokar@gmail.com)
<mailto:jai.mahadeokar@gmail. com mailto:[jai.mahadeokar@gmail.com](mailto:jai.mahadeokar@gmail.com)>
<mailto:jai.mahadeokar@gmail. com mailto:[jai.mahadeokar@gmail.com](mailto:jai.mahadeokar@gmail.com)
<mailto:jai.mahadeokar@gmail. com
mailto:[jai.mahadeokar@gmail.com](mailto:jai.mahadeokar@gmail.com)>>> wrote:

Hi,

Just a recap to our planned tdsp query model:

I agree. So, now can we assume we have following
database
tables, as per things discussed till now:

  1. Ways: (On lines of the one in
    pgrouting-workshop)
  • gid
  • source
  • target
  • name
  • the_geom
  • static_length (In case time dependent
    routing
    is not used)
  • any other attributes…
  1. TimeVaryingCosts (Any other suitable name?)
  • gid
  • start_time_and_date (May divide this into two
    columns:
    day_of_week and time_of_day ? )
  • travel_time / arrival_time
  1. TurnRestrictions (Time dependent)
  • node/source
  • time_from
  • time_to
  • node_to
  • ancestors

If only static shortest path query is made, then
only
table 1 is
required.
If time-dependent shortest path query is made,
table 1 and 2
will be used.
If in time-dependent shortest path query with turn
restrictions
is made , all 3 tables will be used.

So, for a general time dependent query, we will
fetch edge
information from ways table and the time dependent costs
information
from the
TimeDepCosts table. (See previous msg above.)

The query on ways table can be supplied by the usual
as on
normal
dijkstra shortest_path() query.

How do we form the query on the TimeDepCosts table?
I see
two options:

  1. We generate the query according to the gids
    obtained from the
    query on ways table. We will have to query all the
    time slots
    available, or form a query so that the time-window is
    appropriate
    and we have sufficient travel_time info so that we
    can reach
    destination.

Here we can form query like

Select * from TimeDepCosts where gid in (list of gids
queried from
ways table).

We can allow user to more filtering parameters
according to
start_time , date ?
But this will make our query specific to the
database design. As
already discussed, we cant predict the time units /
requirements of
applications and so it would be better if we keep
the query
independent of the database design.

Thats the reason our weight_map assumes start time from
source as 0,
and all travel times offset from the start time from
source. A
function that will generate such weight map is to be
written.

  1. We allow user to pass the query for TimeDepCosts
    table
    too. So,
    we will have our pgsql function as:

CREATE OR REPLACE FUNCTION shortest_path(

ways_sql text,

time_dep_cost_sql text,

source_id
integer,

target_id
integer,

directed
boolean,

has_reverse_cost boolean)
RETURNS SETOF path_result

This way, user can have more control on the amount
of data being
queried. We will only consider the time dependent
windows
that fit
the query.

Example of the time_dep_cost_sql can be:

Select * from TimeDepCosts where "
USER_SUPPLIED_CONSTRAINTS".

The user can now supply his constraints according to
time_window he
needs, start times, holidays etc according to the
database
table design.

The problem with this is, we cannot filter out edges
according to
bounding box etc since the_geom field is not there
in this
table.
This is because that information would be redundant.

Thoughts? Any idea that combines the merits of both
approaches?

Hi all,

I am writing some background code will be needed for the
above
functionality.

Any input on the above mentioned point? I suppose I
should not start
coding for any specific approach until we agree upon a
viable query
strategy.

Hi Jay,

I always find it easiest to answer these questions by
thinking about
the problem in terms of use cases, so:

A web app, where someone enters a start and destination and some
constraints like departure time or arrival time and maybe
restrictions on modes of travel.

Are there other ones we are trying to support? These should
probably
be documented as part of the project description.

I agree, we need to document the use-cases. A separate thread
for this
discussion?

Anyway, assuming the above, then the application which
presumably
has some basic knowledge of the time constraints could do
something
like:

Get the straight line distance between start and destination
double
it and apply some average transportation speed to
approximate the
time window. The numbers can be changed to be acceptable by the
application as long as it over estimates the duration of
travel. It
can then, specify the CONSTRAINTS in terms of a start and
end time.
We do something similar with our spatial bounding box used to
collect the edges needed to build the graph.

Now as far as collecting the constraints and making then
accessible
for the Boost_dijkstra, that is another problem. This
problem boils
down to the following:

Just a correction, in this case we will not be using
boost_dijkstra, but
instead core_tdsp() function which we have build ourselves.

  1. is there an acceptable abstraction that we can use
    internally to
    represent various data sets that we might want to load. Or
    can we
    define a schema that will work for some set of constraints
    we wish
    to implement initially.

Well, we had agreed upon the following schema for testing purpose:

Yes, you are correct, sorry I was not paying attention when I wrote
that.

Ways: (On lines of the one in pgrouting-workshop)

  • gid
  • source
  • target
  • name
  • the_geom
  • static_length (In case time dependent routing is not
    used)
  • any other attributes

TimeDepCosts

  • gid
  • day_of_week
  • time_of_day
  • travel_time / arrival_time

The time_dep_costs table schema can vary according to
application and
its needs and the nature of data available.

  1. do want to move this data into some static memory
    structures for
    the analysis like we do for the graph and then dispose of
    them after
    the analysis.

We are doing exactly that! The Weight_Map data structure that is
supplied to the core_tdsp() is designed specifically to fulfil the
above functionality. It assumes the start time as 0, and all other
times offset from start time. So, there will be schema specific
wrapper
functions that will help generate the weight_map data structure.

Which brings me back, to my original question (couple of
messages back),
what is the best way to generate weight_map, ie retrieve data
from time
dependent cost table.

Well, I think for now, I am going to go with approach #1. i.e
user will
supply his query. This might retrieve unwanted data, since we cant

Yes, I think this is the way to go. It is up to the user that
presumable knows his data to formulate an efficient query to
retrieve the needed data.

filter by the_geom in table#2. But, I think we can refine this
by adding
constraints like: where gid IN (list_of_edgeids_retrieved_
from_ways)

A query something like:

select a.*
from table2 a, table1 b
where b.the_geom && <BBOX_OF_REGION>
and b.gid=a.gid
and <TIME_CONSTRAINTS>;

This would allow the planner to pick the best index.

So, I guess this approach combines the merits of both approaches
discussed before. We can provide flexibility to application as
well as
add constraint on the number of rows retrieved.

Thoughts?

Yes, I think this is a good plan.

I think that the bulk of the roads in the network will probably not
have time dependent data associated with them. If you look at what
Navteq and the Traffic people (however they are) are doing, you will
notice that mostly they only track major highways in/out of major
metropolitan areas. They do not track that information on minor
roads or rural areas. So I think that the number of segments that
are likely to have time dependent data for any given request is
going to be in the 5% of the total segments. Even if someone were to
collect say GPS cellphone average speeds over a much wider range of
segments, I still think it is manageable. Anyway we just have to
solve the problem, they have to represent that data in a way the we
can handle it quickly or this will not be a solution for them.

-Steve

  1. or do we want to construct a method that can dynamically
    request
    constraint information from the database when it is needed by a
    visitor. (2. is probably much more efficient and the better
    way to
    go from a complexity and stability point of view).

Don’t wait on me for additional input as I will be gone
through June
21st. Keep it simple and modular and we can change or add to
it in
the future and I;m sure whatever you come up with will be fine.
After all it is code and it can be changed it we need to. :slight_smile:

Project seems to be moving along fine.

-Steve


pgrouting-dev mailing list
pgrouting-dev@lists.osgeo.org
mailto:[pgrouting-dev@lists.osgeo.org](mailto:pgrouting-dev@lists.osgeo.org)
<mailto:pgrouting-dev@lists. osgeo.org
mailto:[pgrouting-dev@lists.osgeo.org](mailto:pgrouting-dev@lists.osgeo.org)>

http://lists.osgeo.org/ mailman/listinfo/pgrouting-dev
<http://lists.osgeo.org/mailman/listinfo/pgrouting-dev>


Regards,
-Jay Mahadeokar


pgrouting-dev mailing list
pgrouting-dev@lists.osgeo.org mailto:[pgrouting-dev@lists.osgeo.org](mailto:pgrouting-dev@lists.osgeo.org)
http://lists.osgeo.org/ mailman/listinfo/pgrouting-dev
<http://lists.osgeo.org/mailman/listinfo/pgrouting-dev>


pgrouting-dev mailing list
pgrouting-dev@lists.osgeo.org mailto:[pgrouting-dev@lists.osgeo.org](mailto:pgrouting-dev@lists.osgeo.org)
http://lists.osgeo.org/ mailman/listinfo/pgrouting-dev
<http://lists.osgeo.org/mailman/listinfo/pgrouting-dev>


Regards,
-Jay Mahadeokar


pgrouting-dev mailing list
pgrouting-dev@lists.osgeo.org
http://lists.osgeo.org/mailman/listinfo/pgrouting-dev


pgrouting-dev mailing list
pgrouting-dev@lists.osgeo.org
http://lists.osgeo.org/mailman/listinfo/pgrouting-dev


Regards,
-Jay Mahadeokar

Jay, Daniel,

I think we need to get our heads around this or it will be very hard for Jay to complete this as a useful tool.

Jay says:

What im not clear is, how do we package this in the overall
pgRouting project?

This is exactly the problem. So here is my 2 rupees :slight_smile: on this:

I think the core algorithm needs to be generic so it works with any time frames that someone might want to use. Then we layer over that application implementations using wrappers or additional code. So since most people today are using pgRouting for vehicle routing applications I think we need to implement and easy to use wrapper that deals with vehicle applications of date and time of day dependencies. This does not preclude someone else coming in later and writing a new wrapper to deal with their application working in years or microsecond intervals.

For this project we need a concrete implementation that allow someone to create a table of time based restrictions of some kind on edges and reasonable easy say generate a route start at X and use these time dependent restrictions.

I do not think it needs to deal with every single possible case, but it should be a reason set of cases. We can always expand on this after GSoC. I think it would be good in the documentation, we add some notes about things we thought of be did not support and a short paragraph of our thoughts on how to support it in the future.

You need to size this to the time you have available so you can be successful. I will let Daniel be the final arbiter on this as I will be out of town again later this week and my time here seems to on overload lately.

The key idea here is layer of tasks, and structure. We need to bridge the gap between low-level generic code and higher level application specific presentation of which there my be many depending on the application in discussion.

Best regards,
   -Steve

On 7/18/2011 11:03 PM, Jay Mahadeokar wrote:

On Mon, Jul 18, 2011 at 11:21 PM, Stephen Woodbridge
<woodbri@swoodbridge.com <mailto:woodbri@swoodbridge.com>> wrote:

    Hi Jay,

    I there are a few important points to this problem:

    1. the time based adjustments need to be stored in in real human
    terms, ie: similar to crontab time specification or similar to the
    Travel Time Analysis you found.

    2. the query to pass the appropriate time dependent info can easily
    be normalized to the the route start time by subtracting the route
    start time from the feature start and end times, except when it get
    complicated like rolling over from a week day schedule to a weekend
    schedule on a friday to saturday transition. Likewise for holiday
    and event schedules.

    This seems to be the hard part if there is a complicated schedule.
    So what we need is to define that are the inputs and outputs of this
    query, something like:

    INPUTS:
       Start date:
       Start time:
       Max duration time:
       Start location of route:
       Ending location of route:

    OUTPUTS:
       Segment id:
       time offset from:
       time offset to:
       avg speed:
       transit time:
       restriction:

Hi Steve,

Well, this was our initial plan, but we had not kept the query like this
since we wanted to keep the query general. I think you or Daniel had
pointed out that we should not restrict our query to time units like
date, hours etc since in some application, time interval could be
milliseconds or in some other application it could be years etc. So, I
think if we leave retrieval from time_dep_costs table as sql query
parameter, that would be more flexible as we had concluded earlier.

    3. I assume because your code only looks at an interval from the
    start, that it is not limited to an restricted time window like 12
    hr, or 24 hr, etc.

    more below ...

    On 7/18/2011 7:49 AM, Jay Mahadeokar wrote:

        Hi all,

        Hi, I am working on creating wrapper functions for tdsp,
        corresponding
        to the wrapper with and without bounding box as in
        dijkstra_shortest_paths.

        As discussed many times, the TDSP algorithm expects the
        start_time as 0
        and all times offset from start time. For the data I had
        created, I had
        taken time from 0 to 24, corresponding to 24 hours and assigned
        different travel_time for different time periods as per my
        intuition.

        If start time is say 23:30, our query for retrieving entries from
        time_dep_costs should properly do calculation such that travel
        times are
        taken cyclically.

    What to you mean by cyclically? I think the times should be linear
    and repeating if they are cyclical. Is this what you mean?

Yes.

    For example, a route starting Monday at 8AM and a segment has
    9AM-10AM restriction and the route max duration is 48hrs then we
    should see two records:

    ID, From, To, MPH, ...
    27, 0900, 1000, 25, ...
    27, 3300, 3400, 25, ...

Instead of seeing keeping this as two records, we can add a parameter
called is_cyclic in query. If that is true, then once we run out of the
range, we can go back to zero.

That would need many modifications in current code, but I guess it is
manageable.

Would that be ok?

    I'm not even sure how I would compute this, I think this puts a lot
    of burden on the user to figure this out and get it correct. The
    obvious flip side of this would be for us to define a rigid format
    for entering schedule information where we target support for
    something like 80+% of the common schedule definitions and then
    figure this out once for everyone. I'm not sure if you feel this
    will fit in your schedule or not?

We can sure provide functionality for some common formats, I will try
and do that within GSoc timeline if it works out, or we can work on that
later too. What im not clear is, how do we package this in the overall
pgRouting project?

    Thoughts?

    -Steve

        Now, this query should depend on the format of the time
        dependent data
        and its units. I think it is best that we keep this task with
        the user
        so that he writes proper sql query such that the values are properly
        retrieved as required by the function.

        Do you see any specific cases, data formats where we can
        provide the
        wrapper functions for above task?

        On Sun, Jun 12, 2011 at 8:26 PM, Stephen Woodbridge
        <woodbri@swoodbridge.com <mailto:woodbri@swoodbridge.com>
        <mailto:woodbri@swoodbridge. com
        <mailto:woodbri@swoodbridge.com>>> wrote:

            On 6/12/2011 12:33 AM, Jay Mahadeokar wrote:

                On Sun, Jun 12, 2011 at 1:07 AM, Stephen Woodbridge
        <woodbri@swoodbridge.com <mailto:woodbri@swoodbridge.com>
        <mailto:woodbri@swoodbridge. com <mailto:woodbri@swoodbridge.com>>
        <mailto:woodbri@swoodbridge. com

        <mailto:woodbri@swoodbridge. com
        <mailto:woodbri@swoodbridge.com>>>> wrote:

                    On 6/11/2011 12:58 PM, Jay Mahadeokar wrote:

                        On Thu, Jun 9, 2011 at 9:16 AM, Jay Mahadeokar
        <jai.mahadeokar@gmail.com <mailto:jai.mahadeokar@gmail.com>
        <mailto:jai.mahadeokar@gmail. com <mailto:jai.mahadeokar@gmail.com>>
        <mailto:jai.mahadeokar@gmail. com
        <mailto:jai.mahadeokar@gmail. com
        <mailto:jai.mahadeokar@gmail.com>>>
        <mailto:jai.mahadeokar@gmail. com
        <mailto:jai.mahadeokar@gmail. com <mailto:jai.mahadeokar@gmail.com>>
        <mailto:jai.mahadeokar@gmail. com
        <mailto:jai.mahadeokar@gmail. com
        <mailto:jai.mahadeokar@gmail.com>>>>> wrote:

                            Hi,

                            Just a recap to our planned tdsp query model:

                                I agree. So, now can we assume we have
        following
                database
                                tables, as per things discussed till now:

                                   1. Ways: (On lines of the one in
                pgrouting-workshop)

                                    * gid
                                    * source
                                    * target
                                    * name
                                    * the_geom
                                    * static_length (In case time
        dependent
                routing
                        is not used)
                                    * any other attributes..

                                     2. TimeVaryingCosts (Any other
        suitable name?)

                                    * gid
                                    * start_time_and_date (May divide
        this into two
                        columns:
                                      day_of_week and time_of_day ? )
                                    * travel_time / arrival_time

                                     3. TurnRestrictions (Time dependent)

                                    * node/source
                                    * time_from
                                    * time_to
                                    * node_to
                                    * ancestors

                                If only static shortest path query is
        made, then
                only
                        table 1 is
                                required.
                                If time-dependent shortest path query is
        made,
                table 1 and 2
                                will be used.
                                If in time-dependent shortest path query
        with turn
                        restrictions
                                is made , all 3 tables will be used.

                            So, for a general time dependent query, we will
                fetch edge
                            information from ways table and the time
        dependent costs
                        information
                            from the
                            TimeDepCosts table. (See previous msg above.)

                            The query on ways table can be supplied by
        the usual
                as on
                        normal
                            dijkstra shortest_path() query.

                            How do we form the query on the TimeDepCosts
        table?
                I see
                        two options:

                            1. We generate the query according to the gids
                obtained from the
                            query on ways table. We will have to query
        all the
                time slots
                            available, or form a query so that the
        time-window is
                        appropriate
                            and we have sufficient travel_time info so
        that we
                can reach
                            destination.

                            Here we can form query like

                            Select * from TimeDepCosts where gid in
        (list of gids
                        queried from
                            ways table).

                            We can allow user to more filtering parameters
                according to
                            start_time , date ?
                            But this will make our query specific to the
                database design. As
                            already discussed, we cant predict the time
        units /
                        requirements of
                            applications and so it would be better if we
        keep
                the query
                            independent of the database design.

                            Thats the reason our weight_map assumes
        start time from
                        source as 0,
                            and all travel times offset from the start
        time from
                source. A
                            function that will generate such weight map
        is to be
                written.

                            2. We allow user to pass the query for
        TimeDepCosts
                table
                        too. So,
                            we will have our pgsql function as:

                            CREATE OR REPLACE FUNCTION shortest_path(

                ways_sql text,

                        time_dep_cost_sql text,

                source_id
                          integer,

                target_id
                          integer,

                directed
                          boolean,

                        has_reverse_cost boolean)
                                     RETURNS SETOF path_result

                            This way, user can have more control on the
        amount
                of data being
                            queried. We will only consider the time
        dependent
                windows
                        that fit
                            the query.

                            Example of the time_dep_cost_sql can be:

                            Select * from TimeDepCosts where "
                USER_SUPPLIED_CONSTRAINTS".

                            The user can now supply his constraints
        according to
                        time_window he
                            needs, start times, holidays etc according
        to the
                database
                        table design.

                            The problem with this is, we cannot filter
        out edges
                        according to
                            bounding box etc since the_geom field is not
        there
                in this
                        table.
                            This is because that information would be
        redundant.

                            Thoughts? Any idea that combines the merits
        of both
                approaches?

                        Hi all,

                        I am writing some background code will be needed
        for the
                above
                        functionality.

                        Any input on the above mentioned point? I suppose I
                should not start
                        coding for any specific approach until we agree
        upon a
                viable query
                        strategy.

                    Hi Jay,

                    I always find it easiest to answer these questions by
                thinking about
                    the problem in terms of use cases, so:

                    A web app, where someone enters a start and
        destination and some
                    constraints like departure time or arrival time and
        maybe
                    restrictions on modes of travel.

                    Are there other ones we are trying to support? These
        should
                probably
                    be documented as part of the project description.

                I agree, we need to document the use-cases. A separate
        thread
                for this
                discussion?

                    Anyway, assuming the above, then the application which
                presumably
                    has some basic knowledge of the time constraints
        could do
                something
                    like:

                    Get the straight line distance between start and
        destination
                double
                    it and apply some average transportation speed to
                approximate the
                    time window. The numbers can be changed to be
        acceptable by the
                    application as long as it over estimates the duration of
                travel. It
                    can then, specify the CONSTRAINTS in terms of a
        start and
                end time.
                    We do something similar with our spatial bounding
        box used to
                    collect the edges needed to build the graph.

                    Now as far as collecting the constraints and making then
                accessible
                    for the Boost_dijkstra, that is another problem. This
                problem boils
                    down to the following:

                Just a correction, in this case we will not be using
                boost_dijkstra, but
                instead core_tdsp() function which we have build ourselves.

                    1. is there an acceptable abstraction that we can use
                internally to
                    represent various data sets that we might want to
        load. Or
                can we
                    define a schema that will work for some set of
        constraints
                we wish
                    to implement initially.

                Well, we had agreed upon the following schema for
        testing purpose:

            Yes, you are correct, sorry I was not paying attention when
        I wrote
            that.

                   1.

                      Ways: (On lines of the one in pgrouting-workshop)

                          * gid
                          * source
                          * target
                          * name
                          * the_geom
                          * static_length (In case time dependent
        routing is not
                used)
                          * any other attributes
                   2.

                      TimeDepCosts

                          * gid
                          * day_of_week
                          * time_of_day
                          * travel_time / arrival_time

                The time_dep_costs table schema can vary according to
                application and
                its needs and the nature of data available.

                    2. do want to move this data into some static memory
                structures for
                    the analysis like we do for the graph and then
        dispose of
                them after
                    the analysis.

                We are doing exactly that! The Weight_Map data structure
        that is
                supplied to the core_tdsp() is designed specifically to
        fulfil the
                above functionality. It assumes the start time as 0,
        and all other
                times offset from start time. So, there will be schema
        specific
                wrapper
                functions that will help generate the weight_map data
        structure.

                Which brings me back, to my original question (couple of
                messages back),
                what is the best way to generate weight_map, ie retrieve
        data
                from time
                dependent cost table.

                Well, I think for now, I am going to go with approach
        #1. i.e
                user will
                supply his query. This might retrieve unwanted data,
        since we cant

            Yes, I think this is the way to go. It is up to the user that
            presumable knows his data to formulate an efficient query to
            retrieve the needed data.

                filter by the_geom in table#2. But, I think we can
        refine this
                by adding
                constraints like: where gid IN (list_of_edgeids_retrieved_
                from_ways)

            A query something like:

            select a.*
              from table2 a, table1 b
              where b.the_geom && <BBOX_OF_REGION>
               and b.gid=a.gid
               and <TIME_CONSTRAINTS>;

            This would allow the planner to pick the best index.

                So, I guess this approach combines the merits of both
        approaches
                discussed before. We can provide flexibility to
        application as
                well as
                add constraint on the number of rows retrieved.

                Thoughts?

            Yes, I think this is a good plan.

            I think that the bulk of the roads in the network will
        probably not
            have time dependent data associated with them. If you look
        at what
            Navteq and the Traffic people (however they are) are doing,
        you will
            notice that mostly they only track major highways in/out of
        major
            metropolitan areas. They do not track that information on minor
            roads or rural areas. So I think that the number of segments
        that
            are likely to have time dependent data for any given request is
            going to be in the 5% of the total segments. Even if someone
        were to
            collect say GPS cellphone average speeds over a much wider
        range of
            segments, I still think it is manageable. Anyway we just have to
            solve the problem, they have to represent that data in a way
        the we
            can handle it quickly or this will not be a solution for them.

            -Steve

                    3. or do we want to construct a method that can
        dynamically
                request
                    constraint information from the database when it is
        needed by a
                    visitor. (2. is probably much more efficient and the
        better
                way to
                    go from a complexity and stability point of view).

                    Don't wait on me for additional input as I will be gone
                through June
                    21st. Keep it simple and modular and we can change
        or add to
                it in
                    the future and I;m sure whatever you come up with
        will be fine.
                    After all it is code and it can be changed it we
        need to. :slight_smile:

                    Project seems to be moving along fine.

                    -Steve

                    ______________________________ _________________
                    pgrouting-dev mailing list
        pgrouting-dev@lists.osgeo.org <mailto:pgrouting-dev@lists.osgeo.org>
        <mailto:pgrouting-dev@lists. osgeo.org
        <mailto:pgrouting-dev@lists.osgeo.org>>
        <mailto:pgrouting-dev@lists.
        osgeo.org <http://osgeo.org>
        <mailto:pgrouting-dev@lists. osgeo.org
        <mailto:pgrouting-dev@lists.osgeo.org>>>

        http://lists.osgeo.org/ mailman/listinfo/pgrouting-dev
        <http://lists.osgeo.org/ mailman/listinfo/pgrouting-dev
        <http://lists.osgeo.org/mailman/listinfo/pgrouting-dev&gt; >

                --
                Regards,
                -Jay Mahadeokar

                ______________________________ _________________
                pgrouting-dev mailing list
        pgrouting-dev@lists.osgeo.org
        <mailto:pgrouting-dev@lists.osgeo.org>
        <mailto:pgrouting-dev@lists. osgeo.org
        <mailto:pgrouting-dev@lists.osgeo.org>>
        http://lists.osgeo.org/ mailman/listinfo/pgrouting-dev
        <http://lists.osgeo.org/ mailman/listinfo/pgrouting-dev
        <http://lists.osgeo.org/mailman/listinfo/pgrouting-dev&gt; >

            ______________________________ _________________
            pgrouting-dev mailing list
        pgrouting-dev@lists.osgeo.org
        <mailto:pgrouting-dev@lists.osgeo.org>
        <mailto:pgrouting-dev@lists. osgeo.org
        <mailto:pgrouting-dev@lists.osgeo.org>>
        http://lists.osgeo.org/ mailman/listinfo/pgrouting-dev
        <http://lists.osgeo.org/ mailman/listinfo/pgrouting-dev
        <http://lists.osgeo.org/mailman/listinfo/pgrouting-dev&gt; >

        --
        Regards,
        -Jay Mahadeokar

        ______________________________ _________________
        pgrouting-dev mailing list
        pgrouting-dev@lists.osgeo.org <mailto:pgrouting-dev@lists.osgeo.org>
        http://lists.osgeo.org/ mailman/listinfo/pgrouting-dev
        <http://lists.osgeo.org/mailman/listinfo/pgrouting-dev&gt;

    ______________________________ _________________
    pgrouting-dev mailing list
    pgrouting-dev@lists.osgeo.org <mailto:pgrouting-dev@lists.osgeo.org>
    http://lists.osgeo.org/ mailman/listinfo/pgrouting-dev
    <http://lists.osgeo.org/mailman/listinfo/pgrouting-dev&gt;

--
Regards,
-Jay Mahadeokar

_______________________________________________
pgrouting-dev mailing list
pgrouting-dev@lists.osgeo.org
http://lists.osgeo.org/mailman/listinfo/pgrouting-dev

Hi Steve, Jay,

What you wrote I fully agree with. Can’t write it better.
Generic is fine for the basic functions, but to make it usable with road data for example, some custom wrappers are the best.
Such a wrapper can be also for OSM data only and just a certain use case.
The current wrapper functions are far from perfect, but they exist now for many years already.

So just pick a few use cases. Those wrappers can evolve over time. They should been seen as examples, that can be modified by users. And they can be used for documentation and demo queries.

Daniel

On Tue, Jul 19, 2011 at 12:35 PM, Stephen Woodbridge <woodbri@swoodbridge.com> wrote:

Jay, Daniel,

I think we need to get our heads around this or it will be very hard for Jay to complete this as a useful tool.

Jay says:

What im not clear is, how do we package this in the overall
pgRouting project?

This is exactly the problem. So here is my 2 rupees :slight_smile: on this:

I think the core algorithm needs to be generic so it works with any time frames that someone might want to use. Then we layer over that application implementations using wrappers or additional code. So since most people today are using pgRouting for vehicle routing applications I think we need to implement and easy to use wrapper that deals with vehicle applications of date and time of day dependencies. This does not preclude someone else coming in later and writing a new wrapper to deal with their application working in years or microsecond intervals.

For this project we need a concrete implementation that allow someone to create a table of time based restrictions of some kind on edges and reasonable easy say generate a route start at X and use these time dependent restrictions.

I do not think it needs to deal with every single possible case, but it should be a reason set of cases. We can always expand on this after GSoC. I think it would be good in the documentation, we add some notes about things we thought of be did not support and a short paragraph of our thoughts on how to support it in the future.

You need to size this to the time you have available so you can be successful. I will let Daniel be the final arbiter on this as I will be out of town again later this week and my time here seems to on overload lately.

The key idea here is layer of tasks, and structure. We need to bridge the gap between low-level generic code and higher level application specific presentation of which there my be many depending on the application in discussion.

Best regards,
-Steve

On 7/18/2011 11:03 PM, Jay Mahadeokar wrote:

On Mon, Jul 18, 2011 at 11:21 PM, Stephen Woodbridge

<woodbri@swoodbridge.com mailto:[woodbri@swoodbridge.com](mailto:woodbri@swoodbridge.com)> wrote:

Hi Jay,

I there are a few important points to this problem:

  1. the time based adjustments need to be stored in in real human
    terms, ie: similar to crontab time specification or similar to the
    Travel Time Analysis you found.

  2. the query to pass the appropriate time dependent info can easily
    be normalized to the the route start time by subtracting the route
    start time from the feature start and end times, except when it get
    complicated like rolling over from a week day schedule to a weekend
    schedule on a friday to saturday transition. Likewise for holiday
    and event schedules.

This seems to be the hard part if there is a complicated schedule.
So what we need is to define that are the inputs and outputs of this
query, something like:

INPUTS:
Start date:
Start time:
Max duration time:
Start location of route:
Ending location of route:

OUTPUTS:
Segment id:
time offset from:
time offset to:
avg speed:
transit time:
restriction:

Hi Steve,

Well, this was our initial plan, but we had not kept the query like this
since we wanted to keep the query general. I think you or Daniel had
pointed out that we should not restrict our query to time units like
date, hours etc since in some application, time interval could be
milliseconds or in some other application it could be years etc. So, I
think if we leave retrieval from time_dep_costs table as sql query
parameter, that would be more flexible as we had concluded earlier.

  1. I assume because your code only looks at an interval from the
    start, that it is not limited to an restricted time window like 12
    hr, or 24 hr, etc.

more below …

On 7/18/2011 7:49 AM, Jay Mahadeokar wrote:

Hi all,

Hi, I am working on creating wrapper functions for tdsp,
corresponding
to the wrapper with and without bounding box as in
dijkstra_shortest_paths.

As discussed many times, the TDSP algorithm expects the
start_time as 0
and all times offset from start time. For the data I had
created, I had
taken time from 0 to 24, corresponding to 24 hours and assigned
different travel_time for different time periods as per my
intuition.

If start time is say 23:30, our query for retrieving entries from
time_dep_costs should properly do calculation such that travel
times are
taken cyclically.

What to you mean by cyclically? I think the times should be linear
and repeating if they are cyclical. Is this what you mean?

Yes.

For example, a route starting Monday at 8AM and a segment has
9AM-10AM restriction and the route max duration is 48hrs then we
should see two records:

ID, From, To, MPH, …
27, 0900, 1000, 25, …
27, 3300, 3400, 25, …

Instead of seeing keeping this as two records, we can add a parameter
called is_cyclic in query. If that is true, then once we run out of the
range, we can go back to zero.

That would need many modifications in current code, but I guess it is
manageable.

Would that be ok?

I’m not even sure how I would compute this, I think this puts a lot
of burden on the user to figure this out and get it correct. The
obvious flip side of this would be for us to define a rigid format
for entering schedule information where we target support for
something like 80+% of the common schedule definitions and then
figure this out once for everyone. I’m not sure if you feel this
will fit in your schedule or not?

We can sure provide functionality for some common formats, I will try
and do that within GSoc timeline if it works out, or we can work on that
later too. What im not clear is, how do we package this in the overall
pgRouting project?

Thoughts?

-Steve

Now, this query should depend on the format of the time
dependent data
and its units. I think it is best that we keep this task with
the user
so that he writes proper sql query such that the values are properly
retrieved as required by the function.

Do you see any specific cases, data formats where we can
provide the
wrapper functions for above task?

On Sun, Jun 12, 2011 at 8:26 PM, Stephen Woodbridge
<woodbri@swoodbridge.com mailto:[woodbri@swoodbridge.com](mailto:woodbri@swoodbridge.com)

<mailto:woodbri@swoodbridge. com
mailto:[woodbri@swoodbridge.com](mailto:woodbri@swoodbridge.com)>> wrote:

On 6/12/2011 12:33 AM, Jay Mahadeokar wrote:

On Sun, Jun 12, 2011 at 1:07 AM, Stephen Woodbridge
<woodbri@swoodbridge.com mailto:[woodbri@swoodbridge.com](mailto:woodbri@swoodbridge.com)
<mailto:woodbri@swoodbridge. com mailto:[woodbri@swoodbridge.com](mailto:woodbri@swoodbridge.com)>

<mailto:woodbri@swoodbridge mailto:[woodbri@swoodbridge](mailto:woodbri@swoodbridge). com

<mailto:woodbri@swoodbridge. com
mailto:[woodbri@swoodbridge.com](mailto:woodbri@swoodbridge.com)>>> wrote:

On 6/11/2011 12:58 PM, Jay Mahadeokar wrote:

On Thu, Jun 9, 2011 at 9:16 AM, Jay Mahadeokar
<jai.mahadeokar@gmail.com mailto:[jai.mahadeokar@gmail.com](mailto:jai.mahadeokar@gmail.com)
<mailto:jai.mahadeokar@gmail. com mailto:[jai.mahadeokar@gmail.com](mailto:jai.mahadeokar@gmail.com)>

<mailto:jai.mahadeokar@gmail mailto:[jai.mahadeokar@gmail](mailto:jai.mahadeokar@gmail). com

<mailto:jai.mahadeokar@gmail. com
mailto:[jai.mahadeokar@gmail.com](mailto:jai.mahadeokar@gmail.com)>>

<mailto:jai.mahadeokar@gmail mailto:[jai.mahadeokar@gmail](mailto:jai.mahadeokar@gmail). com

<mailto:jai.mahadeokar@gmail. com mailto:[jai.mahadeokar@gmail.com](mailto:jai.mahadeokar@gmail.com)>

<mailto:jai.mahadeokar@gmail mailto:[jai.mahadeokar@gmail](mailto:jai.mahadeokar@gmail). com

<mailto:jai.mahadeokar@gmail. com
mailto:[jai.mahadeokar@gmail.com](mailto:jai.mahadeokar@gmail.com)>>>> wrote:

Hi,

Just a recap to our planned tdsp query model:

I agree. So, now can we assume we have
following
database
tables, as per things discussed till now:

  1. Ways: (On lines of the one in
    pgrouting-workshop)
  • gid
  • source
  • target
  • name
  • the_geom
  • static_length (In case time
    dependent
    routing
    is not used)
  • any other attributes…
  1. TimeVaryingCosts (Any other
    suitable name?)
  • gid
  • start_time_and_date (May divide
    this into two
    columns:
    day_of_week and time_of_day ? )
  • travel_time / arrival_time
  1. TurnRestrictions (Time dependent)
  • node/source
  • time_from
  • time_to
  • node_to
  • ancestors

If only static shortest path query is
made, then
only
table 1 is
required.
If time-dependent shortest path query is
made,
table 1 and 2
will be used.
If in time-dependent shortest path query
with turn
restrictions
is made , all 3 tables will be used.

So, for a general time dependent query, we will
fetch edge
information from ways table and the time
dependent costs
information
from the
TimeDepCosts table. (See previous msg above.)

The query on ways table can be supplied by
the usual
as on
normal
dijkstra shortest_path() query.

How do we form the query on the TimeDepCosts
table?
I see
two options:

  1. We generate the query according to the gids
    obtained from the
    query on ways table. We will have to query
    all the
    time slots
    available, or form a query so that the
    time-window is
    appropriate
    and we have sufficient travel_time info so
    that we
    can reach
    destination.

Here we can form query like

Select * from TimeDepCosts where gid in
(list of gids
queried from
ways table).

We can allow user to more filtering parameters
according to
start_time , date ?
But this will make our query specific to the
database design. As
already discussed, we cant predict the time
units /
requirements of
applications and so it would be better if we
keep
the query
independent of the database design.

Thats the reason our weight_map assumes
start time from
source as 0,
and all travel times offset from the start
time from
source. A
function that will generate such weight map
is to be
written.

  1. We allow user to pass the query for
    TimeDepCosts
    table
    too. So,
    we will have our pgsql function as:

CREATE OR REPLACE FUNCTION shortest_path(

ways_sql text,

time_dep_cost_sql text,

source_id
integer,

target_id
integer,

directed
boolean,

has_reverse_cost boolean)
RETURNS SETOF path_result

This way, user can have more control on the
amount
of data being
queried. We will only consider the time
dependent
windows
that fit
the query.

Example of the time_dep_cost_sql can be:

Select * from TimeDepCosts where "
USER_SUPPLIED_CONSTRAINTS".

The user can now supply his constraints
according to
time_window he
needs, start times, holidays etc according
to the
database
table design.

The problem with this is, we cannot filter
out edges
according to
bounding box etc since the_geom field is not
there
in this
table.
This is because that information would be
redundant.

Thoughts? Any idea that combines the merits
of both
approaches?

Hi all,

I am writing some background code will be needed
for the
above
functionality.

Any input on the above mentioned point? I suppose I
should not start
coding for any specific approach until we agree
upon a
viable query
strategy.

Hi Jay,

I always find it easiest to answer these questions by
thinking about
the problem in terms of use cases, so:

A web app, where someone enters a start and
destination and some
constraints like departure time or arrival time and
maybe
restrictions on modes of travel.

Are there other ones we are trying to support? These
should
probably
be documented as part of the project description.

I agree, we need to document the use-cases. A separate
thread
for this
discussion?

Anyway, assuming the above, then the application which
presumably
has some basic knowledge of the time constraints
could do
something
like:

Get the straight line distance between start and
destination
double
it and apply some average transportation speed to
approximate the
time window. The numbers can be changed to be
acceptable by the
application as long as it over estimates the duration of
travel. It
can then, specify the CONSTRAINTS in terms of a
start and
end time.
We do something similar with our spatial bounding
box used to
collect the edges needed to build the graph.

Now as far as collecting the constraints and making then
accessible
for the Boost_dijkstra, that is another problem. This
problem boils
down to the following:

Just a correction, in this case we will not be using
boost_dijkstra, but
instead core_tdsp() function which we have build ourselves.

  1. is there an acceptable abstraction that we can use
    internally to
    represent various data sets that we might want to
    load. Or
    can we
    define a schema that will work for some set of
    constraints
    we wish
    to implement initially.

Well, we had agreed upon the following schema for
testing purpose:

Yes, you are correct, sorry I was not paying attention when
I wrote
that.

Ways: (On lines of the one in pgrouting-workshop)

  • gid
  • source
  • target
  • name
  • the_geom
  • static_length (In case time dependent
    routing is not
    used)
  • any other attributes

TimeDepCosts

  • gid
  • day_of_week
  • time_of_day
  • travel_time / arrival_time

The time_dep_costs table schema can vary according to
application and
its needs and the nature of data available.

  1. do want to move this data into some static memory
    structures for
    the analysis like we do for the graph and then
    dispose of
    them after
    the analysis.

We are doing exactly that! The Weight_Map data structure
that is
supplied to the core_tdsp() is designed specifically to
fulfil the
above functionality. It assumes the start time as 0,
and all other
times offset from start time. So, there will be schema
specific
wrapper
functions that will help generate the weight_map data
structure.

Which brings me back, to my original question (couple of
messages back),
what is the best way to generate weight_map, ie retrieve
data
from time
dependent cost table.

Well, I think for now, I am going to go with approach
#1. i.e
user will
supply his query. This might retrieve unwanted data,
since we cant

Yes, I think this is the way to go. It is up to the user that
presumable knows his data to formulate an efficient query to
retrieve the needed data.

filter by the_geom in table#2. But, I think we can
refine this
by adding
constraints like: where gid IN (list_of_edgeids_retrieved_
from_ways)

A query something like:

select a.*
from table2 a, table1 b
where b.the_geom && <BBOX_OF_REGION>
and b.gid=a.gid
and <TIME_CONSTRAINTS>;

This would allow the planner to pick the best index.

So, I guess this approach combines the merits of both
approaches
discussed before. We can provide flexibility to
application as
well as
add constraint on the number of rows retrieved.

Thoughts?

Yes, I think this is a good plan.

I think that the bulk of the roads in the network will
probably not
have time dependent data associated with them. If you look
at what
Navteq and the Traffic people (however they are) are doing,
you will
notice that mostly they only track major highways in/out of
major
metropolitan areas. They do not track that information on minor
roads or rural areas. So I think that the number of segments
that
are likely to have time dependent data for any given request is
going to be in the 5% of the total segments. Even if someone
were to
collect say GPS cellphone average speeds over a much wider
range of
segments, I still think it is manageable. Anyway we just have to
solve the problem, they have to represent that data in a way
the we
can handle it quickly or this will not be a solution for them.

-Steve

  1. or do we want to construct a method that can
    dynamically
    request
    constraint information from the database when it is
    needed by a
    visitor. (2. is probably much more efficient and the
    better
    way to
    go from a complexity and stability point of view).

Don’t wait on me for additional input as I will be gone
through June
21st. Keep it simple and modular and we can change
or add to
it in
the future and I;m sure whatever you come up with
will be fine.
After all it is code and it can be changed it we
need to. :slight_smile:

Project seems to be moving along fine.

-Steve


pgrouting-dev mailing list
pgrouting-dev@lists.osgeo.org mailto:[pgrouting-dev@lists.osgeo.org](mailto:pgrouting-dev@lists.osgeo.org)
<mailto:pgrouting-dev@lists. osgeo.org
mailto:[pgrouting-dev@lists.osgeo.org](mailto:pgrouting-dev@lists.osgeo.org)>

<mailto:pgrouting-dev@lists mailto:[pgrouting-dev@lists](mailto:pgrouting-dev@lists).
osgeo.org <http://osgeo.org>

<mailto:pgrouting-dev@lists. osgeo.org
mailto:[pgrouting-dev@lists.osgeo.org](mailto:pgrouting-dev@lists.osgeo.org)>>

http://lists.osgeo.org/ mailman/listinfo/pgrouting-dev
<http://lists.osgeo.org/ mailman/listinfo/pgrouting-dev
<http://lists.osgeo.org/mailman/listinfo/pgrouting-dev> >


Regards,
-Jay Mahadeokar


pgrouting-dev mailing list
pgrouting-dev@lists.osgeo.org
mailto:[pgrouting-dev@lists.osgeo.org](mailto:pgrouting-dev@lists.osgeo.org)
<mailto:pgrouting-dev@lists. osgeo.org
mailto:[pgrouting-dev@lists.osgeo.org](mailto:pgrouting-dev@lists.osgeo.org)>
http://lists.osgeo.org/ mailman/listinfo/pgrouting-dev
<http://lists.osgeo.org/ mailman/listinfo/pgrouting-dev

<http://lists.osgeo.org/mailman/listinfo/pgrouting-dev> >


pgrouting-dev mailing list
pgrouting-dev@lists.osgeo.org
mailto:[pgrouting-dev@lists.osgeo.org](mailto:pgrouting-dev@lists.osgeo.org)

<mailto:pgrouting-dev@lists. osgeo.org
mailto:[pgrouting-dev@lists.osgeo.org](mailto:pgrouting-dev@lists.osgeo.org)>
http://lists.osgeo.org/ mailman/listinfo/pgrouting-dev
<http://lists.osgeo.org/ mailman/listinfo/pgrouting-dev
<http://lists.osgeo.org/mailman/listinfo/pgrouting-dev> >


Regards,
-Jay Mahadeokar


pgrouting-dev mailing list
pgrouting-dev@lists.osgeo.org mailto:[pgrouting-dev@lists.osgeo.org](mailto:pgrouting-dev@lists.osgeo.org)
http://lists.osgeo.org/ mailman/listinfo/pgrouting-dev
<http://lists.osgeo.org/mailman/listinfo/pgrouting-dev>


pgrouting-dev mailing list
pgrouting-dev@lists.osgeo.org mailto:[pgrouting-dev@lists.osgeo.org](mailto:pgrouting-dev@lists.osgeo.org)
http://lists.osgeo.org/ mailman/listinfo/pgrouting-dev
<http://lists.osgeo.org/mailman/listinfo/pgrouting-dev>


Regards,
-Jay Mahadeokar


pgrouting-dev mailing list
pgrouting-dev@lists.osgeo.org
http://lists.osgeo.org/mailman/listinfo/pgrouting-dev


pgrouting-dev mailing list
pgrouting-dev@lists.osgeo.org
http://lists.osgeo.org/mailman/listinfo/pgrouting-dev


Georepublic UG & Georepublic Japan
eMail: daniel.kastl@georepublic.de
Web: http://georepublic.de

On Tue, Jul 19, 2011 at 9:05 AM, Stephen Woodbridge <woodbri@swoodbridge.com> wrote:

Jay, Daniel,

I think we need to get our heads around this or it will be very hard for Jay to complete this as a useful tool.

Jay says:

What im not clear is, how do we package this in the overall
pgRouting project?

This is exactly the problem. So here is my 2 rupees :slight_smile: on this:

Wow, that was innovative! :slight_smile:

I think the core algorithm needs to be generic so it works with any time frames that someone might want to use.

As far as I think, this part is ready.

Then we layer over that application implementations using wrappers or additional code. So since most people today are using pgRouting for vehicle routing applications I think we need to implement and easy to use wrapper that deals with vehicle applications of date and time of day dependencies. This does not preclude someone else coming in later and writing a new wrapper to deal with their application working in years or microsecond intervals.

Now, general pgRouting query is somewhat like this:

  1. Pass parameters to pgRouting query
  2. Extract data from database according to sql parameter passed
  3. Generate the graph and other parameters using data extracted
  4. Pass data and other parameters to internal function
  5. Function calculates the result
  6. Convert the result into pgRouting result and return it.

So, we have steps 3,4,5 and 6 ready.

For tdsp, In step 2 we need to query 2 tables separately. Query to table ways will be generic.Query to table time_dep_costs, should take care of cyclic nature and other things.

We can have two approaches:

  1. Take some specific case like day and time, and write down code that will take care of above thing. When we bundle this for with pgRouting the query will become specific to that format.
    Whenever a new data format is there, someone will need to re-code step 2. Since, this is internal to pgRouting, this cannot be done by application developer. Then we would need to recompile whole thing as a separate query before using it.

  2. We could implement specific functions that can be used as query to time_dep_costs. Those can be installed as separate pgRouting functions itself. For simple applications where the data is in intervals of 24 hours, and cyclic we can just provide a postgres plsql function and it should serve the function. I am not good at plsql, but I believe it is a very flexible tool and can be used for complex data formats as well (thoughts?)

I think approach 2 is much better and it keeps the main tdsp query independent of the data format. We just need to write plsql functions for different data formats and supply those as sql argument to tdsp query to retrieve data from time_dep_costs table.

If you think this is a good approach, I can start learning plsql in detail and work on writing retrieval functions for specific data formats.

Is this model right? I can go ahead and start work on that.

For this project we need a concrete implementation that allow someone to create a table of time based restrictions of some kind on edges and reasonable easy say generate a route start at X and use these time dependent restrictions.

I do not think it needs to deal with every single possible case, but it should be a reason set of cases. We can always expand on this after GSoC. I think it would be good in the documentation, we add some notes about things we thought of be did not support and a short paragraph of our thoughts on how to support it in the future.

You need to size this to the time you have available so you can be successful. I will let Daniel be the final arbiter on this as I will be out of town again later this week and my time here seems to on overload lately.

The key idea here is layer of tasks, and structure. We need to bridge the gap between low-level generic code and higher level application specific presentation of which there my be many depending on the application in discussion.

Best regards,
-Steve

On 7/18/2011 11:03 PM, Jay Mahadeokar wrote:

On Mon, Jul 18, 2011 at 11:21 PM, Stephen Woodbridge

<woodbri@swoodbridge.com mailto:[woodbri@swoodbridge.com](mailto:woodbri@swoodbridge.com)> wrote:

Hi Jay,

I there are a few important points to this problem:

  1. the time based adjustments need to be stored in in real human
    terms, ie: similar to crontab time specification or similar to the
    Travel Time Analysis you found.

  2. the query to pass the appropriate time dependent info can easily
    be normalized to the the route start time by subtracting the route
    start time from the feature start and end times, except when it get
    complicated like rolling over from a week day schedule to a weekend
    schedule on a friday to saturday transition. Likewise for holiday
    and event schedules.

This seems to be the hard part if there is a complicated schedule.
So what we need is to define that are the inputs and outputs of this
query, something like:

INPUTS:
Start date:
Start time:
Max duration time:
Start location of route:
Ending location of route:

OUTPUTS:
Segment id:
time offset from:
time offset to:
avg speed:
transit time:
restriction:

Hi Steve,

Well, this was our initial plan, but we had not kept the query like this
since we wanted to keep the query general. I think you or Daniel had
pointed out that we should not restrict our query to time units like
date, hours etc since in some application, time interval could be
milliseconds or in some other application it could be years etc. So, I
think if we leave retrieval from time_dep_costs table as sql query
parameter, that would be more flexible as we had concluded earlier.

  1. I assume because your code only looks at an interval from the
    start, that it is not limited to an restricted time window like 12
    hr, or 24 hr, etc.

more below …

On 7/18/2011 7:49 AM, Jay Mahadeokar wrote:

Hi all,

Hi, I am working on creating wrapper functions for tdsp,
corresponding
to the wrapper with and without bounding box as in
dijkstra_shortest_paths.

As discussed many times, the TDSP algorithm expects the
start_time as 0
and all times offset from start time. For the data I had
created, I had
taken time from 0 to 24, corresponding to 24 hours and assigned
different travel_time for different time periods as per my
intuition.

If start time is say 23:30, our query for retrieving entries from
time_dep_costs should properly do calculation such that travel
times are
taken cyclically.

What to you mean by cyclically? I think the times should be linear
and repeating if they are cyclical. Is this what you mean?

Yes.

For example, a route starting Monday at 8AM and a segment has
9AM-10AM restriction and the route max duration is 48hrs then we
should see two records:

ID, From, To, MPH, …
27, 0900, 1000, 25, …
27, 3300, 3400, 25, …

Instead of seeing keeping this as two records, we can add a parameter
called is_cyclic in query. If that is true, then once we run out of the
range, we can go back to zero.

That would need many modifications in current code, but I guess it is
manageable.

Would that be ok?

I’m not even sure how I would compute this, I think this puts a lot
of burden on the user to figure this out and get it correct. The
obvious flip side of this would be for us to define a rigid format
for entering schedule information where we target support for
something like 80+% of the common schedule definitions and then
figure this out once for everyone. I’m not sure if you feel this
will fit in your schedule or not?

We can sure provide functionality for some common formats, I will try
and do that within GSoc timeline if it works out, or we can work on that
later too. What im not clear is, how do we package this in the overall
pgRouting project?

Thoughts?

-Steve

Now, this query should depend on the format of the time
dependent data
and its units. I think it is best that we keep this task with
the user
so that he writes proper sql query such that the values are properly
retrieved as required by the function.

Do you see any specific cases, data formats where we can
provide the
wrapper functions for above task?

On Sun, Jun 12, 2011 at 8:26 PM, Stephen Woodbridge
<woodbri@swoodbridge.com mailto:[woodbri@swoodbridge.com](mailto:woodbri@swoodbridge.com)

<mailto:woodbri@swoodbridge. com
mailto:[woodbri@swoodbridge.com](mailto:woodbri@swoodbridge.com)>> wrote:

On 6/12/2011 12:33 AM, Jay Mahadeokar wrote:

On Sun, Jun 12, 2011 at 1:07 AM, Stephen Woodbridge
<woodbri@swoodbridge.com mailto:[woodbri@swoodbridge.com](mailto:woodbri@swoodbridge.com)
<mailto:woodbri@swoodbridge. com mailto:[woodbri@swoodbridge.com](mailto:woodbri@swoodbridge.com)>

<mailto:woodbri@swoodbridge mailto:[woodbri@swoodbridge](mailto:woodbri@swoodbridge). com

<mailto:woodbri@swoodbridge. com
mailto:[woodbri@swoodbridge.com](mailto:woodbri@swoodbridge.com)>>> wrote:

On 6/11/2011 12:58 PM, Jay Mahadeokar wrote:

On Thu, Jun 9, 2011 at 9:16 AM, Jay Mahadeokar
<jai.mahadeokar@gmail.com mailto:[jai.mahadeokar@gmail.com](mailto:jai.mahadeokar@gmail.com)
<mailto:jai.mahadeokar@gmail. com mailto:[jai.mahadeokar@gmail.com](mailto:jai.mahadeokar@gmail.com)>

<mailto:jai.mahadeokar@gmail mailto:[jai.mahadeokar@gmail](mailto:jai.mahadeokar@gmail). com

<mailto:jai.mahadeokar@gmail. com
mailto:[jai.mahadeokar@gmail.com](mailto:jai.mahadeokar@gmail.com)>>

<mailto:jai.mahadeokar@gmail mailto:[jai.mahadeokar@gmail](mailto:jai.mahadeokar@gmail). com

<mailto:jai.mahadeokar@gmail. com mailto:[jai.mahadeokar@gmail.com](mailto:jai.mahadeokar@gmail.com)>

<mailto:jai.mahadeokar@gmail mailto:[jai.mahadeokar@gmail](mailto:jai.mahadeokar@gmail). com

<mailto:jai.mahadeokar@gmail. com
mailto:[jai.mahadeokar@gmail.com](mailto:jai.mahadeokar@gmail.com)>>>> wrote:

Hi,

Just a recap to our planned tdsp query model:

I agree. So, now can we assume we have
following
database
tables, as per things discussed till now:

  1. Ways: (On lines of the one in
    pgrouting-workshop)
  • gid
  • source
  • target
  • name
  • the_geom
  • static_length (In case time
    dependent
    routing
    is not used)
  • any other attributes…
  1. TimeVaryingCosts (Any other
    suitable name?)
  • gid
  • start_time_and_date (May divide
    this into two
    columns:
    day_of_week and time_of_day ? )
  • travel_time / arrival_time
  1. TurnRestrictions (Time dependent)
  • node/source
  • time_from
  • time_to
  • node_to
  • ancestors

If only static shortest path query is
made, then
only
table 1 is
required.
If time-dependent shortest path query is
made,
table 1 and 2
will be used.
If in time-dependent shortest path query
with turn
restrictions
is made , all 3 tables will be used.

So, for a general time dependent query, we will
fetch edge
information from ways table and the time
dependent costs
information
from the
TimeDepCosts table. (See previous msg above.)

The query on ways table can be supplied by
the usual
as on
normal
dijkstra shortest_path() query.

How do we form the query on the TimeDepCosts
table?
I see
two options:

  1. We generate the query according to the gids
    obtained from the
    query on ways table. We will have to query
    all the
    time slots
    available, or form a query so that the
    time-window is
    appropriate
    and we have sufficient travel_time info so
    that we
    can reach
    destination.

Here we can form query like

Select * from TimeDepCosts where gid in
(list of gids
queried from
ways table).

We can allow user to more filtering parameters
according to
start_time , date ?
But this will make our query specific to the
database design. As
already discussed, we cant predict the time
units /
requirements of
applications and so it would be better if we
keep
the query
independent of the database design.

Thats the reason our weight_map assumes
start time from
source as 0,
and all travel times offset from the start
time from
source. A
function that will generate such weight map
is to be
written.

  1. We allow user to pass the query for
    TimeDepCosts
    table
    too. So,
    we will have our pgsql function as:

CREATE OR REPLACE FUNCTION shortest_path(

ways_sql text,

time_dep_cost_sql text,

source_id
integer,

target_id
integer,

directed
boolean,

has_reverse_cost boolean)
RETURNS SETOF path_result

This way, user can have more control on the
amount
of data being
queried. We will only consider the time
dependent
windows
that fit
the query.

Example of the time_dep_cost_sql can be:

Select * from TimeDepCosts where "
USER_SUPPLIED_CONSTRAINTS".

The user can now supply his constraints
according to
time_window he
needs, start times, holidays etc according
to the
database
table design.

The problem with this is, we cannot filter
out edges
according to
bounding box etc since the_geom field is not
there
in this
table.
This is because that information would be
redundant.

Thoughts? Any idea that combines the merits
of both
approaches?

Hi all,

I am writing some background code will be needed
for the
above
functionality.

Any input on the above mentioned point? I suppose I
should not start
coding for any specific approach until we agree
upon a
viable query
strategy.

Hi Jay,

I always find it easiest to answer these questions by
thinking about
the problem in terms of use cases, so:

A web app, where someone enters a start and
destination and some
constraints like departure time or arrival time and
maybe
restrictions on modes of travel.

Are there other ones we are trying to support? These
should
probably
be documented as part of the project description.

I agree, we need to document the use-cases. A separate
thread
for this
discussion?

Anyway, assuming the above, then the application which
presumably
has some basic knowledge of the time constraints
could do
something
like:

Get the straight line distance between start and
destination
double
it and apply some average transportation speed to
approximate the
time window. The numbers can be changed to be
acceptable by the
application as long as it over estimates the duration of
travel. It
can then, specify the CONSTRAINTS in terms of a
start and
end time.
We do something similar with our spatial bounding
box used to
collect the edges needed to build the graph.

Now as far as collecting the constraints and making then
accessible
for the Boost_dijkstra, that is another problem. This
problem boils
down to the following:

Just a correction, in this case we will not be using
boost_dijkstra, but
instead core_tdsp() function which we have build ourselves.

  1. is there an acceptable abstraction that we can use
    internally to
    represent various data sets that we might want to
    load. Or
    can we
    define a schema that will work for some set of
    constraints
    we wish
    to implement initially.

Well, we had agreed upon the following schema for
testing purpose:

Yes, you are correct, sorry I was not paying attention when
I wrote
that.

Ways: (On lines of the one in pgrouting-workshop)

  • gid
  • source
  • target
  • name
  • the_geom
  • static_length (In case time dependent
    routing is not
    used)
  • any other attributes

TimeDepCosts

  • gid
  • day_of_week
  • time_of_day
  • travel_time / arrival_time

The time_dep_costs table schema can vary according to
application and
its needs and the nature of data available.

  1. do want to move this data into some static memory
    structures for
    the analysis like we do for the graph and then
    dispose of
    them after
    the analysis.

We are doing exactly that! The Weight_Map data structure
that is
supplied to the core_tdsp() is designed specifically to
fulfil the
above functionality. It assumes the start time as 0,
and all other
times offset from start time. So, there will be schema
specific
wrapper
functions that will help generate the weight_map data
structure.

Which brings me back, to my original question (couple of
messages back),
what is the best way to generate weight_map, ie retrieve
data
from time
dependent cost table.

Well, I think for now, I am going to go with approach
#1. i.e
user will
supply his query. This might retrieve unwanted data,
since we cant

Yes, I think this is the way to go. It is up to the user that
presumable knows his data to formulate an efficient query to
retrieve the needed data.

filter by the_geom in table#2. But, I think we can
refine this
by adding
constraints like: where gid IN (list_of_edgeids_retrieved_
from_ways)

A query something like:

select a.*
from table2 a, table1 b
where b.the_geom && <BBOX_OF_REGION>
and b.gid=a.gid
and <TIME_CONSTRAINTS>;

This would allow the planner to pick the best index.

So, I guess this approach combines the merits of both
approaches
discussed before. We can provide flexibility to
application as
well as
add constraint on the number of rows retrieved.

Thoughts?

Yes, I think this is a good plan.

I think that the bulk of the roads in the network will
probably not
have time dependent data associated with them. If you look
at what
Navteq and the Traffic people (however they are) are doing,
you will
notice that mostly they only track major highways in/out of
major
metropolitan areas. They do not track that information on minor
roads or rural areas. So I think that the number of segments
that
are likely to have time dependent data for any given request is
going to be in the 5% of the total segments. Even if someone
were to
collect say GPS cellphone average speeds over a much wider
range of
segments, I still think it is manageable. Anyway we just have to
solve the problem, they have to represent that data in a way
the we
can handle it quickly or this will not be a solution for them.

-Steve

  1. or do we want to construct a method that can
    dynamically
    request
    constraint information from the database when it is
    needed by a
    visitor. (2. is probably much more efficient and the
    better
    way to
    go from a complexity and stability point of view).

Don’t wait on me for additional input as I will be gone
through June
21st. Keep it simple and modular and we can change
or add to
it in
the future and I;m sure whatever you come up with
will be fine.
After all it is code and it can be changed it we
need to. :slight_smile:

Project seems to be moving along fine.

-Steve


pgrouting-dev mailing list
pgrouting-dev@lists.osgeo.org mailto:[pgrouting-dev@lists.osgeo.org](mailto:pgrouting-dev@lists.osgeo.org)
<mailto:pgrouting-dev@lists. osgeo.org
mailto:[pgrouting-dev@lists.osgeo.org](mailto:pgrouting-dev@lists.osgeo.org)>

<mailto:pgrouting-dev@lists mailto:[pgrouting-dev@lists](mailto:pgrouting-dev@lists).
osgeo.org <http://osgeo.org>

<mailto:pgrouting-dev@lists. osgeo.org
mailto:[pgrouting-dev@lists.osgeo.org](mailto:pgrouting-dev@lists.osgeo.org)>>

http://lists.osgeo.org/ mailman/listinfo/pgrouting-dev
<http://lists.osgeo.org/ mailman/listinfo/pgrouting-dev
<http://lists.osgeo.org/mailman/listinfo/pgrouting-dev> >


Regards,
-Jay Mahadeokar


pgrouting-dev mailing list
pgrouting-dev@lists.osgeo.org
mailto:[pgrouting-dev@lists.osgeo.org](mailto:pgrouting-dev@lists.osgeo.org)
<mailto:pgrouting-dev@lists. osgeo.org
mailto:[pgrouting-dev@lists.osgeo.org](mailto:pgrouting-dev@lists.osgeo.org)>
http://lists.osgeo.org/ mailman/listinfo/pgrouting-dev
<http://lists.osgeo.org/ mailman/listinfo/pgrouting-dev

<http://lists.osgeo.org/mailman/listinfo/pgrouting-dev> >


pgrouting-dev mailing list
pgrouting-dev@lists.osgeo.org
mailto:[pgrouting-dev@lists.osgeo.org](mailto:pgrouting-dev@lists.osgeo.org)

<mailto:pgrouting-dev@lists. osgeo.org
mailto:[pgrouting-dev@lists.osgeo.org](mailto:pgrouting-dev@lists.osgeo.org)>
http://lists.osgeo.org/ mailman/listinfo/pgrouting-dev
<http://lists.osgeo.org/ mailman/listinfo/pgrouting-dev
<http://lists.osgeo.org/mailman/listinfo/pgrouting-dev> >


Regards,
-Jay Mahadeokar


pgrouting-dev mailing list
pgrouting-dev@lists.osgeo.org mailto:[pgrouting-dev@lists.osgeo.org](mailto:pgrouting-dev@lists.osgeo.org)
http://lists.osgeo.org/ mailman/listinfo/pgrouting-dev
<http://lists.osgeo.org/mailman/listinfo/pgrouting-dev>


pgrouting-dev mailing list
pgrouting-dev@lists.osgeo.org mailto:[pgrouting-dev@lists.osgeo.org](mailto:pgrouting-dev@lists.osgeo.org)
http://lists.osgeo.org/ mailman/listinfo/pgrouting-dev
<http://lists.osgeo.org/mailman/listinfo/pgrouting-dev>


Regards,
-Jay Mahadeokar


pgrouting-dev mailing list
pgrouting-dev@lists.osgeo.org
http://lists.osgeo.org/mailman/listinfo/pgrouting-dev


pgrouting-dev mailing list
pgrouting-dev@lists.osgeo.org
http://lists.osgeo.org/mailman/listinfo/pgrouting-dev


Regards,
-Jay Mahadeokar

We can have two approaches:

  1. Take some specific case like day and time, and write down code that will take care of above thing. When we bundle this for with pgRouting the query will become specific to that format.
    Whenever a new data format is there, someone will need to re-code step 2. Since, this is internal to pgRouting, this cannot be done by application developer. Then we would need to recompile whole thing as a separate query before using it.

  2. We could implement specific functions that can be used as query to time_dep_costs. Those can be installed as separate pgRouting functions itself. For simple applications where the data is in intervals of 24 hours, and cyclic we can just provide a postgres plsql function and it should serve the function. I am not good at plsql, but I believe it is a very flexible tool and can be used for complex data formats as well (thoughts?)

I think approach 2 is much better and it keeps the main tdsp query independent of the data format. We just need to write plsql functions for different data formats and supply those as sql argument to tdsp query to retrieve data from time_dep_costs table.

If you think this is a good approach, I can start learning plsql in detail and work on writing retrieval functions for specific data formats.

Is this model right? I can go ahead and start work on that.

Hi Jay,

Yes, I think (2) is better. When you look at the DARP function, it does something similar, taking several queries as parameters. The last parameter for example is a distance matrix. DARP function doesn’t care how this distance matrix is generated. It just needs to have the right format.
So if someone creates stores the matrix as a table in the database, it’s fine. But it’s also OK to calculate the distance on the fly with some other function.
So TDSP could do it the same way, right?

I’m not good at pl/pgsql … I always pass the task to Anton :wink:
But I don’t think it’s too difficult to learn. It’s maybe lacking extensive documentation and a good editor.

When writing functions it’s good to keep namespace support in mind. Just as a side note.

Daniel


Georepublic UG & Georepublic Japan
eMail: daniel.kastl@georepublic.de
Web: http://georepublic.de

On 7/19/2011 12:24 AM, Jay Mahadeokar wrote:

On Tue, Jul 19, 2011 at 9:05 AM, Stephen Woodbridge
<woodbri@swoodbridge.com <mailto:woodbri@swoodbridge.com>> wrote:

    Jay, Daniel,

    I think we need to get our heads around this or it will be very hard
    for Jay to complete this as a useful tool.

    Jay says:

        What im not clear is, how do we package this in the overall
        pgRouting project?

    This is exactly the problem. So here is my 2 rupees :slight_smile: on this:

Wow, that was innovative! :slight_smile:

Well probably not, sorry that is what happens when I'm too tired and try to be cute. <sigh>

    I think the core algorithm needs to be generic so it works with any
    time frames that someone might want to use.

As far as I think, this part is ready.

    Then we layer over that application implementations using wrappers
    or additional code. So since most people today are using pgRouting
    for vehicle routing applications I think we need to implement and
    easy to use wrapper that deals with vehicle applications of date and
    time of day dependencies. This does not preclude someone else coming
    in later and writing a new wrapper to deal with their application
    working in years or microsecond intervals.

Now, general pgRouting query is somewhat like this:

1. Pass parameters to pgRouting query
2. Extract data from database according to sql parameter passed
3. Generate the graph and other parameters using data extracted
4. Pass data and other parameters to internal function
5. Function calculates the result
6. Convert the result into pgRouting result and return it.

  So, we have steps 3,4,5 and 6 ready.

For tdsp, In step 2 we need to query 2 tables separately. Query to table
ways will be generic.Query to table time_dep_costs, should take care of
cyclic nature and other things.

We can have two approaches:

1. Take some specific case like day and time, and write down code that
will take care of above thing. When we bundle this for with pgRouting
the query will become specific to that format.
Whenever a new data format is there, someone will need to re-code step
2. Since, this is internal to pgRouting, this cannot be done by
application developer. Then we would need to recompile whole thing as a
separate query before using it.

2. We could implement specific functions that can be used as query to
time_dep_costs. Those can be installed as separate pgRouting functions
itself. For simple applications where the data is in intervals of 24
hours, and cyclic we can just provide a postgres plsql function and it
should serve the function. I am not good at plsql, but I believe it is a
very flexible tool and can be used for complex data formats as well
(thoughts?)

I think approach 2 is much better and it keeps the main tdsp query
independent of the data format. We just need to write plsql functions
for different data formats and supply those as sql argument to tdsp
query to retrieve data from time_dep_costs table.

If you think this is a good approach, I can start learning plsql in
detail and work on writing retrieval functions for specific data formats.

Is this model right? I can go ahead and start work on that.

Yes, I think approach 2 is the better model because application developer should not be required to recompile anything.

plsql is very straight forward, and I think we can all help with the learning curve or even coding some of it if we have time. I think if you assume a set returning function then you have inputs and a record structure of your results. The function takes the inputs and returns a set of records. So for example:

-- define an output record structure as a type
-- using pgr_ as namespace for function, change as appropriate

create type pgr_time_dep_costs as (
   edge_id int,
   time_start float,
   time_stop float,
   avg_speed float,
   transit_time float
);

CREATE OR REPLACE FUNCTION pgr_get_time_dep_costs(schedule_tab text, start_date_time timestamp, max_time_duration_sec int, ...)
   RETURNS SETOF pgr_time_dep_costs AS
$BODY$
DECLARE
   out pgr_time_dep_costs;

BEGIN

   LOOP
     -- compute values of out from the input

     -- load the values into out
     out.edge_id := ...;
     out.time_start := ...;
     out.time_stop := ...;
     out.avg_spd := ...;
     out.tansit_time := ...;

     RETURN NEXT out; -- return an out record
   END LOOP;

   RETURN; -- exit the procedure
END
$BODY$
   LANGUAGE plpgsql VOLATILE
   COST 100
   ROWS 1000;

This is pretty basic, but it should help get you started.

-Steve

    For this project we need a concrete implementation that allow
    someone to create a table of time based restrictions of some kind on
    edges and reasonable easy say generate a route start at X and use
    these time dependent restrictions.

    I do not think it needs to deal with every single possible case, but
    it should be a reason set of cases. We can always expand on this
    after GSoC. I think it would be good in the documentation, we add
    some notes about things we thought of be did not support and a short
    paragraph of our thoughts on how to support it in the future.

    You need to size this to the time you have available so you can be
    successful. I will let Daniel be the final arbiter on this as I will
    be out of town again later this week and my time here seems to on
    overload lately.

    The key idea here is layer of tasks, and structure. We need to
    bridge the gap between low-level generic code and higher level
    application specific presentation of which there my be many
    depending on the application in discussion.

    Best regards,
      -Steve

    On 7/18/2011 11:03 PM, Jay Mahadeokar wrote:

        On Mon, Jul 18, 2011 at 11:21 PM, Stephen Woodbridge
        <woodbri@swoodbridge.com <mailto:woodbri@swoodbridge.com>
        <mailto:woodbri@swoodbridge. com
        <mailto:woodbri@swoodbridge.com>>> wrote:

            Hi Jay,

            I there are a few important points to this problem:

            1. the time based adjustments need to be stored in in real human
            terms, ie: similar to crontab time specification or similar
        to the
            Travel Time Analysis you found.

            2. the query to pass the appropriate time dependent info can
        easily
            be normalized to the the route start time by subtracting the
        route
            start time from the feature start and end times, except when
        it get
            complicated like rolling over from a week day schedule to a
        weekend
            schedule on a friday to saturday transition. Likewise for
        holiday
            and event schedules.

            This seems to be the hard part if there is a complicated
        schedule.
            So what we need is to define that are the inputs and outputs
        of this
            query, something like:

            INPUTS:
               Start date:
               Start time:
               Max duration time:
               Start location of route:
               Ending location of route:

            OUTPUTS:
               Segment id:
               time offset from:
               time offset to:
               avg speed:
               transit time:
               restriction:

        Hi Steve,

        Well, this was our initial plan, but we had not kept the query
        like this
        since we wanted to keep the query general. I think you or Daniel had
        pointed out that we should not restrict our query to time units like
        date, hours etc since in some application, time interval could be
        milliseconds or in some other application it could be years etc.
        So, I
        think if we leave retrieval from time_dep_costs table as sql query
        parameter, that would be more flexible as we had concluded earlier.

            3. I assume because your code only looks at an interval from the
            start, that it is not limited to an restricted time window
        like 12
            hr, or 24 hr, etc.

            more below ...

            On 7/18/2011 7:49 AM, Jay Mahadeokar wrote:

                Hi all,

                Hi, I am working on creating wrapper functions for tdsp,
                corresponding
                to the wrapper with and without bounding box as in
                dijkstra_shortest_paths.

                As discussed many times, the TDSP algorithm expects the
                start_time as 0
                and all times offset from start time. For the data I had
                created, I had
                taken time from 0 to 24, corresponding to 24 hours and
        assigned
                different travel_time for different time periods as per my
                intuition.

                If start time is say 23:30, our query for retrieving
        entries from
                time_dep_costs should properly do calculation such that
        travel
                times are
                taken cyclically.

            What to you mean by cyclically? I think the times should be
        linear
            and repeating if they are cyclical. Is this what you mean?

        Yes.

            For example, a route starting Monday at 8AM and a segment has
            9AM-10AM restriction and the route max duration is 48hrs then we
            should see two records:

            ID, From, To, MPH, ...
            27, 0900, 1000, 25, ...
            27, 3300, 3400, 25, ...

        Instead of seeing keeping this as two records, we can add a
        parameter
        called is_cyclic in query. If that is true, then once we run out
        of the
        range, we can go back to zero.

        That would need many modifications in current code, but I guess
        it is
        manageable.

        Would that be ok?

            I'm not even sure how I would compute this, I think this
        puts a lot
            of burden on the user to figure this out and get it correct. The
            obvious flip side of this would be for us to define a rigid
        format
            for entering schedule information where we target support for
            something like 80+% of the common schedule definitions and then
            figure this out once for everyone. I'm not sure if you feel this
            will fit in your schedule or not?

        We can sure provide functionality for some common formats, I
        will try
        and do that within GSoc timeline if it works out, or we can work
        on that
        later too. What im not clear is, how do we package this in the
        overall
        pgRouting project?

            Thoughts?

            -Steve

                Now, this query should depend on the format of the time
                dependent data
                and its units. I think it is best that we keep this task
        with
                the user
                so that he writes proper sql query such that the values
        are properly
                retrieved as required by the function.

                Do you see any specific cases, data formats where we can
                provide the
                wrapper functions for above task?

                On Sun, Jun 12, 2011 at 8:26 PM, Stephen Woodbridge
        <woodbri@swoodbridge.com <mailto:woodbri@swoodbridge.com>
        <mailto:woodbri@swoodbridge. com <mailto:woodbri@swoodbridge.com>>
        <mailto:woodbri@swoodbridge. com
        <mailto:woodbri@swoodbridge. com
        <mailto:woodbri@swoodbridge.com>>>> wrote:

                    On 6/12/2011 12:33 AM, Jay Mahadeokar wrote:

                        On Sun, Jun 12, 2011 at 1:07 AM, Stephen Woodbridge
        <woodbri@swoodbridge.com <mailto:woodbri@swoodbridge.com>
        <mailto:woodbri@swoodbridge. com <mailto:woodbri@swoodbridge.com>>
        <mailto:woodbri@swoodbridge. com
        <mailto:woodbri@swoodbridge. com <mailto:woodbri@swoodbridge.com>>>
        <mailto:woodbri@swoodbridge
        <mailto:woodbri@swoodbridge>. com

        <mailto:woodbri@swoodbridge. com
        <mailto:woodbri@swoodbridge. com
        <mailto:woodbri@swoodbridge.com>>>>> wrote:

                            On 6/11/2011 12:58 PM, Jay Mahadeokar wrote:

                                On Thu, Jun 9, 2011 at 9:16 AM, Jay
        Mahadeokar
        <jai.mahadeokar@gmail.com <mailto:jai.mahadeokar@gmail.com>
        <mailto:jai.mahadeokar@gmail. com <mailto:jai.mahadeokar@gmail.com>>
        <mailto:jai.mahadeokar@gmail. com
        <mailto:jai.mahadeokar@gmail. com
        <mailto:jai.mahadeokar@gmail.com>>>
        <mailto:jai.mahadeokar@gmail
        <mailto:jai.mahadeokar@gmail>. com

        <mailto:jai.mahadeokar@gmail. com
        <mailto:jai.mahadeokar@gmail. com
        <mailto:jai.mahadeokar@gmail.com>>>>
        <mailto:jai.mahadeokar@gmail
        <mailto:jai.mahadeokar@gmail>. com

        <mailto:jai.mahadeokar@gmail. com
        <mailto:jai.mahadeokar@gmail. com
        <mailto:jai.mahadeokar@gmail.com>>>
        <mailto:jai.mahadeokar@gmail
        <mailto:jai.mahadeokar@gmail>. com

        <mailto:jai.mahadeokar@gmail. com
        <mailto:jai.mahadeokar@gmail. com
        <mailto:jai.mahadeokar@gmail.com>>>>>> wrote:

                                    Hi,

                                    Just a recap to our planned tdsp
        query model:

                                        I agree. So, now can we assume
        we have
                following
                        database
                                        tables, as per things discussed
        till now:

                                           1. Ways: (On lines of the one in
                        pgrouting-workshop)

                                            * gid
                                            * source
                                            * target
                                            * name
                                            * the_geom
                                            * static_length (In case time
                dependent
                        routing
                                is not used)
                                            * any other attributes..

                                             2. TimeVaryingCosts (Any other
                suitable name?)

                                            * gid
                                            * start_time_and_date (May
        divide
                this into two
                                columns:
                                              day_of_week and
        time_of_day ? )
                                            * travel_time / arrival_time

                                             3. TurnRestrictions (Time
        dependent)

                                            * node/source
                                            * time_from
                                            * time_to
                                            * node_to
                                            * ancestors

                                        If only static shortest path
        query is
                made, then
                        only
                                table 1 is
                                        required.
                                        If time-dependent shortest path
        query is
                made,
                        table 1 and 2
                                        will be used.
                                        If in time-dependent shortest
        path query
                with turn
                                restrictions
                                        is made , all 3 tables will be used.

                                    So, for a general time dependent
        query, we will
                        fetch edge
                                    information from ways table and the time
                dependent costs
                                information
                                    from the
                                    TimeDepCosts table. (See previous
        msg above.)

                                    The query on ways table can be
        supplied by
                the usual
                        as on
                                normal
                                    dijkstra shortest_path() query.

                                    How do we form the query on the
        TimeDepCosts
                table?
                        I see
                                two options:

                                    1. We generate the query according
        to the gids
                        obtained from the
                                    query on ways table. We will have to
        query
                all the
                        time slots
                                    available, or form a query so that the
                time-window is
                                appropriate
                                    and we have sufficient travel_time
        info so
                that we
                        can reach
                                    destination.

                                    Here we can form query like

                                    Select * from TimeDepCosts where gid in
                (list of gids
                                queried from
                                    ways table).

                                    We can allow user to more filtering
        parameters
                        according to
                                    start_time , date ?
                                    But this will make our query
        specific to the
                        database design. As
                                    already discussed, we cant predict
        the time
                units /
                                requirements of
                                    applications and so it would be
        better if we
                keep
                        the query
                                    independent of the database design.

                                    Thats the reason our weight_map assumes
                start time from
                                source as 0,
                                    and all travel times offset from the
        start
                time from
                        source. A
                                    function that will generate such
        weight map
                is to be
                        written.

                                    2. We allow user to pass the query for
                TimeDepCosts
                        table
                                too. So,
                                    we will have our pgsql function as:

                                    CREATE OR REPLACE FUNCTION
          shortest_path(

                        ways_sql text,

                                time_dep_cost_sql text,

                        source_id
                                  integer,

                        target_id
                                  integer,

                        directed
                                  boolean,

                                has_reverse_cost boolean)
                                             RETURNS SETOF path_result

                                    This way, user can have more control
        on the
                amount
                        of data being
                                    queried. We will only consider the time
                dependent
                        windows
                                that fit
                                    the query.

                                    Example of the time_dep_cost_sql can be:

                                    Select * from TimeDepCosts where "
                        USER_SUPPLIED_CONSTRAINTS".

                                    The user can now supply his constraints
                according to
                                time_window he
                                    needs, start times, holidays etc
        according
                to the
                        database
                                table design.

                                    The problem with this is, we cannot
        filter
                out edges
                                according to
                                    bounding box etc since the_geom
        field is not
                there
                        in this
                                table.
                                    This is because that information
        would be
                redundant.

                                    Thoughts? Any idea that combines the
        merits
                of both
                        approaches?

                                Hi all,

                                I am writing some background code will
        be needed
                for the
                        above
                                functionality.

                                Any input on the above mentioned point?
        I suppose I
                        should not start
                                coding for any specific approach until
        we agree
                upon a
                        viable query
                                strategy.

                            Hi Jay,

                            I always find it easiest to answer these
        questions by
                        thinking about
                            the problem in terms of use cases, so:

                            A web app, where someone enters a start and
                destination and some
                            constraints like departure time or arrival
        time and
                maybe
                            restrictions on modes of travel.

                            Are there other ones we are trying to
        support? These
                should
                        probably
                            be documented as part of the project
        description.

                        I agree, we need to document the use-cases. A
        separate
                thread
                        for this
                        discussion?

                            Anyway, assuming the above, then the
        application which
                        presumably
                            has some basic knowledge of the time constraints
                could do
                        something
                            like:

                            Get the straight line distance between start and
                destination
                        double
                            it and apply some average transportation
        speed to
                        approximate the
                            time window. The numbers can be changed to be
                acceptable by the
                            application as long as it over estimates the
        duration of
                        travel. It
                            can then, specify the CONSTRAINTS in terms of a
                start and
                        end time.
                            We do something similar with our spatial
        bounding
                box used to
                            collect the edges needed to build the graph.

                            Now as far as collecting the constraints and
        making then
                        accessible
                            for the Boost_dijkstra, that is another
        problem. This
                        problem boils
                            down to the following:

                        Just a correction, in this case we will not be using
                        boost_dijkstra, but
                        instead core_tdsp() function which we have build
        ourselves.

                            1. is there an acceptable abstraction that
        we can use
                        internally to
                            represent various data sets that we might
        want to
                load. Or
                        can we
                            define a schema that will work for some set of
                constraints
                        we wish
                            to implement initially.

                        Well, we had agreed upon the following schema for
                testing purpose:

                    Yes, you are correct, sorry I was not paying
        attention when
                I wrote
                    that.

                           1.

                              Ways: (On lines of the one in
        pgrouting-workshop)

                                  * gid
                                  * source
                                  * target
                                  * name
                                  * the_geom
                                  * static_length (In case time dependent
                routing is not
                        used)
                                  * any other attributes
                           2.

                              TimeDepCosts

                                  * gid
                                  * day_of_week
                                  * time_of_day
                                  * travel_time / arrival_time

                        The time_dep_costs table schema can vary
        according to
                        application and
                        its needs and the nature of data available.

                            2. do want to move this data into some
        static memory
                        structures for
                            the analysis like we do for the graph and then
                dispose of
                        them after
                            the analysis.

                        We are doing exactly that! The Weight_Map data
        structure
                that is
                        supplied to the core_tdsp() is designed
        specifically to
                fulfil the
                        above functionality. It assumes the start time
        as 0,
                and all other
                        times offset from start time. So, there will be
        schema
                specific
                        wrapper
                        functions that will help generate the weight_map
        data
                structure.

                        Which brings me back, to my original question
        (couple of
                        messages back),
                        what is the best way to generate weight_map, ie
        retrieve
                data
                        from time
                        dependent cost table.

                        Well, I think for now, I am going to go with
        approach
                #1. i.e
                        user will
                        supply his query. This might retrieve unwanted data,
                since we cant

                    Yes, I think this is the way to go. It is up to the
        user that
                    presumable knows his data to formulate an efficient
        query to
                    retrieve the needed data.

                        filter by the_geom in table#2. But, I think we can
                refine this
                        by adding
                        constraints like: where gid IN
        (list_of_edgeids_retrieved_
                        from_ways)

                    A query something like:

                    select a.*
                      from table2 a, table1 b
                      where b.the_geom && <BBOX_OF_REGION>
                       and b.gid=a.gid
                       and <TIME_CONSTRAINTS>;

                    This would allow the planner to pick the best index.

                        So, I guess this approach combines the merits of
        both
                approaches
                        discussed before. We can provide flexibility to
                application as
                        well as
                        add constraint on the number of rows retrieved.

                        Thoughts?

                    Yes, I think this is a good plan.

                    I think that the bulk of the roads in the network will
                probably not
                    have time dependent data associated with them. If
        you look
                at what
                    Navteq and the Traffic people (however they are) are
        doing,
                you will
                    notice that mostly they only track major highways
        in/out of
                major
                    metropolitan areas. They do not track that
        information on minor
                    roads or rural areas. So I think that the number of
        segments
                that
                    are likely to have time dependent data for any given
        request is
                    going to be in the 5% of the total segments. Even if
        someone
                were to
                    collect say GPS cellphone average speeds over a much
        wider
                range of
                    segments, I still think it is manageable. Anyway we
        just have to
                    solve the problem, they have to represent that data
        in a way
                the we
                    can handle it quickly or this will not be a solution
        for them.

                    -Steve

                            3. or do we want to construct a method that can
                dynamically
                        request
                            constraint information from the database
        when it is
                needed by a
                            visitor. (2. is probably much more efficient
        and the
                better
                        way to
                            go from a complexity and stability point of
        view).

                            Don't wait on me for additional input as I
        will be gone
                        through June
                            21st. Keep it simple and modular and we can
        change
                or add to
                        it in
                            the future and I;m sure whatever you come up
        with
                will be fine.
                            After all it is code and it can be changed it we
                need to. :slight_smile:

                            Project seems to be moving along fine.

                            -Steve

                            ______________________________ _________________
                            pgrouting-dev mailing list
        pgrouting-dev@lists.osgeo.org
        <mailto:pgrouting-dev@lists.osgeo.org>
        <mailto:pgrouting-dev@lists. osgeo.org
        <mailto:pgrouting-dev@lists.osgeo.org>>
        <mailto:pgrouting-dev@lists.
        osgeo.org <http://osgeo.org>
        <mailto:pgrouting-dev@lists. osgeo.org
        <mailto:pgrouting-dev@lists.osgeo.org>>>
        <mailto:pgrouting-dev@lists
        <mailto:pgrouting-dev@lists>.
        osgeo.org <http://osgeo.org> <http://osgeo.org>

        <mailto:pgrouting-dev@lists.
        osgeo.org <http://osgeo.org>
        <mailto:pgrouting-dev@lists. osgeo.org
        <mailto:pgrouting-dev@lists.osgeo.org>>>>

        http://lists.osgeo.org/ mailman/listinfo/pgrouting-dev
        <http://lists.osgeo.org/ mailman/listinfo/pgrouting-dev
        <http://lists.osgeo.org/mailman/listinfo/pgrouting-dev&gt; > >

                        --
                        Regards,
                        -Jay Mahadeokar

                        ______________________________ _________________
                        pgrouting-dev mailing list
        pgrouting-dev@lists.osgeo.org <mailto:pgrouting-dev@lists.osgeo.org>
        <mailto:pgrouting-dev@lists. osgeo.org
        <mailto:pgrouting-dev@lists.osgeo.org>>
        <mailto:pgrouting-dev@lists.
        osgeo.org <http://osgeo.org>
        <mailto:pgrouting-dev@lists. osgeo.org
        <mailto:pgrouting-dev@lists.osgeo.org>>>
        http://lists.osgeo.org/ mailman/listinfo/pgrouting-dev
        <http://lists.osgeo.org/ mailman/listinfo/pgrouting-dev
        <http://lists.osgeo.org/mailman/listinfo/pgrouting-dev&gt; > >

                    ______________________________ _________________
                    pgrouting-dev mailing list
        pgrouting-dev@lists.osgeo.org <mailto:pgrouting-dev@lists.osgeo.org>
        <mailto:pgrouting-dev@lists. osgeo.org
        <mailto:pgrouting-dev@lists.osgeo.org>>
        <mailto:pgrouting-dev@lists.
        osgeo.org <http://osgeo.org>
        <mailto:pgrouting-dev@lists. osgeo.org
        <mailto:pgrouting-dev@lists.osgeo.org>>>
        http://lists.osgeo.org/ mailman/listinfo/pgrouting-dev
        <http://lists.osgeo.org/ mailman/listinfo/pgrouting-dev
        <http://lists.osgeo.org/mailman/listinfo/pgrouting-dev&gt; > >

                --
                Regards,
                -Jay Mahadeokar

                ______________________________ _________________
                pgrouting-dev mailing list
        pgrouting-dev@lists.osgeo.org
        <mailto:pgrouting-dev@lists.osgeo.org>
        <mailto:pgrouting-dev@lists. osgeo.org
        <mailto:pgrouting-dev@lists.osgeo.org>>
        http://lists.osgeo.org/ mailman/listinfo/pgrouting-dev
        <http://lists.osgeo.org/ mailman/listinfo/pgrouting-dev
        <http://lists.osgeo.org/mailman/listinfo/pgrouting-dev&gt; >

            ______________________________ _________________
            pgrouting-dev mailing list
        pgrouting-dev@lists.osgeo.org
        <mailto:pgrouting-dev@lists.osgeo.org>
        <mailto:pgrouting-dev@lists. osgeo.org
        <mailto:pgrouting-dev@lists.osgeo.org>>
        http://lists.osgeo.org/ mailman/listinfo/pgrouting-dev
        <http://lists.osgeo.org/ mailman/listinfo/pgrouting-dev
        <http://lists.osgeo.org/mailman/listinfo/pgrouting-dev&gt; >

        --
        Regards,
        -Jay Mahadeokar

        ______________________________ _________________
        pgrouting-dev mailing list
        pgrouting-dev@lists.osgeo.org <mailto:pgrouting-dev@lists.osgeo.org>
        http://lists.osgeo.org/ mailman/listinfo/pgrouting-dev
        <http://lists.osgeo.org/mailman/listinfo/pgrouting-dev&gt;

    ______________________________ _________________
    pgrouting-dev mailing list
    pgrouting-dev@lists.osgeo.org <mailto:pgrouting-dev@lists.osgeo.org>
    http://lists.osgeo.org/ mailman/listinfo/pgrouting-dev
    <http://lists.osgeo.org/mailman/listinfo/pgrouting-dev&gt;

--
Regards,
-Jay Mahadeokar

_______________________________________________
pgrouting-dev mailing list
pgrouting-dev@lists.osgeo.org
http://lists.osgeo.org/mailman/listinfo/pgrouting-dev

Hi all,

Steve, Daniel, thanks for the input.

Well, I have written the plsql function on lines of what you suggested, for the data format that I had created earlier and it seems to be working for now, with cyclic issue also taken into account and all times offset from query_start_time.

I have updated the code in gsoc-tdsp repository[1] .

I also changed the existing tdsp.h, tdsp.c and weight_map.hpp files so that end_time is also taken into account (I had written a more complex code earlier and thought end_time was redundant and wasn’t necessary, corrected that now).

So, now basically the tdsp query expects a data from time_dep_costs table as set of rows: edge_id, start_time, end_time, travel_time.

Example:

SELECT * FROM time_dependent_shortest_path(’
SELECT gid as id,
source::integer,
target::integer,
length::double precision as cost
FROM ways’,
81, 359, true, false,‘select edge_id, start_time, end_time, travel_time from pgr_get_time_dep_costs(’‘time_dep_costs’‘,14,12)’,0);

Here 6th parameter is the nested pgr function.

I think there is no need to pass query_start_time to the main time_dep_shortest_path function since we are passing it to nested function and that will always be 0 in main query (parameter 7).

Also, the parameters is_directed and has_reverse_cost seem redundant, since in this case graph will be always directed I think. So, maybe these can be removed.

Next step would be to write similar plsql functions for most standard data formats. If you can suggest the most obvious cases, I can start working on that.

[1] https://github.com/pgRouting/pgrouting/blob/gsoc-tdsp/extra/tdsp/sql/tdsp_data_wrappers.sql

On Tue, Jul 19, 2011 at 10:29 AM, Stephen Woodbridge <woodbri@swoodbridge.com> wrote:

On 7/19/2011 12:24 AM, Jay Mahadeokar wrote:

On Tue, Jul 19, 2011 at 9:05 AM, Stephen Woodbridge

<woodbri@swoodbridge.com mailto:[woodbri@swoodbridge.com](mailto:woodbri@swoodbridge.com)> wrote:

Jay, Daniel,

I think we need to get our heads around this or it will be very hard
for Jay to complete this as a useful tool.

Jay says:

What im not clear is, how do we package this in the overall
pgRouting project?

This is exactly the problem. So here is my 2 rupees :slight_smile: on this:

Wow, that was innovative! :slight_smile:

Well probably not, sorry that is what happens when I’m too tired and try to be cute.

I think the core algorithm needs to be generic so it works with any
time frames that someone might want to use.

As far as I think, this part is ready.

Then we layer over that application implementations using wrappers
or additional code. So since most people today are using pgRouting
for vehicle routing applications I think we need to implement and
easy to use wrapper that deals with vehicle applications of date and
time of day dependencies. This does not preclude someone else coming
in later and writing a new wrapper to deal with their application
working in years or microsecond intervals.

Now, general pgRouting query is somewhat like this:

  1. Pass parameters to pgRouting query
  2. Extract data from database according to sql parameter passed
  3. Generate the graph and other parameters using data extracted
  4. Pass data and other parameters to internal function
  5. Function calculates the result
  6. Convert the result into pgRouting result and return it.

So, we have steps 3,4,5 and 6 ready.

For tdsp, In step 2 we need to query 2 tables separately. Query to table
ways will be generic.Query to table time_dep_costs, should take care of
cyclic nature and other things.

We can have two approaches:

  1. Take some specific case like day and time, and write down code that
    will take care of above thing. When we bundle this for with pgRouting
    the query will become specific to that format.
    Whenever a new data format is there, someone will need to re-code step

  2. Since, this is internal to pgRouting, this cannot be done by
    application developer. Then we would need to recompile whole thing as a
    separate query before using it.

  3. We could implement specific functions that can be used as query to
    time_dep_costs. Those can be installed as separate pgRouting functions
    itself. For simple applications where the data is in intervals of 24
    hours, and cyclic we can just provide a postgres plsql function and it
    should serve the function. I am not good at plsql, but I believe it is a
    very flexible tool and can be used for complex data formats as well
    (thoughts?)

I think approach 2 is much better and it keeps the main tdsp query
independent of the data format. We just need to write plsql functions
for different data formats and supply those as sql argument to tdsp
query to retrieve data from time_dep_costs table.

If you think this is a good approach, I can start learning plsql in
detail and work on writing retrieval functions for specific data formats.

Is this model right? I can go ahead and start work on that.

Yes, I think approach 2 is the better model because application developer should not be required to recompile anything.

plsql is very straight forward, and I think we can all help with the learning curve or even coding some of it if we have time. I think if you assume a set returning function then you have inputs and a record structure of your results. The function takes the inputs and returns a set of records. So for example:

– define an output record structure as a type
– using pgr_ as namespace for function, change as appropriate

create type pgr_time_dep_costs as (
edge_id int,
time_start float,
time_stop float,
avg_speed float,
transit_time float
);

CREATE OR REPLACE FUNCTION pgr_get_time_dep_costs(schedule_tab text, start_date_time timestamp, max_time_duration_sec int, …)
RETURNS SETOF pgr_time_dep_costs AS
$BODY$
DECLARE
out pgr_time_dep_costs;

BEGIN

LOOP
– compute values of out from the input

– load the values into out
out.edge_id := …;
out.time_start := …;
out.time_stop := …;
out.avg_spd := …;
out.tansit_time := …;

RETURN NEXT out; – return an out record
END LOOP;

RETURN; – exit the procedure
END
$BODY$
LANGUAGE plpgsql VOLATILE
COST 100
ROWS 1000;

This is pretty basic, but it should help get you started.

-Steve

For this project we need a concrete implementation that allow
someone to create a table of time based restrictions of some kind on
edges and reasonable easy say generate a route start at X and use
these time dependent restrictions.

I do not think it needs to deal with every single possible case, but
it should be a reason set of cases. We can always expand on this
after GSoC. I think it would be good in the documentation, we add
some notes about things we thought of be did not support and a short
paragraph of our thoughts on how to support it in the future.

You need to size this to the time you have available so you can be
successful. I will let Daniel be the final arbiter on this as I will
be out of town again later this week and my time here seems to on
overload lately.

The key idea here is layer of tasks, and structure. We need to
bridge the gap between low-level generic code and higher level
application specific presentation of which there my be many
depending on the application in discussion.

Best regards,
-Steve


Regards,
-Jay Mahadeokar