Graph in memory

Hello,

I followed PostGIS day 2024 and there was this interesting remark made by Paul Ramsey in this video:

“I’m going to ask the pgRouting developers why there isn’t some sort of caching mode that keeps the whole graph in memory.”

And there was also this issue on github

As I haven’t seen a question on this platform, I’ll take the opportunity to ask it.

@kikislater
There are a couple of things we changed in recent versions to reduce the impact of graph building and to store graphs in memory for more than one route use.

  1. In prior versions of pgrouting before 3.7, the graph was built essentially twice, first in boost and then copied over to C. @cvvergara can discuss this a bit more. She revised to build once without having to copy.

  2. Functions have been added I forget the version I think from pgrouting 3.4 on, you will find many permutations of like dijkstra that does multi start , multi stop
    So it builds the graph once and uses to compute several route requests.

I discuss this a bit in my video on PostGIS Day 2024

and Vicky gives many examples of this in her state of pgRouting 2023 FOSS4G talk

So those are all going kind of in that direction.

I don’t think it’s realistic to keep a graph in RAM for multiple function calls because first of all, if you have a large dataset, you are probably using different parts of the graph depending on queries you are doing, so trying to load the whole graph in memory will easily eat up your RAM.

Secondly the way PostgreSQL is built, you can’t just create a graph in memory once and some how allow other connections to use it. In theory you could maybe trick shared buffers to do it.

I think the best bet might be to create some kind of custom type that can save the boost graph in a table in a format more usable, but I think even then you run into issues with toast / detoasting that may just make being able to pull that graph from PostgreSQL storage not worthwhile.

@pramsey has more knowledge of inner workings of PostgreSQL so might have some ideas on how he invisions this would work.

1 Like

Secondly the way PostgreSQL is built, you can’t just create a graph in memory once and some how allow other connections to use it. In theory you could maybe trick shared buffers to do it.

It would be quite a bit of a research project. In order for backends to share the graph the graph would have to be in shared memory, which implies every allocation done in building the graph also allocating within shmem, which could be tricksy, to say the least. Another approach would be to to have the graph allocated by a bgworker in its own space and then pass the queries from the backend to the bgworker via shmem. This might be a better approach. Could start with a bgworker per connection, so each system is isolated, but multiple routes on the same graph don’t have a reload/rebuild phase.

P.

I think the best bet might be to create some kind of custom type that can save the boost graph in a table in a format more usable, but I think even then you run into issues with toast / detoasting that may just make being able to pull that graph from PostgreSQL storage not worthwhile.

@pramsey has more knowledge of inner workings of PostgreSQL so might have some ideas on how he invisions this would work.


Visit Topic or reply to this email to respond.

To unsubscribe from these emails, click here.

Hadn’t thought about using background workers. We can maybe add that on as a Google Summer of Code (GSOC) project which is coming up soon…

I also want to explore the performance impact of the change in Release of pgRouting 3.7.0
of getting rid of the redundancy of graph both in C and C++ on bigger graphs discussed in

I expect not much impact with smaller graphs, but with much bigger graphs we should see lower memory utilization as well as faster build.

Thank you for taking the time to respond via this platform. I don’t know whether to discuss here or on github. It’s interesting to see that even though pgrouting performs well from my point of view, there’s an interest in improving performance.
Even if this can be a disadvantage on certain large networks that need to be managed, for other uses and I don’t know if it’s possible, but apart from the optimisations suggested by Paul which seem completely legitimate, would it be possible to load the graph in memory using an option and not at system level?