Graph in memory

robe · December 9, 2024, 9:47pm

Hadn’t thought about using background workers. We can maybe add that on as a Google Summer of Code (GSOC) project which is coming up soon…

I also want to explore the performance impact of the change in Release of pgRouting 3.7.0
of getting rid of the redundancy of graph both in C and C++ on bigger graphs discussed in

github.com/pgRouting/pgrouting

Read postgresql data on C++

pgRouting:develop ← cvvergara:read-data-on-cpp-try3

opened 10:25PM - 24 Jan 24 UTC

cvvergara

+4770 -6656

Moving the input queries from the C code into the C++ code. ## Before this ch…ange **On the C file:** * Includes what we call a driver [header](https://github.com/pgRouting/pgrouting/blob/v3.6.1/include/drivers/dijkstra/dijkstra_driver.h) is linked as C * The driver headers are very similar to each other * the static `process` within each C file [example](https://github.com/pgRouting/pgrouting/blob/v3.6.1/src/dijkstra/dijkstra.c#L54) are very similar * opens the connection with PostgreSQL * reads the data, (edges sql, restrictions_sql, etc and any array that is on the query) * all of these reading is done on the C code which interacts with PostgreSQL, (which is written in C.) * Suppose, for simplifying this description, that the data takes `x` MB in memory * calls the `pgr_do_function` defined on the driver * gets the results * closes the connection with PostgreSQL **On the driver C++ file:** * [code](https://github.com/pgRouting/pgrouting/blob/v3.6.1/src/dijkstra/dijkstra_driver.cpp) is written in C++ * Drivers are very similar within each other. * Convert C arrays to C++ containers * Now memory size is 2x + container overhead * Builds the boost graph * Now memory size is 3x + container overhead * **None of this code interacts with PostgreSQL** The ideal situation: * Build the boost graph as the data is read. * Have C++ templates on the drivers * Being so similar that can be done General steps to reach the ideal situation 1) Be able to read the PostgreSQL data on the C++ code 2) Create the templates 3) Build the boost graphs based on the templates needs ## This PR is step 1 In rough terms, moving the reading of the data to C++. The sketch of the C & C++ driver files for the first step: **On the C file:** * Includes what we call a driver `header` linked as C * The driver headers are very similar to each other * the static `process` within each C file will still be very similar * opens the connection with PostgreSQL * calls the `pgr_do_function` defined on the driver * gets the results * closes the connection with PostgreSQL **On the driver C++ file:** * [code](https://github.com/pgRouting/pgrouting/blob/v3.6.1/src/dijkstra/dijkstra_driver.cpp) is written in C++ * Drivers will still be very similar within each other. * **C++ code will interact with PostgreSQL** * Read the data directly into C++ containers * Two kinds of data * Data that comes from the inner SQL queries * Data that comes from arrays * All of these reading will be do with C++ code * Now the reading into C++ container takes `x` MB in memory * Builds the boost graph * Now memory size is 2x + container overhead ## tasks - [x] Copy the files `pgdata_getters` and `pgdata_fetchers` to `trsp_pgget` & `trsp_pgfetch` - [x] Verify that the only function that is including the `trsp_pgget` is the deprecated `pgr_trsp` function - [x] Remove the unused code on `trsp_pgget` & `trsp_pgfetch` - [x] Create workaround to postgres defines of functions that exist on the standard - [x] Adjust `pgdata_getters` and `pgdata_fetchers` to be used directly from C++ - [x] parameter is a string containing the SQL query - [x] return a structure instead of a pointer - [x] return a C++ container instead of a pointer - [x] Create new template get_data that works with the changes on `pgdata_getters` and `pgdata_fetchers` - [x] Do general adjustments to several files - [x] Per sub directory basis - [x] Delete from C file code that reads the data - [x] Add into C++ driver file(s) the code that reads the data - [x] If necessary adjust the code that use a C array to use the C++ container - [x] standardize the driver names to start with `pgr_do_` - [x] Update release_notes & NEWS

I expect not much impact with smaller graphs, but with much bigger graphs we should see lower memory utilization as well as faster build.