[pgrouting-dev] SSL SYSCALL error: EOF detected

Luis_de_Sousa · February 20, 2015, 10:58am

Dear all,

I am presently implementing the Optimal Meeting Point (OMP) [0]
algorithm with PyWPS. I have a draft brute force algorithm that in
smaller networks computes the OMP correctly by testing all nodes of
the network with pgr_dijkstra (I am not yet using heuristics since
costs are not only distance dependent).

I am now testing this algorithm in a larger network, of some 50 000
nodes. I previously reported that for about 7%-8% of the nodes
pgr_dijkstra fails, returning an empty tuple [1]. After patching these
exceptions, I am stumping into another issue: somewhere between an
half and two thirds of the network have been processed, Postgres
crashes and I get the error: "SSL SYSCALL error: EOF detected" [2, 3].
This crash seems to happen randomly, with pgr_dijkstra invoked on
different nodes.

This issue should also be easy to circumvent, but it seems to be
symptom of something a bit more serious with pgrouting.

Regards,

Luís

[0] https://github.com/pgRouting/pgrouting/issues/289

[1] http://pgrouting-users.974093.n3.nabble.com/pgrouting-users-Empty-tuple-from-pgr-dijkstra-tp4025706.html

[2] http://stackoverflow.com/questions/20201711/pg-internalerror-ssl-syscall-error-eof-detected

[3] http://stackoverflow.com/questions/20217571/psycopg2-interfaceerror-connection-already-closed-pgr-astar

Stephen_Woodbridge1 · February 20, 2015, 2:16pm

On 2/20/2015 5:58 AM, Luís de Sousa wrote:

Dear all,

I am presently implementing the Optimal Meeting Point (OMP) [0]
algorithm with PyWPS. I have a draft brute force algorithm that in
smaller networks computes the OMP correctly by testing all nodes of
the network with pgr_dijkstra (I am not yet using heuristics since
costs are not only distance dependent).

I am now testing this algorithm in a larger network, of some 50 000
nodes. I previously reported that for about 7%-8% of the nodes
pgr_dijkstra fails, returning an empty tuple [1]. After patching these
exceptions, I am stumping into another issue: somewhere between an
half and two thirds of the network have been processed, Postgres
crashes and I get the error: "SSL SYSCALL error: EOF detected" [2, 3].
This crash seems to happen randomly, with pgr_dijkstra invoked on
different nodes.

Hi Luís,

The error: "SSL SYSCALL error: EOF detected" is because the postgresql backend has crashed and you connection in python has terminated.

This issue should also be easy to circumvent, but it seems to be
symptom of something a bit more serious with pgrouting.

There are a number of things that can crash the database, like running out of memory, or running into a bug in the code.

google: gdb postgres stack trace

There are a couple of ways to get a stack trace of a crash.
1. is to enable core files
2. is to run the database under gdb

HTH,
-Steve

Regards,

Luís

[0] https://github.com/pgRouting/pgrouting/issues/289

[1] http://pgrouting-users.974093.n3.nabble.com/pgrouting-users-Empty-tuple-from-pgr-dijkstra-tp4025706.html

[2] http://stackoverflow.com/questions/20201711/pg-internalerror-ssl-syscall-error-eof-detected

[3] http://stackoverflow.com/questions/20217571/psycopg2-interfaceerror-connection-already-closed-pgr-astar
_______________________________________________
pgrouting-dev mailing list
pgrouting-dev@lists.osgeo.org
http://lists.osgeo.org/mailman/listinfo/pgrouting-dev

Luis_de_Sousa · February 23, 2015, 12:52pm

Dear Stephen et al.

This seems to be a memory leak in pgr_dijkstra; RAM slowly swells up
until malloc fails. It happens after some 180 000 paths have been
calculated, meaning that each call is leaving behind about 10 Kb of
memory.

A stack trace is attached. I do not have the debug symbols for libc,
but for postgres it seems to be all there.

I'll keep you posted if I find something more.

Regards,

Luís

On 20 February 2015 at 15:16, Stephen Woodbridge
<woodbri@swoodbridge.com> wrote:

On 2/20/2015 5:58 AM, Luís de Sousa wrote:

Dear all,

I am presently implementing the Optimal Meeting Point (OMP) [0]
algorithm with PyWPS. I have a draft brute force algorithm that in
smaller networks computes the OMP correctly by testing all nodes of
the network with pgr_dijkstra (I am not yet using heuristics since
costs are not only distance dependent).

I am now testing this algorithm in a larger network, of some 50 000
nodes. I previously reported that for about 7%-8% of the nodes
pgr_dijkstra fails, returning an empty tuple [1]. After patching these
exceptions, I am stumping into another issue: somewhere between an
half and two thirds of the network have been processed, Postgres
crashes and I get the error: "SSL SYSCALL error: EOF detected" [2, 3].
This crash seems to happen randomly, with pgr_dijkstra invoked on
different nodes.

Hi Luís,

The error: "SSL SYSCALL error: EOF detected" is because the postgresql
backend has crashed and you connection in python has terminated.

This issue should also be easy to circumvent, but it seems to be
symptom of something a bit more serious with pgrouting.

There are a number of things that can crash the database, like running out
of memory, or running into a bug in the code.

google: gdb postgres stack trace

There are a couple of ways to get a stack trace of a crash.
1. is to enable core files
2. is to run the database under gdb

HTH,
-Steve

Regards,

Luís

[0] https://github.com/pgRouting/pgrouting/issues/289

[1]
http://pgrouting-users.974093.n3.nabble.com/pgrouting-users-Empty-tuple-from-pgr-dijkstra-tp4025706.html

[2]
http://stackoverflow.com/questions/20201711/pg-internalerror-ssl-syscall-error-eof-detected

[3]
http://stackoverflow.com/questions/20217571/psycopg2-interfaceerror-connection-already-closed-pgr-astar
_______________________________________________
pgrouting-dev mailing list
pgrouting-dev@lists.osgeo.org
http://lists.osgeo.org/mailman/listinfo/pgrouting-dev

_______________________________________________
pgrouting-dev mailing list
pgrouting-dev@lists.osgeo.org
http://lists.osgeo.org/mailman/listinfo/pgrouting-dev

gdb-postgres.txt (13.7 KB)

Stephen_Woodbridge1 · February 23, 2015, 7:01pm

Hi Luis,

This is the easiest way to find a memory leak:

http://blog.cleverelephant.ca/2008/08/valgrinding-postgis.html

You only need to run one query that you think is leaking and this will report what, how much and where in the code.

-Steve

On 2/23/2015 7:52 AM, Luís de Sousa wrote:

Dear Stephen et al.

This seems to be a memory leak in pgr_dijkstra; RAM slowly swells up
until malloc fails. It happens after some 180 000 paths have been
calculated, meaning that each call is leaving behind about 10 Kb of
memory.

A stack trace is attached. I do not have the debug symbols for libc,
but for postgres it seems to be all there.

I'll keep you posted if I find something more.

Regards,

Luís

On 20 February 2015 at 15:16, Stephen Woodbridge
<woodbri@swoodbridge.com> wrote:

On 2/20/2015 5:58 AM, Luís de Sousa wrote:

Dear all,

I am presently implementing the Optimal Meeting Point (OMP) [0]
algorithm with PyWPS. I have a draft brute force algorithm that in
smaller networks computes the OMP correctly by testing all nodes of
the network with pgr_dijkstra (I am not yet using heuristics since
costs are not only distance dependent).

I am now testing this algorithm in a larger network, of some 50 000
nodes. I previously reported that for about 7%-8% of the nodes
pgr_dijkstra fails, returning an empty tuple [1]. After patching these
exceptions, I am stumping into another issue: somewhere between an
half and two thirds of the network have been processed, Postgres
crashes and I get the error: "SSL SYSCALL error: EOF detected" [2, 3].
This crash seems to happen randomly, with pgr_dijkstra invoked on
different nodes.

Hi Luís,

The error: "SSL SYSCALL error: EOF detected" is because the postgresql
backend has crashed and you connection in python has terminated.

This issue should also be easy to circumvent, but it seems to be
symptom of something a bit more serious with pgrouting.

There are a number of things that can crash the database, like running out
of memory, or running into a bug in the code.

google: gdb postgres stack trace

There are a couple of ways to get a stack trace of a crash.
1. is to enable core files
2. is to run the database under gdb

HTH,
-Steve

Regards,

Luís

[0] https://github.com/pgRouting/pgrouting/issues/289

[1]
http://pgrouting-users.974093.n3.nabble.com/pgrouting-users-Empty-tuple-from-pgr-dijkstra-tp4025706.html

[2]
http://stackoverflow.com/questions/20201711/pg-internalerror-ssl-syscall-error-eof-detected

[3]
http://stackoverflow.com/questions/20217571/psycopg2-interfaceerror-connection-already-closed-pgr-astar
_______________________________________________
pgrouting-dev mailing list
pgrouting-dev@lists.osgeo.org
http://lists.osgeo.org/mailman/listinfo/pgrouting-dev

_______________________________________________
pgrouting-dev mailing list
pgrouting-dev@lists.osgeo.org
http://lists.osgeo.org/mailman/listinfo/pgrouting-dev

_______________________________________________
pgrouting-dev mailing list
pgrouting-dev@lists.osgeo.org
http://lists.osgeo.org/mailman/listinfo/pgrouting-dev

Luis_de_Sousa · February 24, 2015, 2:22pm

Hi Steve,

Paul's post is bit old, things don't work exactly the way same
anymore, but I eventually succeeded running it (the log is attached).
I used this query:

SELECT seq, id1 AS node, id2 AS edge, cost, ST_AsGeoJSON(b.the_geom)
FROM pgr_dijkstra('
          SELECT gid AS id,
               source::integer,
               target::integer,
               to_cost::double precision AS cost,
               reverse_cost::double precision
          FROM lux_2po.ways',
          10000 , 20000 , true, true) a LEFT JOIN lux_2po.ways b ON
(a.id2 = b.gid)

I had to feed it to valgrind from a file (without line breaks):

$ cat query.sql | valgrind --leak-check=yes --log-file=valgrindlog
/usr/lib/postgresql/9.1/bin/postgres --single -D
/etc/postgresql/9.1/main lamilo_routing

If I am reading it correctly, this query left behind some 3 kb of
memory. Not as much as my back-of-the-envelope calculation pointed,
but in the same order of magnitude. It may happen that between
different nodes the leak is larger.

Regards,

Luís

On 23 February 2015 at 20:01, Stephen Woodbridge
<woodbri@swoodbridge.com> wrote:

Hi Luis,

This is the easiest way to find a memory leak:

http://blog.cleverelephant.ca/2008/08/valgrinding-postgis.html

You only need to run one query that you think is leaking and this will
report what, how much and where in the code.

-Steve

valgrindlog (3.81 KB)

Stephen_Woodbridge1 · February 24, 2015, 4:52pm

On 2/24/2015 9:22 AM, Luís de Sousa wrote:

Hi Steve,

Paul's post is bit old, things don't work exactly the way same
anymore, but I eventually succeeded running it (the log is attached).
I used this query:

SELECT seq, id1 AS node, id2 AS edge, cost, ST_AsGeoJSON(b.the_geom)
FROM pgr_dijkstra('
           SELECT gid AS id,
                source::integer,
                target::integer,
                to_cost::double precision AS cost,
                reverse_cost::double precision
           FROM lux_2po.ways',
           10000 , 20000 , true, true) a LEFT JOIN lux_2po.ways b ON
(a.id2 = b.gid)

I had to feed it to valgrind from a file (without line breaks):

$ cat query.sql | valgrind --leak-check=yes --log-file=valgrindlog
/usr/lib/postgresql/9.1/bin/postgres --single -D
/etc/postgresql/9.1/main lamilo_routing

If I am reading it correctly, this query left behind some 3 kb of
memory. Not as much as my back-of-the-envelope calculation pointed,
but in the same order of magnitude. It may happen that between
different nodes the leak is larger.

So basically this report is very clean. I do not see any leaks from pgrouting. All the leaks listed are one-time leaks from the postgresql server so the do not accumulate on a per query basis.

One thing you might try is to run your command 2-3 time from the input file. If you can only run one command then try "select ... union all select ... union all select ..." or "select ...; select ...; select ...;"

So when you read the log file all the leaks that are immediately from main are just things that postgresql is doing when it starts up and they all go away when the server shuts down so not a problem. pgrouting leaks will have a much longer call stack associated with them.

You could be running out of memory if lux_2po.ways is a hug table because you are loading the whole thing. We typically load ways based on a bounding box about the start and end points and expanding that a little. You have to be careful if you have a bbox the is a horizontal or vertical line and you need to expand that more than a diagonal line between the points.

-Steve

Regards,

Luís

On 23 February 2015 at 20:01, Stephen Woodbridge
<woodbri@swoodbridge.com> wrote:

Hi Luis,

This is the easiest way to find a memory leak:

http://blog.cleverelephant.ca/2008/08/valgrinding-postgis.html

You only need to run one query that you think is leaking and this will
report what, how much and where in the code.

-Steve

_______________________________________________
pgrouting-dev mailing list
pgrouting-dev@lists.osgeo.org
http://lists.osgeo.org/mailman/listinfo/pgrouting-dev

Luis_de_Sousa · February 25, 2015, 9:06am

Hi again Steve,

I have ran valgrind with 4 and 10 queries, and as you expected, the
report is essentially the same (logs attached).

I have postgres crashing twice a day. The way psycopg2 opens and
maintains the connection is the only alternative suspect, but the
memory swell only takes place with the pgr_dijkstra function. Would
you have other suggestions? Or would you advise a different forum to
pose this issue?

Thank you,

Luís

On 24 February 2015 at 17:52, Stephen Woodbridge
<woodbri@swoodbridge.com> wrote:

On 2/24/2015 9:22 AM, Luís de Sousa wrote:

Hi Steve,

Paul's post is bit old, things don't work exactly the way same
anymore, but I eventually succeeded running it (the log is attached).
I used this query:

SELECT seq, id1 AS node, id2 AS edge, cost, ST_AsGeoJSON(b.the_geom)
FROM pgr_dijkstra('
           SELECT gid AS id,
                source::integer,
                target::integer,
                to_cost::double precision AS cost,
                reverse_cost::double precision
           FROM lux_2po.ways',
           10000 , 20000 , true, true) a LEFT JOIN lux_2po.ways b ON
(a.id2 = b.gid)

I had to feed it to valgrind from a file (without line breaks):

$ cat query.sql | valgrind --leak-check=yes --log-file=valgrindlog
/usr/lib/postgresql/9.1/bin/postgres --single -D
/etc/postgresql/9.1/main lamilo_routing

If I am reading it correctly, this query left behind some 3 kb of
memory. Not as much as my back-of-the-envelope calculation pointed,
but in the same order of magnitude. It may happen that between
different nodes the leak is larger.

So basically this report is very clean. I do not see any leaks from
pgrouting. All the leaks listed are one-time leaks from the postgresql
server so the do not accumulate on a per query basis.

One thing you might try is to run your command 2-3 time from the input file.
If you can only run one command then try "select ... union all select ...
union all select ..." or "select ...; select ...; select ...;"

So when you read the log file all the leaks that are immediately from main
are just things that postgresql is doing when it starts up and they all go
away when the server shuts down so not a problem. pgrouting leaks will have
a much longer call stack associated with them.

You could be running out of memory if lux_2po.ways is a hug table because
you are loading the whole thing. We typically load ways based on a bounding
box about the start and end points and expanding that a little. You have to
be careful if you have a bbox the is a horizontal or vertical line and you
need to expand that more than a diagonal line between the points.

-Steve

valgrindlog.4 (3.67 KB)

valgrindlog.10 (3.67 KB)

Luis_de_Sousa · February 25, 2015, 9:28am

Just as an addendum, the network tables in this schema sum up to less
than 34 Mb:

# SELECT sum(c.relpages) * 8
  FROM pg_class c,
       pg_namespace n
WHERE c.relnamespace = n.oid
   AND n.nspname LIKE 'lux_2po_2169'
   AND c.relname LIKE 'ways%';
?column?
----------
    33952
(1 row)

Thank you,

Luís

On 25 February 2015 at 10:06, Luís de Sousa <luis.a.de.sousa@gmail.com> wrote:

Hi again Steve,

I have ran valgrind with 4 and 10 queries, and as you expected, the
report is essentially the same (logs attached).

I have postgres crashing twice a day. The way psycopg2 opens and
maintains the connection is the only alternative suspect, but the
memory swell only takes place with the pgr_dijkstra function. Would
you have other suggestions? Or would you advise a different forum to
pose this issue?

Thank you,

Luís

Stephen_Woodbridge1 · February 25, 2015, 2:56pm

Luis,

Have tried running this query using pgr_trsp() instead of pgr_dijkstra()

SELECT seq, id1 AS node, id2 AS edge, cost, ST_AsGeoJSON(b.the_geom)
FROM pgr_trsp('
            SELECT gid AS id,
                 source::integer,
                 target::integer,
                 to_cost::double precision AS cost,
                 reverse_cost::double precision
            FROM lux_2po.ways',
            10000 , 20000 , true, true) a LEFT JOIN lux_2po.ways b ON
(a.id2 = b.gid)

If this leaks the same way then I'm guessing the issue is with psycopg2 or how you are calling it. I have not used python interface to pg but in other languages the pattern looks like:

connect to database
prepare a query
execute the query
loop through the results
free the results <======= **** this is a leak if not done ****
loop back to prepare or execute
disconnect from database

Q: have you watched the processes in top? is the postgres process growing in size or is it the python or apache or other process that is consuming all the memory?

If some other process consumes all the memory then postgres will fail when it needs memory because there is none available.

-Steve

On 2/25/2015 4:28 AM, Luís de Sousa wrote:

Just as an addendum, the network tables in this schema sum up to less
than 34 Mb:

# SELECT sum(c.relpages) * 8
   FROM pg_class c,
        pg_namespace n
  WHERE c.relnamespace = n.oid
    AND n.nspname LIKE 'lux_2po_2169'
    AND c.relname LIKE 'ways%';
  ?column?
----------
     33952
(1 row)

Thank you,

Luís

On 25 February 2015 at 10:06, Luís de Sousa <luis.a.de.sousa@gmail.com> wrote:

Hi again Steve,

I have ran valgrind with 4 and 10 queries, and as you expected, the
report is essentially the same (logs attached).

I have postgres crashing twice a day. The way psycopg2 opens and
maintains the connection is the only alternative suspect, but the
memory swell only takes place with the pgr_dijkstra function. Would
you have other suggestions? Or would you advise a different forum to
pose this issue?

Thank you,

Luís

_______________________________________________
pgrouting-dev mailing list
pgrouting-dev@lists.osgeo.org
http://lists.osgeo.org/mailman/listinfo/pgrouting-dev

Luis_de_Sousa · February 25, 2015, 3:18pm

Hi Steve,

I am sorry if I did not make this clear before: PyWPS is running on a
different server. Postgres and pgRouting run on a dedicated server; it
is always a postgres process taking up the RAM.

Thank you,

Luís

On 25 February 2015 at 15:56, Stephen Woodbridge
<woodbri@swoodbridge.com> wrote:

Luis,

Have tried running this query using pgr_trsp() instead of pgr_dijkstra()

SELECT seq, id1 AS node, id2 AS edge, cost, ST_AsGeoJSON(b.the_geom)
FROM pgr_trsp('
           SELECT gid AS id,
                source::integer,
                target::integer,
                to_cost::double precision AS cost,
                reverse_cost::double precision
           FROM lux_2po.ways',
           10000 , 20000 , true, true) a LEFT JOIN lux_2po.ways b ON
(a.id2 = b.gid)

If this leaks the same way then I'm guessing the issue is with psycopg2 or
how you are calling it. I have not used python interface to pg but in other
languages the pattern looks like:

connect to database
prepare a query
execute the query
loop through the results
free the results <======= **** this is a leak if not done ****
loop back to prepare or execute
disconnect from database

Q: have you watched the processes in top? is the postgres process growing in
size or is it the python or apache or other process that is consuming all
the memory?

If some other process consumes all the memory then postgres will fail when
it needs memory because there is none available.

-Steve

On 2/25/2015 4:28 AM, Luís de Sousa wrote:

Just as an addendum, the network tables in this schema sum up to less
than 34 Mb:

# SELECT sum(c.relpages) * 8
   FROM pg_class c,
        pg_namespace n
  WHERE c.relnamespace = n.oid
    AND n.nspname LIKE 'lux_2po_2169'
    AND c.relname LIKE 'ways%';
  ?column?
----------
     33952
(1 row)

Thank you,

Luís

On 25 February 2015 at 10:06, Luís de Sousa <luis.a.de.sousa@gmail.com>
wrote:

Hi again Steve,

I have ran valgrind with 4 and 10 queries, and as you expected, the
report is essentially the same (logs attached).

I have postgres crashing twice a day. The way psycopg2 opens and
maintains the connection is the only alternative suspect, but the
memory swell only takes place with the pgr_dijkstra function. Would
you have other suggestions? Or would you advise a different forum to
pose this issue?

Thank you,

Luís

_______________________________________________
pgrouting-dev mailing list
pgrouting-dev@lists.osgeo.org
http://lists.osgeo.org/mailman/listinfo/pgrouting-dev

_______________________________________________
pgrouting-dev mailing list
pgrouting-dev@lists.osgeo.org
http://lists.osgeo.org/mailman/listinfo/pgrouting-dev

Stephen_Woodbridge1 · February 25, 2015, 4:01pm

On 2/25/2015 10:18 AM, Luís de Sousa wrote:

Hi Steve,

I am sorry if I did not make this clear before: PyWPS is running on a
different server. Postgres and pgRouting run on a dedicated server; it
is always a postgres process taking up the RAM.

You may have and I may have forgotten

Anyway, try using pgr_trsp() and see if it happen with that.

Tell me about the postgresql server:

What OS/Distribution?
Are you using pgrouting installed from a package or compiled from source?

select * from pgr_version();
"2.0.0";"pgrouting-2.0.0";"78";"abde224";"develop";"1.46.1"

My version is built from source and the "develop" branch using Boost v1.46.1 with 78 commits ahead of master.

Since pgr_trsp uses different code from pgr_dijkstra I would not expect to see the same behavior.

-Steve

Thank you,

Luís

On 25 February 2015 at 15:56, Stephen Woodbridge
<woodbri@swoodbridge.com> wrote:

Luis,

Have tried running this query using pgr_trsp() instead of pgr_dijkstra()

SELECT seq, id1 AS node, id2 AS edge, cost, ST_AsGeoJSON(b.the_geom)
FROM pgr_trsp('
            SELECT gid AS id,
                 source::integer,
                 target::integer,
                 to_cost::double precision AS cost,
                 reverse_cost::double precision
            FROM lux_2po.ways',
            10000 , 20000 , true, true) a LEFT JOIN lux_2po.ways b ON
(a.id2 = b.gid)

If this leaks the same way then I'm guessing the issue is with psycopg2 or
how you are calling it. I have not used python interface to pg but in other
languages the pattern looks like:

connect to database
prepare a query
execute the query
loop through the results
free the results <======= **** this is a leak if not done ****
loop back to prepare or execute
disconnect from database

Q: have you watched the processes in top? is the postgres process growing in
size or is it the python or apache or other process that is consuming all
the memory?

If some other process consumes all the memory then postgres will fail when
it needs memory because there is none available.

-Steve

On 2/25/2015 4:28 AM, Luís de Sousa wrote:

Just as an addendum, the network tables in this schema sum up to less
than 34 Mb:

# SELECT sum(c.relpages) * 8
    FROM pg_class c,
         pg_namespace n
   WHERE c.relnamespace = n.oid
     AND n.nspname LIKE 'lux_2po_2169'
     AND c.relname LIKE 'ways%';
   ?column?
----------
      33952
(1 row)

Thank you,

Luís

On 25 February 2015 at 10:06, Luís de Sousa <luis.a.de.sousa@gmail.com>
wrote:

Hi again Steve,

I have ran valgrind with 4 and 10 queries, and as you expected, the
report is essentially the same (logs attached).

I have postgres crashing twice a day. The way psycopg2 opens and
maintains the connection is the only alternative suspect, but the
memory swell only takes place with the pgr_dijkstra function. Would
you have other suggestions? Or would you advise a different forum to
pose this issue?

Thank you,

Luís

_______________________________________________
pgrouting-dev mailing list
pgrouting-dev@lists.osgeo.org
http://lists.osgeo.org/mailman/listinfo/pgrouting-dev

_______________________________________________
pgrouting-dev mailing list
pgrouting-dev@lists.osgeo.org
http://lists.osgeo.org/mailman/listinfo/pgrouting-dev

_______________________________________________
pgrouting-dev mailing list
pgrouting-dev@lists.osgeo.org
http://lists.osgeo.org/mailman/listinfo/pgrouting-dev

Luis_de_Sousa · February 26, 2015, 9:14am

Hi again Steve, a few more points:

1. I am using pgRouting 2.0.0 on Ubuntu 12.04 server, with all the
software installed from the repositories. I am attaching a log file
with details.

2. Valgrind produces exactly the same output for pgr_trsp(). I am
attaching a log from five successive calls to this function.

3. Also attached is a small python script that replicates the issue.
Rough numbers, RAM usage in the Postgres server swells by about 160 Mb
per 10 000 queries. Just give it a go and let it run for a while.

Thank you and regards,

Luís

On 25 February 2015 at 17:01, Stephen Woodbridge
<woodbri@swoodbridge.com> wrote:

On 2/25/2015 10:18 AM, Luís de Sousa wrote:

Hi Steve,

I am sorry if I did not make this clear before: PyWPS is running on a
different server. Postgres and pgRouting run on a dedicated server; it
is always a postgres process taking up the RAM.

You may have and I may have forgotten

Anyway, try using pgr_trsp() and see if it happen with that.

Tell me about the postgresql server:

What OS/Distribution?
Are you using pgrouting installed from a package or compiled from source?

select * from pgr_version();
"2.0.0";"pgrouting-2.0.0";"78";"abde224";"develop";"1.46.1"

My version is built from source and the "develop" branch using Boost v1.46.1
with 78 commits ahead of master.

Since pgr_trsp uses different code from pgr_dijkstra I would not expect to
see the same behavior.

-Steve

memoryLeak.py (1.24 KB)

valgrindlog.trsp.5 (6.07 KB)

versions.log (733 Bytes)

Stephen_Woodbridge1 · February 26, 2015, 3:41pm

Hi Luis,

I looked at the valgrind log and it is basically the same as a single query and I do not see any leaks coming from pgrouting.

I think you should start looking at how you are using python.
https://www.google.com/?gws_rd=ssl#newwindow=1&q=psycopg2+memory+leak

Also, looking at your code, you are computing random start and end node_ids for the query:

1) are you sure you have all nodes from 1 to max+1?
2) it is possible that if a query errors out that we might leak memory and if this is the case you should attach a simple test case and valgrind log to a bug report.

Also, in your python you are using fetchall() but you are never freeing the results.

Anyway, at this point I do not see a problem with pgrouting and I'm not seeing this issue on any of my 12.04 systems.

Thanks,
-Steve

On 2/26/2015 4:14 AM, Luís de Sousa wrote:

Hi again Steve, a few more points:

1. I am using pgRouting 2.0.0 on Ubuntu 12.04 server, with all the
software installed from the repositories. I am attaching a log file
with details.

2. Valgrind produces exactly the same output for pgr_trsp(). I am
attaching a log from five successive calls to this function.

3. Also attached is a small python script that replicates the issue.
Rough numbers, RAM usage in the Postgres server swells by about 160 Mb
per 10 000 queries. Just give it a go and let it run for a while.

Thank you and regards,

Luís

On 25 February 2015 at 17:01, Stephen Woodbridge
<woodbri@swoodbridge.com> wrote:

On 2/25/2015 10:18 AM, Luís de Sousa wrote:

Hi Steve,

I am sorry if I did not make this clear before: PyWPS is running on a
different server. Postgres and pgRouting run on a dedicated server; it
is always a postgres process taking up the RAM.

You may have and I may have forgotten

Anyway, try using pgr_trsp() and see if it happen with that.

Tell me about the postgresql server:

What OS/Distribution?
Are you using pgrouting installed from a package or compiled from source?

select * from pgr_version();
"2.0.0";"pgrouting-2.0.0";"78";"abde224";"develop";"1.46.1"

My version is built from source and the "develop" branch using Boost v1.46.1
with 78 commits ahead of master.

Since pgr_trsp uses different code from pgr_dijkstra I would not expect to
see the same behavior.

-Steve

_______________________________________________
pgrouting-dev mailing list
pgrouting-dev@lists.osgeo.org
http://lists.osgeo.org/mailman/listinfo/pgrouting-dev

Luis_de_Sousa · February 27, 2015, 8:23am

Hi Steve, my comments go below.

On 26 February 2015 at 16:41, Stephen Woodbridge
<woodbri@swoodbridge.com> wrote:

Hi Luis,

I looked at the valgrind log and it is basically the same as a single query
and I do not see any leaks coming from pgrouting.

I think you should start looking at how you are using python.
https://www.google.com/?gws_rd=ssl#newwindow=1&q=psycopg2+memory+leak

I have no memory issues client side, even if the python code is minimal.

Also, looking at your code, you are computing random start and end node_ids
for the query:

1) are you sure you have all nodes from 1 to max+1?

Yes.

2) it is possible that if a query errors out that we might leak memory and
if this is the case you should attach a simple test case and valgrind log to
a bug report.

This is a possibility (e.g. network islands). I will follow this lead.

Also, in your python you are using fetchall() but you are never freeing the
results.

Again, this might not be conventional code, but the memory issue is on
the server.

Anyway, at this point I do not see a problem with pgrouting and I'm not
seeing this issue on any of my 12.04 systems.

At this stage I am also inclined to consider this an issue external to
pgRouting. However, the fact that you are able to run the script
without problems clearly points to the server side.

I will keep investigating this problem. If something further comes up
relevant to pgRouting I will report back.

Thank you and regards,

Luís

Stephen_Woodbridge1 · February 27, 2015, 3:01pm

Luis,

All these are good points.

It might also be possible for the client to cause things to leak on the server. For example, somethings are created on the server for the client and they persist for the duration of the the connection.

Another possibility, is that python might be running all the queries in the connection within a transaction, and this allocates memory on the server to handle a potential rollback.

You could try closing and opening the connection between requests to see if it is related to these.

If you know php, you could try rewriting your script into php (which I'm more familiar with) and see if your get the same behavior.

Or you could write all your queries to a text file, then run the text file as input to psql and see if the memory grows. If it does not then this indicates something weird with the python happening.

./mypython > myqueries.sql
psql mydatabase -f myqueries.sql &> myqueries.log

I'm not saying that the problem is not server side only that the valgrind logs do not indicate a problem specifically with pgrouting.

At this point we need to divide problem into smaller pieces like the above suggestion. If we can reproduce it without python, then we can eliminate that as the issue. If we can reproduce it with psql and text file of sql commands then it indicates something is likely leaking in pgrouting and we did not hit the right query in valgrind to reproduce it.

pgRouting does not do much/any? clean up on error conditions so this is possibly an error to look at.

-Steve

On 2/27/2015 3:23 AM, Luís de Sousa wrote:

Hi Steve, my comments go below.

On 26 February 2015 at 16:41, Stephen Woodbridge
<woodbri@swoodbridge.com> wrote:

Hi Luis,

I looked at the valgrind log and it is basically the same as a single query
and I do not see any leaks coming from pgrouting.

I think you should start looking at how you are using python.
https://www.google.com/?gws_rd=ssl#newwindow=1&q=psycopg2+memory+leak

I have no memory issues client side, even if the python code is minimal.

Also, looking at your code, you are computing random start and end node_ids
for the query:

1) are you sure you have all nodes from 1 to max+1?

Yes.

2) it is possible that if a query errors out that we might leak memory and
if this is the case you should attach a simple test case and valgrind log to
a bug report.

This is a possibility (e.g. network islands). I will follow this lead.

Also, in your python you are using fetchall() but you are never freeing the
results.

Again, this might not be conventional code, but the memory issue is on
the server.

Anyway, at this point I do not see a problem with pgrouting and I'm not
seeing this issue on any of my 12.04 systems.

At this stage I am also inclined to consider this an issue external to
pgRouting. However, the fact that you are able to run the script
without problems clearly points to the server side.

I will keep investigating this problem. If something further comes up
relevant to pgRouting I will report back.

Thank you and regards,

Luís
_______________________________________________
pgrouting-dev mailing list
pgrouting-dev@lists.osgeo.org
http://lists.osgeo.org/mailman/listinfo/pgrouting-dev