[GRASS-user] Import large table from PostGIS - Connection lost

Hello there,

this is my first post to this mailing list, so first I’d like to thank the community for supporting this marvelous piece of software. I perform most of my research and data processing in PostGIS since I work mostly with vector data in visualize-in-web use cases for most of my projects, but I rely on GRASS GIS to perform such powerful operations like topology cleaning and building.

And regarding this, I’m trying to import a large table from PostGIS. It is 2GB in size. I use this:

grass $GRASS_DB/PERMANENT --exec
v.in.ogr --verbose
input=“PG:host=$HOST dbname=cell_raw_data user=$USER port=$PORT password=$PASS”
layer=cat2020.buildingpart output=buildingpart

everything goes well until it reaches the “Finding centroid for OGR layer” step:

Finding centroids for OGR layer <cat2020.buildingpart>…
ERROR 1: server closed the connection unexpectedly
This probably means the server terminated abnormally
before or while processing the request.

ERROR 1: no connection to the server

ERROR 1: no connection to the server

ERROR 1: no connection to the server

ERROR 1: no connection to the server

100%

The connection to PostGIS seems to be closed. I’ve gone through the PostgreSQL
configuration and tried to set up generous values for the tcp_keepalives_*
parameters, but to no avail. It seems to me that the PostgreSQL server drops
the connection due to being idle too much time. Aside from that, both PostgreSQL
and GRASS run in Docker containers.

GRASS version is 7.8.5.

Does anybody have issues importing large PostGIS tables? Any advice / ideas I can
research on? I’ve already tried to solve it by reconfiguring PostgreSQL, as said,
and also have done some research into Docker handling TCP, but to no avail. Any
idea or hint will be most welcome.

Best regards,

···

Juan Pedro Pérez Alcántara

jp.perez.alcantara@gmail.com

Hi again, I found the solution myself, hope this can be of any use.

It happens that there is a PostgreSQL config parameter I was missing. With this entries at the postgresql.conf the connection last enough to handle all the data imports, without compromising in the long run the integrity of the connections to the server with a connection leak:

tcp_keepalives_idle=60
tcp_keepalives_count=20000
tcp_keepalives_interval=30
idle_in_transaction_session_timeout=86400000

The TCP entries tries to maintain the connection alive, while the “idle_in…” param, which is the one I was missing, waits for 24h before dropping a connection that seems to be idle. However, I think the TCP ones can keep the connection alive even longer.

Hope this helps if anyone comes across such a case. I’m glad it’s not a Docker problem.

Best regards,


Juan Pedro Pérez Alcántara

jp.perez.alcantara@gmail.com

On Tue, 19 Oct 2021 at 09:55, Juan Pedro Pérez Alcántara <jp.perez.alcantara@gmail.com> wrote:

Hello there,

this is my first post to this mailing list, so first I’d like to thank the community for supporting this marvelous piece of software. I perform most of my research and data processing in PostGIS since I work mostly with vector data in visualize-in-web use cases for most of my projects, but I rely on GRASS GIS to perform such powerful operations like topology cleaning and building.

And regarding this, I’m trying to import a large table from PostGIS. It is 2GB in size. I use this:

grass $GRASS_DB/PERMANENT --exec
v.in.ogr --verbose
input=“PG:host=$HOST dbname=cell_raw_data user=$USER port=$PORT password=$PASS”
layer=cat2020.buildingpart output=buildingpart

everything goes well until it reaches the “Finding centroid for OGR layer” step:

Finding centroids for OGR layer <cat2020.buildingpart>…
ERROR 1: server closed the connection unexpectedly
This probably means the server terminated abnormally
before or while processing the request.

ERROR 1: no connection to the server

ERROR 1: no connection to the server

ERROR 1: no connection to the server

ERROR 1: no connection to the server

100%

The connection to PostGIS seems to be closed. I’ve gone through the PostgreSQL
configuration and tried to set up generous values for the tcp_keepalives_*
parameters, but to no avail. It seems to me that the PostgreSQL server drops
the connection due to being idle too much time. Aside from that, both PostgreSQL
and GRASS run in Docker containers.

GRASS version is 7.8.5.

Does anybody have issues importing large PostGIS tables? Any advice / ideas I can
research on? I’ve already tried to solve it by reconfiguring PostgreSQL, as said,
and also have done some research into Docker handling TCP, but to no avail. Any
idea or hint will be most welcome.

Best regards,


Juan Pedro Pérez Alcántara

jp.perez.alcantara@gmail.com