[GRASS-user] v.select disjoint operation. Grass 7.8.2

Hi, I am running Grass 7.8.2 on a HPC (linux) to utilise memory not available to me via my local machine.

It is running using proj 6.3, gdal 3.0.4, geos 3.8, python 3

I am trying to perform the v.select disjoint operation, which I have used successfully before on a similar setup and similar data using the same version of Grass (I suspect only the supporting modules and input data have changed). My code is:

g.region n=5.4715700149536133 e=31.4501590728759766 s=-13.5662746429443359 w=12.1540479660034180 -p

Input and clean topology

v.in.ogr -e -o input=file.json output=filev2 snap=-1 --verbose --overwrite
v.in.ogr -e -o input=file2.gpkg output=file2 snap=-1 --verbose --overwrite encoding=UTF-8

Extract all categories for each dataset so as to maintain polygon (building) contiguity where polygons overlap other polygons within each dataset

v.extract -d input=filev2 layer=1 type=centroid,area output=buildingsdiss --verbose --overwrite
v.extract -d input=file2 layer=1 type=centroid,area output=file2diss1 where=“cat < 6000001” --verbose --overwrite

Perform disjoint operation to find buildings that do not intersect buildings in primary dataset

v.select ainput=file2diss1 alayer=file2diss1 binput=buildingsdiss blayer=buildingsdiss output=DISJOINT1 operator=disjoint --verbose --overwrite

Output disjoint building layer and cleaned primary dataset

v.out.ogr -m input=DISJOINT1 output=DISJOINT1.gpkg --verbose --overwrite
v.out.ogr -m input=filev2 output=filev2.gpkg --verbose --overwrite

Having input the topologies and extracted the categories I persistently get the following error using various data when trying to run the disjoint operation:

projection: 3 (Latitude-Longitude)
zone: 0
datum: wgs84
ellipsoid: wgs84
north: 5:28:17.649311N
south: 13:33:58.588333S
west: 8:09:27.521013W
east: 51:45:42.670764E
nsres: 1:00:07.170402
ewres: 1:00:56.104945
rows: 19
cols: 59
cells: 1121
NS and EW resolutions are different
Processing features…
0%…100%
Processing areas…
0%ERROR: Unable to seek: Invalid argument
NS and EW resolutions are different

Any advice appreciated on getting v.select to run. Thanks

Best wishes, Chris

Hi,

Is there a chance to receive the dataset for testing (or, ideally, a
reproducible example with the North Carolina sample dataset from
https://grass.osgeo.org/download/data/)?

thanks,
Markus

On Wed, May 12, 2021 at 1:14 PM Christopher Lloyd via grass-user
<grass-user@lists.osgeo.org> wrote:

Hi, I am running Grass 7.8.2 on a HPC (linux) to utilise memory not available to me via my local machine.

It is running using proj 6.3, gdal 3.0.4, geos 3.8, python 3

I am trying to perform the v.select disjoint operation, which I have used successfully before on a similar setup and similar data using the same version of Grass (I suspect only the supporting modules and input data have changed). My code is:

g.region n=5.4715700149536133 e=31.4501590728759766 s=-13.5662746429443359 w=12.1540479660034180 -p

## Input and clean topology
v.in.ogr -e -o input=file.json output=filev2 snap=-1 --verbose --overwrite
v.in.ogr -e -o input=file2.gpkg output=file2 snap=-1 --verbose --overwrite encoding=UTF-8

## Extract all categories for each dataset so as to maintain polygon (building) contiguity where polygons overlap other polygons within each dataset
v.extract -d input=filev2 layer=1 type=centroid,area output=buildingsdiss --verbose --overwrite
v.extract -d input=file2 layer=1 type=centroid,area output=file2diss1 where="cat < 6000001" --verbose --overwrite

## Perform disjoint operation to find buildings that do not intersect buildings in primary dataset
v.select ainput=file2diss1 alayer=file2diss1 binput=buildingsdiss blayer=buildingsdiss output=DISJOINT1 operator=disjoint --verbose --overwrite

## Output disjoint building layer and cleaned primary dataset
v.out.ogr -m input=DISJOINT1 output=DISJOINT1.gpkg --verbose --overwrite
v.out.ogr -m input=filev2 output=filev2.gpkg --verbose --overwrite

Having input the topologies and extracted the categories I persistently get the following error using various data when trying to run the disjoint operation:

projection: 3 (Latitude-Longitude)
zone: 0
datum: wgs84
ellipsoid: wgs84
north: 5:28:17.649311N
south: 13:33:58.588333S
west: 8:09:27.521013W
east: 51:45:42.670764E
nsres: 1:00:07.170402
ewres: 1:00:56.104945
rows: 19
cols: 59
cells: 1121
NS and EW resolutions are different
Processing features...
   0%..........100%
Processing areas...
   0% ERROR: Unable to seek: Invalid argument
NS and EW resolutions are different

Any advice appreciated on getting v.select to run. Thanks

Best wishes, Chris

_______________________________________________
grass-user mailing list
grass-user@lists.osgeo.org
https://lists.osgeo.org/mailman/listinfo/grass-user

--
Markus Neteler, PhD
https://www.mundialis.de - free data with free software
https://grass.osgeo.org
https://courses.neteler.org/blog

Hi Markus, Thanks for your response. I note that another user recently had a similar error with a similar (but not the same) module.

I am currently splitting my input data into smaller sized files, each for input separately to the v.select module, to see if this resolves the issue. I will let you know on progress. If the failure continues then I will supply you with a reproducible example using the NC data as you indicate. Many thanks.

Best wishes, Chris

On Tuesday, 18 May 2021, 22:01:58 BST, Markus Neteler neteler@osgeo.org wrote:

Hi,

Is there a chance to receive the dataset for testing (or, ideally, a
reproducible example with the North Carolina sample dataset from
https://grass.osgeo.org/download/data/)?

thanks,
Markus

On Wed, May 12, 2021 at 1:14 PM Christopher Lloyd via grass-user
<grass-user@lists.osgeo.org> wrote:

Hi, I am running Grass 7.8.2 on a HPC (linux) to utilise memory not available to me via my local machine.

It is running using proj 6.3, gdal 3.0.4, geos 3.8, python 3

I am trying to perform the v.select disjoint operation, which I have used successfully before on a similar setup and similar data using the same version of Grass (I suspect only the supporting modules and input data have changed). My code is:

g.region n=5.4715700149536133 e=31.4501590728759766 s=-13.5662746429443359 w=12.1540479660034180 -p

Input and clean topology

v.in.ogr -e -o input=file.json output=filev2 snap=-1 --verbose --overwrite
v.in.ogr -e -o input=file2.gpkg output=file2 snap=-1 --verbose --overwrite encoding=UTF-8

Extract all categories for each dataset so as to maintain polygon (building) contiguity where polygons overlap other polygons within each dataset

v.extract -d input=filev2 layer=1 type=centroid,area output=buildingsdiss --verbose --overwrite
v.extract -d input=file2 layer=1 type=centroid,area output=file2diss1 where=“cat < 6000001” --verbose --overwrite

Perform disjoint operation to find buildings that do not intersect buildings in primary dataset

v.select ainput=file2diss1 alayer=file2diss1 binput=buildingsdiss blayer=buildingsdiss output=DISJOINT1 operator=disjoint --verbose --overwrite

Output disjoint building layer and cleaned primary dataset

v.out.ogr -m input=DISJOINT1 output=DISJOINT1.gpkg --verbose --overwrite
v.out.ogr -m input=filev2 output=filev2.gpkg --verbose --overwrite

Having input the topologies and extracted the categories I persistently get the following error using various data when trying to run the disjoint operation:

projection: 3 (Latitude-Longitude)
zone: 0
datum: wgs84
ellipsoid: wgs84
north: 5:28:17.649311N
south: 13:33:58.588333S
west: 8:09:27.521013W
east: 51:45:42.670764E
nsres: 1:00:07.170402
ewres: 1:00:56.104945
rows: 19
cols: 59
cells: 1121
NS and EW resolutions are different
Processing features…
0%…100%
Processing areas…
0% ERROR: Unable to seek: Invalid argument
NS and EW resolutions are different

Any advice appreciated on getting v.select to run. Thanks

Best wishes, Chris


grass-user mailing list
grass-user@lists.osgeo.org
https://lists.osgeo.org/mailman/listinfo/grass-user


Markus Neteler, PhD
https://www.mundialis.de - free data with free software
https://grass.osgeo.org
https://courses.neteler.org/blog

Hi Markus, Having split my input data into smaller sized shp files, these now process fine using v.select. So clearly the problem that I had lies with some file size limitation with the v.select ‘disjoint’ algorithm - also failing with one input file prior using the v.extract algorithm. The original (large) input files input to grass ok using v.in.ogr, but then failed when using v.select. Might you be able to indicate the file size limitation for the v.select algorithm and the reasons for this? Thanks.

Best wishes, Chris

On Wednesday, 19 May 2021, 14:29:56 BST, Christopher Lloyd via grass-user grass-user@lists.osgeo.org wrote:

Hi Markus, Thanks for your response. I note that another user recently had a similar error with a similar (but not the same) module.

I am currently splitting my input data into smaller sized files, each for input separately to the v.select module, to see if this resolves the issue. I will let you know on progress. If the failure continues then I will supply you with a reproducible example using the NC data as you indicate. Many thanks.

Best wishes, Chris

On Tuesday, 18 May 2021, 22:01:58 BST, Markus Neteler neteler@osgeo.org wrote:

Hi,

Is there a chance to receive the dataset for testing (or, ideally, a
reproducible example with the North Carolina sample dataset from
https://grass.osgeo.org/download/data/)?

thanks,
Markus

On Wed, May 12, 2021 at 1:14 PM Christopher Lloyd via grass-user
<grass-user@lists.osgeo.org> wrote:

Hi, I am running Grass 7.8.2 on a HPC (linux) to utilise memory not available to me via my local machine.

It is running using proj 6.3, gdal 3.0.4, geos 3.8, python 3

I am trying to perform the v.select disjoint operation, which I have used successfully before on a similar setup and similar data using the same version of Grass (I suspect only the supporting modules and input data have changed). My code is:

g.region n=5.4715700149536133 e=31.4501590728759766 s=-13.5662746429443359 w=12.1540479660034180 -p

Input and clean topology

v.in.ogr -e -o input=file.json output=filev2 snap=-1 --verbose --overwrite
v.in.ogr -e -o input=file2.gpkg output=file2 snap=-1 --verbose --overwrite encoding=UTF-8

Extract all categories for each dataset so as to maintain polygon (building) contiguity where polygons overlap other polygons within each dataset

v.extract -d input=filev2 layer=1 type=centroid,area output=buildingsdiss --verbose --overwrite
v.extract -d input=file2 layer=1 type=centroid,area output=file2diss1 where=“cat < 6000001” --verbose --overwrite

Perform disjoint operation to find buildings that do not intersect buildings in primary dataset

v.select ainput=file2diss1 alayer=file2diss1 binput=buildingsdiss blayer=buildingsdiss output=DISJOINT1 operator=disjoint --verbose --overwrite

Output disjoint building layer and cleaned primary dataset

v.out.ogr -m input=DISJOINT1 output=DISJOINT1.gpkg --verbose --overwrite
v.out.ogr -m input=filev2 output=filev2.gpkg --verbose --overwrite

Having input the topologies and extracted the categories I persistently get the following error using various data when trying to run the disjoint operation:

projection: 3 (Latitude-Longitude)
zone: 0
datum: wgs84
ellipsoid: wgs84
north: 5:28:17.649311N
south: 13:33:58.588333S
west: 8:09:27.521013W
east: 51:45:42.670764E
nsres: 1:00:07.170402
ewres: 1:00:56.104945
rows: 19
cols: 59
cells: 1121
NS and EW resolutions are different
Processing features…
0%…100%
Processing areas…
0% ERROR: Unable to seek: Invalid argument
NS and EW resolutions are different

Any advice appreciated on getting v.select to run. Thanks

Best wishes, Chris


grass-user mailing list
grass-user@lists.osgeo.org
https://lists.osgeo.org/mailman/listinfo/grass-user


Markus Neteler, PhD
https://www.mundialis.de - free data with free software
https://grass.osgeo.org
https://courses.neteler.org/blog


grass-user mailing list
grass-user@lists.osgeo.org
https://lists.osgeo.org/mailman/listinfo/grass-user

Hi Chris,

On Thu, May 20, 2021 at 2:00 PM Christopher Lloyd
<chrislloyd2@yahoo.co.uk> wrote:

Hi Markus, Having split my input data into smaller sized shp files, these now process fine using v.select. So clearly the problem that I had lies with some file size limitation with the v.select 'disjoint' algorithm - also failing with one input file prior using the v.extract algorithm. The original (large) input files input to grass ok using v.in.ogr,

The vector limit is as follows:

https://grasswiki.osgeo.org/wiki/GRASS_GIS_Performance#Large_vector_data_processing
--> In all GRASS versions, the limit with topology is at time 2^31 - 1
(about 2 billion) features per vector map.

but then failed when using v.select. Might you be able to indicate the file size limitation for the v.select algorithm and the reasons for this? Thanks.

This said, perhaps v.select lacks a proper declaration?

Could you tell us how many vector features per map you are dealing
with (say: those which work and those which fail)?

Best wishes, Markus

Hi Markus, Thanks for your reply - sure. In current and previous successful runs of the disjoint operation in v.select I have input shapefile subsets of up to 2 GB in file size (file size just the shp file extension itself - not dbf etc.). Shapefiles of 2.5 GB size failed and had to be split. These correspond to 10,659,000 features and 12,159,000 features respectively. I don’t believe that the error is linked to any intrinsic limitation of the shapefile format as I also receive the same error when I input to Grass the (unsplit) polygon datasets in geojson or geopackage format.

Best wishes, Chris

On Thursday, 20 May 2021, 16:48:14 BST, Markus Neteler neteler@osgeo.org wrote:

Hi Chris,

On Thu, May 20, 2021 at 2:00 PM Christopher Lloyd
<chrislloyd2@yahoo.co.uk> wrote:

Hi Markus, Having split my input data into smaller sized shp files, these now process fine using v.select. So clearly the problem that I had lies with some file size limitation with the v.select ‘disjoint’ algorithm - also failing with one input file prior using the v.extract algorithm. The original (large) input files input to grass ok using v.in.ogr,

The vector limit is as follows:

https://grasswiki.osgeo.org/wiki/GRASS_GIS_Performance#Large_vector_data_processing
→ In all GRASS versions, the limit with topology is at time 2^31 - 1
(about 2 billion) features per vector map.

but then failed when using v.select. Might you be able to indicate the file size limitation for the v.select algorithm and the reasons for this? Thanks.

This said, perhaps v.select lacks a proper declaration?

Could you tell us how many vector features per map you are dealing
with (say: those which work and those which fail)?

Best wishes, Markus

Hi Chris,

this " ERROR: Unable to seek: Invalid argument" is a strange error because v.select uses exactly the same functions to seek and read data as the modules you used to produce the input vectors for v.select.

You have already listed the commands leading to the error, could you also provide the input files file.json, file2.gpkg for debugging?

Best,

Markus M

On Thu, May 20, 2021 at 6:39 PM Christopher Lloyd via grass-user <grass-user@lists.osgeo.org> wrote:

Hi Markus, Thanks for your reply - sure. In current and previous successful runs of the disjoint operation in v.select I have input shapefile subsets of up to 2 GB in file size (file size just the shp file extension itself - not dbf etc.). Shapefiles of 2.5 GB size failed and had to be split. These correspond to 10,659,000 features and 12,159,000 features respectively. I don’t believe that the error is linked to any intrinsic limitation of the shapefile format as I also receive the same error when I input to Grass the (unsplit) polygon datasets in geojson or geopackage format.

Best wishes, Chris

On Thursday, 20 May 2021, 16:48:14 BST, Markus Neteler <neteler@osgeo.org> wrote:

Hi Chris,

On Thu, May 20, 2021 at 2:00 PM Christopher Lloyd
<chrislloyd2@yahoo.co.uk> wrote:

Hi Markus, Having split my input data into smaller sized shp files, these now process fine using v.select. So clearly the problem that I had lies with some file size limitation with the v.select ‘disjoint’ algorithm - also failing with one input file prior using the v.extract algorithm. The original (large) input files input to grass ok using v.in.ogr,

The vector limit is as follows:

https://grasswiki.osgeo.org/wiki/GRASS_GIS_Performance#Large_vector_data_processing
→ In all GRASS versions, the limit with topology is at time 2^31 - 1
(about 2 billion) features per vector map.

but then failed when using v.select. Might you be able to indicate the file size limitation for the v.select algorithm and the reasons for this? Thanks.

This said, perhaps v.select lacks a proper declaration?

Could you tell us how many vector features per map you are dealing
with (say: those which work and those which fail)?

Best wishes, Markus


grass-user mailing list
grass-user@lists.osgeo.org
https://lists.osgeo.org/mailman/listinfo/grass-user

Hi Markus, Thanks for your response. The error does seem strange given the facts you outline. I can provide you with one of the input files (OSM building polygons) as this is open access (licensing). However, unfortunately the other file contains Maxar building polygons that have strict licensing to my organisation. Hence, I cannot share this file. However, given my past experience with Grass and file size limitations when using the v.select ‘disjoint’ operation I do not anticipate you having a problem recreating the problem. I suspect that if I share the osm building polygon file with you, and you add a few polygons to a copy of the file, and then perform the disjoint operation on the original osm file verses the copy, that you will most likely be able to reproduce the error. Is this sufficient? If so, how/where do I upload the file? The osm geojson file is just over 4 GB in size.

Best wishes, Chris

On Thursday, 20 May 2021, 21:20:11 BST, Markus Metz markus.metz.giswork@gmail.com wrote:

Hi Chris,

this " ERROR: Unable to seek: Invalid argument" is a strange error because v.select uses exactly the same functions to seek and read data as the modules you used to produce the input vectors for v.select.

You have already listed the commands leading to the error, could you also provide the input files file.json, file2.gpkg for debugging?

Best,

Markus M

On Thu, May 20, 2021 at 6:39 PM Christopher Lloyd via grass-user <grass-user@lists.osgeo.org> wrote:

Hi Markus, Thanks for your reply - sure. In current and previous successful runs of the disjoint operation in v.select I have input shapefile subsets of up to 2 GB in file size (file size just the shp file extension itself - not dbf etc.). Shapefiles of 2.5 GB size failed and had to be split. These correspond to 10,659,000 features and 12,159,000 features respectively. I don’t believe that the error is linked to any intrinsic limitation of the shapefile format as I also receive the same error when I input to Grass the (unsplit) polygon datasets in geojson or geopackage format.

Best wishes, Chris

On Thursday, 20 May 2021, 16:48:14 BST, Markus Neteler <neteler@osgeo.org> wrote:

Hi Chris,

On Thu, May 20, 2021 at 2:00 PM Christopher Lloyd
<chrislloyd2@yahoo.co.uk> wrote:

Hi Markus, Having split my input data into smaller sized shp files, these now process fine using v.select. So clearly the problem that I had lies with some file size limitation with the v.select ‘disjoint’ algorithm - also failing with one input file prior using the v.extract algorithm. The original (large) input files input to grass ok using v.in.ogr,

The vector limit is as follows:

https://grasswiki.osgeo.org/wiki/GRASS_GIS_Performance#Large_vector_data_processing
→ In all GRASS versions, the limit with topology is at time 2^31 - 1
(about 2 billion) features per vector map.

but then failed when using v.select. Might you be able to indicate the file size limitation for the v.select algorithm and the reasons for this? Thanks.

This said, perhaps v.select lacks a proper declaration?

Could you tell us how many vector features per map you are dealing
with (say: those which work and those which fail)?

Best wishes, Markus


grass-user mailing list
grass-user@lists.osgeo.org
https://lists.osgeo.org/mailman/listinfo/grass-user