[GRASS-dev] [GRASS GIS] #3425: regression in v.in.ogr -f 7.3 7 Oct (possibly earlier)

#3425: regression in v.in.ogr -f 7.3 7 Oct (possibly earlier)
-------------------------+-------------------------
Reporter: rsbivand | Owner: grass-dev@…
     Type: defect | Status: new
Priority: major | Milestone:
Component: Vector | Version: svn-trunk
Keywords: | CPU: Unspecified
Platform: Unspecified |
-------------------------+-------------------------
In 7.2.2, there was no check for GDAL version in v.in.ogr -f. Now for
v.in.ogr -f in 7.3 and GDAL >= 2.0, long format names are returned.

Choice of best drivers in R rgrass7::readVECT() depends on predictable
text output from v.in.ogr -f.

Please add a new flag to preserve backward compatibility, so that the test
at vector/v.in.ogr/main.c line 387 can use the older branch even if GDAL
>= 2.0. This is a serious regression for users of rgrass7. If the flag is
added, readVECT will again use SQLite, not ESRI_Shapefile, to which it has
fallen back.

--
Ticket URL: <https://trac.osgeo.org/grass/ticket/3425&gt;
GRASS GIS <https://grass.osgeo.org>

#3425: regression in v.in.ogr -f 7.3 7 Oct (possibly earlier)
--------------------------+-------------------------
  Reporter: rsbivand | Owner: grass-dev@…
      Type: defect | Status: new
  Priority: major | Milestone:
Component: Vector | Version: svn-trunk
Resolution: | Keywords:
       CPU: Unspecified | Platform: Unspecified
--------------------------+-------------------------

Comment (by mmetz):

Replying to [ticket:3425 rsbivand]:
> In 7.2.2, there was no check for GDAL version in v.in.ogr -f. Now for
v.in.ogr -f in 7.3 and GDAL >= 2.0, long format names are returned.

With `v.in.ogr -f`, format names are returned in the form `" %s (%s):
%s\n"`. The short name is returned first.

>
> Choice of best drivers in R rgrass7::readVECT() depends on predictable
text output from v.in.ogr -f.

As before, the driver's short name is printed first in each line.
`v.in.ogr -f` follows `ogrinfo --formats`.
>
> Please add a new flag to preserve backward compatibility, so that the
test at vector/v.in.ogr/main.c line 387 can use the older branch even if
GDAL >= 2.0. This is a serious regression for users of rgrass7. If the
flag is added, readVECT will again use SQLite, not ESRI_Shapefile, to
which it has fallen back.

Is rgrass7 not using the first item printed by `v.in.ogr -f`?

--
Ticket URL: <https://trac.osgeo.org/grass/ticket/3425#comment:1&gt;
GRASS GIS <https://grass.osgeo.org>

#3425: regression in v.in.ogr -f 7.3 7 Oct (possibly earlier)
--------------------------+-------------------------
  Reporter: rsbivand | Owner: grass-dev@…
      Type: defect | Status: new
  Priority: major | Milestone:
Component: Vector | Version: svn-trunk
Resolution: | Keywords:
       CPU: Unspecified | Platform: Unspecified
--------------------------+-------------------------

Comment (by rsbivand):

{{{rgdal::ogrDrivers[,1]}}} is:

{{{
> ogrDrivers()[,1]
  [1] "AeronavFAA" "AmigoCloud" "ARCGEN" "AVCBin"
  [5] "AVCE00" "BNA" "CAD" "Carto"
  [9] "Cloudant" "CouchDB" "CSV" "CSW"
[13] "DGN" "DXF" "EDIGEO" "ElasticSearch"
[17] "ESRI Shapefile" "Geoconcept" "GeoJSON" "Geomedia"
[21] "GeoRSS" "GFT" "GML" "GMLAS"
[25] "GPKG" "GPSBabel" "GPSTrackMaker" "GPX"
[29] "HTF" "HTTP" "Idrisi" "Interlis 1"
[33] "Interlis 2" "JML" "KML" "MapInfo File"
[37] "Memory" "MSSQLSpatial" "NAS" "netCDF"
[41] "ODBC" "ODS" "OGR_GMT" "OGR_PDS"
[45] "OGR_SDTS" "OGR_VRT" "OpenAir" "OpenFileGDB"
[49] "OSM" "PCIDSK" "PDF" "PGDUMP"
[53] "PGeo" "PLSCENES" "PostgreSQL" "REC"
[57] "S57" "SEGUKOOA" "SEGY" "Selafin"
[61] "SQLite" "SUA" "SVG" "SXF"
[65] "TIGER" "UK .NTF" "VDV" "VFK"
[69] "Walk" "WAsP" "WFS" "XLS"
[73] "XLSX" "XPlane"
}}}

from (rgdal/src/ogrdrivers.cpp) line 71:

{{{
       SET_STRING_ELT(VECTOR_ELT(ans, 0), i,
         COPY_TO_USER_STRING(poDriver->GetName()));
}}}

In 7.2.2, the matches work in rgrass7:::.read_vect_non_plugin, line 80 in
rgrass7/R/vect_link.R:

{{{
     ogrD <- rgdal::ogrDrivers()
     ogrDw <- gsub(" ", "_", ogrD$name[ogrD$write])
# guess GRASS v.out.ogr capability from rgdal
     ogrDGRASS <- execGRASS("v.in.ogr", flags=ifelse(ignore.stderr, c("f",
                            "quiet"), "f"), intern=TRUE,
                            ignore.stderr=ignore.stderr)
     ogrDGRASSs <- gsub(" ", "_", sapply(strsplit(ogrDGRASS, ": "), "[",
2))
     candDrivers <- gsub(" ", "_", sort(intersect(ogrDGRASSs, ogrDw)))
}}}

In 7.3, ogrDGRASSs is:

{{{
Browse[2]> sort(ogrDGRASSs)
  [1] "Aeronav_FAA"
  [2] "AmigoCloud"
  [3] "Arc/Info_Binary_Coverage"
  [4] "Arc/Info_E00_(ASCII)_Coverage"
  [5] "Arc/Info_Generate"
  [6] "Atlas_BNA"
  [7] "AutoCAD_Driver"
  [8] "AutoCAD_DXF"
  [9] "Carto"
[10] "Cloudant_/_CouchDB"
[11] "Comma_Separated_Value_(.csv)"
[12] "CouchDB_/_GeoCouch"
[13] "Czech_Cadastral_Exchange_Data_Format"
[14] "Elastic_Search"
[15] "EPIInfo_.REC_"
[16] "ESRI_FileGDB"
[17] "ESRI_Personal_GeoDatabase"
[18] "ESRI_Shapefile"
[19] "French_EDIGEO_exchange_format"
[20] "Geoconcept"
[21] "Geography_Markup_Language_(GML)"
[22] "Geography_Markup_Language_(GML)_driven_by_application_schemas"
...
[63] "SEG-Y"
[64] "Selafin"
[65] "SQLite_/_Spatialite"
[66] "Storage_and_eXchange_Format"
[67] "Tim_Newport-Peace's_Special_Use_Airspace_Format"
[68] "U.S._Census_TIGER/Line"
}}}

from vector/v.in.ogr/main.c line 404

{{{
             fprintf(stdout, " %s (%s): %s\n",
                     GDALGetDriverShortName(hDriver),
                     pszRWFlag, GDALGetDriverLongName(hDriver));
}}}

where {{{GDALGetDriverShortName(hDriver)}}} calls {{{GDALDriver
*>(hDriver)->GetDescription()}}}, which looks like my
{{{poDriver->GetDescription()}}}. I can't see why the string values differ
for the same GDAL.

--
Ticket URL: <https://trac.osgeo.org/grass/ticket/3425#comment:2&gt;
GRASS GIS <https://grass.osgeo.org>

#3425: regression in v.in.ogr -f 7.3 7 Oct (possibly earlier)
--------------------------+-------------------------
  Reporter: rsbivand | Owner: grass-dev@…
      Type: defect | Status: new
  Priority: major | Milestone:
Component: Vector | Version: svn-trunk
Resolution: | Keywords:
       CPU: Unspecified | Platform: Unspecified
--------------------------+-------------------------

Comment (by mmetz):

Replying to [comment:2 rsbivand]:
> {{{rgdal::ogrDrivers[,1]}}} is:
>
> {{{
> > ogrDrivers()[,1]
> [1] "AeronavFAA" "AmigoCloud" "ARCGEN" "AVCBin"
> [...]
> }}}
>
[...]
>
> In 7.3, ogrDGRASSs is:
>
> {{{
> Browse[2]> sort(ogrDGRASSs)
> [1] "Aeronav_FAA"
> [2] "AmigoCloud"
> [3] "Arc/Info_Binary_Coverage"
> [4] "Arc/Info_E00_(ASCII)_Coverage"
> [...]
> }}}

in 7.3, v.in.ogr -f is

{{{
  AeronavFAA (ro): Aeronav FAA
  AmigoCloud (rw+): AmigoCloud
  ARCGEN (ro): Arc/Info Generate
  AVCBin (ro): Arc/Info Binary Coverage
  [...]
}}}

>
> from vector/v.in.ogr/main.c line 404
>
> {{{
> fprintf(stdout, " %s (%s): %s\n",
> GDALGetDriverShortName(hDriver),
> pszRWFlag, GDALGetDriverLongName(hDriver));
> }}}
>
>
> where {{{GDALGetDriverShortName(hDriver)}}} calls {{{GDALDriver
*>(hDriver)->GetDescription()}}}, which looks like my
{{{poDriver->GetDescription()}}}. I can't see why the string values differ
for the same GDAL.

The string values do not differ for the same GDAL. You need to use the
short name. With `gdalinfo --formats`, `ogrinfo --formats`, `v.in.ogr -f`,
the first item is (was always) the short name.

--
Ticket URL: <https://trac.osgeo.org/grass/ticket/3425#comment:3&gt;
GRASS GIS <https://grass.osgeo.org>

#3425: regression in v.in.ogr -f 7.3 7 Oct (possibly earlier)
--------------------------+-------------------------
  Reporter: rsbivand | Owner: grass-dev@…
      Type: defect | Status: new
  Priority: major | Milestone:
Component: Vector | Version: svn-trunk
Resolution: | Keywords:
       CPU: Unspecified | Platform: Unspecified
--------------------------+-------------------------

Comment (by rsbivand):

The specific problems are "GML" and "SQLite", which are shown as different
above: "GML" and "SQLite" in rgdal, but "Geography_Markup_Language_(GML)"
and "SQLite_/_Spatialite" in GDAL >= 2.0 in v.in.ogr -f. The rgdal
ogrDrivers() output is structured so as not to vary by GDAL version for
the same driver, even though the implementation varies. I'm in the C++
API, you are in the C API, but that shouldn't matter.

This is the root of the problem, see thread starting at
https://lists.osgeo.org/pipermail/grass-stats/2017-October/001729.html

I can try to condition on GRASS version to work around the string
regression, but would prefer not to have to.

--
Ticket URL: <https://trac.osgeo.org/grass/ticket/3425#comment:4&gt;
GRASS GIS <https://grass.osgeo.org>

#3425: regression in v.in.ogr -f 7.3 7 Oct (possibly earlier)
--------------------------+-------------------------
  Reporter: rsbivand | Owner: grass-dev@…
      Type: defect | Status: new
  Priority: major | Milestone:
Component: Vector | Version: svn-trunk
Resolution: | Keywords:
       CPU: Unspecified | Platform: Unspecified
--------------------------+-------------------------

Comment (by rsbivand):

Regression resolved with revision 61:

{{{
     gv <- unname(read.dcf(textConnection(gsub("=", ":",
       execGRASS("g.version", flags="g", intern=TRUE))))[1,1])
     if (gv < "7.3") {
       ogrDGRASSs <- gsub(" ", "_", sapply(strsplit(ogrDGRASS, ": "), "[",
2))
     } else {
       strs <- sapply(strsplit(ogrDGRASS, ": "), "[", 1)
       ogrDGRASSs <- gsub(" ", "_", gsub(" $", "",
         substring(strs, 2, nchar(strs)-5)))
     }
}}}

--
Ticket URL: <https://trac.osgeo.org/grass/ticket/3425#comment:5&gt;
GRASS GIS <https://grass.osgeo.org>

#3425: regression in v.in.ogr -f 7.3 7 Oct (possibly earlier)
--------------------------+-------------------------
  Reporter: rsbivand | Owner: grass-dev@…
      Type: defect | Status: closed
  Priority: major | Milestone:
Component: Vector | Version: svn-trunk
Resolution: fixed | Keywords:
       CPU: Unspecified | Platform: Unspecified
--------------------------+-------------------------
Changes (by rsbivand):

* status: new => closed
* resolution: => fixed

Comment:

I've made the changes in rgrass7, but would have prefered the format of
v.in.ogr -f (and others taking -f) to be predictable - these changes are
just aethetic, so not needed.

--
Ticket URL: <https://trac.osgeo.org/grass/ticket/3425#comment:6&gt;
GRASS GIS <https://grass.osgeo.org>

#3425: regression in v.in.ogr -f 7.3 7 Oct (possibly earlier)
--------------------------+-------------------------
  Reporter: rsbivand | Owner: grass-dev@…
      Type: defect | Status: closed
  Priority: major | Milestone:
Component: Vector | Version: svn-trunk
Resolution: fixed | Keywords:
       CPU: Unspecified | Platform: Unspecified
--------------------------+-------------------------

Comment (by mmetz):

Replying to [comment:5 rsbivand]:
> Regression resolved with revision 61:
>
> {{{
> gv <- unname(read.dcf(textConnection(gsub("=", ":",
> execGRASS("g.version", flags="g", intern=TRUE))))[1,1])
> if (gv < "7.3") {
> ogrDGRASSs <- gsub(" ", "_", sapply(strsplit(ogrDGRASS, ": "),
"[", 2))
> } else {
> strs <- sapply(strsplit(ogrDGRASS, ": "), "[", 1)
> ogrDGRASSs <- gsub(" ", "_", gsub(" $", "",
> substring(strs, 2, nchar(strs)-5)))
> }
> }}}

Why not
{{{
       ogrDGRASSs <- gsub(" ", "_", trimws(sapply(strsplit(ogrDGRASS, "
[(]"), "[", 1)))
}}}

which works with all versions?

This would also work with `r.in.gdal -f`. The outputs of `r.in.gdal -f`
and `v.in.ogr -f` already have the same structure (also in GRASS 7.2),
independent of the GDAL version used.

Starting with GDAL 2.0, the outputs of `gdalinfo --formats` and `ogrinfo
--formats` also have the same structure.

--
Ticket URL: <https://trac.osgeo.org/grass/ticket/3425#comment:7&gt;
GRASS GIS <https://grass.osgeo.org>