[GRASS5] v.in.shape/mif - the sequel

Hi

I've started work on re-arranging the vector import facilities for so-called geometric formats, ie. those that are based on whole polygon/multiline features and tend to be stored and processed from the top with techniques of computational geometry, like shapefile, MapInfo, DXF (?).

As you may know, the technique used to date to `split' lines at nodes is to identify vertices that co-incide by storing them in some kind of spatially keyed database. The database doesn't need to be spatially aware, so the fastest kind to use is a hash table. I want to use a hash table to store the co-ordinates in the new version. Most applications use a dbm type feature that comes with the system for this purpose, and this moreover has a sort of standard. I've done a

grep -r '[nsog]?dbm\.h'

on the source tree, but I can't find any reference to common dbm types. Does anyone see any problem with #include'ing dbm.h on any particular platform? Especially useful would be information about non-Gnu platforms. GNU systems like linux, cygwin should pick up gdbm which is good.

I might also mention here, that I intend to continue developing the old (or current) v.in.shape. I had planned to retire this and not maintain it beyond bugfixes, but it's good at what it does for the current state of the vector library, so it seems worth taking the effort to get it working properly. That really requires a bit more work than just bugfixing as the numerous bug reports show. This would include the long-awaited projection support - optionally the source projection is given, and input is re-projected to the current projection. And a better set of procedures for cleaning linework and detecting and removing artefacts. This is for 5.0 or a maintenance release, the earlier changes mentioned are for 5.1.

David

On Sun, 06 Jan 2002 01:40:39 +0000, David D Gray <ddgray@armadce.demon.co.uk> wrote:

Hi

Welcome back...

I've started work on re-arranging the vector import facilities for
so-called geometric formats, ie. those that are based on whole
polygon/multiline features and tend to be stored and processed from the
top with techniques of computational geometry, like shapefile, MapInfo,
DXF (?).

I prefer "spaghetti format", although I realize "spaghetti" technically
can't define areal features...

As you may know, the technique used to date to `split' lines at nodes is
to identify vertices that co-incide by storing them in some kind of
spatially keyed database. The database doesn't need to be spatially
aware, so the fastest kind to use is a hash table. I want to use a hash
table to store the co-ordinates in the new version. Most applications
use a dbm type feature that comes with the system for this purpose, and
this moreover has a sort of standard. I've done a
grep -r '[nsog]?dbm\.h'

I haven't looked at the code, but I'm confused by your description of
splitting lines, or rather finding nodes? It doesn't seem sufficient.

on the source tree, but I can't find any reference to common dbm types.
Does anyone see any problem with #include'ing dbm.h on any particular
platform? Especially useful would be information about non-Gnu
platforms. GNU systems like linux, cygwin should pick up gdbm which is good.

Anyway, I'm not aware of any addition of dbm since you last worked on
GRASS. I don't see any problem adding it, but maybe help out with
configure rules?

I might also mention here, that I intend to continue developing the old
(or current) v.in.shape. I had planned to retire this and not maintain
it beyond bugfixes, but it's good at what it does for the current state
of the vector library, so it seems worth taking the effort to get it
working properly. That really requires a bit more work than just
bugfixing as the numerous bug reports show. This would include the
long-awaited projection support - optionally the source projection is
given, and input is re-projected to the current projection. And a better
  set of procedures for cleaning linework and detecting and removing
artefacts. This is for 5.0 or a maintenance release, the earlier changes
mentioned are for 5.1.

I'm sure user's will rejoice if v.in.shape can be made more robust. I
think it can never fully work for polygons without some concept of
complex features in GRASS (i.e. one face mapping to one or more areas).

Perhaps, if you get bored, you'll want to look at the d.area polygon
hole problem? I've been working on it in fits and starts, but haven't
managed to figure out how to implement an algorithm that should work
nicely (the algorithm, with modification, could extend to a general
intersector).

--
Eric G. Miller <egm2@jps.net>

Eric G. Miller wrote:

> on the source tree, but I can't find any reference to common dbm types.
> Does anyone see any problem with #include'ing dbm.h on any particular
> platform? Especially useful would be information about non-Gnu
> platforms. GNU systems like linux, cygwin should pick up gdbm which is good.

Anyway, I'm not aware of any addition of dbm since you last worked on
GRASS. I don't see any problem adding it, but maybe help out with
configure rules?

It would at least need to check both -ldbm and -lgdbm (I don't know
about ndbm). Alternate library checks are already done for FFTW
(-lfftw, -ldfftw) and Tcl/Tk (-ltcl, -ltcl<version>).

I can add the configure.in rules if desired.

--
Glynn Clements <glynn.clements@virgin.net>

Glynn Clements wrote:

> > on the source tree, but I can't find any reference to common dbm types.
> > Does anyone see any problem with #include'ing dbm.h on any particular
> > platform? Especially useful would be information about non-Gnu
> > platforms. GNU systems like linux, cygwin should pick up gdbm which is good.
>
> Anyway, I'm not aware of any addition of dbm since you last worked on
> GRASS. I don't see any problem adding it, but maybe help out with
> configure rules?

It would at least need to check both -ldbm and -lgdbm (I don't know
about ndbm). Alternate library checks are already done for FFTW
(-lfftw, -ldfftw) and Tcl/Tk (-ltcl, -ltcl<version>).

I can add the configure.in rules if desired.

I've added the checks for DBM.

config.h should define HAVE_DBM_H, and the head file will define
DBMINCPATH (any -I switches), DBMLIBPATH (any -L switches) and DBMLIB
(-ldbm, -lgdbm or -lndbm) for use in Gmakefiles. I put libndbm last
as, on GNU systems, this may actually be a link to libdb (which
includes a DBM interface to Berkeley DB ("NEWDB") files).

Note: if dbm.h isn't in one of the standard include directories, you
need to use "--with-dbm-includes=..." (RH6.2 puts it in
/usr/include/gdbm). If you don't have it, you can use --without-dbm to
disable the checks.

--
Glynn Clements <glynn.clements@virgin.net>

Glynn Clements wrote:

Glynn Clements wrote:

on the source tree, but I can't find any reference to common dbm types. Does anyone see any problem with #include'ing dbm.h on any particular platform? Especially useful would be information about non-Gnu platforms. GNU systems like linux, cygwin should pick up gdbm which is good.

Anyway, I'm not aware of any addition of dbm since you last worked on
GRASS. I don't see any problem adding it, but maybe help out with
configure rules?

It would at least need to check both -ldbm and -lgdbm (I don't know
about ndbm). Alternate library checks are already done for FFTW
(-lfftw, -ldfftw) and Tcl/Tk (-ltcl, -ltcl<version>).

I can add the configure.in rules if desired.

I've added the checks for DBM.

config.h should define HAVE_DBM_H, and the head file will define
DBMINCPATH (any -I switches), DBMLIBPATH (any -L switches) and DBMLIB
(-ldbm, -lgdbm or -lndbm) for use in Gmakefiles. I put libndbm last
as, on GNU systems, this may actually be a link to libdb (which
includes a DBM interface to Berkeley DB ("NEWDB") files).

Note: if dbm.h isn't in one of the standard include directories, you
need to use "--with-dbm-includes=..." (RH6.2 puts it in
/usr/include/gdbm). If you don't have it, you can use --without-dbm to
disable the checks.

Glynn,

Thanks. So now I can port the dbm using into stable v.in.shape as well.

David

Eric G. Miller wrote:

On Sun, 06 Jan 2002 01:40:39 +0000, David D Gray <ddgray@armadce.demon.co.uk> wrote:

[...]

As you may know, the technique used to date to `split' lines at nodes is to identify vertices that co-incide by storing them in some kind of spatially keyed database. The database doesn't need to be spatially aware, so the fastest kind to use is a hash table. I want to use a hash table to store the co-ordinates in the new version. Most applications use a dbm type feature that comes with the system for this purpose, and this moreover has a sort of standard. I've done a
grep -r '[nsog]?dbm\.h'

I haven't looked at the code, but I'm confused by your description of
splitting lines, or rather finding nodes? It doesn't seem sufficient.

Hi Eric

Lines are not really split. what happens is that as lines are extracted they are added to the database, keyed on position (but this is just an identifier, we don't need to have storage that can do spatial search.) Links are stored as part of the structure that holds a vertex. On re-extraction, nodes are identified (not always correctly) as vertices with 3 or more links, and we track the links from one node to another as a separate `line', whose vertices are marked blank so as not to be traversed again.

[...]

Perhaps, if you get bored, you'll want to look at the d.area polygon
hole problem? I've been working on it in fits and starts, but haven't
managed to figure out how to implement an algorithm that should work
nicely (the algorithm, with modification, could extend to a general
intersector).

Yes, I had followed some of the discussion about this, but didn't have time to do much about it. The import stuff and vector5.1 have priority for me, but its worth having a look at as it may give some indicator as to how we manage the thorny problem of intersections.

David