[GRASS-dev] g.xlist/g.xremove addons (C version of g.mlist/g.mremove)

Hi,

I've just uploaded g.xlist and g.xremove (C implementations of g.mlist
and g.mremove, no dependency on g.list/g.remove) to
grass-addons/general. To compile these addons, you need POSIX regex(3)
functions. They are super fast (native speed of g.list/g.remove)!
Please test these modules.

I needed to extend G__ls to filter out unwanted file names.
G_set_ls_filter is used to define a regular expression filter. If you
have better ideas to do this, please let me know. If these changes to
libgis (ls.c, join.c, gisdefs.h, POSIX regex) are acceptable,
g.m?(list|remove) might be substituted with these modules (?).

Huidae

Huidae Cho wrote:

I've just uploaded g.xlist and g.xremove (C implementations of g.mlist
and g.mremove, no dependency on g.list/g.remove) to
grass-addons/general. To compile these addons, you need POSIX regex(3)
functions. They are super fast (native speed of g.list/g.remove)!
Please test these modules.

I needed to extend G__ls to filter out unwanted file names.
G_set_ls_filter is used to define a regular expression filter. If you
have better ideas to do this, please let me know.

I would prefer to avoid putting the regex stuff in G__ls().

It would be more flexible (and eliminate any portability issues) to
make the filter a callback function, e.g.:

typedef int ls_filter_func(const char * /*filename*/, void * /*closure*/);
static ls_filter_func *ls_filter;
static void *ls_closure;

void G_set_ls_filter(ls_filter_func *func, void *closure)
{
  *ls_filter = func;
  *ls_closure = closure;
}

const char **G__ls(const char *dir, int *num_files)
{
...
    while ((dp = readdir(dfd)) != NULL) {
...
  if (ls_filter && !(*ls_filter)(dp->d_name, ls_closure))
      continue;
...
}

g.mlist etc would have the callback perform the regexec(), and pass a
pointer to the compiled regex_t as the closure argument.

This would keep the portability issues out of libgis, and also allow
the use of other filters, e.g. fnmatch() or name[0] != '.' .

If these changes to
libgis (ls.c, join.c, gisdefs.h, POSIX regex) are acceptable,
g.m?(list|remove) might be substituted with these modules (?).

The existing scripts cannot be removed unless the C versions can be
made to work on all platforms (I daresay there's a regex library for
Windows, but we still need configure tests, etc).

Also, a stylistic issue regarding main.c:

  #define TYPES opt.type->answers

Avoid using preprocessor macros unless it's *really* necessary. While
macros may make the code look "neater", it's also makes it harder to
understand what's actually going on.

--
Glynn Clements <glynn@gclements.plus.com>

Glynn Clements <glynn@gclements.plus.com> writes:

>> I've just uploaded g.xlist and g.xremove (C implementations of
>> g.mlist and g.mremove, no dependency on g.list/g.remove) to
>> grass-addons/general. To compile these addons, you need POSIX
>> regex(3) functions. They are super fast (native speed of
>> g.list/g.remove)! Please test these modules.

[...]

>> If these changes to libgis (ls.c, join.c, gisdefs.h, POSIX regex)
>> are acceptable, g.m?(list|remove) might be substituted with these
>> modules (?).

> The existing scripts cannot be removed unless the C versions can be
> made to work on all platforms (I daresay there's a regex library for
> Windows, but we still need configure tests, etc).

  BTW, Autoconf allows for the parts of the package to be
  configured by independent `configure' scripts
  (AC_CONFIG_SUBDIRS.) Could the splitting of the top-level
  `configure' script be considered for GRASS?

  Doing the things this way has the following benefits:

  * the move to Autoconf 2.50 could then be made gradually, and
    not at once;

  * when the part of code depends on a set of specific libraries,
    there won't be any need to collect the specific compiler and
    linker flags into specific variables -- they could be
    collected into CPPFLAGS, CFLAGS and LDFLAGS (thus simplifying
    the script), while they'll affect only the part of the package
    the configure script belongs to.

[...]

Ivan Shmakov wrote:

>> I've just uploaded g.xlist and g.xremove (C implementations of
>> g.mlist and g.mremove, no dependency on g.list/g.remove) to
>> grass-addons/general. To compile these addons, you need POSIX
>> regex(3) functions. They are super fast (native speed of
>> g.list/g.remove)! Please test these modules.

[...]

>> If these changes to libgis (ls.c, join.c, gisdefs.h, POSIX regex)
>> are acceptable, g.m?(list|remove) might be substituted with these
>> modules (?).

> The existing scripts cannot be removed unless the C versions can be
> made to work on all platforms (I daresay there's a regex library for
> Windows, but we still need configure tests, etc).

  BTW, Autoconf allows for the parts of the package to be
  configured by independent `configure' scripts
  (AC_CONFIG_SUBDIRS.) Could the splitting of the top-level
  `configure' script be considered for GRASS?

Personally, I don't see that there is much to be gained by doing this.

It's not as if different parts of GRASS really are separate packages.
Everything depends upon libgis, and much of it depends upon e.g. GDAL.
So it's likely to be more reliable if a single configure script
considers everything as a whole.

--
Glynn Clements <glynn@gclements.plus.com>

Glynn Clements <glynn@gclements.plus.com> writes:

>>>> I've just uploaded g.xlist and g.xremove (C implementations of
>>>> g.mlist and g.mremove, no dependency on g.list/g.remove) to
>>>> grass-addons/general. To compile these addons, you need POSIX
>>>> regex(3) functions. They are super fast (native speed of
>>>> g.list/g.remove)! Please test these modules.

>> [...]

>>>> If these changes to libgis (ls.c, join.c, gisdefs.h, POSIX regex)
>>>> are acceptable, g.m?(list|remove) might be substituted with these
>>>> modules (?).

>>> The existing scripts cannot be removed unless the C versions can be
>>> made to work on all platforms (I daresay there's a regex library
>>> for Windows, but we still need configure tests, etc).

>> BTW, Autoconf allows for the parts of the package to be configured
>> by independent `configure' scripts (AC_CONFIG_SUBDIRS.) Could the
>> splitting of the top-level `configure' script be considered for
>> GRASS?

> Personally, I don't see that there is much to be gained by doing
> this.

> It's not as if different parts of GRASS really are separate packages.
> Everything depends upon libgis,

  Which, fortunately, hasn't to be tested by configure.

> and much of it depends upon e.g. GDAL. So it's likely to be more
> reliable if a single configure script considers everything as a
> whole.

  I see your point. However, it's not like that all the GRASS
  code depends on the same set of libraries. (Does libgis depend
  on Python, or OpenGL, for instance?)

  Anyway, I'm primarily concerned with porting configure.in
  (aclocal.m4) to Autoconf 2.50+, and, to a somewhat lesser
  extent, with the readability of these. And AC_CONFIG_SUBDIRS
  seemingly offers some help with respect to these tasks.

On Sun, Jun 29, 2008 at 4:41 PM, Glynn Clements
<glynn@gclements.plus.com> wrote:

Huidae Cho wrote:

I've just uploaded g.xlist and g.xremove (C implementations of g.mlist
and g.mremove, no dependency on g.list/g.remove) to
grass-addons/general. To compile these addons, you need POSIX regex(3)
functions. They are super fast (native speed of g.list/g.remove)!
Please test these modules.

They work well for me and they are natuallry extremely fast
compared to the shell scripts.

...

If these changes to
libgis (ls.c, join.c, gisdefs.h, POSIX regex) are acceptable,
g.m?(list|remove) might be substituted with these modules (?).

The existing scripts cannot be removed unless the C versions can be
made to work on all platforms (I daresay there's a regex library for
Windows, but we still need configure tests, etc).

What is the current state? Since I have to deal with thousands of
maps in a mapset frequently, I am interested to get the g.mlist
and g.mremove scripts replaced with a C version.

Thanks
Markus

Markus Neteler wrote:

> > If these changes to
> > libgis (ls.c, join.c, gisdefs.h, POSIX regex) are acceptable,
> > g.m?(list|remove) might be substituted with these modules (?).
>
> The existing scripts cannot be removed unless the C versions can be
> made to work on all platforms (I daresay there's a regex library for
> Windows, but we still need configure tests, etc).

What is the current state? Since I have to deal with thousands of
maps in a mapset frequently, I am interested to get the g.mlist
and g.mremove scripts replaced with a C version.

Well, first someone needs to find a regex library for Windows. If it's
needed for g.mlist etc, it will effectively become a mandatory
dependency, so if only source is available, we'll need to accept
responsibility for providing a binary package.

--
Glynn Clements <glynn@gclements.plus.com>

On Thu, Sep 4, 2008 at 8:35 PM, Glynn Clements <glynn@gclements.plus.com> wrote:

Markus Neteler wrote:

> > If these changes to
> > libgis (ls.c, join.c, gisdefs.h, POSIX regex) are acceptable,
> > g.m?(list|remove) might be substituted with these modules (?).
>
> The existing scripts cannot be removed unless the C versions can be
> made to work on all platforms (I daresay there's a regex library for
> Windows, but we still need configure tests, etc).

What is the current state? Since I have to deal with thousands of
maps in a mapset frequently, I am interested to get the g.mlist
and g.mremove scripts replaced with a C version.

Well, first someone needs to find a regex library for Windows. If it's
needed for g.mlist etc, it will effectively become a mandatory
dependency, so if only source is available, we'll need to accept
responsibility for providing a binary package.

Voila:
http://sourceforge.net/project/showfiles.php?group_id=2435&package_id=73286
"Release Name: Current Release: mingw-regex-2.5.1

Notes:
This is a port of the GNU regex components from glibc,
ported for use in native Win32 applications by Tor Lillqvist.

This release comprises three files, all as tar.gz archives:

mingw-regex-2.5.1-bin.tar.gz provides the regex functions
in the form of a precompiled DLL.

mingw-regex-2.5.1-dev.tar.gz provides the import libraries
and header files, required for linking user applications
to the above DLL.

mingw-regex-2.5.1-src.tar.gz provides the full source code,
from which the above two packages have been compiled.
"

Markus

Markus Neteler wrote:

>> > > If these changes to
>> > > libgis (ls.c, join.c, gisdefs.h, POSIX regex) are acceptable,
>> > > g.m?(list|remove) might be substituted with these modules (?).
>> >
>> > The existing scripts cannot be removed unless the C versions can be
>> > made to work on all platforms (I daresay there's a regex library for
>> > Windows, but we still need configure tests, etc).
>>
>> What is the current state? Since I have to deal with thousands of
>> maps in a mapset frequently, I am interested to get the g.mlist
>> and g.mremove scripts replaced with a C version.
>
> Well, first someone needs to find a regex library for Windows. If it's
> needed for g.mlist etc, it will effectively become a mandatory
> dependency, so if only source is available, we'll need to accept
> responsibility for providing a binary package.

Voila:
http://sourceforge.net/project/showfiles.php?group_id=2435&package_id=73286
"Release Name: Current Release: mingw-regex-2.5.1

Notes:
This is a port of the GNU regex components from glibc,
ported for use in native Win32 applications by Tor Lillqvist.

I have added g.mlist and g.mremove to SVN trunk, along with
G_set_ls_filter() and the configure checks for the regex functions.

--
Glynn Clements <glynn@gclements.plus.com>

On Fri, Sep 05, 2008 at 12:18:10AM +0100, Glynn Clements wrote:

Markus Neteler wrote:

> >> > > If these changes to
> >> > > libgis (ls.c, join.c, gisdefs.h, POSIX regex) are acceptable,
> >> > > g.m?(list|remove) might be substituted with these modules (?).
> >> >
> >> > The existing scripts cannot be removed unless the C versions can be
> >> > made to work on all platforms (I daresay there's a regex library for
> >> > Windows, but we still need configure tests, etc).
> >>
> >> What is the current state? Since I have to deal with thousands of
> >> maps in a mapset frequently, I am interested to get the g.mlist
> >> and g.mremove scripts replaced with a C version.
> >
> > Well, first someone needs to find a regex library for Windows. If it's
> > needed for g.mlist etc, it will effectively become a mandatory
> > dependency, so if only source is available, we'll need to accept
> > responsibility for providing a binary package.
>
> Voila:
> http://sourceforge.net/project/showfiles.php?group_id=2435&package_id=73286
> "Release Name: Current Release: mingw-regex-2.5.1
>
> Notes:
> This is a port of the GNU regex components from glibc,
> ported for use in native Win32 applications by Tor Lillqvist.

I have added g.mlist and g.mremove to SVN trunk, along with
G_set_ls_filter() and the configure checks for the regex functions.

Do we really need to keep the old names (g.mlist/g.mremove) when we may
break backward compatibility in grass7? The "m" used to stand for
"modified", but "extended" (g.xlist/g.xremove) would be more
appropriate. Just my paranoia :-).

Huidae

Huidae Cho wrote:

> I have added g.mlist and g.mremove to SVN trunk, along with
> G_set_ls_filter() and the configure checks for the regex functions.
>

Do we really need to keep the old names (g.mlist/g.mremove) when we may
break backward compatibility in grass7? The "m" used to stand for
"modified", but "extended" (g.xlist/g.xremove) would be more
appropriate. Just my paranoia :-).

I thought that these were supposed to be replacements for g.mlist and
g.mremove. AFIACT, they have the same options as the script versions,
other than the use of extended REs versus basic REs for -r.

If we decide to keep the script versions as a fallback, those will be
replaced with Python versions in 7.x, so they can be changed to use
extended REs (the shell script uses sed, which only supports basic
REs). Or we can make the C versions use basic REs for -r and add e.g.
-e for extended REs.

--
Glynn Clements <glynn@gclements.plus.com>

On Sat, Sep 06, 2008 at 06:05:46PM +0100, Glynn Clements wrote:

Huidae Cho wrote:

> > I have added g.mlist and g.mremove to SVN trunk, along with
> > G_set_ls_filter() and the configure checks for the regex functions.
> >
>
> Do we really need to keep the old names (g.mlist/g.mremove) when we may
> break backward compatibility in grass7? The "m" used to stand for
> "modified", but "extended" (g.xlist/g.xremove) would be more
> appropriate. Just my paranoia :-).

I thought that these were supposed to be replacements for g.mlist and
g.mremove. AFIACT, they have the same options as the script versions,
other than the use of extended REs versus basic REs for -r.

Yes, g.xlist/g.xremove were supposed to be replacements/improvements
for g.mlist/g.mremove, but with different names.

If we decide to keep the script versions as a fallback, those will be
replaced with Python versions in 7.x, so they can be changed to use
extended REs (the shell script uses sed, which only supports basic
REs). Or we can make the C versions use basic REs for -r and add e.g.
-e for extended REs.

I didn't mean whether or not we need to keep the script versions; I
doubt the need for fallback versions. What I suggest is to remove the
script versions (already done) and "rename" the C version of
g.mlist/g.mremove to g.xlist/g.xremove as its current name g."m"list
(modified g.list) is somewhat awkward compared to g."x"list (extended
g.list).

I don't think it's a good idea to have two flags for basic and extended
REs unless grass7 should keep backward compatibility.

Huidae

Huidae Cho wrote:

> If we decide to keep the script versions as a fallback, those will be
> replaced with Python versions in 7.x, so they can be changed to use
> extended REs (the shell script uses sed, which only supports basic
> REs). Or we can make the C versions use basic REs for -r and add e.g.
> -e for extended REs.

FWIW, I've done this; -r uses basic REs, -e uses extended REs.

I didn't mean whether or not we need to keep the script versions; I
doubt the need for fallback versions. What I suggest is to remove the
script versions (already done)

The script versions are still there, and should be kept until we're
sure that there aren't any problems with the C versions (e.g. whether
the configure checks for the regex functions are sufficient).

and "rename" the C version of
g.mlist/g.mremove to g.xlist/g.xremove as its current name g."m"list
(modified g.list) is somewhat awkward compared to g."x"list (extended
g.list).

Renaming them means that any scripts which use them will need to be
changed, documentation needs to be changed, etc, which seems like
gratuitous incompatibility.

I don't think it's a good idea to have two flags for basic and extended
REs unless grass7 should keep backward compatibility.

We don't have to keep compatibility, but that doesn't mean that we
break things without reason.

AFAICT, the new modules provide exactly the same functionality (and
with the same names) as the existing scripts. Anything that worked
before with the scripts should continue to work with either the
scripts or the C modules.

Except for one thing: the g.mremove script has dview= rather than
3dview=. The C modules has 3dview=, and both versions of g.mlist use
type=3dview. I'm not sure if there's a reason for this, or if it was
just a typo.

--
Glynn Clements <glynn@gclements.plus.com>

Hi,

2008/9/6 Glynn Clements <glynn@gclements.plus.com>:

Except for one thing: the g.mremove script has dview= rather than
3dview=. The C modules has 3dview=, and both versions of g.mlist use
type=3dview. I'm not sure if there's a reason for this, or if it was
just a typo.

I don't see any reason why to support 3dview type in GRASS7.

Martin

--
Martin Landa <landa.martin gmail.com> * http://gama.fsv.cvut.cz/~landa *

On Sat, 6 Sep 2008, Martin Landa wrote:

Hi,

2008/9/6 Glynn Clements <glynn@gclements.plus.com>:

Except for one thing: the g.mremove script has dview= rather than
3dview=. The C modules has 3dview=, and both versions of g.mlist use
type=3dview. I'm not sure if there's a reason for this, or if it was
just a typo.

I don't see any reason why to support 3dview type in GRASS7.

Why - is there a replacement for this? (Really sorry I haven't been keeping up to date with improvements to OGSF and 3-D visualisation functionality.)

Paul

Hi,

2008/9/6 Paul Kelly <paul-grass@stjohnspoint.co.uk>:

Except for one thing: the g.mremove script has dview= rather than
3dview=. The C modules has 3dview=, and both versions of g.mlist use
type=3dview. I'm not sure if there's a reason for this, or if it was
just a typo.

I don't see any reason why to support 3dview type in GRASS7.

Why - is there a replacement for this? (Really sorry I haven't been keeping
up to date with improvements to OGSF and 3-D visualisation functionality.)

Tcl/Tk Nviz will be replaced by wxGUI Nviz extension. All settings
should be stored in the workspace file (gxw).

Martin

--
Martin Landa <landa.martin gmail.com> * http://gama.fsv.cvut.cz/~landa *

Glynn:

Or we can make the C versions use basic REs for -r and add e.g.
-e for extended REs.

I think it would be better to offer just one regex flag.

2c,
Hamish

On Sat, 6 Sep 2008, Martin Landa wrote:

Hi,

2008/9/6 Paul Kelly <paul-grass@stjohnspoint.co.uk>:

Except for one thing: the g.mremove script has dview= rather than
3dview=. The C modules has 3dview=, and both versions of g.mlist use
type=3dview. I'm not sure if there's a reason for this, or if it was
just a typo.

I don't see any reason why to support 3dview type in GRASS7.

Why - is there a replacement for this? (Really sorry I haven't been keeping
up to date with improvements to OGSF and 3-D visualisation functionality.)

Tcl/Tk Nviz will be replaced by wxGUI Nviz extension. All settings
should be stored in the workspace file (gxw).

Nviz never really worked properly with 3dview files. (Bob Covill added some support at one stage after some nagging by me, but it didn't fully work so wasn't much use for faithfully recreating a 3d-view.) The really useful thing about 3dview files IMHO is specifying an exact observer location, field of view, yaw, pitch and roll angles and recreating the view as would be seen by an observer at that point. This worked great in the old SG3d (and in d.3d) and I used it in some research a few years ago. But the Silicon Graphics machines in my old lab (needed to run SG3d) are long since scrapped and the ability to faithfully recreate a 3-D view has been sorely missing from Nviz. Last time I looked the functions to load a 3-D view file were still there in the OGSF library so I was kind of hopeful that this could be got working again at some stage.

Paul

Huidae Cho wrote:

its current name g."m"list (modified g.list) is somewhat awkward compared
to g."x"list (extended g.list).

I had always though that it stood for many-list. And in that light it works
and no need to change it- unless you want a way to make it obvious that
compatibility is broken and scripts using it may silently fail or
unexpectedly give a different result.

But I'll respect and take your word as for the author's original intentions.
:wink:

Hamish

On Sat, Sep 06, 2008 at 10:05:08PM +0100, Glynn Clements wrote:

Huidae Cho wrote:

> > If we decide to keep the script versions as a fallback, those will be
> > replaced with Python versions in 7.x, so they can be changed to use
> > extended REs (the shell script uses sed, which only supports basic
> > REs). Or we can make the C versions use basic REs for -r and add e.g.
> > -e for extended REs.

FWIW, I've done this; -r uses basic REs, -e uses extended REs.

> I didn't mean whether or not we need to keep the script versions; I
> doubt the need for fallback versions. What I suggest is to remove the
> script versions (already done)

The script versions are still there, and should be kept until we're
sure that there aren't any problems with the C versions (e.g. whether
the configure checks for the regex functions are sufficient).

> and "rename" the C version of
> g.mlist/g.mremove to g.xlist/g.xremove as its current name g."m"list
> (modified g.list) is somewhat awkward compared to g."x"list (extended
> g.list).

Renaming them means that any scripts which use them will need to be
changed, documentation needs to be changed, etc, which seems like
gratuitous incompatibility.

> I don't think it's a good idea to have two flags for basic and extended
> REs unless grass7 should keep backward compatibility.

We don't have to keep compatibility, but that doesn't mean that we
break things without reason.

Original g.xlist/g.xremove are not compatible with their script versions
because these modules support only extended REs with -r flag and I
didn't see the need for two regex flags. Having one flag for just
extended REs would be simple and straightforward, and could be a good
reason for incompatibility. Just my two cents.

AFAICT, the new modules provide exactly the same functionality (and
with the same names) as the existing scripts. Anything that worked
before with the scripts should continue to work with either the
scripts or the C modules.

If we decide to have fallback scripts, yes, the both versions should
provide the same functionality. If we decide to drop the scripts, I'm
not sure why we cannot have just one flag for extended REs and break
things for the above reason.

Except for one thing: the g.mremove script has dview= rather than
3dview=. The C modules has 3dview=, and both versions of g.mlist use
type=3dview. I'm not sure if there's a reason for this, or if it was
just a typo.

The very first version had 3dview=, but I don't know when it changed.
Probably, when we added g.parser's GIS_OPT_ feature?

Huidae