[GRASS-dev] split GRASS (lib / cli / modules / wx / qt / web / etc.)

Dear devs,

stimulated by the GSOC idea of Ondřej I would like to revive again this topic.
I know that this has been already discussed in the past, I found this
(CLI1=GUI [0]) but it is focused on users and packagers prospective.
Here I would like to face this point from a developer point of view.
As point out in previous threads GRASS it is already modular, but
(imho) there is duplicate code/functionalities and often things and
levels are mixed.

Let's start with a simple example: most of the GRASS modules, mix
nicely logic and cli, several of them have a single main function with
everything inside. I think could be useful to have a more clear
distinction between logic/algorithms and their public interface
(cli/gui). If we clearly split these two things the GRASS modules
became just an interface to some functions inside the GRASS libraries.

Another example is GRASS GUI that have internally a lot of
functionalities that (imho) should be moved|integrated to a
dedicated|existing python library, because their are independent by
the library (wx|qt|javascript+html5) used to render the final GUI, and
again to me it seems that a lot of things are mixed.

Split these main functionalities in different repository can help
developers, because they can focus/work on a smaller base of code.

So how to split GRASS. It would be nice to open a dedicate repository
(git?) for each of this projects:

- grass-lib: provides only C and Python API. This component should be
a python citizen, I mean that should be available at the PyPI - the
Python Package Index [1] and of course install-able as python package
through pip;
- grass-cli: provides a shell (with no modules!), also available as a
pure python package;
- grass-modules: provides all the GRASS core modules (this could be
also a pure python interface calling functions in the C/Python
libraries), and could be split in other sub categories (e.g. imagery,
temporal, terrain, etc).
- grass-wx: provides a WxPython/Phoenix interface for GRASS
- (grass-qt: provides a PyQt/PySide interface for GRASS)
- (grass-jupyther: provides a Jupyther interface to GRASS)
- (grass-rest: provides a RESTful API for GRASS)
- (add your idea here... :-D)
- etc.

Each point is characterize by a different use-case and this things are
generally developed by different person with different backgrounds and
needs and to me it make sense to split them.

We could have a greater granularity and a clear focus for each
repository and could help to acquire new developers because it open
new GRASS' development possibilities.
Enlarging the use-case of GRASS. Separate things in dedicated
repository force developers to respect the distinction, and force them
to think where the code should be put/published.
Such subdivision could help has to reduce the total amount of code
making things more general and abstract. It should also help making
independent and well isolated tests.

It should also help the development cycle since we can release things
in a independently way, it requires only to specify in the
requirements.txt file a working tested combinations of python packages
versions.

{{{
numpy>=1.10
grass-lib>=8
grass-cli==8.1
grass-modules>=8
grass-wx=8.1.3
}}}

I think this idea could help mainly developers making things clear and
well organized in different sub-projects.
Opening the possibility to integrate GRASS functionalities to other
open-source projects.

This solution could help also in making things easier also for
packager and users, for example users could install GRASS on all the
system (win/Mac/*nix) running a single command:

{{{
$ pip install --user grass-lib grass-cli grass-modules grass-wx
}}}

What do you think?

All the best.

Pietro

[0] https://lists.osgeo.org/pipermail/grass-dev/2010-November/052661.html
[1] https://pypi.python.org/pypi

Pietro <peter.zamb@gmail.com> writes:

Dear devs,

stimulated by the GSOC idea of Ondřej I would like to revive again this topic.
I know that this has been already discussed in the past, I found this
(CLI1=GUI [0]) but it is focused on users and packagers prospective.
Here I would like to face this point from a developer point of view.
As point out in previous threads GRASS it is already modular, but
(imho) there is duplicate code/functionalities and often things and
levels are mixed.

Let's start with a simple example: most of the GRASS modules, mix
nicely logic and cli, several of them have a single main function with
everything inside. I think could be useful to have a more clear
distinction between logic/algorithms and their public interface
(cli/gui). If we clearly split these two things the GRASS modules
became just an interface to some functions inside the GRASS libraries.

Another example is GRASS GUI that have internally a lot of
functionalities that (imho) should be moved|integrated to a
dedicated|existing python library, because their are independent by
the library (wx|qt|javascript+html5) used to render the final GUI, and
again to me it seems that a lot of things are mixed.

Split these main functionalities in different repository can help
developers, because they can focus/work on a smaller base of code.

So how to split GRASS. It would be nice to open a dedicate repository
(git?) for each of this projects:

- grass-lib: provides only C and Python API. This component should be
a python citizen, I mean that should be available at the PyPI - the
Python Package Index [1] and of course install-able as python package
through pip;
- grass-cli: provides a shell (with no modules!), also available as a
pure python package;
- grass-modules: provides all the GRASS core modules (this could be
also a pure python interface calling functions in the C/Python
libraries), and could be split in other sub categories (e.g. imagery,
temporal, terrain, etc).
- grass-wx: provides a WxPython/Phoenix interface for GRASS
- (grass-qt: provides a PyQt/PySide interface for GRASS)
- (grass-jupyther: provides a Jupyther interface to GRASS)
- (grass-rest: provides a RESTful API for GRASS)
- (add your idea here... :-D)
- etc.

Each point is characterize by a different use-case and this things are
generally developed by different person with different backgrounds and
needs and to me it make sense to split them.

We could have a greater granularity and a clear focus for each
repository and could help to acquire new developers because it open
new GRASS' development possibilities.
Enlarging the use-case of GRASS. Separate things in dedicated
repository force developers to respect the distinction, and force them
to think where the code should be put/published.
Such subdivision could help has to reduce the total amount of code
making things more general and abstract. It should also help making
independent and well isolated tests.

It should also help the development cycle since we can release things
in a independently way, it requires only to specify in the
requirements.txt file a working tested combinations of python packages
versions.

{{{
numpy>=1.10
grass-lib>=8
grass-cli==8.1
grass-modules>=8
grass-wx=8.1.3
}}}

I think this idea could help mainly developers making things clear and
well organized in different sub-projects.
Opening the possibility to integrate GRASS functionalities to other
open-source projects.

This solution could help also in making things easier also for
packager and users, for example users could install GRASS on all the
system (win/Mac/*nix) running a single command:

{{{
$ pip install --user grass-lib grass-cli grass-modules grass-wx
}}}

What do you think?

I am not a developer of GRASS but in my experience, it is very
advantageous to split one large package into smaller ones and I think
this is definitely a step into the right direction.

Just for clarifications: GRASS will still be available as a deb package
for Debian and derivatives, dmg, ... I hope? (pip makes me always a
little bit nervous - no idea why. Possibky it is another package manager
in addition to deb, rpm, dmg, homebrew, Macports, ...).

Also: It would see it as very important that grass can be installed on
all systems (as you mention - win, mac, *nix, ...?).

Cheers,

Rainer

All the best.

Pietro

[0] https://lists.osgeo.org/pipermail/grass-dev/2010-November/052661.html
[1] https://pypi.python.org/pypi
_______________________________________________
grass-dev mailing list
grass-dev@lists.osgeo.org
http://lists.osgeo.org/mailman/listinfo/grass-dev

--
Rainer M. Krug
email: Rainer<at>krugs<dot>de
PGP: 0x0F52F982

On 18/03/16 12:58, Pietro wrote:

Dear devs,

stimulated by the GSOC idea of Ondřej I would like to revive again this topic.
I know that this has been already discussed in the past, I found this
(CLI1=GUI [0]) but it is focused on users and packagers prospective.
Here I would like to face this point from a developer point of view.
As point out in previous threads GRASS it is already modular, but
(imho) there is duplicate code/functionalities and often things and
levels are mixed.

In your opinion is this true at the module level, or mostly for the wxGUI ?

Let's start with a simple example: most of the GRASS modules, mix
nicely logic and cli, several of them have a single main function with
everything inside. I think could be useful to have a more clear
distinction between logic/algorithms and their public interface
(cli/gui).

Why ? I really like the fact that each module works kind of like a high-level function with a defined public interface.

If we clearly split these two things the GRASS modules
became just an interface to some functions inside the GRASS libraries.

Many modules (if not most) are already that: a combination of GRASS library function calls in order to achieve the specific goal the module is set out for.

Another example is GRASS GUI that have internally a lot of
functionalities that (imho) should be moved|integrated to a
dedicated|existing python library, because their are independent by
the library (wx|qt|javascript+html5) used to render the final GUI, and
again to me it seems that a lot of things are mixed.

I agree that the GUI part is probably more subject to possible refactoring into a larger library part.

So how to split GRASS. It would be nice to open a dedicate repository
(git?) for each of this projects:

- grass-lib: provides only C and Python API. This component should be
a python citizen, I mean that should be available at the PyPI - the
Python Package Index [1] and of course install-able as python package
through pip;
- grass-cli: provides a shell (with no modules!), also available as a
pure python package;

I'm not sure I like this extremely pythonic approach to GRASS. I love to use Python for scripting GRASS, but in my perception GRASS is far from being a Python project.

- grass-modules: provides all the GRASS core modules (this could be
also a pure python interface calling functions in the C/Python
libraries), and could be split in other sub categories (e.g. imagery,
temporal, terrain, etc).

What is the difference between these modules and the existing ones ? Except for your idea to make all module Python modules.

I think this idea could help mainly developers making things clear and
well organized in different sub-projects.

I think I need more explanation in order to really understand what the advantage would be.

Currently everything resides in the same repository, but there is a distinction between modules (raster, vector, db, etc directories of the source tree), and libraries (lib).

I'm not sure what the great advantage of separating them would be, and honestly I'm afraid that if we start to have modules in versions that only run with specific version of a separate library package, we will have a flood of new user mails complaining about things not working. The current state of affairs at least guarantees that things are in sync.

Opening the possibility to integrate GRASS functionalities to other
open-source projects.

I don't know why other open-source projects cannot use GRASS functionality currently. There's pygrass if people want direct library access, you can always use system calls to modules, even more easily so with the new --exec parameter to the startup script.

This solution could help also in making things easier also for
packager and users, for example users could install GRASS on all the
system (win/Mac/*nix) running a single command:

{{{
$ pip install --user grass-lib grass-cli grass-modules grass-wx
}}}

I can already do that today in Debian:

apt-get install grass-core grass-gui grass-dev grass-doc

But that's a packaging issue.

What do you think?

As I said, I find the whole approach a bit too Pythonic. Unless we decide that GRASS is going to become a Python project with a bit of C-libraries thrown in for performance, I need more convincing to buy this.

Moritz

On Fri, Mar 18, 2016 at 2:16 PM, Moritz Lennert
<mlennert@club.worldonline.be> wrote:

On 18/03/16 12:58, Pietro wrote:
In your opinion is this true at the module level, or mostly for the wxGUI ?

No in my opinion things are quite mixed also in C/python modules.

Let's start with a simple example: most of the GRASS modules, mix
nicely logic and cli, several of them have a single main function with
everything inside. I think could be useful to have a more clear
distinction between logic/algorithms and their public interface
(cli/gui).

Why ? I really like the fact that each module works kind of like a
high-level function with a defined public interface.

Yes, and I like too! Indeed I don't want to change this. What I would
like to change is to better distinguish this high-level
functionalities from the low level parts. So for instance just opening
randomly a GRASS core module: r.resamp.stats

https://trac.osgeo.org/grass/browser/grass/trunk/raster/r.resamp.stats/main.c

here are defined:
- static const struct menu, that probably could be useful also for
other modules so should go to the grass-lib
- static char *build_method_list(void), that could be generalized to
used also from other modules and should go to the grass-lib
- static int find_method(const char *name), could be also generalized
to be used by other modules and should go to the grass-lib
- static void resamp_unweighted(void), this function could be also
changed to be more general and moved to the grass-lib too
- static void resamp_weighted(void), same as before.

For each of the above function we can build tests to improve the
reliability, verify performance regression and so on.

So far you can access to this functions only from the CLI interface,
if we clear separate this two level then we can access to this using
CLI, but also using C/python/etc. So for instance If I need select an
option from a menu list on a new module I have to reinvent the wheel,
write my own buggy code and as GRASS developers we end up having
duplicate buggy code in each module.

The main function could stay or be rewritten in python, this is not
really relevant because it is just defining the CLI interface and
calling the functions and finally adding the history metadata and set
the color table.

If we clearly split these two things the GRASS modules
became just an interface to some functions inside the GRASS libraries.

Many modules (if not most) are already that: a combination of GRASS library
function calls in order to achieve the specific goal the module is set out
for.

yes they are calling GRASS library functions, but they are also adding
functionalities that (imho) should be included to the GRASS library.
Because they could be useful not only for this specific modules but
also for others.

So how to split GRASS. It would be nice to open a dedicate repository
(git?) for each of this projects:

- grass-lib: provides only C and Python API. This component should be
a python citizen, I mean that should be available at the PyPI - the
Python Package Index [1] and of course install-able as python package
through pip;
- grass-cli: provides a shell (with no modules!), also available as a
pure python package;

I'm not sure I like this extremely pythonic approach to GRASS. I love to use
Python for scripting GRASS, but in my perception GRASS is far from being a
Python project.

mmh, ok, so let's add a more layer: grass-lib, grass-py, and then the others..

So grass-lib will contain only C (C++?) code and will be not available
at the PyPi website.
grass-py add the python wrap to grass-lib and add API and go to PyPi.

This is the same approach of GDAL[0], PROJ4[1], mapnik[2] that are all
available as python packages.

I do think that add grass to PyPi can only open new prospective and
use cases reaching a broader group of users and developers.

- grass-modules: provides all the GRASS core modules (this could be
also a pure python interface calling functions in the C/Python
libraries), and could be split in other sub categories (e.g. imagery,
temporal, terrain, etc).

What is the difference between these modules and the existing ones ? Except
for your idea to make all module Python modules.

Because things are complicated and the current status quo is not
flexible and/or very limited.
Basically you can create python packages that act as both library and
grass-addons.
So for instance the r.green modules require scipy and numexpr to run,
and the only way available in 2016 is that user have to care about it
installing the missing libraries and even worst if some of our modules
depend on another addons there is no way to handle this.

Other things that are not working with the current set up is that
there is not way to said for each version of grass the addon is
available.
So for instance r.green was developed and tested with grass7 stable,
Sören has improved pygrass vector API in trunk (thank you Sören), but
now in the grass-addons repository I have to choose if I want to
support grass-stable or grass-trunk, I have to specify this somewhere
in the manual and hope user read it. Instead I could create a python
package and installing r.green (v0.3) will use grass-lib(v7.0.x) or
installing r.green(v0.4) will use grass-lib(v7.1.x). Moreover I could
also have the module documentation using sphinx, instead of writing
html code.

Perhaps in the future I would like to add a dedicated GUI to this set
of modules or a web interface and then I will have more dependencies,
and make almost impossible to use this new features for an average
user.

So basically we can remove g.extension and rely on pip or we have to
reinvent the wheel to get these functionalities in g.extension.

I think this idea could help mainly developers making things clear and
well organized in different sub-projects.

I think I need more explanation in order to really understand what the
advantage would be.

Sorry If I was not clear, hopefully now it is a bit clearer what kind
of advantages we (as developers) could have.

Currently everything resides in the same repository, but there is a
distinction between modules (raster, vector, db, etc directories of the
source tree), and libraries (lib).

I'm not sure what the great advantage of separating them would be

To me a better separation of the code functionalities could help us to
(just brainstorming):
1) reduce the amount of code removing duplication and redundancy;
2) get less functions better documented and tested;
3) make for new developers easier to contribute to the project because
the can contribute to a smaller repository with very clear aims and
objectives;
4) reach an higher level of abstraction
5) simplify the use of grass as a library for other tools (e.g.
postgis, spatialite, qgis, gdal, etc).

honestly I'm afraid that if we start to have modules in versions that only
run with specific version of a separate library package, we will have a
flood of new user mails complaining about things not working. The current
state of affairs at least guarantees that things are in sync.

I've never had problem with python on this, and I see only advantages
to be able to specify which are the supported version for what. So
user could not mix versions because pip check this kind of issues. And
developers can have a faster release cycle.

Opening the possibility to integrate GRASS functionalities to other
open-source projects.

I don't know why other open-source projects cannot use GRASS functionality
currently. There's pygrass if people want direct library access, you can
always use system calls to modules, even more easily so with the new --exec
parameter to the startup script.

yes, indeed we are going on this direction, I think we can do in a
more effective way.

Making thing easier for both users and developers.

This solution could help also in making things easier also for
packager and users, for example users could install GRASS on all the
system (win/Mac/*nix) running a single command:

{{{
$ pip install --user grass-lib grass-cli grass-modules grass-wx
}}}

I can already do that today in Debian:

apt-get install grass-core grass-gui grass-dev grass-doc

But that's a packaging issue.

No, it isn't only a packaging issue it is a more general approach on
how to face problems and organizing the code.

What do you think?

As I said, I find the whole approach a bit too Pythonic. Unless we decide
that GRASS is going to become a Python project with a bit of C-libraries
thrown in for performance, I need more convincing to buy this.

Mhh.. The C functionalities should stay, but I do think the python
part should became more Pythonic or better, well integrated in the
Python ecosystem and I think we could improve on this side.
Why are you worry to be too Pythonic?

Have a nice weekend.

Pietro

[0] https://pypi.python.org/pypi/GDAL
[1] https://pypi.python.org/pypi/pyproj
[2] https://pypi.python.org/pypi/mapnik2

On 18/03/16 18:38, Pietro wrote:

On Fri, Mar 18, 2016 at 2:16 PM, Moritz Lennert
<mlennert@club.worldonline.be> wrote:

On 18/03/16 12:58, Pietro wrote:
In your opinion is this true at the module level, or mostly for the wxGUI ?

No in my opinion things are quite mixed also in C/python modules.

Let's start with a simple example: most of the GRASS modules, mix
nicely logic and cli, several of them have a single main function with
everything inside. I think could be useful to have a more clear
distinction between logic/algorithms and their public interface
(cli/gui).

Why ? I really like the fact that each module works kind of like a
high-level function with a defined public interface.

Yes, and I like too! Indeed I don't want to change this. What I would
like to change is to better distinguish this high-level
functionalities from the low level parts. So for instance just opening
randomly a GRASS core module: r.resamp.stats

https://trac.osgeo.org/grass/browser/grass/trunk/raster/r.resamp.stats/main.c

here are defined:
- static const struct menu, that probably could be useful also for
other modules so should go to the grass-lib
- static char *build_method_list(void), that could be generalized to
used also from other modules and should go to the grass-lib
- static int find_method(const char *name), could be also generalized
to be used by other modules and should go to the grass-lib
- static void resamp_unweighted(void), this function could be also
changed to be more general and moved to the grass-lib too
- static void resamp_weighted(void), same as before.

For each of the above function we can build tests to improve the
reliability, verify performance regression and so on.

So far you can access to this functions only from the CLI interface,
if we clear separate this two level then we can access to this using
CLI, but also using C/python/etc. So for instance If I need select an
option from a menu list on a new module I have to reinvent the wheel,
write my own buggy code and as GRASS developers we end up having
duplicate buggy code in each module.

The main function could stay or be rewritten in python, this is not
really relevant because it is just defining the CLI interface and
calling the functions and finally adding the history metadata and set
the color table.

If we clearly split these two things the GRASS modules
became just an interface to some functions inside the GRASS libraries.

Many modules (if not most) are already that: a combination of GRASS library
function calls in order to achieve the specific goal the module is set out
for.

yes they are calling GRASS library functions, but they are also adding
functionalities that (imho) should be included to the GRASS library.
Because they could be useful not only for this specific modules but
also for others.

The above points are a call for refactoring of the code, not necessarily reorganizing it into different packages in different source trees.

IMHO, this will always be an issue because of the structure of GRASS: Someone develops a new module. Code is specific to this module. Then someone else develops a second module and reproduces part of the code of the former, because it does what they want and so they just copy it.
At one point several modules share the same code and it becomes clear that there is a need for this code at library level.

Unless we work with a much more centralized development system, where a limited number of developers review each proposed module and then check whether parts of the code should go into the library instead of the module, I don't really see a different way of doing things as the way that has grown organically throughout the development history of GRASS...

We could decide to organize concerted code review moments aiming at identifying relevant parts of code that should go into libraries, but I have the feeling that current ad-hoc management is more efficient: whenever the need is felt, we do it.

BTW, I don't see how separating the source trees solves this issue. Many people will still continue to code things in modules first and only after the same code is used in several modules will it become apparent that it should go into the libs.

mmh, ok, so let's add a more layer: grass-lib, grass-py, and then the others..

So grass-lib will contain only C (C++?) code and will be not available
at the PyPi website.
grass-py add the python wrap to grass-lib and add API and go to PyPi.

This is the same approach of GDAL[0], PROJ4[1], mapnik[2] that are all
available as python packages.

I do think that add grass to PyPi can only open new prospective and
use cases reaching a broader group of users and developers.

I have the feeling that the question about possibly extracting the python libs into an installable PyPi package is a different issue. But then again, what use would these libraries be without the C-libraries and actually even without the modules ? Yes Pygrass allows you to code directly with low-level routines, but for me one of the big strength of the GRASS modular structure and our Python APIs is that they allow to work with the modules as "functions", so without forcing to use any low-level access.

- grass-modules: provides all the GRASS core modules (this could be
also a pure python interface calling functions in the C/Python
libraries), and could be split in other sub categories (e.g. imagery,
temporal, terrain, etc).

What is the difference between these modules and the existing ones ? Except
for your idea to make all module Python modules.

Because things are complicated and the current status quo is not
flexible and/or very limited.
Basically you can create python packages that act as both library and
grass-addons.

That is again another issue: how to handle addons that consist of more than just one single file script ? I'm not sure that I would agree that everytime someone codes a more complex addon, the lib part should immediately go into the core GRASS libs. So, I'd rather see an easier way to handle such libs in addons. Currently, it is not easy for people (see [1] for a recent example), but even though I'm talking without enough knowledge, here, I don't think it should be too difficult for addon modules to contain files that get installed in .grass7/addons/lib instead of the current mix of different solutions.

So for instance the r.green modules require scipy and numexpr to run,
and the only way available in 2016 is that user have to care about it
installing the missing libraries and even worst if some of our modules
depend on another addons there is no way to handle this.

I agree that dependency management between addons would be nice, but I don't see it as that much of an issue. You can always include in your module a check for the existence of another and stop with a fatal error encouraging the user to install that module.

AFAIU, there is also the option to use toolboxes, but personally, I haven't looked at this in detail, yet.

Other things that are not working with the current set up is that
there is not way to said for each version of grass the addon is
available.

So for instance r.green was developed and tested with grass7 stable,
Sören has improved pygrass vector API in trunk (thank you Sören), but
now in the grass-addons repository I have to choose if I want to
support grass-stable or grass-trunk, I have to specify this somewhere
in the manual and hope user read it. Instead I could create a python
package and installing r.green (v0.3) will use grass-lib(v7.0.x) or
installing r.green(v0.4) will use grass-lib(v7.1.x).

Another option might be to distinguish in the addons repository between grass_stable and grass_trunk, possibly with some easy option to create a sort of "symbolic" link between the two for modules than run with both.

Moreover I could
also have the module documentation using sphinx, instead of writing
html code.

That's a totally different issue again.

Perhaps in the future I would like to add a dedicated GUI to this set
of modules or a web interface and then I will have more dependencies,
and make almost impossible to use this new features for an average
user.

I think that if you go that far away from the KISS principle in GRASS module elaboration, then you are probably better off packaging the whole thing...

So basically we can remove g.extension and rely on pip or we have to
reinvent the wheel to get these functionalities in g.extension.

I'm not familiar enough with pip to judge what this would entail for GRASS.

Sorry If I was not clear, hopefully now it is a bit clearer what kind
of advantages we (as developers) could have.

Well, I mainly find that you mix many different issues, and I'm not convinced that the proposed solution of breaking up the source tree really solves all of these.

Maybe it might be better to break up your suggestions into separate parts to discuss them individually. If at the end we see that they all point to the same solution, your argument will be ever more convincing :slight_smile:

Just my 2¢,

Moritz

[1] https://lists.osgeo.org/pipermail/grass-dev/2016-March/079481.html

On 21-03-16 10:48, Moritz Lennert wrote:

On 18/03/16 18:38, Pietro wrote:

On Fri, Mar 18, 2016 at 2:16 PM, Moritz Lennert
<mlennert@club.worldonline.be> wrote:

On 18/03/16 12:58, Pietro wrote:
In your opinion is this true at the module level, or mostly for the wxGUI ?

No in my opinion things are quite mixed also in C/python modules.

Let's start with a simple example: most of the GRASS modules, mix
nicely logic and cli, several of them have a single main function with
everything inside. I think could be useful to have a more clear
distinction between logic/algorithms and their public interface
(cli/gui).

Why ? I really like the fact that each module works kind of like a
high-level function with a defined public interface.

Yes, and I like too! Indeed I don't want to change this. What I would
like to change is to better distinguish this high-level
functionalities from the low level parts. So for instance just opening
randomly a GRASS core module: r.resamp.stats

https://trac.osgeo.org/grass/browser/grass/trunk/raster/r.resamp.stats/main.c

here are defined:
- static const struct menu, that probably could be useful also for
other modules so should go to the grass-lib
- static char *build_method_list(void), that could be generalized to
used also from other modules and should go to the grass-lib
- static int find_method(const char *name), could be also generalized
to be used by other modules and should go to the grass-lib
- static void resamp_unweighted(void), this function could be also
changed to be more general and moved to the grass-lib too
- static void resamp_weighted(void), same as before.

For each of the above function we can build tests to improve the
reliability, verify performance regression and so on.

So far you can access to this functions only from the CLI interface,
if we clear separate this two level then we can access to this using
CLI, but also using C/python/etc. So for instance If I need select an
option from a menu list on a new module I have to reinvent the wheel,
write my own buggy code and as GRASS developers we end up having
duplicate buggy code in each module.

The main function could stay or be rewritten in python, this is not
really relevant because it is just defining the CLI interface and
calling the functions and finally adding the history metadata and set
the color table.

If we clearly split these two things the GRASS modules
became just an interface to some functions inside the GRASS libraries.

Many modules (if not most) are already that: a combination of GRASS library
function calls in order to achieve the specific goal the module is set out
for.

yes they are calling GRASS library functions, but they are also adding
functionalities that (imho) should be included to the GRASS library.
Because they could be useful not only for this specific modules but
also for others.

The above points are a call for refactoring of the code, not necessarily reorganizing it into different packages in different source trees.

IMHO, this will always be an issue because of the structure of GRASS: Someone develops a new module. Code is specific to this module. Then someone else develops a second module and reproduces part of the code of the former, because it does what they want and so they just copy it.
At one point several modules share the same code and it becomes clear that there is a need for this code at library level.

Unless we work with a much more centralized development system, where a limited number of developers review each proposed module and then check whether parts of the code should go into the library instead of the module, I don't really see a different way of doing things as the way that has grown organically throughout the development history of GRASS...

We could decide to organize concerted code review moments aiming at identifying relevant parts of code that should go into libraries, but I have the feeling that current ad-hoc management is more efficient: whenever the need is felt, we do it.

BTW, I don't see how separating the source trees solves this issue. Many people will still continue to code things in modules first and only after the same code is used in several modules will it become apparent that it should go into the libs.

mmh, ok, so let's add a more layer: grass-lib, grass-py, and then the others..

So grass-lib will contain only C (C++?) code and will be not available
at the PyPi website.
grass-py add the python wrap to grass-lib and add API and go to PyPi.

This is the same approach of GDAL[0], PROJ4[1], mapnik[2] that are all
available as python packages.

I do think that add grass to PyPi can only open new prospective and
use cases reaching a broader group of users and developers.

I have the feeling that the question about possibly extracting the python libs into an installable PyPi package is a different issue. But then again, what use would these libraries be without the C-libraries and actually even without the modules ? Yes Pygrass allows you to code directly with low-level routines, but for me one of the big strength of the GRASS modular structure and our Python APIs is that they allow to work with the modules as "functions", so without forcing to use any low-level access.

+1 this is indeed a great strength. It makes it relatively easy to string together a number of functions or combining original code with existing functions. It makes it possible for people like me with very limited programming experience, to write my own scripts/add-ons which would cost considerable more effort on other platforms. I would guess that this modular structure is also one of the factors that make GRASS GIS so incredible stable (but no clue how that would change or not under the proposed changes).

- grass-modules: provides all the GRASS core modules (this could be
also a pure python interface calling functions in the C/Python
libraries), and could be split in other sub categories (e.g. imagery,
temporal, terrain, etc).

What is the difference between these modules and the existing ones ? Except
for your idea to make all module Python modules.

Because things are complicated and the current status quo is not
flexible and/or very limited.
Basically you can create python packages that act as both library and
grass-addons.

That is again another issue: how to handle addons that consist of more than just one single file script ? I'm not sure that I would agree that everytime someone codes a more complex addon, the lib part should immediately go into the core GRASS libs. So, I'd rather see an easier way to handle such libs in addons. Currently, it is not easy for people (see [1] for a recent example), but even though I'm talking without enough knowledge, here, I don't think it should be too difficult for addon modules to contain files that get installed in .grass7/addons/lib instead of the current mix of different solutions.

So for instance the r.green modules require scipy and numexpr to run,
and the only way available in 2016 is that user have to care about it
installing the missing libraries and even worst if some of our modules
depend on another addons there is no way to handle this.

I agree that dependency management between addons would be nice, but I don't see it as that much of an issue. You can always include in your module a check for the existence of another and stop with a fatal error encouraging the user to install that module.

+1 would be nice to have some good examples how to do that on the wiki.

AFAIU, there is also the option to use toolboxes, but personally, I haven't looked at this in detail, yet.

Other things that are not working with the current set up is that
there is not way to said for each version of grass the addon is
available.

So for instance r.green was developed and tested with grass7 stable,
Sören has improved pygrass vector API in trunk (thank you Sören), but
now in the grass-addons repository I have to choose if I want to
support grass-stable or grass-trunk, I have to specify this somewhere
in the manual and hope user read it. Instead I could create a python
package and installing r.green (v0.3) will use grass-lib(v7.0.x) or
installing r.green(v0.4) will use grass-lib(v7.1.x).

Another option might be to distinguish in the addons repository between grass_stable and grass_trunk, possibly with some easy option to create a sort of "symbolic" link between the two for modules than run with both.

Moreover I could
also have the module documentation using sphinx, instead of writing
html code.

That's a totally different issue again.

Perhaps in the future I would like to add a dedicated GUI to this set
of modules or a web interface and then I will have more dependencies,
and make almost impossible to use this new features for an average
user.

I think that if you go that far away from the KISS principle in GRASS module elaboration, then you are probably better off packaging the whole thing...

So basically we can remove g.extension and rely on pip or we have to
reinvent the wheel to get these functionalities in g.extension.

I'm not familiar enough with pip to judge what this would entail for GRASS.

Sorry If I was not clear, hopefully now it is a bit clearer what kind
of advantages we (as developers) could have.

Well, I mainly find that you mix many different issues, and I'm not convinced that the proposed solution of breaking up the source tree really solves all of these.

Maybe it might be better to break up your suggestions into separate parts to discuss them individually. If at the end we see that they all point to the same solution, your argument will be ever more convincing :slight_smile:

Just my 2¢,

Moritz

[1] https://lists.osgeo.org/pipermail/grass-dev/2016-March/079481.html

_______________________________________________
grass-dev mailing list
grass-dev@lists.osgeo.org
http://lists.osgeo.org/mailman/listinfo/grass-dev