On Fri, Mar 18, 2016 at 2:16 PM, Moritz Lennert
<mlennert@club.worldonline.be> wrote:
On 18/03/16 12:58, Pietro wrote:
In your opinion is this true at the module level, or mostly for the wxGUI ?
No in my opinion things are quite mixed also in C/python modules.
Let's start with a simple example: most of the GRASS modules, mix
nicely logic and cli, several of them have a single main function with
everything inside. I think could be useful to have a more clear
distinction between logic/algorithms and their public interface
(cli/gui).
Why ? I really like the fact that each module works kind of like a
high-level function with a defined public interface.
Yes, and I like too! Indeed I don't want to change this. What I would
like to change is to better distinguish this high-level
functionalities from the low level parts. So for instance just opening
randomly a GRASS core module: r.resamp.stats
https://trac.osgeo.org/grass/browser/grass/trunk/raster/r.resamp.stats/main.c
here are defined:
- static const struct menu, that probably could be useful also for
other modules so should go to the grass-lib
- static char *build_method_list(void), that could be generalized to
used also from other modules and should go to the grass-lib
- static int find_method(const char *name), could be also generalized
to be used by other modules and should go to the grass-lib
- static void resamp_unweighted(void), this function could be also
changed to be more general and moved to the grass-lib too
- static void resamp_weighted(void), same as before.
For each of the above function we can build tests to improve the
reliability, verify performance regression and so on.
So far you can access to this functions only from the CLI interface,
if we clear separate this two level then we can access to this using
CLI, but also using C/python/etc. So for instance If I need select an
option from a menu list on a new module I have to reinvent the wheel,
write my own buggy code and as GRASS developers we end up having
duplicate buggy code in each module.
The main function could stay or be rewritten in python, this is not
really relevant because it is just defining the CLI interface and
calling the functions and finally adding the history metadata and set
the color table.
If we clearly split these two things the GRASS modules
became just an interface to some functions inside the GRASS libraries.
Many modules (if not most) are already that: a combination of GRASS library
function calls in order to achieve the specific goal the module is set out
for.
yes they are calling GRASS library functions, but they are also adding
functionalities that (imho) should be included to the GRASS library.
Because they could be useful not only for this specific modules but
also for others.
So how to split GRASS. It would be nice to open a dedicate repository
(git?) for each of this projects:
- grass-lib: provides only C and Python API. This component should be
a python citizen, I mean that should be available at the PyPI - the
Python Package Index [1] and of course install-able as python package
through pip;
- grass-cli: provides a shell (with no modules!), also available as a
pure python package;
I'm not sure I like this extremely pythonic approach to GRASS. I love to use
Python for scripting GRASS, but in my perception GRASS is far from being a
Python project.
mmh, ok, so let's add a more layer: grass-lib, grass-py, and then the others..
So grass-lib will contain only C (C++?) code and will be not available
at the PyPi website.
grass-py add the python wrap to grass-lib and add API and go to PyPi.
This is the same approach of GDAL[0], PROJ4[1], mapnik[2] that are all
available as python packages.
I do think that add grass to PyPi can only open new prospective and
use cases reaching a broader group of users and developers.
- grass-modules: provides all the GRASS core modules (this could be
also a pure python interface calling functions in the C/Python
libraries), and could be split in other sub categories (e.g. imagery,
temporal, terrain, etc).
What is the difference between these modules and the existing ones ? Except
for your idea to make all module Python modules.
Because things are complicated and the current status quo is not
flexible and/or very limited.
Basically you can create python packages that act as both library and
grass-addons.
So for instance the r.green modules require scipy and numexpr to run,
and the only way available in 2016 is that user have to care about it
installing the missing libraries and even worst if some of our modules
depend on another addons there is no way to handle this.
Other things that are not working with the current set up is that
there is not way to said for each version of grass the addon is
available.
So for instance r.green was developed and tested with grass7 stable,
Sören has improved pygrass vector API in trunk (thank you Sören), but
now in the grass-addons repository I have to choose if I want to
support grass-stable or grass-trunk, I have to specify this somewhere
in the manual and hope user read it. Instead I could create a python
package and installing r.green (v0.3) will use grass-lib(v7.0.x) or
installing r.green(v0.4) will use grass-lib(v7.1.x). Moreover I could
also have the module documentation using sphinx, instead of writing
html code.
Perhaps in the future I would like to add a dedicated GUI to this set
of modules or a web interface and then I will have more dependencies,
and make almost impossible to use this new features for an average
user.
So basically we can remove g.extension and rely on pip or we have to
reinvent the wheel to get these functionalities in g.extension.
I think this idea could help mainly developers making things clear and
well organized in different sub-projects.
I think I need more explanation in order to really understand what the
advantage would be.
Sorry If I was not clear, hopefully now it is a bit clearer what kind
of advantages we (as developers) could have.
Currently everything resides in the same repository, but there is a
distinction between modules (raster, vector, db, etc directories of the
source tree), and libraries (lib).
I'm not sure what the great advantage of separating them would be
To me a better separation of the code functionalities could help us to
(just brainstorming):
1) reduce the amount of code removing duplication and redundancy;
2) get less functions better documented and tested;
3) make for new developers easier to contribute to the project because
the can contribute to a smaller repository with very clear aims and
objectives;
4) reach an higher level of abstraction
5) simplify the use of grass as a library for other tools (e.g.
postgis, spatialite, qgis, gdal, etc).
honestly I'm afraid that if we start to have modules in versions that only
run with specific version of a separate library package, we will have a
flood of new user mails complaining about things not working. The current
state of affairs at least guarantees that things are in sync.
I've never had problem with python on this, and I see only advantages
to be able to specify which are the supported version for what. So
user could not mix versions because pip check this kind of issues. And
developers can have a faster release cycle.
Opening the possibility to integrate GRASS functionalities to other
open-source projects.
I don't know why other open-source projects cannot use GRASS functionality
currently. There's pygrass if people want direct library access, you can
always use system calls to modules, even more easily so with the new --exec
parameter to the startup script.
yes, indeed we are going on this direction, I think we can do in a
more effective way.
Making thing easier for both users and developers.
This solution could help also in making things easier also for
packager and users, for example users could install GRASS on all the
system (win/Mac/*nix) running a single command:
{{{
$ pip install --user grass-lib grass-cli grass-modules grass-wx
}}}
I can already do that today in Debian:
apt-get install grass-core grass-gui grass-dev grass-doc
But that's a packaging issue.
No, it isn't only a packaging issue it is a more general approach on
how to face problems and organizing the code.
What do you think?
As I said, I find the whole approach a bit too Pythonic. Unless we decide
that GRASS is going to become a Python project with a bit of C-libraries
thrown in for performance, I need more convincing to buy this.
Mhh.. The C functionalities should stay, but I do think the python
part should became more Pythonic or better, well integrated in the
Python ecosystem and I think we could improve on this side.
Why are you worry to be too Pythonic?
Have a nice weekend.
Pietro
[0] https://pypi.python.org/pypi/GDAL
[1] https://pypi.python.org/pypi/pyproj
[2] https://pypi.python.org/pypi/mapnik2