[GRASS-dev] GAL Framework and GRASS rasters

Hello GRASS-dev mailing list.

Some time passed and I've been working on desing an implementation of GAL
Framework. Currently core parts of component architecture with remote
execution of interface methods with D-Bus library are working and bindings to
Python are almost finished so it's time to start thinking about next steps.

I heard from Martin Landa that there is need to rewrite GRASS's raster library
so there may be an intersection of GRASS and GAL Framework development
effort. If you'd be interested, we may cooperate to design and implement
raster data internal (file storage, DB storage) and presentation (API)
representation in pure C and I'll then just enwrap this code for GAL
purposes.

I can spend next few months on this so is there a will to do something with
rasters in GRASS? Is anyone working on this currently? Is there an
intentions, ideas, constraints or anything else I should know about?

Thanks for your responses and if you are interested in GAL itself, you may
visit its homepage at http://gal-framework.no-ip.org, see and article written
for Geoinformatics FCE CTU 2007 seminar at
http://geoinformatics.fsv.cvut.cz/wiki/index.php/GAL_Framework, or contact me
to get further informations.

--
Bc. Radek Bartoň

Faculty of Information Technology
Brno University of Technology

E-mail: xbarto33@stud.fit.vutbr.cz
Web: http://blackhex.no-ip.org
Jabber: blackhex@jabber.cz

Hi Radek.
I think the proposal is quite interesting, and could be useful
especially if:
- integrated with the idea of using a variety of raster formats (e.g.
through gdal, via a r.external command)
- could really act as a glue between grass and the (apparently powerful)
terralib
pc

Radek Bartoň ha scritto:

Hello GRASS-dev mailing list.

Some time passed and I've been working on desing an implementation of GAL
Framework. Currently core parts of component architecture with remote
execution of interface methods with D-Bus library are working and bindings to
Python are almost finished so it's time to start thinking about next steps.

I heard from Martin Landa that there is need to rewrite GRASS's raster library
so there may be an intersection of GRASS and GAL Framework development
effort. If you'd be interested, we may cooperate to design and implement
raster data internal (file storage, DB storage) and presentation (API)
representation in pure C and I'll then just enwrap this code for GAL
purposes.

I can spend next few months on this so is there a will to do something with
rasters in GRASS? Is anyone working on this currently? Is there an
intentions, ideas, constraints or anything else I should know about?

Thanks for your responses and if you are interested in GAL itself, you may
visit its homepage at http://gal-framework.no-ip.org, see and article written
for Geoinformatics FCE CTU 2007 seminar at
http://geoinformatics.fsv.cvut.cz/wiki/index.php/GAL_Framework, or contact me
to get further informations.

--
Paolo Cavallini, see: http://www.faunalia.it/pc

Dne Wednesday 17 of October 2007 15:46:34 Paolo Cavallini napsal(a):

Hi Radek.

Hi Paolo.

I think the proposal is quite interesting, and could be useful
especially if:
- integrated with the idea of using a variety of raster formats (e.g.
through gdal, via a r.external command)
- could really act as a glue between grass and the (apparently powerful)
terralib
pc

Yes, that was my intentions and rationale for making a GAL Framework.

Now I would like to know what intentions with rasters do GRASS developers
have. What would you like to keep and what would you like to change? Then
there could be made low-level C API for both projects.

--
Bc. Radek Bartoň

Faculty of Information Technology
Brno University of Technology

E-mail: xbarto33@stud.fit.vutbr.cz
Web: http://blackhex.no-ip.org
Jabber: blackhex@jabber.cz

Radek Bartoň-3 wrote:

Now I would like to know what intentions with rasters do GRASS developers
have. What would you like to keep and what would you like to change? Then
there could be made low-level C API for both projects.

We are collecting ideas for GRASS 7 at
http://grass.gdf-hannover.de/wiki/GRASS_7_ideas_collection#Raster

Markus
--
View this message in context: http://www.nabble.com/GAL-Framework-and-GRASS-rasters-tf4636214.html#a13281380
Sent from the Grass - Dev mailing list archive at Nabble.com.

Dne Thursday 18 of October 2007 21:01:47 Markus Neteler napsal(a):

We are collecting ideas for GRASS 7 at
http://grass.gdf-hannover.de/wiki/GRASS_7_ideas_collection#Raster

Markus

Thanks, I read that and I found this page
http://grass.gdf-hannover.de/wiki/Replacement_raster_format too. I have some
first thoughts. Should I discuss them first here or I may just edit that
page?

--
Bc. Radek Bartoň

Faculty of Information Technology
Brno University of Technology

E-mail: xbarto33@stud.fit.vutbr.cz
Web: http://blackhex.no-ip.org
Jabber: blackhex@jabber.cz

Ahoj Radku,

2007/10/22, Radek Bartoò <xbarto33@stud.fit.vutbr.cz>:

Dne Thursday 18 of October 2007 21:01:47 Markus Neteler napsal(a):
>
> We are collecting ideas for GRASS 7 at
> http://grass.gdf-hannover.de/wiki/GRASS_7_ideas_collection#Raster
>
> Markus

Thanks, I read that and I found this page
http://grass.gdf-hannover.de/wiki/Replacement_raster_format too. I have some
first thoughts. Should I discuss them first here or I may just edit that
page?

I guess better to edit wiki-page directly... All ideas shoud be
collected there...

Martin

--
Bc. Radek Bartoò

Faculty of Information Technology
Brno University of Technology

E-mail: xbarto33@stud.fit.vutbr.cz
Web: http://blackhex.no-ip.org
Jabber: blackhex@jabber.cz

_______________________________________________
grass-dev mailing list
grass-dev@grass.itc.it
http://grass.itc.it/mailman/listinfo/grass-dev

--
Martin Landa <landa.martin@gmail.com> * http://gama.fsv.cvut.cz/~landa *

Markus:

> > We are collecting ideas for GRASS 7 at
> > http://grass.gdf-hannover.de/wiki/GRASS_7_ideas_collection#Raster

Radek:

> Thanks, I read that and I found this page
> http://grass.gdf-hannover.de/wiki/Replacement_raster_format too. I have
> some first thoughts. Should I discuss them first here or I may just edit
> that page?

Martin:

I guess better to edit wiki-page directly... All ideas shoud be
collected there...

I would say discuss ideas on the mailing list then summarize discussion on the
wiki. Otherwise there is no discussion, just the last person to edit the wiki
page sets the (supposed) direction. Also ideas get much wider exposure on the
mailing list in the short term. The key of course is getting an objective &
unbiased expert to post the mailing summary on the wiki like kernel trap :slight_smile: And
also to realize that I expect the wiki page will probably only ever be a list
of ideas from the community, not a firm specification which the folks doing the
actual coding will stringently follow. (unless those coders use the wiki as
their central planning tool)
Of course ideas from all corners are important so don't stop editing the wiki!

2c,
Hamish

__________________________________________________
Do You Yahoo!?
Tired of spam? Yahoo! Mail has the best spam protection around
http://mail.yahoo.com

Dne Tuesday 23 of October 2007 21:53:27 Hamish napsal(a):

I would say discuss ideas on the mailing list then summarize discussion on
the wiki. Otherwise there is no discussion, just the last person to edit
the wiki page sets the (supposed) direction. Also ideas get much wider
exposure on the mailing list in the short term.

OK, here are some ideas then:

1. In reply to discussion about metadata:

What about to store all metadata of raster layer in single database instead of
in XML or other files. It don't have to be full-featured DBMS like
PostgreSQL. Using single file with SQLite database would be for this purpose
enough and I think that most suitable.

Pros and cons:

+ Better performance for greater number of layers and metadata.
- Worse performance for lesser nuber of layers and metadata.
+ Easy way for implementation of metadata searching or similar operations.
- Metadata are separated from data in case you want to transfer layers from
one environment to another directly.

2. In reply to discussion about 3D rasters and time series.

What about to approach rasters like 1-N dimensional "Rubik's" cubes? There
would be specified methods for certain dimension swapping, warping using
certain functions like sumarization, average, clipping, and it would be up to
library user to decide what each of dimensions mean.

3. In reply to pyramids vs. tiles discussion:

I think we should specify raster library API first, so we define what we want
to achieve, and think about certain aspects of implementation lately. Of
course it won't be that simple as it sounds and it would need some portion of
programming experience to handle this but, I think, that in this case this
should be transparent because library user would want to get some area of
raster data and won't care how they are organized internally.

On the other hand different approaches are suitable for different tasks. In
this case, for example, is tile based structure more suitable for computing
in rasters and pyramids are more suitable for data visualisation on Web or in
3D. I think that core API should be implementation independent but there
should be defined some sort of extension mechanism too (like i. e. in
OpenGL). Such an extension would then approach rasters like tiles and there
would be other extension which would make a quad tree structure over tiles
for pyramids support.

4. About function singature format:

If we stick into function sigature format like:

int G_function_name(<structure> * context, <argument>, <argument>, ...,
<out_argument>, <out_argument>, ...);

when return value is a error code and first argument is a pointer to structure
carrying information about context (mapset, layer, etc.), it would be easier
to implement reentrance of such functions and bindings of API to object
oriented laguages like C++ or Pytion although SWIG would need special note
about output arguments in interface files.

Please comment this ideas, but please don't eat me alive. :slight_smile:

--
Bc. Radek Bartoň

Faculty of Information Technology
Brno University of Technology

E-mail: xbarto33@stud.fit.vutbr.cz
Web: http://blackhex.no-ip.org
Jabber: blackhex@jabber.cz

Radek Bartoò wrote:

> I would say discuss ideas on the mailing list then summarize discussion on
> the wiki. Otherwise there is no discussion, just the last person to edit
> the wiki page sets the (supposed) direction. Also ideas get much wider
> exposure on the mailing list in the short term.

OK, here are some ideas then:

1. In reply to discussion about metadata:

What about to store all metadata of raster layer in single database instead of
in XML or other files. It don't have to be full-featured DBMS like
PostgreSQL. Using single file with SQLite database would be for this purpose
enough and I think that most suitable.

Pros and cons:

+ Better performance for greater number of layers and metadata.
- Worse performance for lesser nuber of layers and metadata.
+ Easy way for implementation of metadata searching or similar operations.
- Metadata are separated from data in case you want to transfer layers from
one environment to another directly.

It might be useful for category attributes (i.e. the way it's used for
vectors), but I don't think that it makes sense for per-map metadata.

2. In reply to discussion about 3D rasters and time series.

What about to approach rasters like 1-N dimensional "Rubik's" cubes? There
would be specified methods for certain dimension swapping, warping using
certain functions like sumarization, average, clipping, and it would be up to
library user to decide what each of dimensions mean.

One of the main issues here is efficiency. The existing mechanism
means that you don't have to read and decode the entire map if you're
working at a reduced resolution; any rows which aren't used in the
scaled-down version are simply skipped.

3. In reply to pyramids vs. tiles discussion:

I think we should specify raster library API first, so we define what we want
to achieve, and think about certain aspects of implementation lately. Of
course it won't be that simple as it sounds and it would need some portion of
programming experience to handle this but, I think, that in this case this
should be transparent because library user would want to get some area of
raster data and won't care how they are organized internally.

Unless you're planning on re-writing much of GRASS from scratch, you
need to preserve the G_get_raster_row() etc interface. You can have
other interfaces as well, but the legacy interface must remain for the
foreseeable future.

4. About function singature format:

If we stick into function sigature format like:

int G_function_name(<structure> * context, <argument>, <argument>, ...,
<out_argument>, <out_argument>, ...);

when return value is a error code and first argument is a pointer to structure
carrying information about context (mapset, layer, etc.), it would be easier
to implement reentrance of such functions and bindings of API to object
oriented laguages like C++ or Pytion although SWIG would need special note
about output arguments in interface files.

This amounts to a complete re-write. Apart from the vast number of
static variables (libgis alone has ~180), each library has its own
context, so e.g. most vector functions would need both a vector
context and a libgis context; higher-level libraries would need even
more.

--
Glynn Clements <glynn@gclements.plus.com>

Dne Wednesday 24 of October 2007 02:41:41 Glynn Clements napsal(a):

It might be useful for category attributes (i.e. the way it's used for
vectors), but I don't think that it makes sense for per-map metadata.

Can you explain why it doesn't make sense in more detail, please? I think that
using SQLite .db file is not so much complex than any configuration file
format and it can offer so much better possibilities for implemetation of any
operation over such metadata. Although another disadvantage is that this
format is not directly readable. I apply to Matin Landa's presentation about
lack of metadata support in GRASS.

> What about to approach rasters like 1-N dimensional "Rubik's" cubes?
> There would be specified methods for certain dimension swapping, warping
> using certain functions like sumarization, average, clipping, and it
> would be up to library user to decide what each of dimensions mean.

One of the main issues here is efficiency. The existing mechanism
means that you don't have to read and decode the entire map if you're
working at a reduced resolution; any rows which aren't used in the
scaled-down version are simply skipped.

Of course, this kind of efficiency would have to be considered in that
hypercube approach too. I ment that by: "specified methods for certain
warping". User would specify what dimensions, in what resolution, with what
boundaries in what projection he/she wants and how to deal with "scaling"
(low pass filter, nearest value, reclass, etc.) by functions context. There
would need to be implemented advanced cache system and file-based or virtual
file system disk storage to provide that too.

Can anyone, please, explain me (quickly and on principle) how is done that
G_get_raster_row() returns raster row at resolution defined by setted window
efficiently without precomputed pyramids or similar process? I searched for
that kind of information in GRASS Programmer's Manual but with no luck and
I'm scared enough to dig out that information form source code. :slight_smile:

Unless you're planning on re-writing much of GRASS from scratch, you
need to preserve the G_get_raster_row() etc interface. You can have
other interfaces as well, but the legacy interface must remain for the
foreseeable future.

Some wrapper for less general interface over more general interface could be
written anytime. The problem would be efficiency. But there is a question you
need to answer yourself: Do you want rather top performace with low-level
approach or do you want scalability and extensibility and catch up
performance with distributive computation?

This amounts to a complete re-write. Apart from the vast number of
static variables (libgis alone has ~180), each library has its own
context, so e.g. most vector functions would need both a vector
context and a libgis context; higher-level libraries would need even
more.

I see, you are not willing to make a radical changes and this is OK for
correct release engeneering. Can anyone then collect at least list of
functions which are necessarily needed to preserve and which could be
deprecated to make picture what can be done with rasters and cause the least
pain?

--
Bc. Radek Bartoò

Faculty of Information Technology
Brno University of Technology

E-mail: xbarto33@stud.fit.vutbr.cz
Web: http://blackhex.no-ip.org
Jabber: blackhex@jabber.cz

Radek Bartoò wrote:

> It might be useful for category attributes (i.e. the way it's used for
> vectors), but I don't think that it makes sense for per-map metadata.

Can you explain why it doesn't make sense in more detail, please? I think that
using SQLite .db file is not so much complex than any configuration file
format and it can offer so much better possibilities for implemetation of any
operation over such metadata. Although another disadvantage is that this
format is not directly readable. I apply to Matin Landa's presentation about
lack of metadata support in GRASS.

In retrospect, I can see the point if you have one database for the
whole mapset. The downside is that it complicates sharing mapsets,
although that issue already applies to the attribute database.

> > What about to approach rasters like 1-N dimensional "Rubik's" cubes?
> > There would be specified methods for certain dimension swapping, warping
> > using certain functions like sumarization, average, clipping, and it
> > would be up to library user to decide what each of dimensions mean.
>
> One of the main issues here is efficiency. The existing mechanism
> means that you don't have to read and decode the entire map if you're
> working at a reduced resolution; any rows which aren't used in the
> scaled-down version are simply skipped.

Of course, this kind of efficiency would have to be considered in that
hypercube approach too. I ment that by: "specified methods for certain
warping". User would specify what dimensions, in what resolution, with what
boundaries in what projection he/she wants and how to deal with "scaling"
(low pass filter, nearest value, reclass, etc.) by functions context. There
would need to be implemented advanced cache system and file-based or virtual
file system disk storage to provide that too.

Can anyone, please, explain me (quickly and on principle) how is done that
G_get_raster_row() returns raster row at resolution defined by setted window
efficiently without precomputed pyramids or similar process? I searched for
that kind of information in GRASS Programmer's Manual but with no luck and
I'm scared enough to dig out that information form source code. :slight_smile:

The row argument passed to G_get_raster_row() is converted from the
region's grid to the raster's grid (by compute_window_row in
get_row.c) to determine which row to read from the file.

If the region resolution is coarser than the map's resolution, a
contiguous sequence of region rows will result in a discontiguous
sequences of raster rows.

> Unless you're planning on re-writing much of GRASS from scratch, you
> need to preserve the G_get_raster_row() etc interface. You can have
> other interfaces as well, but the legacy interface must remain for the
> foreseeable future.

Some wrapper for less general interface over more general interface could be
written anytime. The problem would be efficiency. But there is a question you
need to answer yourself: Do you want rather top performace with low-level
approach or do you want scalability and extensibility and catch up
performance with distributive computation?

The ability to perform "draft" calculations on huge maps in a
reasonable timeframe is quite important.

> This amounts to a complete re-write. Apart from the vast number of
> static variables (libgis alone has ~180), each library has its own
> context, so e.g. most vector functions would need both a vector
> context and a libgis context; higher-level libraries would need even
> more.

I see, you are not willing to make a radical changes and this is OK for
correct release engeneering. Can anyone then collect at least list of
functions which are necessarily needed to preserve and which could be
deprecated to make picture what can be done with rasters and cause the least
pain?

You can obtain details of function usage with the tools/sql.sh script.

--
Glynn Clements <glynn@gclements.plus.com>

Hi Glynn.

First of all, thanks for your answers.

Dne Sunday 28 of October 2007 22:46:05 Glynn Clements napsal(a):

Radek Bartoò wrote:

The row argument passed to G_get_raster_row() is converted from the
region's grid to the raster's grid (by compute_window_row in
get_row.c) to determine which row to read from the file.

If the region resolution is coarser than the map's resolution, a
contiguous sequence of region rows will result in a discontiguous
sequences of raster rows.

So some further computation over high-detailed data is needed.

Are the number of columns in returned row at finest resolution or at coarse?

Are rows from finer resolution that are not in coarse resolution just skipped
away or are there possibilities to apply for example low-pass filter or
reclassification during rescale?

--
Bc. Radek Bartoò

Faculty of Information Technology
Brno University of Technology

E-mail: xbarto33@stud.fit.vutbr.cz
Web: http://blackhex.no-ip.org
Jabber: blackhex@jabber.cz

Radek Bartoò wrote:

> The row argument passed to G_get_raster_row() is converted from the
> region's grid to the raster's grid (by compute_window_row in
> get_row.c) to determine which row to read from the file.
>
> If the region resolution is coarser than the map's resolution, a
> contiguous sequence of region rows will result in a discontiguous
> sequences of raster rows.

So some further computation over high-detailed data is needed.

Are the number of columns in returned row at finest resolution or at coarse?

Coarse. All raster data provided to modules conforms to the region
used by the module. This ensures consistency when reading multiple
raster maps with different grids. Unless explicitly set by the module,
the region is set automatically from $GRASS_REGION, the file specified
by $WIND_OVERRIDE, or the WIND file.

Most modules neither know nor care about the original grid of a raster
map. The main exceptions are the various r.resamp.* modules, which
specifically read the raster at its native resolution.

Are rows from finer resolution that are not in coarse resolution just skipped
away or are there possibilities to apply for example low-pass filter or
reclassification during rescale?

No, they're just skipped.

The automatic resampling is limited to nearest-neighbour resampling.
If you want anything else, you need to explicitly resample the map
with one of the r.resamp.* modules.

--
Glynn Clements <glynn@gclements.plus.com>