Comments on scripts, modeling and Grass

I comment here on the several reactions to my message
"Re: Forestry and Grass + scripts for models"

Issue 1: Are "classic" programming languages (i.e., FORTRAN, c, c++...)
adequate for GIS modeling?

Rich Shepard <rshepard@appl-ecosys.com>

" It was only when I moved to microcomputers and C (in the mid-1980's) that
I learned the value of commenting code. Scientists aren't taught this;
programmers are.

My code now is at least 50% comments; often much more. I explain why I did
something a certain way or what I was trying to accomplish. The effort is
repaid manyfold when I look at old code years later. :-)"

Well, this demonstrates that C is not a good language for modeling, as
you have
to repeat each line in a different language. Would English be an adequate
language for this
message if I had to repeat every paragraph in a different language because
virtually nobody (even not myself!) could understand what I'm saying?
(OK,
sometimes my English can be so obscure that this problem could actually
be real!).

Note that I'm talking about modeling. I understand that, in other
applications,
FORTRAN, C, C++,..., can be the best alternative. But, in modeling, often you
cannot evaluate the program just by the result. If you write a program to
display an image, you can easily detect an error in the code because
the display would be wrong. Or if you write a program to calculate
an FFT you can use a simple input and compare the result to your own
computation
or to the result produced by another FFT program: the result is predictable.
Instead, in modeling, the result is not predictable (yet): that's why
you make the model. Also, you can have a CORRECT result out of a WRONG model:
the result must be produced by processes that are physically, biologically,
ecologically... consistent, and this should be assessed by other scientists
being able to read the code. Which implies a simple syntax with, ideally,
one action per line.

Bernhard Reiter <bernhard@uwm.edu> has a different opinion:

This is pretty much a question of style. Good C programs a very readable.
(C++ is more difficult.)

I think that this could be true for very skilled C "writers" and "readers"
conforming to the same style (and not only to the same language!). I would
argue that the time devoted to the programming itself would be of the same
order of magnitude that the time devoted to think on the processes to be
modeled. Take the example of writing this message: I can concentrate
on the idea that I want to communicate and spend a minimum time on the
language itself (I know, most of you would prefer that I spend more
time on improving my language..., but you get the idea)

Issue 2: Are the scripting languages around convenient alternatives?

Bernhard Reiter <bernhard@uwm.edu>:

Any scripting language will do that. The shell does it.
I would recommend python. (www.python.org)
But you also mean GRASS functions.

It is clear that almost ANY scripting language (even the c-shell)
can be used for a simple flow of grass commands: most of us use this method
every day. The problem is when, within the model, you must
ANALYSE the result of a grass command and the subsequent
actions depend upon the result of this analysis.

Leonard Coop <coopl@ava.bcc.orst.edu>:

Here is a snippet from a perl script (I didn't write it but I use it)
running GRASS showing how you can use perl files as standard i/o to
write to GRASS functions, in this case r.mapcalc:

   open RMAPCALC, "|r.mapcalc" or die "Can't launch r.mapcalc";
   # STEP 5: ADD CORRECTIONS
   print RMAPCALC "$corrname = ($basename * $rationame)/1000\n";
   # STEPS 6: FORCE NEGATIVES TO ZERO
   print RMAPCALC "$outname = max($corrname,0)\n";
   close RMAPCALC;

Well, here we have 5 lines for what should be just one,
r.mapcalc "out = max( ($basename * $rationame)/1000, 0)"

...and not to tell about the annoying and redundant "$": if we write
a program, the default should be that the names of the variables represent
the variables, or should I sign this message as $AgustinLobo ? If I sign
as Agustin Lobo, would anybody understand that my name has written the
message?

Bernhard Reiter <bernhard@uwm.edu>:

Perl generally is a greater mess than "C". It is much harder to write
good structured programs in perl.

But,

"Roderick A. Anderson" <raanders@altoplanos.net>:

Despite what others may think or say perl is an excellent language for
simple to very complex projects. It is only as obscure as you want it to
be.

May be we could do the following: use a short model as an example and code
it in different scripting languages.

ISSUE 3. Are "high level" languages (i.e., Splus (R as free alternative),
Matlab (Octave as free alternative), IDL ...) adequate for GIS modeling?

Rich Shepard <rshepard@appl-ecosys.com>:

I've know of S-Plus and I just discovered R. I think that it
would be ideal to link GRASS and R for analytical purposes.

"Pete St. Onge" <pete@seul.org>

Alternatively, R has a batch mode that can be called from the shell. You
could actually build an R script programmatically from another app once
the data files have been created, then launch R to do batch processing

by running the script.

Yes, as with Splus and, probably, Matlab. The problem, as I said, is
that the communication between R and Grass would be through writing
and reading files: this is slow (remember a model can have thousands
of iterations) and uses more disk space (which is becoming a minor
problem, though).

Another problem with Splus (I think that R is better but don't know
how much better) is that it is very inefficient with FOR loops, as it
allocates all the memory at once. This is a big problem for simulations,
that typically have long loops.

ISSUE 4. The use of Grass in modeling

Bruce Byars <Bruce_Byars@baylor.edu>

What we have been very successful in is using grass
as a kind of "front-end" processor to develop/analyze/display data
for use in complex stand-alone models. In our experience this is
the best of both worlds. It is not really time-consuming to write
either scripts or C programs for this type of data development.
In this way, a user can use whatever model they are comfortable
with, using grass for data development and display.

For example, we use grass to develop input data into the SWAT
watershed hydrology model. SWAT is a very large and complex
model, but using grass we significantly cut down the time to develop
inputs and analyze results spatially.

Yes, this is an obvious application. But I think that Grass could
(should) be
used for the actual modeling as well.

Final comment: using C for writing is like making a toy out of raw
plastic material:
you can do whatever you want but it is difficult and, probably, the parts
will not
be used by others. Using grass commands within a script (either a shell
script
or Splus, Matlab etc) is more like using a LEGO: you have more constraints,
but the parts are standard, which implies that many others use them and can
detect errors.

Agus

Dr. Agustin Lobo
Instituto de Ciencias de la Tierra (CSIC)
Lluis Sole Sabaris s/n
08028 Barcelona SPAIN
tel 34 93409 5410
fax 34 93411 0012
alobo@ija.csic.es
http://pangea.ija.csic.es/alobo

Hello All

I don't know whether this is relevant to this disucssion, but as I recall,
when the first PC version of Grass was launched by LAS about 5 years ago,
it contained some sort of Grass 'Visual Toolbox', allowing you to link
various Grass routines together directly through the GUI. The publicity
gave the impression that this would be similar to linking modelling
components together in a program such as ModelMaker. I've not tried
Grassland, so I can't say how good this approach is, but has anyone tried
to use this feature regularly, and would an extension of that approach
provide both power and simplicity for the non-programmer that many comments
suggest? Unfortunately, I suspect that the code is not in the public
domain however.

Roy

On Wed, 27 Oct 1999, Agustin Lobo wrote:

Well, this demonstrates that C is not a good language for modeling, as you
have to repeat each line in a different language. Would English be an
adequate language for this message if I had to repeat every paragraph in a
different language because virtually nobody (even not myself!) could
understand what I'm saying? (OK, sometimes my English can be so obscure
that this problem could actually be real!).

Augustin,

  I did not make myself clear. C code is very easy to read and understand,
however, regardless of what language one uses (C, FORTRAN, English or
Spanish), _why_ you did something a certain way or in a certain order is not
always intuitively obvious.

  Let me try an analogy or two: Have you ever tried baking breads? Do you
realize that the sequence in which the ingredients are added can make a big
difference in the texture and taste of the final product? That's why recipes
are written down. And, if we try to do it from memory (and we haven't made
that recipe in a while), we may be disappointed with the results if we
change the sequence.

  Another situation: major construction work in the city's downtown, core
area. We read about it in the newspaper, but we forget just how we got to
our destination. The next time we need to go to the same place, we get stuck
in traffic and wonder how we managed to avoid it the last time.

  In both analogies (probably pretty poor for 6 am), the material (bread
making technique or vehicle type) is not the issue. The issue is the
process. Same with coding models. It's not the complexity of the language
which needs commenting, it's the algorithm.

  If you want to use something written in English (for example), try writing
scientific models in COBOL. Plain English words. But, my son told me when he
wrote insurance company software as his first job out of college, it can be
as obtuse to understand as the most obfuscated C.

Note that I'm talking about modeling. I understand that, in other
applications, FORTRAN, C, C++,..., can be the best alternative. But, in
modeling, often you cannot evaluate the program just by the result. If you
write a program to display an image, you can easily detect an error in the
code because the display would be wrong. Or if you write a program to
calculate an FFT you can use a simple input and compare the result to your
own computation or to the result produced by another FFT program: the
result is predictable. Instead, in modeling, the result is not predictable
(yet): that's why you make the model. Also, you can have a CORRECT result
out of a WRONG model: the result must be produced by processes that are
physically, biologically, ecologically... consistent, and this should be
assessed by other scientists being able to read the code. Which implies a
simple syntax with, ideally, one action per line.

  My point exactly. The coding langguage is immaterial. You want a way to
understand the program's logic and algorithm. I suggest that comments in the
source code are the only way to do that. Well, ... you could always write an
accompanying "users manual" which explained the logic and algorithms, but
then that would probably get misplaced.

  Trying to identify the one, "right" language or tool will lead you in
circles. We've all suffered through discussions (arguments, flame wars,
whatever) over which programming editor is best (emacs vs. vi in the world
of unices), which linux distribution is best, which programming language is
best, and so on. In my opinion, two of the reasons there are so many
scripting languages available in the world of linux are 1) someone wanted to
do something which is not well or easily done with existing tools and 2)
there's ego gratification to produce a tool used and admired by others.

  So much depends on what you want to do. I prefer to grab the closest tool
-- the one I know the best -- to solve the problem and move on with the
answer. Others perfer to focus on selecting exactly the right tool for the
job, even if that takes precedence over getting the job done. We all have
different priorities, needs and preferences. One of the things I like best
about the linux environment is the abundant choice of tools and solutions.
Everything will work, and we can each pick our solution of choice.

Rich

Dr. Richard B. Shepard, President

                       Applied Ecosystem Services, Inc. (TM)
              Making environmentally-responsible mining happen. (SM)
                       --------------------------------
            2404 SW 22nd Street | Troutdale, OR 97060-1247 | U.S.A.
+ 1 503-667-4517 (voice) | + 1 503-667-8863 (fax) | rshepard@appl-ecosys.com

Roy Sanderson wrote:

Hello All

I don't know whether this is relevant to this disucssion, but as I recall,
when the first PC version of Grass was launched by LAS about 5 years ago,
it contained some sort of Grass 'Visual Toolbox', allowing you to link
various Grass routines together directly through the GUI. The publicity
gave the impression that this would be similar to linking modelling
components together in a program such as ModelMaker. I've not tried
Grassland, so I can't say how good this approach is, but has anyone tried
to use this feature regularly, and would an extension of that approach
provide both power and simplicity for the non-programmer that many comments
suggest? Unfortunately, I suspect that the code is not in the public
domain however.

Roy

I was never very impressed with the implementation of this feature in
Grassland, as the various options and values were not graphically
presented. In other words, the visual representation of the model did
not tell you all the details that you'd want to know to understand the
relationship between components. ESRI is working on something along this
line for ArcView Spatial Analyst; it'll be interesting to see if they do
a better job.
--
Best regards,
  -Malcolm

Malcolm D. Williamson malcolm@cast.uark.edu
Center for Advanced Spatial Technologies Voice: 501-575-2734
12 Ozark Hall Fax: 501-575-5218
University of Arkansas, Fayetteville AR 72701
http://www.cast.uark.edu/

This is a very interesting discussion,
I only want to add two things:

  From my perspective, we are should have more structure in this
discussion, because Rich is right about the difference between
algorithms and their implementation. The problem on how to select
the programming language you want to use is a general one and quite
common in computer science. In our discussion we mixed a few issues
that are better treated seperatly. Like modularisation, programming
style and modelling.

  Secondly I want to add to the discussion about programming
languages in generall. Yes, there are better and worse programming languages
as someone would expect. And the style matters a lot.
A good programmer can use almost any language to write good readable
programs. But it is easier with some languages to do what you want to do.

For that matter a set of modern all purpose "glue" languages
are on the rise. You can use them for everything, but thier main
advantage is that they can call libraries and other programs and
functions very easily.

Better known languages are:
  perl
  python
  java
  guile (/scheme)

On Wed, Oct 27, 1999 at 11:55:26AM +0100, Agustin Lobo wrote:

Or if you write a program to calculate an FFT you can use a simple
input and compare the result to your own computation or to the result
produced by another FFT program: the result is predictable. Instead,
in modeling, the result is not predictable (yet): that's why you make
the model. Also, you can have a CORRECT result out of a WRONG model:
the result must be produced by processes that are physically,
biologically, ecologically... consistent, and this should be assessed
by other scientists being able to read the code. Which implies a
simple syntax with, ideally, one action per line.

We have another problem here: How to model.
And yes, the model implementation has to be accessible to other
scientist. I did quite a bit of work in implementing models
(e.g. CemoS (nonspatial exposure modelling)
http://www.usf.uni-osnabrueck.de/projects/CemoS/CemoS.en.html
and http://www.usf.uni-osnabrueck.de/projects/GREAT-ER/GREAT-ER.html )
and I can tell you that most models are not designed to be that way,
because they start out as experiements of scientists.
We tried to do better, though.

Bernhard Reiter <bernhard@uwm.edu> has a different opinion:

>This is pretty much a question of style. Good C programs a very readable.
>(C++ is more difficult.)

I think that this could be true for very skilled C "writers" and "readers"
conforming to the same style (and not only to the same language!). I would
argue that the time devoted to the programming itself would be of the same
order of magnitude that the time devoted to think on the processes to be
modeled. Take the example of writing this message: I can concentrate
on the idea that I want to communicate and spend a minimum time on the
language itself (I know, most of you would prefer that I spend more
time on improving my language..., but you get the idea)

I am talking about the ability to express yourself in a programming
language. But you are right the resulting program for the most part is
the content. That means that a good programming language has to
have the ability to express your mental model fast and accurate.

Issue 2: Are the scripting languages around convenient alternatives?

Wether you have scripting language or not it not the questions as the
borders are more and more blurring.

[I have not referred to the programming example, as it is very
language specific and there are a lot of details to that discussion. ]

Bernhard Reiter <bernhard@uwm.edu>:
>Perl generally is a greater mess than "C". It is much harder to write
>good structured programs in perl.

"Roderick A. Anderson" <raanders@altoplanos.net>:
>Despite what others may think or say perl is an excellent language for
>simple to very complex projects. It is only as obscure as you want it to
>be.

True.
But from the programming language theory and usability studies,
python is a lot better in that respect.
  http://www.python.org/doc/Comparisons.html
Warning: This is a point where religious wars start.
I am not saying that perl is bad. It is nice, if you can handle it.

May be we could do the following: use a short model as an example and code
it in different scripting languages.

It would depend on how the integrate with the tast.
Usually you build object-orientated packages to aid such a task.

ISSUE 3. Are "high level" languages (i.e., Splus (R as free alternative),
Matlab (Octave as free alternative), IDL ...) adequate for GIS modeling?

Highlevel in this respect means specialisted for one topic.
This is not what you would call a highlevel programming language
among programming languages. :slight_smile:
The problem with specialisted languages is that they require you to
learn more.

You could use a "glue" language to drive both.

  Bernhard

--
Research Assistant, Geog Dept UM-Milwaukee, USA. (www.uwm.edu/~bernhard)
Free Software Projects and Consulting (ntevation.de)
Association for a Free Informational Infrastructure (ffii.org)

I agree with many of the latest comments, but I also think that
we should try to address a more concrete question now.

We should not get into the discussion of what computing language
is best, in general. This language just does not exist: information on the
different programming languages is reasonably well distributed
and most people programming are not stupid and try to use at least a
good one. Therefore, under these circumstances, if such an optimal
language existed it would have been selected and the
rest would have gone extinct. But, as in Nature, such an optimality
just does not exist and there are multiple solutions that survive
because can do reasonably well somewhere, sometime.

The question here is to decide whether we should select a
language to become the standard for "gluing" grass commands
into more complex processes. Officially, the favored
language is just the bsh since Grass3.x. I'm guilty of the
crime of not having ever followed such advice and used
the csh. The true is that, for simple flows of grass commands,
including simple interactions with the user, even the bsh or
the csh can do the job. The point here is to decide whether we
could select a different language to deal with
more complicated processes. The alternatives are:

1. Use a classic programming language, such C.
2. Use a classic script language, such bsh, csh, tcl.
3. Use a "modern" script language, such perl, python...
(sorry here: the distinction between "classic" and "modern"
scripting languages is initially based on "the-ones-I-know"
and "the-ones-that-I-don't-know", but some inputs in this
discussion seem to suggest that this distinction is correct).
4. Use a high level language, such R, Octave (S, Matlab
and IDL are not public domain).

I think that we have now points for and against all of these
options and perhaps should make a pause in the discussion.
I will try to prepare an example that could let us
try the different options.

Having a common decision would be of benefit for everyone, as we
could share a sort of meta-Grass.

Agus

Dr. Agustin Lobo
Instituto de Ciencias de la Tierra (CSIC)
Lluis Sole Sabaris s/n
08028 Barcelona SPAIN
tel 34 93409 5410
fax 34 93411 0012
alobo@ija.csic.es
http://pangea.ija.csic.es/alobo

I evaluated an student version of Grassland and we even
converted some of our csh scripts to tcl (a pain).
The Visual toolbox was probably there because
writing tcl scripts is more difficult than writing
csh scripts. Anyway, these toolboxes (originally
developped in Khoros, then in Erdas imagine and others)
can only write simple flows and I think that they are
more nice tools for sale-persons to atract custommers
than real useful instruments.

In other aspects, Grassland had many bugs but was impressive
in many aspects, for example: the display was independent of
the projection of the location, so you could display your
data in a different projection. Also, you could call Excell
files linked to your sites maps.

On Wed, 27 Oct 1999, Roy Sanderson wrote:

Hello All

I don't know whether this is relevant to this disucssion, but as I recall,
when the first PC version of Grass was launched by LAS about 5 years ago,
it contained some sort of Grass 'Visual Toolbox', allowing you to link
various Grass routines together directly through the GUI. The publicity
gave the impression that this would be similar to linking modelling
components together in a program such as ModelMaker. I've not tried
Grassland, so I can't say how good this approach is, but has anyone tried
to use this feature regularly, and would an extension of that approach
provide both power and simplicity for the non-programmer that many comments
suggest? Unfortunately, I suspect that the code is not in the public
domain however.

Roy

On Thu, 28 Oct 1999, Agustin Lobo wrote:

The question here is to decide whether we should select a language to
become the standard for "gluing" grass commands into more complex
processes. Officially, the favored language is just the bsh since
Grass3.x. I'm guilty of the crime of not having ever followed such advice
and used the csh. The true is that, for simple flows of grass commands,
including simple interactions with the user, even the bsh or the csh can
do the job. The point here is to decide whether we could select a
different language to deal with more complicated processes. The
alternatives are:

Agus,

  Allow me to suggest that this is not the best use of everyone's time and
effort. As you point out in the paragraph following the one I quote above,
you know shell scripting, but not more generic scripting languages. We're
all in a similar position.

  My opinion on establishing a "standard" meta-language for GRASS is to not
do it.

  We all have different needs and different skills. Whatever works best for
each of us is good enough. Even if we want to package a particular solution
for distribution to others, we can do so using the language of our choice.
The end use will not care less which language we choose.

  However, there is value in providing a set of common hooks (which I
believe already exists within GRASS) to pipe input and output not only
among GRASS modules but also with external programs. As long as this API is
well documented we can use whatever "glue" language we prefer.

  Personally, I like C, and I always marvel at the proliferation of
scripting languages. New ones show up quite frequently on freshmeat and I
have to keep at least a half-dozen on my systems because different
application programs are dependent on different scripting languages. Puzzles
me no end why we have this situation. But, I'm content to remain puzzled.
:slight_smile:

  I'm curious, and not challanging you with any negative intent, to know
what prompted you to raise this issue? Specifically, what is the problem for
which you are seeking a solution? Again, this is just idle curiosity as I
work my way through the first mug of coffee this morning.

Rich

Dr. Richard B. Shepard, President

                       Applied Ecosystem Services, Inc. (TM)
              Making environmentally-responsible mining happen. (SM)
                       --------------------------------
            2404 SW 22nd Street | Troutdale, OR 97060-1247 | U.S.A.
+ 1 503-667-4517 (voice) | + 1 503-667-8863 (fax) | rshepard@appl-ecosys.com

On Thu, Oct 28, 1999 at 06:15:34AM -0700, Rich Shepard wrote:

On Thu, 28 Oct 1999, Agustin Lobo wrote:

> The question here is to decide whether we should select a language to
> become the standard for "gluing" grass commands into more complex
> processes.

  My opinion on establishing a "standard" meta-language for GRASS is to not
do it.

I second that.
There would be advantages in doing so, but staying independent is even
better in the long run.

  However, there is value in providing a set of common hooks (which I
believe already exists within GRASS) to pipe input and output not only
among GRASS modules but also with external programs. As long as this API is
well documented we can use whatever "glue" language we prefer.

Correct.
If the hooks were some distributed object protocol, that would make
life easier for "glueing". Standards for that are CORBA and XML-RPC.

Did you look at www.swig.org, which actually is technology to provide
such hooks to higher level programming languages.

  Personally, I like C, and I always marvel at the proliferation of
scripting languages.

Well there is a reson for that.
But let me add that it is not a scripting/interpreted versus compiled
languages issue. It is about modern programming paradigms, like
object-orientated and dynamic. And you can program better in them for
most, but not all tasks.

New ones show up quite frequently on freshmeat

perl, python and scheme are around a long long time. :slight_smile:

  Bernhard
--
Research Assistant, Geog Dept UM-Milwaukee, USA. (www.uwm.edu/~bernhard)
Free Software Projects and Consulting (intevation.de)
Association for a Free Informational Infrastructure (ffii.org)