[GRASS-dev] r.mapcalc addition

Hi all,

It'd be great if someone could commit the attached addition to r.mapcalc

It basically allows one to specify the seed for random number
generation by adding a new function "erand(a,b,s)" which takes a and b
as the lower and upper limit, and s is a seed that is used to initiate
the pseudo random number generator.

This is useful in modelling and simulation so that particular
replicates can be rerun exactly.

If it needs any extra work let me know and I'll make the changes.

patched files (r.mapcalc.diff):
raster/r.mapcalc/function.c
raster/r.mapcalc/Makefile
raster/r.mapcalc/func_proto.h
raster/r.mapcalc/r.mapcalc.html

additional file with erand function code:
raster/r.mapcalc/xerand.c

Cheers,

--
-Joel

"Wish not to seem, but to be, the best."
                -- Aeschylus

(attachments)

rmapcalc.tar.gz (1.51 KB)

Joel Pitt wrote:

It'd be great if someone could commit the attached addition to r.mapcalc

It basically allows one to specify the seed for random number
generation by adding a new function "erand(a,b,s)" which takes a and b
as the lower and upper limit, and s is a seed that is used to initiate
the pseudo random number generator.

This is useful in modelling and simulation so that particular
replicates can be rerun exactly.

If it needs any extra work let me know and I'll make the changes.

I consider this to be the wrong approach. At a minimum, setting the
seed should be decoupled from the generation of random numbers;
there's no reason to have what is essentially another copy of
f_rand().

Also, it's debatable whether an r.mapcalc function is the correct
mechanism for setting an operating parameter. An environment variable
might be more suitable.

Finally, even if f_erand() was the correct interface, having it call
f_rand() once the seed has been set is preferable to including a clone
of f_rand() in the body of f_erand().

--
Glynn Clements <glynn@gclements.plus.com>

On 10/24/06, Glynn Clements <glynn@gclements.plus.com> wrote:

I consider this to be the wrong approach. At a minimum, setting the
seed should be decoupled from the generation of random numbers;
there's no reason to have what is essentially another copy of
f_rand().

Also, it's debatable whether an r.mapcalc function is the correct
mechanism for setting an operating parameter. An environment variable
might be more suitable.

Finally, even if f_erand() was the correct interface, having it call
f_rand() once the seed has been set is preferable to including a clone
of f_rand() in the body of f_erand().

Okay, that seems reasonable. I went the way I did because as far as I'm aware
there was no existing system in mapcalc for options e.g. -seed=2342
and at the time I altered it I was still getting to grips with GRASS.
It is probably for the same reason, that I didn't consider environment
variables too.

If I alter f_rand() to check an environment variable (GRASS_SEED or
something similar), would that be suitable for inclusion?

--
-Joel

"Wish not to seem, but to be, the best."
                -- Aeschylus

Glynn Clements wrote:

Joel Pitt wrote:

> It'd be great if someone could commit the attached addition to
> r.mapcalc
>
> It basically allows one to specify the seed for random number
> generation by adding a new function "erand(a,b,s)" which takes a and
> b as the lower and upper limit, and s is a seed that is used to
> initiate the pseudo random number generator.
>
> This is useful in modelling and simulation so that particular
> replicates can be rerun exactly.
>
> If it needs any extra work let me know and I'll make the changes.

I consider this to be the wrong approach. At a minimum, setting the
seed should be decoupled from the generation of random numbers;
there's no reason to have what is essentially another copy of
f_rand().

Also, it's debatable whether an r.mapcalc function is the correct
mechanism for setting an operating parameter. An environment variable
might be more suitable.

Finally, even if f_erand() was the correct interface, having it call
f_rand() once the seed has been set is preferable to including a clone
of f_rand() in the body of f_erand().

what if r.mapcalc's rand(a,b) fn had an optional third argument? ie the
seed. rand(a,b) would work as it currently does, but rand(a,b,seed)
would be available if you wanted it.

??
Hamish

Joel Pitt wrote:

> I consider this to be the wrong approach. At a minimum, setting the
> seed should be decoupled from the generation of random numbers;
> there's no reason to have what is essentially another copy of
> f_rand().
>
> Also, it's debatable whether an r.mapcalc function is the correct
> mechanism for setting an operating parameter. An environment variable
> might be more suitable.
>
> Finally, even if f_erand() was the correct interface, having it call
> f_rand() once the seed has been set is preferable to including a clone
> of f_rand() in the body of f_erand().

Okay, that seems reasonable. I went the way I did because as far as I'm aware
there was no existing system in mapcalc for options e.g. -seed=2342
and at the time I altered it I was still getting to grips with GRASS.
It is probably for the same reason, that I didn't consider environment
variables too.

If I alter f_rand() to check an environment variable (GRASS_SEED or
something similar), would that be suitable for inclusion?

That's one option, although putting it at the top-level (at the
beginning of execute() in evalute.c) is more robust. If another
function which uses the PRNG is added later, you probably want the
seed to affect everything which uses the PRNG regardless of the
evaluation order.

--
Glynn Clements <glynn@gclements.plus.com>

Hamish wrote:

> > It'd be great if someone could commit the attached addition to
> > r.mapcalc
> >
> > It basically allows one to specify the seed for random number
> > generation by adding a new function "erand(a,b,s)" which takes a and
> > b as the lower and upper limit, and s is a seed that is used to
> > initiate the pseudo random number generator.
> >
> > This is useful in modelling and simulation so that particular
> > replicates can be rerun exactly.
> >
> > If it needs any extra work let me know and I'll make the changes.
>
> I consider this to be the wrong approach. At a minimum, setting the
> seed should be decoupled from the generation of random numbers;
> there's no reason to have what is essentially another copy of
> f_rand().
>
> Also, it's debatable whether an r.mapcalc function is the correct
> mechanism for setting an operating parameter. An environment variable
> might be more suitable.
>
> Finally, even if f_erand() was the correct interface, having it call
> f_rand() once the seed has been set is preferable to including a clone
> of f_rand() in the body of f_erand().

what if r.mapcalc's rand(a,b) fn had an optional third argument? ie the
seed. rand(a,b) would work as it currently does, but rand(a,b,seed)
would be available if you wanted it.

That could be problematic if you have more than one call to rand(). We
don't actually specify the evaluation order (GRASS functions don't
have side effects).

If r.mapcalc used G_parser(), a seed=... option would be the obvious
choice, but it doesn't, and that can't easily be changed. An
environment variable is probably the next best thing. For scripts,
setting an environment variable is typically easier than substituting
into an r.mapcalc expression (particularly if you are reading the
expression from a file).

--
Glynn Clements <glynn@gclements.plus.com>

Glynn Clements wrote:

> what if r.mapcalc's rand(a,b) fn had an optional third argument? ie
> the seed. rand(a,b) would work as it currently does, but
> rand(a,b,seed) would be available if you wanted it.

That could be problematic if you have more than one call to rand(). We
don't actually specify the evaluation order (GRASS functions don't
have side effects).

If r.mapcalc used G_parser(), a seed=... option would be the obvious
choice, but it doesn't, and that can't easily be changed. An
environment variable is probably the next best thing. For scripts,
setting an environment variable is typically easier than substituting
into an r.mapcalc expression (particularly if you are reading the
expression from a file).

if you have more than one call to rand() and both are seeded with the
same value (GRASS_RND_SEED or whatever), won't you get the same "random"
number both times?

should the seed be read on startup in main.c, not xrand.c ..?

Hamish

On 10/24/06, Hamish <hamish_nospam@yahoo.com> wrote:

Glynn Clements wrote:
> > what if r.mapcalc's rand(a,b) fn had an optional third argument? ie
> > the seed. rand(a,b) would work as it currently does, but
> > rand(a,b,seed) would be available if you wanted it.
>
> That could be problematic if you have more than one call to rand(). We
> don't actually specify the evaluation order (GRASS functions don't
> have side effects).
>
> If r.mapcalc used G_parser(), a seed=... option would be the obvious
> choice, but it doesn't, and that can't easily be changed. An
> environment variable is probably the next best thing. For scripts,
> setting an environment variable is typically easier than substituting
> into an r.mapcalc expression (particularly if you are reading the
> expression from a file).

if you have more than one call to rand() and both are seeded with the
same value (GRASS_RND_SEED or whatever), won't you get the same "random"
number both times?

should the seed be read on startup in main.c, not xrand.c ..?

I implemented a static variable in rand function, so that it will only
be read the first time rand() is called - independent of how many
rand() occurences are in the expression.

However, I'll follow Glynn's suggestion and read the seed in the
evaluate() function. This will be more robust if other functions are
implemented that call the RNG.

On this topic, would there be any call for implementing more
complicated probability distribution functions? Or is the philosophy
of mapcalc to have the simplest elements necessary?

--
-Joel

"Wish not to seem, but to be, the best."
                -- Aeschylus

Joel Pitt wrote:

On this topic, would there be any call for implementing more
complicated probability distribution functions? Or is the philosophy
of mapcalc to have the simplest elements necessary?

anything from the GRASS <-> R-stats interface that you think should be
in GRASS itself?

http://cran.au.r-project.org/src/contrib/Descriptions/grasper.html
http://cran.au.r-project.org/src/contrib/Descriptions/gstat.html

(grasp is developed by Landcare Research)

Hamish

Joel Pitt wrote:

> > > what if r.mapcalc's rand(a,b) fn had an optional third argument? ie
> > > the seed. rand(a,b) would work as it currently does, but
> > > rand(a,b,seed) would be available if you wanted it.
> >
> > That could be problematic if you have more than one call to rand(). We
> > don't actually specify the evaluation order (GRASS functions don't
> > have side effects).
> >
> > If r.mapcalc used G_parser(), a seed=... option would be the obvious
> > choice, but it doesn't, and that can't easily be changed. An
> > environment variable is probably the next best thing. For scripts,
> > setting an environment variable is typically easier than substituting
> > into an r.mapcalc expression (particularly if you are reading the
> > expression from a file).
>
> if you have more than one call to rand() and both are seeded with the
> same value (GRASS_RND_SEED or whatever), won't you get the same "random"
> number both times?
>
> should the seed be read on startup in main.c, not xrand.c ..?

I implemented a static variable in rand function, so that it will only
be read the first time rand() is called - independent of how many
rand() occurences are in the expression.

Note that this is necessary even if rand() only occurs once in the
expression, as the function (f_rand()) gets called once per row. You
don't want the PRNG to get re-seeded at the beginning of each row :wink:

However, I'll follow Glynn's suggestion and read the seed in the
evaluate() function. This will be more robust if other functions are
implemented that call the RNG.

On this topic, would there be any call for implementing more
complicated probability distribution functions? Or is the philosophy
of mapcalc to have the simplest elements necessary?

A Gaussian distribution might be useful, as you can't readily
implement that using existing r.mapcalc functions (the usual
implementation uses iteration, which r.mapcalc doesn't support).

For enhancements which might be useful to other users, I'd suggest
posting an announcement (or the code itself) to the list. If enough
people express an interest in having the functionality built in, we
can add it.

Adding new functions is quite straightforward; OTOH, we don't want to
bloat r.mapcalc with code which is unlikely to be used by anyone but
its author.

For more complex tasks, there's always R/GRASS. r.mapcalc will always
be bound by its structural limitations (row-by-row processing,
inability to define new functions, etc).

--
Glynn Clements <glynn@gclements.plus.com>

On 10/25/06, Glynn Clements <glynn@gclements.plus.com> wrote:

> However, I'll follow Glynn's suggestion and read the seed in the
> evaluate() function. This will be more robust if other functions are
> implemented that call the RNG.
>
> On this topic, would there be any call for implementing more
> complicated probability distribution functions? Or is the philosophy
> of mapcalc to have the simplest elements necessary?

A Gaussian distribution might be useful, as you can't readily
implement that using existing r.mapcalc functions (the usual
implementation uses iteration, which r.mapcalc doesn't support).

For enhancements which might be useful to other users, I'd suggest
posting an announcement (or the code itself) to the list. If enough
people express an interest in having the functionality built in, we
can add it.

Adding new functions is quite straightforward; OTOH, we don't want to
bloat r.mapcalc with code which is unlikely to be used by anyone but
its author.

For more complex tasks, there's always R/GRASS. r.mapcalc will always
be bound by its structural limitations (row-by-row processing,
inability to define new functions, etc).

I've attached a diff of evaluate.c and the related html files that I'm aware of.

Turns out it was alot simpler adding the functionality as a environment variable
:slight_smile:

--
-Joel

"Wish not to seem, but to be, the best."
                -- Aeschylus

(attachments)

r.mapcalc.diff (2.36 KB)

Hi there, I was just checking the latest CVS and saw that my r.mapcalc
addition hadn't been added yet.

I was hoping someone with appropriate permissions could do so.

My dispersal simulation tool (which I hope to release properly in a
few months) requires the addition to function correctly.

Much thanks,
Joel

---------- Forwarded message ----------
From: Joel Pitt <joel.pitt@gmail.com>
Date: Oct 26, 2006 9:39 PM
Subject: Re: [GRASS-dev] r.mapcalc addition
To: Glynn Clements <glynn@gclements.plus.com>
Cc: Hamish <hamish_nospam@yahoo.com>, grass-dev@grass.itc.it

On 10/25/06, Glynn Clements <glynn@gclements.plus.com> wrote:

> However, I'll follow Glynn's suggestion and read the seed in the
> evaluate() function. This will be more robust if other functions are
> implemented that call the RNG.
>
> On this topic, would there be any call for implementing more
> complicated probability distribution functions? Or is the philosophy
> of mapcalc to have the simplest elements necessary?

A Gaussian distribution might be useful, as you can't readily
implement that using existing r.mapcalc functions (the usual
implementation uses iteration, which r.mapcalc doesn't support).

For enhancements which might be useful to other users, I'd suggest
posting an announcement (or the code itself) to the list. If enough
people express an interest in having the functionality built in, we
can add it.

Adding new functions is quite straightforward; OTOH, we don't want to
bloat r.mapcalc with code which is unlikely to be used by anyone but
its author.

For more complex tasks, there's always R/GRASS. r.mapcalc will always
be bound by its structural limitations (row-by-row processing,
inability to define new functions, etc).

I've attached a diff of evaluate.c and the related html files that I'm aware of.

Turns out it was alot simpler adding the functionality as a environment variable
:slight_smile:

--
-Joel

"Wish not to seem, but to be, the best."
                -- Aeschylus

--
-Joel

"Unless you try to do something beyond what you have mastered, you
will never grow." -C.R. Lawton

(attachments)

r.mapcalc.diff (2.36 KB)

Joel Pitt wrote:

Hi there, I was just checking the latest CVS and saw that my r.mapcalc
addition hadn't been added yet.

I was hoping someone with appropriate permissions could do so.

Done.

--
Glynn Clements <glynn@gclements.plus.com>