Vaclav Petras wrote:
> > Shouldn't the seed not be generated on e.g, OS time,
> > which would ensure that each run would give a different result?
>
> No. The reason is to provide reproducibility. Anyone running the same
> command with the same data should obtain the same result.
Does the reproducibility go behind one operating system, compiler or
library?
If drand48() is used, yes. If rand() is used, no.
I don't think that the first random number is specified by the C
language standard.
The C standard doesn't specify any particular implementation for
rand() (it does give an example implementation, but it only produces
15-bit values). It does specify that if the PRNG isn't explicitly
seeded, the behaviour is as if srand(1) was called beforehand.
[§7.20.2.2p2]
IOW, the sequence of results is implementation-dependent, but it may
not change from one run to the next unless the program explicitly
seeds the PRNG with a non-deterministic value such as the current
time.
If the results would be really reproducible it would be
good for testing framework but I'm afraid that they are not (with my
limited knowledge about the topic).
In ticket #2272, I attached a portable implementation of lrand48(). If
desired, we could add this to libgis and use that in preference to any
implementation-specific PRNG.
> If you want a different result each time, set GRASS_RND_SEED to a
> different value each time, e.g.
>
> GRASS_RND_SEED=`date +%N` r.mapcalc "a = rand(0,100)"
>
> [%N is the nanoseconds portion of the current time; this is a GNU
> extension.]
I've heard that this is not enough on powerful computers/clusters, that you
have to use also PID because nanoseconds might be the same (I think I
rememberer that it was nanoseconds not seconds).
The main issue is on systems where the reported time only changes in
increments of a scheduler "tick" (e.g. 10ms on old versions of Linux).
> > On a related note, it would be nice to be able to set the seed (I think
> > there has been such a request before, but not sure about the answer at
> that
> > time).
>
> GRASS_RND_SEED was the answer.
I think there should be some possibility of randomization (auto-setting of
seed) build-in the modules providing random(ized) results. Perhaps a flag
which would turn it on. It can be also an option which would behave like
GRASS_RND_SEED but would have one special value for auto-generating the
seed. (GRASS_RND_SEED if present would override this option.) With the
default value of the option we should ask a question what is actually the
expected behavior of the module giving random results.
That's certainly reasonable. The main thing is that I believe that
reproducibility should be the default. If people have to take explicit
action to introduce randomness, they're more likely to consider the
issues involved. If randomised seeds are the default, the lack of
reproducibility may not be considered until it is too late.
--
Glynn Clements <glynn@gclements.plus.com>