On Wed, Oct 14, 2015 at 12:55 PM, Dylan Beaudette
<dylan.beaudette@gmail.com> wrote:
On Wed, Oct 14, 2015 at 10:50 AM, Dylan Beaudette
<dylan.beaudette@gmail.com> wrote:
Some additional clues:
The original stack was 365 maps with 3105 x 7025 cells.
1. zooming into a smaller region (30 x 40 cells) and running
t.rast.series 100x resulted in 100 "correct" maps, no errors.
2. returning to the full extent and running t.rast.series 30x on the
first 31 maps resulted in 30 "correct" maps, no errors.
3. returning to the full extent and running t.rast.series 30x on the
last 31 maps resulted in 30 "correct" maps, no errors
So, it seems that t.rast.series (r.series) is throwing an error, or
generating wront output, when when:
a large set of maps are supplied as input, and, a region that has a
moderate number of total cells.
Yeah, I know, that isn't very specific. I will try re-compiling with
debugging and no optimization next.
Dylan
More data,
1. re-compiled with CFLAGS="-g -Wall":
* Multiple runs of t.rast.series with the full stack (365 maps with
3105 x 7025 cells), no errors.
* each run required about 8.5 minutes to complete
2. re-compiled with CFLAGS="-O2 -mtune=native -march=native" LDFLAGS="-s":
* 10x tests with full stack, no errors
* each run required about 3.5 minutes
3. re-run original script (see listing below)
* random errors from t.rast.series
This doesn't make much sense to me. The only difference between my
latest "tests" and the original code is that the input to
t.rast.series was static over the course of my "tests", vs. dynamic
within the original code (see below). I purposely selected a stack
that caused t.rast.series to throw an error for my tests.
OK, this does make sense--t.rast.series (r.series) was not the source
of the problems. I was able to verify this by running t.univar on the
output from the previous step:
# NOTE: 4 CPUs so that external disk isn't thrashed
gdd_max_C=30
gdd_min_C=10
gdd_base_C=10
t.rast.mapcalc --q --o nprocs=4 input=tmin_subset,tmax_subset
output=gdd basename=gdd expr="max(((min(tmax_subset, $gdd_max_C) +
max(tmin_subset, $gdd_min_C)) / 2.0) - $gdd_base_C, 0)"
... which means that t.rast.mapcalc was generating one (or more)
outputs with some kind of problem, which was then causing t.univar and
t.rast.series to fail.
The inputs to t.rast.mapcalc are files that have been registered with
r.external. I suspect that the multiple concurrent r.mapcalc instances
may be to blame. I don't have an explanation other than some evidence
from the last time I encountered this type of issue. The workflow then
was :
1. spawn 8 concurrent processes via backgrounding: r.sun -> r.mapcalc
2. when finished with daily solar models, sum maps with r.series
I would occasionally encounter the "Error reading raster data for row
xxx" error from r.series in this case and assume that r.series had
somehow broken the map in question.
It would seem that concurrent use of r.mapcalc may be worth
investigating... however, it is strange that it only occurs sometimes.
Oddly enough, I didn't have problems with maps generated with the
following (similar) code:
# spring frost
# if tmin never drops below 0 before the start of summer, then the
last "spring frost" is on day 0
# NOTE: 2 CPUs so that disk isn't thrashed
t.rast.mapcalc --o -n nprocs=2 input=tmin output=spring_frost
basename=spring_frost \
expr="if(start_doy() < 182, if(tmin < 0, start_doy(), 0), null())"
# fall frost
# NOTE: 2 CPUs so that disk isn't thrashed
t.rast.mapcalc --o -n nprocs=2 input=tmin output=fall_frost
basename=fall_frost \
expr="if(start_doy() > 213, if(tmin < 0, start_doy(), 365), null())"
Dylan