[GRASS-dev] Error reading raster data for row xxx (only when using r.series and t.rast.series)

Hi,

Over the last couple of years I have noticed a very strange raster
corruption (?) issues when using r.series, and now more recently,
t.rast.series. Typically, I'll generate a large number of maps with
r.sun or t.rast.mapcalc and then aggregate the series with r.series or
t.rast.series. About 50% of the time the command runs as expected, the
other half of the time r.series or t.rast.series gives me an error
like this:

Error reading raster data for row xxx (testmap)

After this error, I can no longer perform any kind of operation on map
"testmap" without the dreaded Error reading raster data for row xxx...

The situation was worse when using a MASK map, possibly related to a
similar (fixed?) issue discussed in this thread:

http://lists.osgeo.org/pipermail/grass-dev/2015-July/075627.html

Within that thread, Glynn mentioned that this type of error was
probably related to pthreads and concurrent processes. The temporary
fix entailed:

export WORKERS=0

I have tried this on my machine but the results are the same,
non-deterministic corruption (?) of one input to r.series or
t.rast.series.

I have encountered this error on several disks, mirrored HDDs, single
HDD, and now on an SSD. I don't think that this is a disk problem,
rather, something that r.series or t.rast.series is "doing" to the
files it operates on.

Is there some possibility that one of these commands is leaving a file
"open" or in some kind of intermediate state that prevents subsequent
commands from accessing the file?

I'll try to create a sample dataset to send over. In the meantime is
there any kind of diagnostic information that I can report back with?

Thanks,
Dylan

On 12/10/15 23:35, Dylan Beaudette wrote:

Hi,

Over the last couple of years I have noticed a very strange raster
corruption (?) issues when using r.series, and now more recently,
t.rast.series. Typically, I'll generate a large number of maps with
r.sun or t.rast.mapcalc and then aggregate the series with r.series or
t.rast.series. About 50% of the time the command runs as expected, the
other half of the time r.series or t.rast.series gives me an error
like this:

Error reading raster data for row xxx (testmap)

After this error, I can no longer perform any kind of operation on map
"testmap" without the dreaded Error reading raster data for row xxx...

The situation was worse when using a MASK map, possibly related to a
similar (fixed?) issue discussed in this thread:

http://lists.osgeo.org/pipermail/grass-dev/2015-July/075627.html

Within that thread, Glynn mentioned that this type of error was
probably related to pthreads and concurrent processes. The temporary
fix entailed:

export WORKERS=0

I have tried this on my machine but the results are the same,
non-deterministic corruption (?) of one input to r.series or
t.rast.series.

I have encountered this error on several disks, mirrored HDDs, single
HDD, and now on an SSD. I don't think that this is a disk problem,
rather, something that r.series or t.rast.series is "doing" to the
files it operates on.

Is there some possibility that one of these commands is leaving a file
"open" or in some kind of intermediate state that prevents subsequent
commands from accessing the file?

I'll try to create a sample dataset to send over. In the meantime is
there any kind of diagnostic information that I can report back with?

Are you using a mask, as was the case in the thread you cite ?

Moritz

Hi Moritz,

Not using a MASK in this case. Fairly basic work-flow, using "tmin"
and "tmax" (Space Time Raster Dataset) loaded via r.external.

# make a where clause for finding the current year
  wc="strftime('%Y', start_time) = '"$year"'"

  # extract current dataset: "copies" by reference
  t.rast.extract --q --o input=tmin output=tmin_subset
basename=tmin_subset where="$wc"
  t.rast.extract --q --o input=tmax output=tmax_subset
basename=tmax_subset where="$wc"

  # GDD for current year: slow from external rasters
  # NOTE: 2 CPUs so that disk isn't thrashed
  t.rast.mapcalc --o nprocs=2 method=equal
input=tmin_subset,tmax_subset output=gdd basename=gdd expr="(((
min((tmax_subset * 1.8 + 32.0), 86.0) + max((tmin_subset * 1.8 +
32.0), 50) ) / 2) - 50)"

... the error occurs at this step in a non-deterministic pattern.
Strangely enough, the errors are more frequent during the morning
hours vs. afternoon hours!

  # sum GDD for this year: fast from SSD, but only single CPU thread
  t.rast.series --o --q in=gdd out=gdd_$year method=sum

Thanks,
Dylan

On Tue, Oct 13, 2015 at 1:04 AM, Moritz Lennert
<mlennert@club.worldonline.be> wrote:

On 12/10/15 23:35, Dylan Beaudette wrote:

Hi,

Over the last couple of years I have noticed a very strange raster
corruption (?) issues when using r.series, and now more recently,
t.rast.series. Typically, I'll generate a large number of maps with
r.sun or t.rast.mapcalc and then aggregate the series with r.series or
t.rast.series. About 50% of the time the command runs as expected, the
other half of the time r.series or t.rast.series gives me an error
like this:

Error reading raster data for row xxx (testmap)

After this error, I can no longer perform any kind of operation on map
"testmap" without the dreaded Error reading raster data for row xxx...

The situation was worse when using a MASK map, possibly related to a
similar (fixed?) issue discussed in this thread:

http://lists.osgeo.org/pipermail/grass-dev/2015-July/075627.html

Within that thread, Glynn mentioned that this type of error was
probably related to pthreads and concurrent processes. The temporary
fix entailed:

export WORKERS=0

I have tried this on my machine but the results are the same,
non-deterministic corruption (?) of one input to r.series or
t.rast.series.

I have encountered this error on several disks, mirrored HDDs, single
HDD, and now on an SSD. I don't think that this is a disk problem,
rather, something that r.series or t.rast.series is "doing" to the
files it operates on.

Is there some possibility that one of these commands is leaving a file
"open" or in some kind of intermediate state that prevents subsequent
commands from accessing the file?

I'll try to create a sample dataset to send over. In the meantime is
there any kind of diagnostic information that I can report back with?

Are you using a mask, as was the case in the thread you cite ?

Moritz

Hi Markus,

GRASS version information:

./configure --without-odbc --without-mysql --with-readline
--with-cxx --enable-largefile --with-freetype
--with-freetype-includes=/usr/include/freetype2 --with-sqlite
--with-python --with-pthread --with-geos=/usr/local/bin/geos-config
--without-opencl --with-opencl-includes=/usr/include/CL/
--with-postgres --with-postgres-includes=/usr/include/postgresql/
--with-postgres-libs=/usr/lib/
--with-proj-share=/usr/local/share/proj/

libgis Revision: 64732
libgis Date: 2015-02-24 16:54:05 -0800 (Tue, 24 Feb 2015)

PROJ.4: 4.8.0
GDAL/OGR: 2.0.0
GEOS: 3.4.2
SQLite: 3.7.9

I find it strange that I have encountered this mysterious error
(occasionally) over the last couple of years while tracking
grass_trunk.

One other piece of data; I have never encountered this error with any
other GRASS modules... just r.series and t.rast.series.

Thanks,
Dylan

On Tue, Oct 13, 2015 at 3:09 AM, Markus Neteler <neteler@osgeo.org> wrote:

Hi Dylan,

please post to the list which GRASS GIS version you are using
(g.version or wxGUI HELP) and the OS.

Best
Markus

One more piece of data:

This error only occurs when using "method=sum". I have not encountered
this error when using any of the other methods available to r.series
or t.rast.series.

Dylan

On Tue, Oct 13, 2015 at 10:00 AM, Dylan Beaudette
<dylan.beaudette@gmail.com> wrote:

Hi Markus,

GRASS version information:

./configure --without-odbc --without-mysql --with-readline
--with-cxx --enable-largefile --with-freetype
--with-freetype-includes=/usr/include/freetype2 --with-sqlite
--with-python --with-pthread --with-geos=/usr/local/bin/geos-config
--without-opencl --with-opencl-includes=/usr/include/CL/
--with-postgres --with-postgres-includes=/usr/include/postgresql/
--with-postgres-libs=/usr/lib/
--with-proj-share=/usr/local/share/proj/

libgis Revision: 64732
libgis Date: 2015-02-24 16:54:05 -0800 (Tue, 24 Feb 2015)

PROJ.4: 4.8.0
GDAL/OGR: 2.0.0
GEOS: 3.4.2
SQLite: 3.7.9

I find it strange that I have encountered this mysterious error
(occasionally) over the last couple of years while tracking
grass_trunk.

One other piece of data; I have never encountered this error with any
other GRASS modules... just r.series and t.rast.series.

Thanks,
Dylan

On Tue, Oct 13, 2015 at 3:09 AM, Markus Neteler <neteler@osgeo.org> wrote:

Hi Dylan,

please post to the list which GRASS GIS version you are using
(g.version or wxGUI HELP) and the OS.

Best
Markus

On Oct 13, 2015 7:01 PM, “Dylan Beaudette” <dylan.beaudette@gmail.com> wrote:

Hi Markus,

GRASS version information:

./configure --without-odbc --without-mysql --with-readline
–with-cxx --enable-largefile --with-freetype
–with-freetype-includes=/usr/include/freetype2 --with-sqlite
–with-python --with-pthread

I would not use pthread.

–with-geos=/usr/local/bin/geos-config
–without-opencl --with-opencl-includes=/usr/include/CL/
–with-postgres --with-postgres-includes=/usr/include/postgresql/
–with-postgres-libs=/usr/lib/
–with-proj-share=/usr/local/share/proj/

libgis Revision: 64732

This is fairly old. Trunk is r66487.

Can you update?

libgis Date: 2015-02-24 16:54:05 -0800 (Tue, 24 Feb 2015)

PROJ.4: 4.8.0
GDAL/OGR: 2.0.0
GEOS: 3.4.2
SQLite: 3.7.9

Best
Markus

Dangit... This is strange, just did an 'svn up', make distclean, make,
make install, and now this:

Welcome to GRASS GIS 7.1.svn (r66487)
GRASS GIS homepage: http://grass.osgeo.org
This version running through: Bash Shell (/bin/bash)
Help is available with the command: g.manual -i
See the licence terms with: g.version -c
Start the GUI with: g.gui wxpython
When ready to quit enter: exit

GRASS 7.1.svn (prism):~/src/grass_trunk > g.version -r
GRASS 7.1.svn (2015)
libgis Revision: 64732
libgis Date: 2015-02-24 16:54:05 -0800 (Tue, 24 Feb 2015)

Has it been so long since I have compiled GRASS that I have missed
something? I see that the libgis is still "old".

?

Dylan

On Tue, Oct 13, 2015 at 11:06 AM, Markus Neteler <neteler@osgeo.org> wrote:

On Oct 13, 2015 7:01 PM, "Dylan Beaudette" <dylan.beaudette@gmail.com>
wrote:

Hi Markus,

GRASS version information:

./configure --without-odbc --without-mysql --with-readline
--with-cxx --enable-largefile --with-freetype
--with-freetype-includes=/usr/include/freetype2 --with-sqlite
--with-python --with-pthread

I would not use pthread.

--with-geos=/usr/local/bin/geos-config
--without-opencl --with-opencl-includes=/usr/include/CL/
--with-postgres --with-postgres-includes=/usr/include/postgresql/
--with-postgres-libs=/usr/lib/
--with-proj-share=/usr/local/share/proj/

libgis Revision: 64732

This is fairly old. Trunk is r66487.

Can you update?

libgis Date: 2015-02-24 16:54:05 -0800 (Tue, 24 Feb 2015)

PROJ.4: 4.8.0
GDAL/OGR: 2.0.0
GEOS: 3.4.2
SQLite: 3.7.9

Best
Markus

On Tue, Oct 13, 2015 at 8:43 PM, Dylan Beaudette
<dylan.beaudette@gmail.com> wrote:

Dangit... This is strange, just did an 'svn up', make distclean, make,
make install, and now this:

Welcome to GRASS GIS 7.1.svn (r66487)
GRASS GIS homepage: http://grass.osgeo.org
This version running through: Bash Shell (/bin/bash)
Help is available with the command: g.manual -i
See the licence terms with: g.version -c
Start the GUI with: g.gui wxpython
When ready to quit enter: exit

GRASS 7.1.svn (prism):~/src/grass_trunk > g.version -r
GRASS 7.1.svn (2015)
libgis Revision: 64732
libgis Date: 2015-02-24 16:54:05 -0800 (Tue, 24 Feb 2015)

Has it been so long since I have compiled GRASS that I have missed
something?

Yep:
https://trac.osgeo.org/grass/changeset/65591
"Prevent concurrent raster reads when a mask is present"

(which also got backported subsequently)

I see that the libgis is still "old".

... this is unrelated since the issue was in r.mapcalc (hence
affecting all modules using it).

The true revision number of interest is the one next to GRASS GIS
7.1.svn (r66487).

I guess your issue is now solved.

Markus

On Tue, Oct 13, 2015 at 11:06 AM, Markus Neteler <neteler@osgeo.org> wrote:

On Oct 13, 2015 7:01 PM, "Dylan Beaudette" <dylan.beaudette@gmail.com>
wrote:

Hi Markus,

GRASS version information:

./configure --without-odbc --without-mysql --with-readline
--with-cxx --enable-largefile --with-freetype
--with-freetype-includes=/usr/include/freetype2 --with-sqlite
--with-python --with-pthread

I would not use pthread.

OK, good to know. I will re-compile without it and report back.

--with-geos=/usr/local/bin/geos-config
--without-opencl --with-opencl-includes=/usr/include/CL/
--with-postgres --with-postgres-includes=/usr/include/postgresql/
--with-postgres-libs=/usr/lib/
--with-proj-share=/usr/local/share/proj/

libgis Revision: 64732

This is fairly old. Trunk is r66487.

Can you update?

How does the "libgis" revision number relate to the version reported
when starting GRASS (e.g. SVN revision)?

libgis Date: 2015-02-24 16:54:05 -0800 (Tue, 24 Feb 2015)

PROJ.4: 4.8.0
GDAL/OGR: 2.0.0
GEOS: 3.4.2
SQLite: 3.7.9

Best
Markus

On Tue, Oct 13, 2015 at 11:55 AM, Markus Neteler <neteler@osgeo.org> wrote:

On Tue, Oct 13, 2015 at 8:43 PM, Dylan Beaudette
<dylan.beaudette@gmail.com> wrote:

Dangit... This is strange, just did an 'svn up', make distclean, make,
make install, and now this:

Welcome to GRASS GIS 7.1.svn (r66487)
GRASS GIS homepage: http://grass.osgeo.org
This version running through: Bash Shell (/bin/bash)
Help is available with the command: g.manual -i
See the licence terms with: g.version -c
Start the GUI with: g.gui wxpython
When ready to quit enter: exit

GRASS 7.1.svn (prism):~/src/grass_trunk > g.version -r
GRASS 7.1.svn (2015)
libgis Revision: 64732
libgis Date: 2015-02-24 16:54:05 -0800 (Tue, 24 Feb 2015)

Has it been so long since I have compiled GRASS that I have missed
something?

Yep:
https://trac.osgeo.org/grass/changeset/65591
"Prevent concurrent raster reads when a mask is present"

(which also got backported subsequently)

I see that the libgis is still "old".

... this is unrelated since the issue was in r.mapcalc (hence
affecting all modules using it).

The true revision number of interest is the one next to GRASS GIS
7.1.svn (r66487).

I guess your issue is now solved.

Markus

Maybe. I updated my local copy of grass_trunk last Monday, compiled,
and experienced the issues with r.series and t.rast.series... in the
absence of a MASK map.

Are the *.series modules a convenient front-end to r.mapcalc?

Thanks!
Dylan

Hi Dylan,
r.series is a module implemented in C with no relations to r.mapcalc.
t.rast.series is a Python module that makes use of r.series internally.

Best regards
Soeren

2015-10-13 21:06 GMT+02:00 Dylan Beaudette <dylan.beaudette@gmail.com>:

On Tue, Oct 13, 2015 at 11:55 AM, Markus Neteler <neteler@osgeo.org> wrote:

On Tue, Oct 13, 2015 at 8:43 PM, Dylan Beaudette
<dylan.beaudette@gmail.com> wrote:

Dangit... This is strange, just did an 'svn up', make distclean, make,
make install, and now this:

Welcome to GRASS GIS 7.1.svn (r66487)
GRASS GIS homepage: http://grass.osgeo.org
This version running through: Bash Shell (/bin/bash)
Help is available with the command: g.manual -i
See the licence terms with: g.version -c
Start the GUI with: g.gui wxpython
When ready to quit enter: exit

GRASS 7.1.svn (prism):~/src/grass_trunk > g.version -r
GRASS 7.1.svn (2015)
libgis Revision: 64732
libgis Date: 2015-02-24 16:54:05 -0800 (Tue, 24 Feb 2015)

Has it been so long since I have compiled GRASS that I have missed
something?

Yep:
https://trac.osgeo.org/grass/changeset/65591
"Prevent concurrent raster reads when a mask is present"

(which also got backported subsequently)

I see that the libgis is still "old".

... this is unrelated since the issue was in r.mapcalc (hence
affecting all modules using it).

The true revision number of interest is the one next to GRASS GIS
7.1.svn (r66487).

I guess your issue is now solved.

Markus

Maybe. I updated my local copy of grass_trunk last Monday, compiled,
and experienced the issues with r.series and t.rast.series... in the
absence of a MASK map.

Are the *.series modules a convenient front-end to r.mapcalc?

Thanks!
Dylan
_______________________________________________
grass-dev mailing list
grass-dev@lists.osgeo.org
http://lists.osgeo.org/mailman/listinfo/grass-dev

On Tue, Oct 13, 2015 at 8:59 PM, Dylan Beaudette
<dylan.beaudette@gmail.com> wrote:
...

libgis Revision: 64732

Ah, I read on mobile, not realizing that this was the libgis rev number.

...

How does the "libgis" revision number relate to the version reported
when starting GRASS (e.g. SVN revision)?

Not at all...

ok I find this libgis rev number continuously confusing and useless as
already posted some years ago.
In the end only the starting-GRASS-SVN-revision gives me true information.

@devs: I suggest to less prominently advertise the lib*gis* rev number
in g.version.

Markus

Thank you for the clarification Sören.

Any ideas on how r.series could be leaving maps open or otherwise
corrupting inputs in those cases where:

1. there are a lot of maps (>100)
2. method=sum

Dylan

On Tue, Oct 13, 2015 at 12:11 PM, Sören Gebbert
<soerengebbert@googlemail.com> wrote:

Hi Dylan,
r.series is a module implemented in C with no relations to r.mapcalc.
t.rast.series is a Python module that makes use of r.series internally.

Best regards
Soeren

2015-10-13 21:06 GMT+02:00 Dylan Beaudette <dylan.beaudette@gmail.com>:

On Tue, Oct 13, 2015 at 11:55 AM, Markus Neteler <neteler@osgeo.org> wrote:

On Tue, Oct 13, 2015 at 8:43 PM, Dylan Beaudette
<dylan.beaudette@gmail.com> wrote:

Dangit... This is strange, just did an 'svn up', make distclean, make,
make install, and now this:

Welcome to GRASS GIS 7.1.svn (r66487)
GRASS GIS homepage: http://grass.osgeo.org
This version running through: Bash Shell (/bin/bash)
Help is available with the command: g.manual -i
See the licence terms with: g.version -c
Start the GUI with: g.gui wxpython
When ready to quit enter: exit

GRASS 7.1.svn (prism):~/src/grass_trunk > g.version -r
GRASS 7.1.svn (2015)
libgis Revision: 64732
libgis Date: 2015-02-24 16:54:05 -0800 (Tue, 24 Feb 2015)

Has it been so long since I have compiled GRASS that I have missed
something?

Yep:
https://trac.osgeo.org/grass/changeset/65591
"Prevent concurrent raster reads when a mask is present"

(which also got backported subsequently)

I see that the libgis is still "old".

... this is unrelated since the issue was in r.mapcalc (hence
affecting all modules using it).

The true revision number of interest is the one next to GRASS GIS
7.1.svn (r66487).

I guess your issue is now solved.

Markus

Maybe. I updated my local copy of grass_trunk last Monday, compiled,
and experienced the issues with r.series and t.rast.series... in the
absence of a MASK map.

Are the *.series modules a convenient front-end to r.mapcalc?

Thanks!
Dylan
_______________________________________________
grass-dev mailing list
grass-dev@lists.osgeo.org
http://lists.osgeo.org/mailman/listinfo/grass-dev

On Tue, Oct 13, 2015 at 11:06 AM, Markus Neteler <neteler@osgeo.org> wrote:

On Oct 13, 2015 7:01 PM, "Dylan Beaudette" <dylan.beaudette@gmail.com>
wrote:

Hi Markus,

GRASS version information:

./configure --without-odbc --without-mysql --with-readline
--with-cxx --enable-largefile --with-freetype
--with-freetype-includes=/usr/include/freetype2 --with-sqlite
--with-python --with-pthread

I would not use pthread.

Hi Markus,

I have done some tests after re-compiling _without_ pthreads. So far,
no errors from r.series. Also I seem to be getting better performance
form r.mapcalc and t.rast.series, especially when working with files
loaded via r.external. I don't know enough about pthreads to speculate
further. Any ideas?

I'll report back when the script finishes, probably 20 hours or so.

Cheers,
Dylan

On Tue, Oct 13, 2015 at 4:25 PM, Dylan Beaudette
<dylan.beaudette@gmail.com> wrote:

On Tue, Oct 13, 2015 at 11:06 AM, Markus Neteler <neteler@osgeo.org> wrote:

On Oct 13, 2015 7:01 PM, "Dylan Beaudette" <dylan.beaudette@gmail.com>
wrote:

Hi Markus,

GRASS version information:

./configure --without-odbc --without-mysql --with-readline
--with-cxx --enable-largefile --with-freetype
--with-freetype-includes=/usr/include/freetype2 --with-sqlite
--with-python --with-pthread

I would not use pthread.

Hi Markus,

I have done some tests after re-compiling _without_ pthreads. So far,
no errors from r.series. Also I seem to be getting better performance
form r.mapcalc and t.rast.series, especially when working with files
loaded via r.external. I don't know enough about pthreads to speculate
further. Any ideas?

I'll report back when the script finishes, probably 20 hours or so.

Reporting back, same error occurring in a seemingly random way.

Dylan

On 13/10/15 20:43, Dylan Beaudette wrote:

Dangit... This is strange, just did an 'svn up', make distclean, make,
make install, and now this:

Welcome to GRASS GIS 7.1.svn (r66487)
GRASS GIS homepage: http://grass.osgeo.org
This version running through: Bash Shell (/bin/bash)
Help is available with the command: g.manual -i
See the licence terms with: g.version -c
Start the GUI with: g.gui wxpython
When ready to quit enter: exit

GRASS 7.1.svn (prism):~/src/grass_trunk > g.version -r
GRASS 7.1.svn (2015)
libgis Revision: 64732
libgis Date: 2015-02-24 16:54:05 -0800 (Tue, 24 Feb 2015)

Has it been so long since I have compiled GRASS that I have missed
something? I see that the libgis is still "old".

I've sometimes had to completely erase my source tree and do a fresh svn checkout to get a clean new compile. Not sure why, though.

Another thing: you might want to do the make distclean before the svn up.

Moritz

On Tue, Oct 13, 2015 at 11:58 PM, Moritz Lennert
<mlennert@club.worldonline.be> wrote:

On 13/10/15 20:43, Dylan Beaudette wrote:

Dangit... This is strange, just did an 'svn up', make distclean, make,
make install, and now this:

Welcome to GRASS GIS 7.1.svn (r66487)
GRASS GIS homepage: http://grass.osgeo.org
This version running through: Bash Shell (/bin/bash)
Help is available with the command: g.manual -i
See the licence terms with: g.version -c
Start the GUI with: g.gui wxpython
When ready to quit enter: exit

GRASS 7.1.svn (prism):~/src/grass_trunk > g.version -r
GRASS 7.1.svn (2015)
libgis Revision: 64732
libgis Date: 2015-02-24 16:54:05 -0800 (Tue, 24 Feb 2015)

Has it been so long since I have compiled GRASS that I have missed
something? I see that the libgis is still "old".

I've sometimes had to completely erase my source tree and do a fresh svn
checkout to get a clean new compile. Not sure why, though.

Another thing: you might want to do the make distclean before the svn up.

Moritz

Thanks for the tips Moritz,

I have tried your suggestion and still get the same errors.

One more clue in regards to the original "Error reading raster data
for row xxx" issue:

* about half of the time the results from t.rast.series are correct
* the other half of the time I get "Error reading raster data for row xxx"
* about 1 time in 30 runs the resulting map will be created, but will
have values at the extreme edges of FCELL precision... suggesting some
kind of overflow.

Dylan

Some additional clues:

The original stack was 365 maps with 3105 x 7025 cells.

1. zooming into a smaller region (30 x 40 cells) and running
t.rast.series 100x resulted in 100 "correct" maps, no errors.

2. returning to the full extent and running t.rast.series 30x on the
first 31 maps resulted in 30 "correct" maps, no errors.

3. returning to the full extent and running t.rast.series 30x on the
last 31 maps resulted in 30 "correct" maps, no errors

So, it seems that t.rast.series (r.series) is throwing an error, or
generating wront output, when when:

a large set of maps are supplied as input, and, a region that has a
moderate number of total cells.

Yeah, I know, that isn't very specific. I will try re-compiling with
debugging and no optimization next.

Dylan

On Wed, Oct 14, 2015 at 9:55 AM, Dylan Beaudette
<dylan.beaudette@gmail.com> wrote:

On Tue, Oct 13, 2015 at 11:58 PM, Moritz Lennert
<mlennert@club.worldonline.be> wrote:

On 13/10/15 20:43, Dylan Beaudette wrote:

Dangit... This is strange, just did an 'svn up', make distclean, make,
make install, and now this:

Welcome to GRASS GIS 7.1.svn (r66487)
GRASS GIS homepage: http://grass.osgeo.org
This version running through: Bash Shell (/bin/bash)
Help is available with the command: g.manual -i
See the licence terms with: g.version -c
Start the GUI with: g.gui wxpython
When ready to quit enter: exit

GRASS 7.1.svn (prism):~/src/grass_trunk > g.version -r
GRASS 7.1.svn (2015)
libgis Revision: 64732
libgis Date: 2015-02-24 16:54:05 -0800 (Tue, 24 Feb 2015)

Has it been so long since I have compiled GRASS that I have missed
something? I see that the libgis is still "old".

I've sometimes had to completely erase my source tree and do a fresh svn
checkout to get a clean new compile. Not sure why, though.

Another thing: you might want to do the make distclean before the svn up.

Moritz

Thanks for the tips Moritz,

I have tried your suggestion and still get the same errors.

One more clue in regards to the original "Error reading raster data
for row xxx" issue:

* about half of the time the results from t.rast.series are correct
* the other half of the time I get "Error reading raster data for row xxx"
* about 1 time in 30 runs the resulting map will be created, but will
have values at the extreme edges of FCELL precision... suggesting some
kind of overflow.

Dylan

On Wed, Oct 14, 2015 at 10:50 AM, Dylan Beaudette
<dylan.beaudette@gmail.com> wrote:

Some additional clues:

The original stack was 365 maps with 3105 x 7025 cells.

1. zooming into a smaller region (30 x 40 cells) and running
t.rast.series 100x resulted in 100 "correct" maps, no errors.

2. returning to the full extent and running t.rast.series 30x on the
first 31 maps resulted in 30 "correct" maps, no errors.

3. returning to the full extent and running t.rast.series 30x on the
last 31 maps resulted in 30 "correct" maps, no errors

So, it seems that t.rast.series (r.series) is throwing an error, or
generating wront output, when when:

a large set of maps are supplied as input, and, a region that has a
moderate number of total cells.

Yeah, I know, that isn't very specific. I will try re-compiling with
debugging and no optimization next.

Dylan

More data,

1. re-compiled with CFLAGS="-g -Wall":
* Multiple runs of t.rast.series with the full stack (365 maps with
3105 x 7025 cells), no errors.
* each run required about 8.5 minutes to complete

2. re-compiled with CFLAGS="-O2 -mtune=native -march=native" LDFLAGS="-s":
* 10x tests with full stack, no errors
* each run required about 3.5 minutes

3. re-run original script (see listing below)
* random errors from t.rast.series

This doesn't make much sense to me. The only difference between my
latest "tests" and the original code is that the input to
t.rast.series was static over the course of my "tests", vs. dynamic
within the original code (see below). I purposely selected a stack
that caused t.rast.series to throw an error for my tests.

Arg!

Dylan

For the record, here is the script that I have been using:

years=`seq 1981 2010`
for year in $years
  do
  echo $year

  # make a where clause for finding the current year
  wc="strftime('%Y', start_time) = '"$year"'"

  # extract current dataset: "copies" by reference
  t.rast.extract --q --o input=tmin output=tmin_subset
basename=tmin_subset where="$wc"
  t.rast.extract --q --o input=tmax output=tmax_subset
basename=tmax_subset where="$wc"

  # compute GDD on each day (about 36 minutes)
  # values less than 0 are set to 0
  # NOTE: 4 CPUs so that external disk isn't thrashed
  gdd_max_C=30
  gdd_min_C=10
  gdd_base_C=10
  t.rast.mapcalc --q --o nprocs=4 input=tmin_subset,tmax_subset
output=gdd basename=gdd expr="max(((min(tmax_subset, $gdd_max_C) +
max(tmin_subset, $gdd_min_C)) / 2.0) - $gdd_base_C, 0)"

  # save Sonora time-series for later
  # -k flag ensures that the output is in same order as input
  t.rast.list -s gdd columns="id" | parallel -k --gnu r.what map="{}"
points=sonora > qa_qc/sonora-daily-gdd-$year.dat

  # sum GDD for this year: fast from SSD, but only single CPU thread
(3.5 minutes)
  # BUG: crashes randomly
  t.rast.series --q --o in=gdd out=gdd_$year method=sum

  done

On Wed, Oct 14, 2015 at 12:55 PM, Dylan Beaudette
<dylan.beaudette@gmail.com> wrote:

On Wed, Oct 14, 2015 at 10:50 AM, Dylan Beaudette
<dylan.beaudette@gmail.com> wrote:

Some additional clues:

The original stack was 365 maps with 3105 x 7025 cells.

1. zooming into a smaller region (30 x 40 cells) and running
t.rast.series 100x resulted in 100 "correct" maps, no errors.

2. returning to the full extent and running t.rast.series 30x on the
first 31 maps resulted in 30 "correct" maps, no errors.

3. returning to the full extent and running t.rast.series 30x on the
last 31 maps resulted in 30 "correct" maps, no errors

So, it seems that t.rast.series (r.series) is throwing an error, or
generating wront output, when when:

a large set of maps are supplied as input, and, a region that has a
moderate number of total cells.

Yeah, I know, that isn't very specific. I will try re-compiling with
debugging and no optimization next.

Dylan

More data,

1. re-compiled with CFLAGS="-g -Wall":
* Multiple runs of t.rast.series with the full stack (365 maps with
3105 x 7025 cells), no errors.
* each run required about 8.5 minutes to complete

2. re-compiled with CFLAGS="-O2 -mtune=native -march=native" LDFLAGS="-s":
* 10x tests with full stack, no errors
* each run required about 3.5 minutes

3. re-run original script (see listing below)
* random errors from t.rast.series

This doesn't make much sense to me. The only difference between my
latest "tests" and the original code is that the input to
t.rast.series was static over the course of my "tests", vs. dynamic
within the original code (see below). I purposely selected a stack
that caused t.rast.series to throw an error for my tests.

OK, this does make sense--t.rast.series (r.series) was not the source
of the problems. I was able to verify this by running t.univar on the
output from the previous step:

  # NOTE: 4 CPUs so that external disk isn't thrashed
  gdd_max_C=30
  gdd_min_C=10
  gdd_base_C=10
  t.rast.mapcalc --q --o nprocs=4 input=tmin_subset,tmax_subset
output=gdd basename=gdd expr="max(((min(tmax_subset, $gdd_max_C) +
max(tmin_subset, $gdd_min_C)) / 2.0) - $gdd_base_C, 0)"

... which means that t.rast.mapcalc was generating one (or more)
outputs with some kind of problem, which was then causing t.univar and
t.rast.series to fail.

The inputs to t.rast.mapcalc are files that have been registered with
r.external. I suspect that the multiple concurrent r.mapcalc instances
may be to blame. I don't have an explanation other than some evidence
from the last time I encountered this type of issue. The workflow then
was :

1. spawn 8 concurrent processes via backgrounding: r.sun -> r.mapcalc

2. when finished with daily solar models, sum maps with r.series

I would occasionally encounter the "Error reading raster data for row
xxx" error from r.series in this case and assume that r.series had
somehow broken the map in question.

It would seem that concurrent use of r.mapcalc may be worth
investigating... however, it is strange that it only occurs sometimes.

Oddly enough, I didn't have problems with maps generated with the
following (similar) code:

# spring frost
# if tmin never drops below 0 before the start of summer, then the
last "spring frost" is on day 0
# NOTE: 2 CPUs so that disk isn't thrashed
t.rast.mapcalc --o -n nprocs=2 input=tmin output=spring_frost
basename=spring_frost \
expr="if(start_doy() < 182, if(tmin < 0, start_doy(), 0), null())"

# fall frost
# NOTE: 2 CPUs so that disk isn't thrashed
t.rast.mapcalc --o -n nprocs=2 input=tmin output=fall_frost
basename=fall_frost \
expr="if(start_doy() > 213, if(tmin < 0, start_doy(), 365), null())"

Dylan