[GRASS-dev] r.viewshed fails on large raster: temp file already exists

Hi folks

I was running r.viewshed in GRASS 7.0 under Win7 on a large raster with 11000 rows by 11000 columns. The algorithm ran in external memory mode with temporary files being written to a local directory (on C:). On starting sweeping, two temporary files were created, “STREAM_a06524” and “STREAM_b06524”, of which the first never increased in size beyond 0 KB, and the second grew to about 11 GB during sweeping. When sorting events, additional temp files were created, each of about 255 MB size, which were all named in the same logic, i.e. “STREAM_c06524”, “STREAM_d06524” etc, with only the letter after the underscore in the file name changing to the next letter in the alphabet. This continued until the file “STREAM_z06524” had been created, then the algorithm crashed with “File exists” (see sh output below). Obviously it had tried to create a file that already existed, probably named “STREAM_a06524”. It seems that either the programmer never counted on more files being necessary than the alphabet has letters, or maybe something gets mixed up in the loop and the code assumes that “STREAM_a06524” does not exist because it never wrote any data to it.

Heres’ the sh output:

GRASS 7.0.svn> r.viewshed input=dom_js1_2m@bn output=test_js1_2m_all coordinate

=639965,245032 obs_elev=4 tgt_elev=1.5 stream_dir=C:/tmp/grass_temp_visibility_

files --verbose

Nodata value set to nan

rows=11000, cols=11000, total = 121000000

In-memory memory usage is 12100088000 B (11539 MB), max mem

allowed=524288000 B(500MB)

***** EXTERNAL_MEMORY MODE *****

Intermediate files will not be deleted in case of abnormal termination.

Intermediate location: C:/tmp/grass_temp_visibility_files

To save space delete these files manually!

Memory manager registering memory in MM_IGNORE_MEMORY_EXCEEDED mode.

Estimated size active structure:

(key=64, ptr=4, total node=80 B)

Total= 880000 B

Start sweeping.

Computing events…

100%

Sorting events…

ami_single_temp_name: mktemp failed: : File exists

Assertion failed: 0, file ami_stream.cpp, line 97

This application has requested the Runtime to terminate it in an unusual way.

Please contact the application’s support team for more information.

A screenshot of the temp directory with the created files is attached.

Best regards

Benedikt


Dr. Benedikt Notter

INFRAS
Forschung und Beratung
Mühlemattstrasse 45

3007 Bern
Schweiz


Tel +41 31 370 19 14
Fax +41 31 370 19 10
benedikt.notter@infras.ch
http://www.infras.ch

(attachments)

r.viewshed_error_02.PNG

Hi Benedikt,

thanks for your detailed problem report:

On Mon, Dec 30, 2013 at 9:27 AM, Benedikt Notter
<benedikt.notter@infras.ch> wrote:

Hi folks

I was running r.viewshed in GRASS 7.0 under Win7 on a large raster with
11000 rows by 11000 columns. The algorithm ran in external memory mode with
temporary files being written to a local directory (on C:). On starting
sweeping, two temporary files were created, “STREAM_a06524” and
“STREAM_b06524”, of which the first never increased in size beyond 0 KB, and
the second grew to about 11 GB during sweeping. When sorting events,
additional temp files were created, each of about 255 MB size, which were
all named in the same logic, i.e. “STREAM_c06524”, “STREAM_d06524” etc, with
only the letter after the underscore in the file name changing to the next
letter in the alphabet. This continued until the file “STREAM_z06524” had
been created, then the algorithm crashed with “File exists” (see sh output
below). Obviously it had tried to create a file that already existed,
probably named “STREAM_a06524”. It seems that either the programmer never
counted on more files being necessary than the alphabet has letters, or
maybe something gets mixed up in the loop and the code assumes that
“STREAM_a06524” does not exist because it never wrote any data to it.

I suspect moreover that the tmp file name creation under Windows has
some issue. The names generated on Linux are random as they should be:

To check, I have tried the "same" command using the new EU DEM (just
as a test, I used the entire tile):

# set big region:
GRASS 7.0.svn (eu_laea):~ > g.region rast=eu_dem_25m_northern_italy -p
projection: 99 (Lambert Azimuthal Equal Area)
zone: 0
datum: etrs89
ellipsoid: grs80
north: 3000000
south: 2000000
west: 4000000
east: 5000000
nsres: 25
ewres: 25
rows: 40000
cols: 40000
cells: 1600000000

# get some coordinates for viewer:
g.region -c
center easting: 4500000.000000
center northing: 2500000.000000

# check available space:
[neteler@blade16 ~]$ df -h /storage/local/rviewshed_tmp/
Filesystem Size Used Avail Use% Mounted on
/dev/sda2 228G 27G 201G 12% /storage/local

# create temp dir:
GRASS 7.0.svn (eu_laea):~ > mkdir /storage/local/rviewshed_tmp

# run the job, measure job duration while we are at it:
GRASS 7.0.svn (eu_laea):~ > time -p r.viewshed
input=eu_dem_25m_northern_italy output=test_js1_2m_all
coordinate=4500000.0,2500000.0 obs_elev=4 tgt_elev=1.5
stream_dir=/storage/local/rviewshed_tmp/ --verbose
Nodata value set to -nan
rows=40000, cols=40000, total = 1600000000
In-memory memory usage is 160000320000 B (152588 MB), max mem
allowed=524288000 B(500MB)
***** EXTERNAL_MEMORY MODE *****
Intermediate files will not be deleted in case of abnormal termination.
Intermediate location: /storage/local/rviewshed_tmp/
To save space delete these files manually!
Memory manager registering memory in MM_IGNORE_MEMORY_EXCEEDED mode.
Estimated size active structure:
(key=64, ptr=8, total node=96 B)
Total= 3840000 B
Start sweeping.
Computing events...
ERROR 1: libgrass_I.so: cannot open shared object file: No such file
or directory
ERROR 1: libgrass_I.so: cannot open shared object file: No such file
or directory
   3%
[detached]

# ... using screen to be able to detach/reconnect to the remote terminal

# See what happens in the tmp dir (after 12min, 78% of "Computing events"):
[neteler@blade16 ~]$ l /storage/local/rviewshed_tmp/
total 117439744
-rw------- 1 neteler gis 0 Dec 31 15:40 STREAM_ERIIEd
-rw------- 1 neteler gis 115229065216 Dec 31 15:52 STREAM_K3zGLE

When reaching "Sorting events..."

[neteler@blade16 ~]$ l /storage/local/rviewshed_tmp/
total 157911308
-rw------- 1 neteler gis 429391872 Dec 31 15:56 STREAM_ERIIEd
-rw------- 1 neteler gis 148445779584 Dec 31 15:56 STREAM_K3zGLE
-rw------- 1 neteler gis 261861248 Dec 31 15:56 STREAM_GeVOJ9
-rw------- 1 neteler gis 261861248 Dec 31 15:56 STREAM_OOwDs9
-rw------- 1 neteler gis 261861248 Dec 31 15:56 STREAM_61D4FA
-rw------- 1 neteler gis 261861248 Dec 31 15:56 STREAM_Dmlzau
-rw------- 1 neteler gis 261861248 Dec 31 15:56 STREAM_S9f90O
-rw------- 1 neteler gis 261861248 Dec 31 15:57 STREAM_CRITSv
[...]

So the names on Linux are completely randomized unlike on your Windows box.

The problem appears to be in mktemp() on Windows, i.e. in the file
iostream/ami_stream.cpp" line 85

or two lines above where you get STREAM_a06524, STREAM_b06524,
STREAM_c06524, STREAM_d06524 etc. (process ID used?).

Searching around, I found a similar report:
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=47439

--> http://msdn.microsoft.com/en-us/library/34wc6k1f(v=VS.80).aspx
"_mktemp can create a maximum of 26 unique file names for any given
combination of base and template values. Therefore, FNZ12345 is the
last unique file name _mktemp can create for the base and template
values used in this example."

Ops.

Potential patch to be re-used:
http://gcc.gnu.org/ml/fortran/2011-03/msg00102/mktemp.diff

Anyone?
Markus

On Tue, Dec 31, 2013 at 4:14 PM, Markus Neteler <neteler@osgeo.org> wrote:
...

To check, I have tried the "same" command using the new EU DEM (just
as a test, I used the entire tile):

# set big region:
GRASS 7.0.svn (eu_laea):~ > g.region rast=eu_dem_25m_northern_italy -p
projection: 99 (Lambert Azimuthal Equal Area)

...

rows: 40000
cols: 40000
cells: 1600000000

Just for the record:
r.viewshed run 36 hours on a 2 years old machine to calculate the
viewshed on this region of 1.6Gigapixels.

Peak swap file size: 216GB, the stream_dir directory for swapping was
mounted over NFS, hence a bit slower than a local device.

So we only need to solve the number of files' limit in Windows' mktemp().

Markus

Hi Benedikt and Markus,

I’ve created a ticket for this problem:

http://trac.osgeo.org/grass/ticket/2153

Just to be sure that it will not be forgotten, moreover, the ticket forced me to not use terms such as defective by design :wink:

Markus, what about the errors in your log? I see it works but the .so part looks serious:

Computing events…
ERROR 1: libgrass_I.so: cannot open shared object file: No such file
or directory
ERROR 1: libgrass_I.so: cannot open shared object file: No such file
or directory
3%

I haven’t mention that in the ticket but I’ve tried to understand what is the right approach to create temporary files but without any success.

Vaclav

http://www.cplusplus.com/reference/cstdio/tmpfile/
http://www.cplusplus.com/reference/cstdio/tmpnam/
http://www.cplusplus.com/doc/tutorial/files/
http://stackoverflow.com/questions/18716658/whats-the-difference-between-tempfile-and-mktemp
http://stackoverflow.com/questions/1022487/how-to-create-a-temporary-text-file-in-c

Hi,

could anyone please implement G_mktemp() following the suggestions in
the ticket (there is a link to a gfortran patch included which might
be just "perfect":

On Thu, Jan 2, 2014 at 5:38 PM, Vaclav Petras <wenzeslaus@gmail.com> wrote:

Hi Benedikt and Markus,

I've created a ticket for this problem:

http://trac.osgeo.org/grass/ticket/2153

this would help to overcome the related Windows _mktemp() limitation.

thanks
Markus

Markus Neteler wrote:

could anyone please implement G_mktemp() following the suggestions in
the ticket (there is a link to a gfortran patch included which might
be just "perfect":

I've added lib/gis/mkstemp.c, which needs testing and feedback.

G_mktemp() just generates the filename; as such it is prone to race
conditions (some other process may create that file after G_mktemp()
returns).

G_mkstemp() opens the file and returns a descriptor. G_mkstemp_fp()
opens the file and returns a FILE*.

The last two take the arguments "flags" and "mode". "flags" should be
O_WRONLY or O_RDWR, plus any other desired flags (e.g. O_APPEND).
"mode" is the file mode (0666 would be typical).

None of the functions use the PID, although the caller can do so.

In theory, up to 26^5 (= ~12 million) filenames will be attempted
until it finds one which doesn't exist.

In retrospect, it should probably use the last 5 Xs rather than the
first, as you may not be able to control the names of the leading
directories. Or the first 5 Xs after the last / or \. Or something.

But I'll wait to see what other comments are made before changing
anything.

--
Glynn Clements <glynn@gclements.plus.com>

On 1/17/14, Glynn Clements <glynn@gclements.plus.com> wrote:

Markus Neteler wrote:

could anyone please implement G_mktemp() following the suggestions in
the ticket (there is a link to a gfortran patch included which might
be just "perfect":

I've added lib/gis/mkstemp.c, which needs testing and feedback.

I have used your explanations to add doxygen style documentation to
the file, see r58834.

One compiler issue remains:

mkstemp.c: In function 'G_mkstemp_fp':
mkstemp.c:164:5: warning: suggest parentheses around comparison in
operand of '&' [-Wparentheses]
     const char *fmode = (flags & O_ACCMODE == O_RDWR)
     ^

For testing, I am not sure how to change
lib/iostream/ami_stream.cpp
in order to use the new function(s). Suggestions?

Markus

On Sat, Feb 1, 2014 at 11:17 PM, Markus Neteler <neteler@osgeo.org> wrote:

On 1/17/14, Glynn Clements <glynn@gclements.plus.com> wrote:

Markus Neteler wrote:

could anyone please implement G_mktemp() following the suggestions in
the ticket (there is a link to a gfortran patch included which might
be just "perfect":

I've added lib/gis/mkstemp.c, which needs testing and feedback.

I have used your explanations to add doxygen style documentation to
the file, see r58834.

Now also online:
http://grass.osgeo.org/programming7/mkstemp_8c.html

One compiler issue remains:

mkstemp.c: In function 'G_mkstemp_fp':
mkstemp.c:164:5: warning: suggest parentheses around comparison in
operand of '&' [-Wparentheses]
     const char *fmode = (flags & O_ACCMODE == O_RDWR)
     ^

For testing, I am not sure how to change
lib/iostream/ami_stream.cpp
in order to use the new function(s). Suggestions?

Markus

Markus Neteler wrote:

One compiler issue remains:

mkstemp.c: In function 'G_mkstemp_fp':
mkstemp.c:164:5: warning: suggest parentheses around comparison in
operand of '&' [-Wparentheses]
     const char *fmode = (flags & O_ACCMODE == O_RDWR)

Fixed in r58877.

For testing, I am not sure how to change
lib/iostream/ami_stream.cpp
in order to use the new function(s). Suggestions?

  fd = G_mkstemp(tmp_path, O_RDWR, 0600);

--
Glynn Clements <glynn@gclements.plus.com>

On Tue, Feb 4, 2014 at 11:14 AM, Glynn Clements
<glynn@gclements.plus.com> wrote:

Markus Neteler wrote:

For testing, I am not sure how to change
lib/iostream/ami_stream.cpp
in order to use the new function(s). Suggestions?

  fd = G_mkstemp(tmp_path, O_RDWR, 0600);

I have attached a patch for this, just to know if I got it right.
IMHO the change needs to be submitted to get it into the overnight
winGRASS binaries.

Markus

(attachments)

ami_stream_mktemp.diff (1.65 KB)

On Tue, Feb 4, 2014 at 12:43 PM, Markus Neteler <neteler@osgeo.org> wrote:

On Tue, Feb 4, 2014 at 11:14 AM, Glynn Clements
<glynn@gclements.plus.com> wrote:

Markus Neteler wrote:

For testing, I am not sure how to change
lib/iostream/ami_stream.cpp
in order to use the new function(s). Suggestions?

  fd = G_mkstemp(tmp_path, O_RDWR, 0600);

I have attached a patch for this, just to know if I got it right.
IMHO the change needs to be submitted to get it into the overnight
winGRASS binaries.

It seems to run nicely on Linux (local patch):

...
-rw------- 1 neteler gis 429391872 Feb 4 17:40 STREAM_baaaaX
-rw------- 1 neteler gis 261873568 Feb 4 17:40 STREAM_daaaaX
-rw------- 1 neteler gis 261873568 Feb 4 17:41 STREAM_eaaaaX
-rw------- 1 neteler gis 261873568 Feb 4 17:41 STREAM_faaaaX
-rw------- 1 neteler gis 261873568 Feb 4 17:41 STREAM_gaaaaX
-rw------- 1 neteler gis 261873568 Feb 4 17:41 STREAM_haaaaX
-rw------- 1 neteler gis 261873568 Feb 4 17:41 STREAM_iaaaaX
-rw------- 1 neteler gis 261873568 Feb 4 17:41 STREAM_jaaaaX
-rw------- 1 neteler gis 261873568 Feb 4 17:42 STREAM_kaaaaX
-rw------- 1 neteler gis 261873568 Feb 4 17:42 STREAM_laaaaX
-rw------- 1 neteler gis 261873568 Feb 4 17:42 STREAM_maaaaX
...
-rw------- 1 neteler gis 261873568 Feb 4 19:10 STREAM_nvaaaX
-rw------- 1 neteler gis 261873568 Feb 4 19:10 STREAM_ovaaaX
-rw------- 1 neteler gis 261873568 Feb 4 19:10 STREAM_pvaaaX
-rw------- 1 neteler gis 261873568 Feb 4 19:10 STREAM_qvaaaX
-rw------- 1 neteler gis 261873568 Feb 4 19:11 STREAM_rvaaaX
-rw------- 1 neteler gis 261873568 Feb 4 19:11 STREAM_svaaaX
-rw------- 1 neteler gis 261873568 Feb 4 19:11 STREAM_tvaaaX
-rw------- 1 neteler gis 261873568 Feb 4 19:11 STREAM_uvaaaX
-rw------- 1 neteler gis 261873568 Feb 4 19:11 STREAM_vvaaaX
-rw------- 1 neteler gis 261873568 Feb 4 19:11 STREAM_wvaaaX
-rw------- 1 neteler gis 225340096 Feb 4 19:12 STREAM_xvaaaX
-rw------- 1 neteler gis 11409293312 Feb 4 19:18 STREAM_caaaaX
...

The job is running on a Desktop PC and saving the STREAM_* files over NFS.

I cannot test the patch on Windows as mentioned.

Markus

On Tue, Feb 4, 2014 at 7:22 PM, Markus Neteler <neteler@osgeo.org> wrote:

On Tue, Feb 4, 2014 at 12:43 PM, Markus Neteler <neteler@osgeo.org> wrote:

On Tue, Feb 4, 2014 at 11:14 AM, Glynn Clements
<glynn@gclements.plus.com> wrote:

Markus Neteler wrote:

For testing, I am not sure how to change
lib/iostream/ami_stream.cpp
in order to use the new function(s). Suggestions?

  fd = G_mkstemp(tmp_path, O_RDWR, 0600);

I have attached a patch for this, just to know if I got it right.
IMHO the change needs to be submitted to get it into the overnight
winGRASS binaries.

It seems to run nicely on Linux (local patch):

...

The viewshed was successfully calculated on the new EU DEM, selecting
a rather square map with 16 billion pixels.

If there are no objections, I'll submit the patch to SVN in order to
get it into the winGRASS binary snapshots for further testing.

Markus

Markus Neteler wrote:

I have attached a patch for this, just to know if I got it right.
IMHO the change needs to be submitted to get it into the overnight
winGRASS binaries.

It looks basically okay.

However:

  +LIBES = $(GISLIB)
  +DEPENDENCIES = $(GISDEP)

This needs to go into include/Make/Grass.make as:

  IOSTREAMDEPS = $(GISLIB)

At present, the iostream library is always built as a static library
(probably due to C++ issues).

Also, iostream is already using libgis for G_fseek().

But I still need to fix the basic design. At a minimum, it needs to
only consider Xs in the last component of the path. Also, there's no
point in making 26^5 (~12 million) attempts. It would probably be
better to use 2 or 3 Xs for a unique identifier (26^2 = 676, 26^3 =
17576) and have the rest based upon the PID.

--
Glynn Clements <glynn@gclements.plus.com>