[GRASS-dev] Improving the G_calloc: unable to allocate xx bytes of memory message?

(related to a question on grass-user)

On Fri, Oct 24, 2014 at 3:27 PM, Andrea Timmermann wrote:
...

When running:

r.basin map=map@Elevation prefix=o coordinates=-71.10394196,43.9865230801
threshold=19005 dir=C:\\Users\\Andrea\\Basins

I get:
" ....
v.to.rast complete.
All in RAM calculation...
Reading raster map <o_map_drainage_e>...
Calculating basins using vector point map...
Current region rows: 21612, cols: 21612
ERROR: G_calloc: unable to allocate 21612 * 4 bytes of memory at io.c:41

...

Question: is there a way to also communicate in which *command* the
memory allocation occurred, i.e. improve the error message?

We have 9 files called "io.c" in GRASS 7.
Ideal would be something like:

ERROR: G_calloc: unable to allocate 21612 * 4 bytes of memory at r.thin/io.c:41

or
ERROR: r.thin: G_calloc: unable to allocate 21612 * 4 bytes of memory
at r.thin/io.c:41

Not sure how to implement that in lib/gis/alloc.c

thanks
Markus

Markus Neteler wrote:

> r.basin map=map@Elevation prefix=o coordinates=-71.10394196,43.9865230801
> threshold=19005 dir=C:\\Users\\Andrea\\Basins
>
> I get:
> " ....
> v.to.rast complete.
> All in RAM calculation...
> Reading raster map <o_map_drainage_e>...
> Calculating basins using vector point map...
> Current region rows: 21612, cols: 21612
> ERROR: G_calloc: unable to allocate 21612 * 4 bytes of memory at io.c:41
...

Question: is there a way to also communicate in which *command* the
memory allocation occurred, i.e. improve the error message?

Why should allocation errors be different to any other error?

For the case of scripts, it may suffice to fix run_command() etc so
that they terminate (or raise an exception) if the module returns a
non-zero exit status. If the error message is immedately followed by
"error executing <command>", it should remove most of the ambiguity.

We have 9 files called "io.c" in GRASS 7.
Ideal would be something like:

ERROR: G_calloc: unable to allocate 21612 * 4 bytes of memory at r.thin/io.c:41

or
ERROR: r.thin: G_calloc: unable to allocate 21612 * 4 bytes of memory
at r.thin/io.c:41

Not sure how to implement that in lib/gis/alloc.c

alloc.c is the wrong place. If you want errors to be traceable, the
module name should be included for *all* errors by G_fatal_error (and
possibly others).

Also, I wouldn't suggest using module/file.c, because that will be
inaccurate (and thus misleading) if the caller is part of a library.

If we just want the complete filename in allocation errors, we can
modify the compiler switches to include $(RELDIR) as per the attached
patch. Similarly for use of __FILE__ in the iostream headers.

--
Glynn Clements <glynn@gclements.plus.com>

(attachments)

alloc-file-path.diff (1.33 KB)

On Sat, Oct 25, 2014 at 12:13 AM, Glynn Clements
<glynn@gclements.plus.com> wrote:

Markus Neteler wrote:

> r.basin map=map@Elevation prefix=o coordinates=-71.10394196,43.9865230801
> threshold=19005 dir=C:\\Users\\Andrea\\Basins
>
> I get:
> " ....
> v.to.rast complete.
> All in RAM calculation...
> Reading raster map <o_map_drainage_e>...
> Calculating basins using vector point map...
> Current region rows: 21612, cols: 21612
> ERROR: G_calloc: unable to allocate 21612 * 4 bytes of memory at io.c:41
...

Question: is there a way to also communicate in which *command* the
memory allocation occurred, i.e. improve the error message?

Why should allocation errors be different to any other error?

That's true.

For the case of scripts, it may suffice to fix run_command() etc so
that they terminate (or raise an exception) if the module returns a
non-zero exit status. If the error message is immedately followed by
"error executing <command>", it should remove most of the ambiguity.

Yes, that would be a perfect solution to be implemented.

We have 9 files called "io.c" in GRASS 7.
Ideal would be something like:

ERROR: G_calloc: unable to allocate 21612 * 4 bytes of memory at r.thin/io.c:41

or
ERROR: r.thin: G_calloc: unable to allocate 21612 * 4 bytes of memory
at r.thin/io.c:41

Not sure how to implement that in lib/gis/alloc.c

alloc.c is the wrong place. If you want errors to be traceable, the
module name should be included for *all* errors by G_fatal_error (and
possibly others).

Also, I wouldn't suggest using module/file.c, because that will be
inaccurate (and thus misleading) if the caller is part of a library.

If we just want the complete filename in allocation errors, we can
modify the compiler switches to include $(RELDIR) as per the attached
patch. Similarly for use of __FILE__ in the iostream headers.

Ok, I tried that patch but I don't manage to trigger it.

GRASS 7.1.svn (nc_spm_08_grass7):~ > g.region rast=elevation -p res=0.01
projection: 99 (Lambert Conformal Conic)
zone: 0
datum: nad83
ellipsoid: a=6378137 es=0.006694380022900787
north: 228500
south: 215000
west: 630000
east: 645000
nsres: 0.01
ewres: 0.01
rows: 1350000
cols: 1500000
cells: 2025000000000

GRASS 7.1.svn (nc_spm_08_grass7):~ > r.cost roadsmajor output=bla
start_coordinates=637500.000000,221750.000000
WARNING: segment zero_fill(): Unable to write (No such file or directory)
WARNING: Could not write segment file
ERROR: Can not create temporary file

Yet a bit unhelpful :slight_smile: It comes from lib/segment/format.c but I
didn't manage to improve that.

Markus

Markus Neteler wrote:

Ok, I tried that patch but I don't manage to trigger it.

GRASS 7.1.svn (nc_spm_08_grass7):~ > g.region rast=elevation -p res=0.01

rows: 1350000
cols: 1500000
cells: 2025000000000

That's 1.8 * 2^40 cells; at 24 bytes/cell, it would be 44 TiB.

GRASS 7.1.svn (nc_spm_08_grass7):~ > r.cost roadsmajor output=bla
start_coordinates=637500.000000,221750.000000
WARNING: segment zero_fill(): Unable to write (No such file or directory)

So it's not trying to allocate 44 TiB of RAM, it's trying to create a
44 TiB segment file, which fails for other reasons.

You'll need to find something which actually allocates RAM rather than
a temporary file. Try "r.proj memory=..."; this allocates the cache
memory before creating any temporary file.

WARNING: Could not write segment file
ERROR: Can not create temporary file

Yet a bit unhelpful :slight_smile: It comes from lib/segment/format.c but I
didn't manage to improve that.

Specifically:

  if (write(fd, buf, n) != n) {
      G_warning("segment zero_fill(): Unable to write (%s)", strerror(errno));

ENOENT ("No such file or directory") isn't a valid error code for
write(). But errno is only set if write() returns -1, so it's likely
that write() returned a short count rather than failing and the errno
value is left over from a previous system call (sucessful calls don't
reset errno to zero). In all probability, the next call to write()
would have failed due to ENOSPC, EDQUOT, or EFBIG (or raised SIGXFSZ
if RLIMIT_FSIZE is being exceeded).

--
Glynn Clements <glynn@gclements.plus.com>

On Mon, Oct 27, 2014 at 4:33 PM, Glynn Clements
<glynn@gclements.plus.com> wrote:

Markus Neteler wrote:

Ok, I tried that patch but I don't manage to trigger it.

GRASS 7.1.svn (nc_spm_08_grass7):~ > g.region rast=elevation -p res=0.01

rows: 1350000
cols: 1500000
cells: 2025000000000

That's 1.8 * 2^40 cells; at 24 bytes/cell, it would be 44 TiB.

Yes, I tried to do some extreme computing. But GRASS/Linux resisted
easily as you point out below.

GRASS 7.1.svn (nc_spm_08_grass7):~ > r.cost roadsmajor output=bla
start_coordinates=637500.000000,221750.000000
WARNING: segment zero_fill(): Unable to write (No such file or directory)

So it's not trying to allocate 44 TiB of RAM, it's trying to create a
44 TiB segment file, which fails for other reasons.

You'll need to find something which actually allocates RAM rather than
a temporary file. Try "r.proj memory=..."; this allocates the cache
memory before creating any temporary file.

WARNING: Could not write segment file
ERROR: Can not create temporary file

OK, no success with r.proj (to crash it) but this one triggered the new message:

GRASS 7.1.svn (nc_spm_08_grass7):~ > g.region rast=roadsmajor res=0.1 -p
projection: 99 (Lambert Conformal Conic)
zone: 0
datum: nad83
ellipsoid: a=6378137 es=0.006694380022900787
north: 228500
south: 215000
west: 630000
east: 645000
nsres: 0.1
ewres: 0.1
rows: 135000
cols: 150000
cells: 20250000000

GRASS 7.1.svn (nc_spm_08_grass7):~ > v.to.rast roadsmajor out=test
use=attr attrcol=PROPYEAR memory=3000000 --o
WARNING: No areas selected from vector map <roadsmajor>
Current region rows: 135000, cols: 150000
ERROR: G_calloc: unable to allocate 18446744072484715136 * 4 bytes of
       memory at vector/v.to.rast/raster.c:84

... only, it does not return to command line (have to use CTRL-C).

Yet a bit unhelpful :slight_smile: It comes from lib/segment/format.c but I
didn't manage to improve that.

Specifically:

        if (write(fd, buf, n) != n) {
            G_warning("segment zero_fill(): Unable to write (%s)", strerror(errno));

ENOENT ("No such file or directory") isn't a valid error code for
write(). But errno is only set if write() returns -1, so it's likely
that write() returned a short count rather than failing and the errno
value is left over from a previous system call (sucessful calls don't
reset errno to zero). In all probability, the next call to write()
would have failed due to ENOSPC, EDQUOT, or EFBIG (or raised SIGXFSZ
if RLIMIT_FSIZE is being exceeded).

I see and now darkly remember something about discussion in this list
about clearing the previous error state.

Markus

Markus Neteler wrote:

GRASS 7.1.svn (nc_spm_08_grass7):~ > v.to.rast roadsmajor out=test
use=attr attrcol=PROPYEAR memory=3000000 --o
WARNING: No areas selected from vector map <roadsmajor>
Current region rows: 135000, cols: 150000
ERROR: G_calloc: unable to allocate 18446744072484715136 * 4 bytes of
       memory at vector/v.to.rast/raster.c:84

$ printf '%x' 18446744072484715136
ffffffffb6fe7a80

Those leading "f"s are a giveway. It's almost certain that a
calculation was done with 32-bit arithmetic, which overflowed to
produce a negative result (0xb6fe7a80 = -1224836480), which was then
converted to 64 bits with sign extension, then cast to unsigned.

... only, it does not return to command line (have to use CTRL-C).

It's possible that memory corruption has already occurred, resulting
in e.g. an infinite loop.

A great deal of GRASS code still uses "int" where it should be using
"long", "long long", "size_t" or "ssize_t". Fixing this is far from
straightforward as:

1. It's likely to be a lot of work.

2. I don't know of any easy way to detect such cases.

3. We still haven't decided upon the appropriate type. All of the
options have drawbacks; to recap:

  long - only 32 bits on 64-bit versions of Windows
  long long - not specified by C89
  size_t - unsigned
  ssize_t - POSIX; not specified by any version of ISO C

Also: Windows' write() function uses "int" for the count argument and
the return value, so it can't handle more than 2 GiB at a time.
Likewise for read().

>> Yet a bit unhelpful :slight_smile: It comes from lib/segment/format.c but I
>> didn't manage to improve that.
>
> Specifically:
>
> if (write(fd, buf, n) != n) {
> G_warning("segment zero_fill(): Unable to write (%s)", strerror(errno));
>
> ENOENT ("No such file or directory") isn't a valid error code for
> write(). But errno is only set if write() returns -1, so it's likely
> that write() returned a short count rather than failing and the errno
> value is left over from a previous system call (sucessful calls don't
> reset errno to zero). In all probability, the next call to write()
> would have failed due to ENOSPC, EDQUOT, or EFBIG (or raised SIGXFSZ
> if RLIMIT_FSIZE is being exceeded).

I see and now darkly remember something about discussion in this list
about clearing the previous error state.

The main issue here is that errno is being examined when write() has
reported a short count (a non-negative return value less than the
value of the third argument) rather than an error (returning -1).

The code should either distinguish these cases, or use something like:

ssize_t writen(int fd, const void *buf, size_t count)
{
    size_t left = count;
    const char *ptr = buf;
    while (left > 0) {
        ssize_t n = write(fd, ptr, left);
        if (n <= 0)
            return -1;
        ptr += n;
        left -= n;
    }
    return count;
}

This never returns a short count; it either writes (and returns) the
exact number of bytes requested or returns -1 (errno won't be changed
if write() returns 0, but that should only occur when the third
argument is 0; if write() can't write *anything*, that's an error
condition).

--
Glynn Clements <glynn@gclements.plus.com>

On Mon, Oct 27, 2014 at 6:42 PM, Glynn Clements
<glynn@gclements.plus.com> wrote:

Markus Neteler wrote:

GRASS 7.1.svn (nc_spm_08_grass7):~ > v.to.rast roadsmajor out=test
use=attr attrcol=PROPYEAR memory=3000000 --o
WARNING: No areas selected from vector map <roadsmajor>
Current region rows: 135000, cols: 150000
ERROR: G_calloc: unable to allocate 18446744072484715136 * 4 bytes of
       memory at vector/v.to.rast/raster.c:84

$ printf '%x' 18446744072484715136
ffffffffb6fe7a80

Those leading "f"s are a giveway. It's almost certain that a
calculation was done with 32-bit arithmetic, which overflowed to
produce a negative result (0xb6fe7a80 = -1224836480), which was then
converted to 64 bits with sign extension, then cast to unsigned.

... only, it does not return to command line (have to use CTRL-C).

It's possible that memory corruption has already occurred, resulting
in e.g. an infinite loop.

[...]

In any case I would suggest to submit this patch to SVN since it helps
the user to understand where the issue happens.

thanks
Markus

On Sat, Oct 25, 2014 at 12:13 AM, Glynn Clements
<glynn@gclements.plus.com> wrote:

+++ include/Make/Compile.make (working copy)

+ALL_CFLAGS = $(LFS_CFLAGS) $(EXTRA_CFLAGS) $(NLS_CFLAGS) $(DEFS) $(EXTRA_INC) $(INC) -DRELDIR=\"$(RELDIR)\"

+#define G_malloc(n) G__malloc(RELDIR "/" __FILE__, __LINE__, (n))
+#define G_calloc(m, n) G__calloc(RELDIR "/" __FILE__, __LINE__, (m), (n))
+#define G_realloc(p, n) G__realloc(RELDIR "/" __FILE__, __LINE__, (p), (n))

With RELDIR defined in Grass.make it is impossible to compile anything
with GRASS libs without GRASS build system. Could you please add the
RELDIR definition to some header file?

Thanks
Radim

Radim Blazek wrote:

> +++ include/Make/Compile.make (working copy)

> +ALL_CFLAGS = $(LFS_CFLAGS) $(EXTRA_CFLAGS) $(NLS_CFLAGS) $(DEFS) $(EXTRA_INC) $(INC) -DRELDIR=\"$(RELDIR)\"

> +#define G_malloc(n) G__malloc(RELDIR "/" __FILE__, __LINE__, (n))
> +#define G_calloc(m, n) G__calloc(RELDIR "/" __FILE__, __LINE__, (m), (n))
> +#define G_realloc(p, n) G__realloc(RELDIR "/" __FILE__, __LINE__, (p), (n))

With RELDIR defined in Grass.make it is impossible to compile anything
with GRASS libs without GRASS build system. Could you please add the
RELDIR definition to some header file?

$(RELDIR) is just $(CURDIR) minus $(GRASS_HOME):

  RELDIR := $(subst $(GRASS_HOME)/,$(CURDIR))

Like $(CURDIR), it's different for each subdirectory, so it's not
something which can be put into a header file.

However, it's also not used for anything other than error messages, so
compiling with e.g. -DRELDIR= or -DRELDIR=\"$(CURDIR)\" is safe.

I've added a fallback in r62632.

--
Glynn Clements <glynn@gclements.plus.com>

On Thu, Nov 6, 2014 at 8:59 AM, Glynn Clements <glynn@gclements.plus.com> wrote:

Radim Blazek wrote:

> +++ include/Make/Compile.make (working copy)

> +ALL_CFLAGS = $(LFS_CFLAGS) $(EXTRA_CFLAGS) $(NLS_CFLAGS) $(DEFS) $(EXTRA_INC) $(INC) -DRELDIR=\"$(RELDIR)\"

> +#define G_malloc(n) G__malloc(RELDIR "/" __FILE__, __LINE__, (n))
> +#define G_calloc(m, n) G__calloc(RELDIR "/" __FILE__, __LINE__, (m), (n))
> +#define G_realloc(p, n) G__realloc(RELDIR "/" __FILE__, __LINE__, (p), (n))

With RELDIR defined in Grass.make it is impossible to compile anything
with GRASS libs without GRASS build system. Could you please add the
RELDIR definition to some header file?

$(RELDIR) is just $(CURDIR) minus $(GRASS_HOME):

        RELDIR := $(subst $(GRASS_HOME)/,$(CURDIR))

Like $(CURDIR), it's different for each subdirectory, so it's not
something which can be put into a header file.

However, it's also not used for anything other than error messages, so
compiling with e.g. -DRELDIR= or -DRELDIR=\"$(CURDIR)\" is safe.

I've added a fallback in r62632.

Thanks
Radim

--
Glynn Clements <glynn@gclements.plus.com>

On Thu, Nov 6, 2014 at 8:59 AM, Glynn Clements <glynn@gclements.plus.com> wrote:
...

However, it's also not used for anything other than error messages, so
compiling with e.g. -DRELDIR= or -DRELDIR=\"$(CURDIR)\" is safe.

I've added a fallback in r62632.

Backported to relbranch7 in r62633.

Markus

Glynn,

while I am able to make use of your suggestion about ssize_t writen()
I found back the code snippet concerning errno:

http://trac.osgeo.org/grass/ticket/2278#comment:5

Markus

Markus Neteler wrote:

while I am able to make use of your suggestion about ssize_t writen()
I found back the code snippet concerning errno:

http://trac.osgeo.org/grass/ticket/2278#comment:5

The reason for copying errno is in case an error occurs within any of
the calls between the one for which the error is being reported and
the G_fatal_error() call.

E.g. in the second hunk of the patch at which that comment was aimed,
the code which handled the error called close(cell_fd) as part of the
clean-up prior to calling G_fatal_error(). If the close() call failed,
the error message would report the error for the close() call rather
than for the original error.

This is less of an issue if the G_fatal_error() call immediately
follows the failed call, but in theory it could arise in the context
of _(...) e.g. if an error occurs while loading the message catalogs
to translate the error message.

--
Glynn Clements <glynn@gclements.plus.com>