[GRASS-dev] [GRASS GIS] #837: Memory leaks in r.example

#837: Memory leaks in r.example
-------------------------+--------------------------------------------------
Reporter: sprice | Owner: grass-dev@lists.osgeo.org
     Type: defect | Status: new
Priority: normal | Milestone: 6.4.0
Component: default | Version: svn-releasebranch64
Keywords: | Platform: MacOSX
      Cpu: Unspecified |
-------------------------+--------------------------------------------------
In r.example, there are many small memory leaks in basic functions. Not
only is it bad to have this many leaks in the example module, the leaks
probably persist through many other modules. Here is the location of leaks
in a run of r.example, altered only to print "Done!" & pause before exit.

It seems to mostly be a single leak in find_file(), but there is another
leak in G_gisinit().

(in text GRASS)
{{{
GRASS 6.4.0svn (semisup):~ > r.example --o input=landcover
output=landcover.test
r.example(18500) malloc: protecting edges
r.example(18500) malloc: recording malloc stacks to disk using standard
recorder
r.example(18500) malloc: enabling scribbling to detect mods to free blocks
r.example(18500) malloc: stack logs being written into /tmp/stack-
logs.18500.r.example.TQfsBA.index
Done!
^C
sh(18567) malloc: protecting edges
sh(18567) malloc: recording malloc stacks to disk using standard recorder
sh(18567) malloc: enabling scribbling to detect mods to free blocks
sh(18567) malloc: process 18500 no longer exists, stack logs deleted from
/tmp/stack-logs.18500.r.example.TQfsBA.index
sh(18567) malloc: stack logs being written into /tmp/stack-
logs.18567.sh.JvnnBY.index
g.gisenv(18568) malloc: protecting edges
g.gisenv(18568) malloc: recording malloc stacks to disk using standard
recorder
g.gisenv(18568) malloc: enabling scribbling to detect mods to free blocks
g.gisenv(18568) malloc: stack logs being written into /tmp/stack-
logs.18568.g.gisenv.yqOvWQ.index
g.gisenv(18568) malloc: stack logs deleted from /tmp/stack-
logs.18568.g.gisenv.yqOvWQ.index
g.gisenv(18570) malloc: protecting edges
g.gisenv(18570) malloc: recording malloc stacks to disk using standard
recorder
g.gisenv(18570) malloc: enabling scribbling to detect mods to free blocks
g.gisenv(18570) malloc: stack logs being written into /tmp/stack-
logs.18570.g.gisenv.5s2NuI.index
g.gisenv(18570) malloc: stack logs deleted from /tmp/stack-
logs.18570.g.gisenv.5s2NuI.index
g.gisenv(18571) malloc: protecting edges
g.gisenv(18571) malloc: recording malloc stacks to disk using standard
recorder
g.gisenv(18571) malloc: enabling scribbling to detect mods to free blocks
g.gisenv(18571) malloc: stack logs being written into /tmp/stack-
logs.18571.g.gisenv.6FNlZi.index
g.gisenv(18571) malloc: stack logs deleted from /tmp/stack-
logs.18571.g.gisenv.6FNlZi.index
sh(18567) malloc: stack logs deleted from /tmp/stack-
logs.18567.sh.JvnnBY.index
GRASS 6.4.0svn (semisup):~ >
}}}

(outside of GRASS)
{{{
seth:r.example sprice$ leaks r.example
Process 18500: 1701 nodes malloced for 466 KB
Process 18500: 1622 leaks for 25968 total leaked bytes.
Leak: 0x100100040 size=32 zone: DefaultMallocZone_0x100083000 string
'/Users/sprice/grass/semisup'
         Call stack: [thread 0x7fff70224be0]: | 0x4 | start | main |
G__gisinit | G_location_path | G__location_path | G__malloc | malloc |
malloc_zone_malloc
Leak: 0x100100380 size=16 zone: DefaultMallocZone_0x100083000 string
'sprice'
         Call stack: [thread 0x7fff70224be0]: | 0x4 | start | main |
get_map_row | get_null_value_row | find_file1 | find_file | G_store |
G__malloc | malloc | malloc_zone_malloc
Leak: 0x1001003c0 size=16 zone: DefaultMallocZone_0x100083000 string
'sprice'
         Call stack: [thread 0x7fff70224be0]: | 0x4 | start | main |
G_raster_map_type | find_file | G_store | G__malloc | malloc |
malloc_zone_malloc
Leak: 0x1001003d0 size=16 zone: DefaultMallocZone_0x100083000 string
'sprice'
         Call stack: [thread 0x7fff70224be0]: | 0x4 | start | main |
G_open_cell_old | G__open_cell_old | find_file | G_store | G__malloc |
malloc | malloc_zone_malloc
Leak: 0x1001003e0 size=16 zone: DefaultMallocZone_0x100083000 string
'sprice'
         Call stack: [thread 0x7fff70224be0]: | 0x4 | start | main |
G_open_cell_old | G__open_cell_old | G_raster_map_type | find_file |
G_store | G__malloc | malloc | malloc_zone_malloc
Leak: 0x1001003f0 size=16 zone: DefaultMallocZone_0x100083000 string
'sprice'
         Call stack: [thread 0x7fff70224be0]: | 0x4 | start | main |
G_open_cell_old | G__open_cell_old | G_get_gdal_link | find_file | G_store
| G__malloc | malloc | malloc_zone_malloc
Leak: 0x100100400 size=16 zone: DefaultMallocZone_0x100083000 string
'sprice'
         Call stack: [thread 0x7fff70224be0]: | 0x4 | start | main |
G_open_cell_old | G__open_cell_old | G_get_gdal_link | G_raster_map_type |
find_file | G_store | G__malloc | malloc | malloc_zone_malloc
Leak: 0x100100480 size=16 zone: DefaultMallocZone_0x100083000 string
'sprice'
         Call stack: [thread 0x7fff70224be0]: | 0x4 | start | main |
get_map_row | get_null_value_row | find_file1 | find_file | G_store |
G__malloc | malloc | malloc_zone_malloc
Leak: 0x100100490 size=16 zone: DefaultMallocZone_0x100083000 string
'sprice'
         Call stack: [thread 0x7fff70224be0]: | 0x4 | start | main |
get_map_row | get_null_value_row | find_file1 | find_file | G_store |
G__malloc | malloc | malloc_zone_malloc
Leak: 0x100100510 size=16 zone: DefaultMallocZone_0x100083000 string
'sprice'
         Call stack: [thread 0x7fff70224be0]: | 0x4 | start | main |
get_map_row | get_null_value_row | find_file1 | find_file | G_store |
G__malloc | malloc | malloc_zone_malloc
Leak: 0x100100520 size=16 zone: DefaultMallocZone_0x100083000 string
'sprice'
         Call stack: [thread 0x7fff70224be0]: | 0x4 | start | main |
get_map_row | get_null_value_row | find_file1 | find_file | G_store |
G__malloc | malloc | malloc_zone_malloc
[...]
Leak: 0x100106ba0 size=16 zone: DefaultMallocZone_0x100083000 string
'sprice'
         Call stack: [thread 0x7fff70224be0]: | 0x4 | start | main |
get_map_row | get_null_value_row | find_file1 | find_file | G_store |
G__malloc | malloc | malloc_zone_malloc
Leak: 0x100106bb0 size=16 zone: DefaultMallocZone_0x100083000 string
'sprice'
         Call stack: [thread 0x7fff70224be0]: | 0x4 | start | main |
get_map_row | get_null_value_row | find_file1 | find_file | G_store |
G__malloc | malloc | malloc_zone_malloc
Leak: 0x100106bc0 size=16 zone: DefaultMallocZone_0x100083000 string
'sprice'
         Call stack: [thread 0x7fff70224be0]: | 0x4 | start | main |
get_map_row | get_null_value_row | find_file1 | find_file | G_store |
G__malloc | malloc | malloc_zone_malloc
Leak: 0x1001076e0 size=16 zone: DefaultMallocZone_0x100083000 string
'sprice'
         Call stack: [thread 0x7fff70224be0]: | 0x4 | start | main |
close_new | G_write_range | G_raster_map_type | find_file | G_store |
G__malloc | malloc | malloc_zone_malloc
Leak: 0x100107830 size=16 zone: DefaultMallocZone_0x100083000 string
'sprice'
         Call stack: [thread 0x7fff70224be0]: | 0x4 | start | main |
close_new | G__write_cats | G_raster_map_is_fp | find_file | G_store |
G__malloc | malloc | malloc_zone_malloc
seth:r.example sprice$
}}}

--
Ticket URL: <http://trac.osgeo.org/grass/ticket/837&gt;
GRASS GIS <http://grass.osgeo.org>

#837: Memory leaks in r.example
----------------------+-----------------------------------------------------
  Reporter: sprice | Owner: grass-dev@lists.osgeo.org
      Type: defect | Status: new
  Priority: normal | Milestone: 6.4.0
Component: default | Version: svn-releasebranch64
Resolution: | Keywords:
  Platform: MacOSX | Cpu: OSX/Intel
----------------------+-----------------------------------------------------
Changes (by sprice):

  * cpu: Unspecified => OSX/Intel

--
Ticket URL: <http://trac.osgeo.org/grass/ticket/837#comment:1&gt;
GRASS GIS <http://grass.osgeo.org>

#837: Memory leaks in r.example
----------------------+-----------------------------------------------------
  Reporter: sprice | Owner: grass-dev@lists.osgeo.org
      Type: defect | Status: closed
  Priority: normal | Milestone: 6.4.0
Component: default | Version: svn-releasebranch64
Resolution: wontfix | Keywords:
  Platform: MacOSX | Cpu: OSX/Intel
----------------------+-----------------------------------------------------
Changes (by glynn):

  * status: new => closed
  * resolution: => wontfix

Comment:

Replying to [ticket:837 sprice]:

> In r.example, there are many small memory leaks in basic functions.

We don't care about "fixed" memory consumption, i.e. the use of a certain
amount of memory for the process overall, or for each library, or for each
map. It's only considered a "leak" if the memory usage will grow during
normal processing, i.e. consuming memory (unnecessarily) per raster row,
or per vector element.

If you try to use the GRASS libraries from a persistent application (one
which performs an unlimited number of operations until it is explicitly
terminated), you lose. The libraries weren't designed for that purpose,
aren't suitable for that purpose, and will probably never be suitable for
that purpose. And memory usage is the least of the issues.

Closing as "wontfix". If you have details of any actual '''leaks''', feel
free to reopen.

--
Ticket URL: <http://trac.osgeo.org/grass/ticket/837#comment:2&gt;
GRASS GIS <http://grass.osgeo.org>

#837: Memory leaks in r.example
----------------------+-----------------------------------------------------
  Reporter: sprice | Owner: grass-dev@lists.osgeo.org
      Type: defect | Status: reopened
  Priority: normal | Milestone: 6.4.0
Component: default | Version: svn-releasebranch64
Resolution: | Keywords:
  Platform: MacOSX | Cpu: OSX/Intel
----------------------+-----------------------------------------------------
Changes (by sprice):

  * status: closed => reopened
  * resolution: wontfix =>

Comment:

Judging by the 1616 leaked nodes in get_map_row() this is on a per row
basis. But I've never heard of a well written program containing leaks
anyway.

Here's your fix for the first leak:
Change line 58 in lib/gis/gisinit.c from "G_location_path();" to
"G_free(G_location_path());"

The second leak:
Add "G_free(dummy);" at line 1003 in file lib/gis/get_row.c

The third, fourth, & last leaks (in lib/gis/opencell.c) are a bit more
difficult to take care of all permutations:
{{{
--- grass-6.4.0RC5/lib/gis/opencell.c 2009-03-04 01:06:31.000000000
-0700
+++ grass-6.4.svn_src_snapshot_2009_12_12/lib/gis/opencell.c 2009-12-16
00:04:45.000000000 -0700
@@ -188,21 +188,27 @@
             G_warning(_("Unable to open raster map <%s@%s> since it is a
reclass "
                         "of raster map <%s@%s> which does not exist"),
                       name, mapset, r_name, r_mapset);
+ G_free(xmapset);
             return -1;
         }
         break;
      default: /* Error reading cellhd/reclass file */
+ G_free(xmapset);
         return -1;
      }

      /* read the cell header */
- if (G_get_cellhd(r_name, r_mapset, &cellhd) < 0)
- return -1;
+ if (G_get_cellhd(r_name, r_mapset, &cellhd) < 0){
+ G_free(xmapset);
+ return -1;
+ }

      /* now check the type */
      MAP_TYPE = G_raster_map_type(r_name, r_mapset);
- if (MAP_TYPE < 0)
- return -1;
+ if (MAP_TYPE < 0){
+ G_free(xmapset);
+ return -1;
+ }

      if (MAP_TYPE == CELL_TYPE)
         /* set the number of bytes for CELL map */
@@ -211,6 +217,7 @@
         if (CELL_nbytes < 1) {
             G_warning(_("Raster map <%s@%s>: format field in header file
invalid"),
                       r_name, r_mapset);
+ G_free(xmapset);
             return -1;
         }
      }
@@ -220,11 +227,13 @@
                     "Found raster map <%s@%s>, should be <%s>."),
                   name, mapset, name, G__projection_name(cellhd.proj),
                   G__projection_name(G__.window.proj));
+ G_free(xmapset);
         return -1;
      }
      if (cellhd.zone != G__.window.zone) {
         G_warning(_("Raster map <%s@%s> is in different zone (%d) than
current region (%d)"),
                   name, mapset, cellhd.zone, G__.window.zone);
+ G_free(xmapset);
         return -1;
      }

@@ -232,6 +241,7 @@
      if (MAP_TYPE == CELL_TYPE && (unsigned int) CELL_nbytes >
sizeof(CELL)) {
         G_warning(_("Raster map <%s@%s>: bytes per cell (%d) too large"),
                   name, mapset, CELL_nbytes);
+ G_free(xmapset);
         return -1;
      }

@@ -261,6 +271,7 @@
  #else
         G_warning(_("map <%s@%s> is a GDAL link but GRASS is compiled
without GDAL support"),
                   r_name, r_mapset);
+ G_free(xmapset);
         return -1;
  #endif
      }
@@ -268,8 +279,10 @@
         /* now actually open file for reading */
         fd = G_open_old(cell_dir, r_name, r_mapset);

- if (fd < 0)
- return -1;
+ if (fd < 0){
+ G_free(xmapset);
+ return -1;
+ }

      fcb = new_fileinfo(fd);

@@ -298,6 +311,7 @@
             fcb->name = G_store(name);
      }
      fcb->mapset = G_store(mapset);
+ G_free(xmapset);

      /* mark no data row in memory */
      fcb->cur_row = -1;
@@ -977,9 +991,12 @@
         return -1;
      }
      G__file_name(path, "fcell", name, xmapset);
- if (access(path, 0) == 0)
- return 1;
+ if (access(path, 0) == 0){
+ G_free(xmapset);
+ return 1;
+ }
      G__file_name(path, "g3dcell", name, xmapset);
+ G_free(xmapset);
      if (access(path, 0) == 0)
         return 1;

@@ -1002,7 +1019,8 @@
  {
      char path[GPATH_MAX];
      const char *xmapset;
-
+ RASTER_MAP_TYPE cktype;
+
      xmapset = G_find_cell2(name, mapset);
      if (!xmapset) {
         if (mapset && *mapset)
@@ -1013,11 +1031,15 @@
      }
      G__file_name(path, "fcell", name, xmapset);

- if (access(path, 0) == 0)
- return G__check_fp_type(name, xmapset);
+ if (access(path, 0) == 0) {
+ cktype = G__check_fp_type(name, xmapset);
+ G_free(xmapset);
+ return cktype;
+ }

      G__file_name(path, "g3dcell", name, xmapset);
-
+ G_free(xmapset);
+
      if (access(path, 0) == 0)
         return DCELL_TYPE;
  }}}

The next leak in lib/gis/gdal.c:
{{{
@@ -135,12 +139,14 @@
      RASTER_MAP_TYPE map_type;
      FILE *fp;
      struct Key_Value *key_val;
- const char *p;
+ const char *p, *fc;
      DCELL null_val;

- if (!G_find_cell2(name, mapset))
+ if (!(fc = G_find_cell2(name, mapset)))
         return NULL;
-
+
+ G_free(fc);
+
      map_type = G_raster_map_type(name, mapset);
      if (map_type < 0)
         return NULL;
}}}

--
Ticket URL: <http://trac.osgeo.org/grass/ticket/837#comment:3&gt;
GRASS GIS <http://grass.osgeo.org>

#837: Memory leaks in r.example
----------------------+-----------------------------------------------------
  Reporter: sprice | Owner: grass-dev@lists.osgeo.org
      Type: defect | Status: reopened
  Priority: normal | Milestone: 6.4.0
Component: default | Version: svn-releasebranch64
Resolution: | Keywords:
  Platform: MacOSX | Cpu: OSX/Intel
----------------------+-----------------------------------------------------
Comment (by glynn):

Replying to [comment:3 sprice]:
> Judging by the 1616 leaked nodes in get_map_row() this is on a per row
basis. But I've never heard of a well written program containing leaks
anyway.

No program leaks, insofar as all of its memory is returned to the system
upon termination. It's just a question of how much memory it uses while
it's running. Most programs could have their memory consumption reduced;
it's a question of whether it's worth the effort and (more importantly)
added complexity.

> Here's your fix for the first leak:
> Change line 58 in lib/gis/gisinit.c from "G_location_path();" to
"G_free(G_location_path());"

This is harmless (it's a fixed amount of memory for the lifetime of the
process), but it's also trivial to fix.

> The second leak:
> Add "G_free(dummy);" at line 1003 in file lib/gis/get_row.c

This is a '''real''' leak, but I'm not sure that it can be fixed this way,
as the return value from G_find_* may not be owned by the caller. The
return value may be a mapset from the current mapset list, in which case
freeing it will break stuff.

Historically, the view has been that memory consumed by G_find_* or
G_open_* isn't an issue; it just means that a process consumes a certain
amount of memory per map. But the null value handling was hacked on later;
the fact that it opens and closes the null value bitmap every few rows was
always known to be inefficient, but I don't think anyone spotted the leak
before.

Possible solutions:

1. Change the null bitmap handling to keep the file open.

The reason why this wasn't done originally is that it doubles the number
of open files, which can cause e.g. r.series to run into OS limits on the
number of open files. It also doubles the number of required fileinfo
slots, but this is less of a problem now that the fileinfo array is
dynamically sized.

2. Make G_find_* return unique allocations, so that they can be freed.
This may increase fixed memory consumption slightly for some modules, as
nothing currently frees the return value. But it will reduce it in the
long run (and in the short run for raster modules by fixing this leak),
and the worst case with it fixed will still be a lot better than the
current worst case.

3. Make open_null_read free the return value. I '''think''' that it will
always be unique in this case, as a mapset is always being passed in.
AFAICT, the return value is only shared if the name is unqualified and the
mapset is NULL or the empty string, indicating that the map should be
looked up using the mapset search path. A fully-qualified name or an
unqualified name with an explicit mapset should always return a unique
value. But if I've overlooked something, freeing the return value will
cause no end of problems.

If this was the only issue, my inclination would be for !#2, as we're
trying to get 6.4 released. In that context, !#1 is too disruptive and !#3
is a hack which, if it's wrong, may not get discovered until it's too
late.

But ...

> The third, fourth, & last leaks (in lib/gis/opencell.c) are a bit more
difficult to take care of all permutations:

[snip]

This is a large part of the reason why we just resign ourselves to each
map requiring a certain amount of memory. And I'd leave it at that, if it
wasn't for the null bitmap handling calling this function repeatedly.

This specific fix can't be applied without first making G_find_* always
return unique allocations.

The fix would be simpler if G_open_cell_old() called G_fatal_error() on
errors. This would make life simpler for modules, which currently need to
check the return value. Off the top of my head, I don't know of a single
case where the caller actually tries to recover from G_open_cell_old()
failing. Doing so would also be a benefit to a small number of modules
which forget to check for errors.

An intermediate solution would be to free the memory under normal
circumstances and not bother about the leak if G_open_cell_old() fails.
It's 99.99% likely that the program is going to terminate immediately in
that situation.

> The next leak in lib/gis/gdal.c:

This isn't a problem. Each map which is opened requires some memory. There
are all sorts of ways in which that memory consumption could be reduced;
some of them are worth the effort, some of them aren't. Again, this
specific fix can't be applied without first making G_find_* always return
unique allocations. If that is done, this would be worthwhile.

Right now, the only thing that I'm certain of is that the null bitmap
handling has to change for 7.0, but it's probably too disruptive for 6.x.

--
Ticket URL: <http://trac.osgeo.org/grass/ticket/837#comment:4&gt;
GRASS GIS <http://grass.osgeo.org>

#837: Memory leaks in r.example
----------------------+-----------------------------------------------------
  Reporter: sprice | Owner: grass-dev@lists.osgeo.org
      Type: defect | Status: reopened
  Priority: normal | Milestone: 6.4.0
Component: default | Version: svn-releasebranch64
Resolution: | Keywords:
  Platform: MacOSX | Cpu: OSX/Intel
----------------------+-----------------------------------------------------
Comment (by sprice):

So far my biggest reason to get these fixed is the annoyance of finding
the ''real'' leaks among the thousands of trivial leaks reported. This
hasn't tapped out my 16 GB of RAM.

Looking at the code, I also think that #2 makes the most sense, but I see
what you mean about the unique allocation. I'm not sure how 7.0 handles
this, if it's fixed it'd be ok to ignore it.

If I could get 7.0 compiled on my machine, I would be willing to write my
modules for it; thus find leaks/bugs in it. As far as I can tell, it's not
ready for that, though.

--
Ticket URL: <http://trac.osgeo.org/grass/ticket/837#comment:5&gt;
GRASS GIS <http://grass.osgeo.org>

GRASS GIS wrote:

#837: Memory leaks in r.example

Contrary to my original assessment, this is an important issue. There
is a significant memory leak caused by the null bitmap handling, and
we need to decide what to do about it for 6.4 (for 7.x, I intend to
overhaul the null bitmap handling, but that may be too disruptive for
6.4).

--
Glynn Clements <glynn@gclements.plus.com>

Glynn wrote:

GRASS GIS wrote:

> #837: Memory leaks in r.example

Contrary to my original assessment, this is an important
issue. There is a significant memory leak caused by the null
bitmap handling, and we need to decide what to do about it for
6.4 (for 7.x, I intend to overhaul the null bitmap handling,
but that may be too disruptive for 6.4).

for the 6.4.0 release we should hold any changes, it will have
gone this many years already without being a major problem, it
can wait a little longer.

Hamish

#837: Memory leaks in r.example
----------------------+-----------------------------------------------------
  Reporter: sprice | Owner: grass-dev@lists.osgeo.org
      Type: defect | Status: reopened
  Priority: normal | Milestone: 6.4.0
Component: default | Version: svn-releasebranch64
Resolution: | Keywords:
  Platform: MacOSX | Cpu: OSX/Intel
----------------------+-----------------------------------------------------
Comment (by glynn):

Replying to [comment:4 glynn]:

> > The third, fourth, & last leaks (in lib/gis/opencell.c) are a bit more
difficult to take care of all permutations:
>
> [snip]
>
> This is a large part of the reason why we just resign ourselves to each
map requiring a certain amount of memory. And I'd leave it at that, if it
wasn't for the null bitmap handling calling this function repeatedly.

My mistake. The null bitmap code calls G_open_old_misc(). That also leaks
memory, but it wouldn't be nearly as much trouble to fix as
G_open_cell_old().

That still leaves us with the question of whether to change G_find_*.
Returning unique allocations would allow the caller to free the returned
mapset string. We could then safely fix the significant leak in the null
bitmap handling (which is leaking memory per-row), but would add a fixed
increase in memory consumption in the case where unqualified map names are
used, at least until we modify everything which calls it (which is around
350 files).

--
Ticket URL: <http://trac.osgeo.org/grass/ticket/837#comment:6&gt;
GRASS GIS <http://grass.osgeo.org>

#837: Memory leaks in r.example
----------------------+-----------------------------------------------------
  Reporter: sprice | Owner: grass-dev@lists.osgeo.org
      Type: defect | Status: reopened
  Priority: normal | Milestone: 6.4.0
Component: default | Version: svn-releasebranch64
Resolution: | Keywords:
  Platform: MacOSX | Cpu: OSX/Intel
----------------------+-----------------------------------------------------
Comment (by glynn):

Replying to [comment:4 glynn]:

> 3. Make open_null_read free the return value. I '''think''' that it will
always be unique in this case, as a mapset is always being passed in.

Incidentally, this is exactly what it used to do from at least 5.3 (I
don't have an older version) until r30300 (Feb 2008). I can't find any
indication that this was made in response to a bug report. So in spite of
this being a hack, I'm inclined to take this route.

However: that doesn't solve the issue with G_open_old_misc(). That also
used to free the returned mapset, in spite of this being unsafe; this was
fixed in r29711. In theory, it would be possible to fix that by deducing
whether G_find_file2() is returning a unique allocation or a shared
pointer, but that's unspeakably ugly.

--
Ticket URL: <http://trac.osgeo.org/grass/ticket/837#comment:7&gt;
GRASS GIS <http://grass.osgeo.org>

Hamish wrote:

> > #837: Memory leaks in r.example
>
> Contrary to my original assessment, this is an important
> issue. There is a significant memory leak caused by the null
> bitmap handling, and we need to decide what to do about it for
> 6.4 (for 7.x, I intend to overhaul the null bitmap handling,
> but that may be too disruptive for 6.4).

for the 6.4.0 release we should hold any changes, it will have
gone this many years already without being a major problem, it
can wait a little longer.

Actually, it only dates back to Jan-Feb 2008. Prior to that, there was
a bug where G_open_* could free elements of the mapset path.

Essentially, you don't know whether it's safe to free the mapset
string returned from G_find_*. If it isn't and you free it, it can
cause a crash. If it is and you don't free it, you've leaked some
memory.

Most of the time this isn't a problem, but the null bitmap code calls
these functions every NULL_ROWS_INMEM (== 8) rows. For large maps (or
for large numbers of maps, e.g. r.series), this could add up (even if
mapset names are short, there's an overhead for each malloc()'d
block).

--
Glynn Clements <glynn@gclements.plus.com>

Glynn wrote:

Essentially, you don't know whether it's safe to free the mapset
string returned from G_find_*. If it isn't and you free it, it can
cause a crash. If it is and you don't free it, you've leaked some
memory.

Most of the time this isn't a problem, but the null bitmap code calls
these functions every NULL_ROWS_INMEM (== 8) rows. For large maps (or
for large numbers of maps, e.g. r.series), this could add up (even if
mapset names are short, there's an overhead for each malloc()'d
block).

for some perspective, what's the maximum damage? what's the typical damage?

as a "worst plausible case" say you have 50,000 x 50,000 cell raster
output from r.sun, and you have 365 of them you want to run them through
r.series.

as a "typical heavy load case" let's say a 4000x4000 cell raster with
a dozen of them to run through r.series.

if the leak is a *mapset string, from your above comments above and the
G_find_file() code that means GMAPSET_MAX (256) bytes+whatever system
overheads that incurs.

for these two cases that would be:

expected worst case:
  50000/8 * 256 * 365 = 570mb

normal heavy load:
  4000/8 * 256 * 12 = 1.5mb

how's my math?

compare the leak to r.series memory req for that sized map:
does it keep 8 rows of all maps in memory during processing(??)
(not sure, but going with that...)

8 rows * 50k columns * sizeof(DCELL) * 365 maps = 570mb

so total mem use at completion = 1.2gb

if all of the above is correct (& I'm not sure it is), then I would say
that anyone running the above worse case is probably running it on a
system which has >2gb ram, & so not a cause for panic.

Hamish

Hamish wrote:

> Essentially, you don't know whether it's safe to free the mapset
> string returned from G_find_*. If it isn't and you free it, it can
> cause a crash. If it is and you don't free it, you've leaked some
> memory.
>
> Most of the time this isn't a problem, but the null bitmap code calls
> these functions every NULL_ROWS_INMEM (== 8) rows. For large maps (or
> for large numbers of maps, e.g. r.series), this could add up (even if
> mapset names are short, there's an overhead for each malloc()'d
> block).

for some perspective, what's the maximum damage? what's the typical damage?

as a "worst plausible case" say you have 50,000 x 50,000 cell raster
output from r.sun, and you have 365 of them you want to run them through
r.series.

as a "typical heavy load case" let's say a 4000x4000 cell raster with
a dozen of them to run through r.series.

if the leak is a *mapset string, from your above comments above and the
G_find_file() code that means GMAPSET_MAX (256) bytes+whatever system
overheads that incurs.

That's a worst case; I doubt that anyone uses 256-byte mapset names in
practice.

for these two cases that would be:

expected worst case:
  50000/8 * 256 * 365 = 570mb

normal heavy load:
  4000/8 * 256 * 12 = 1.5mb

how's my math?

Seems okay.

compare the leak to r.series memory req for that sized map:
does it keep 8 rows of all maps in memory during processing(??)
(not sure, but going with that...)

It's 8 map rows of null data, one bit per region column (i.e.
resampled horizontally but not vertically).

8 rows * 50k columns * sizeof(DCELL) * 365 maps = 570mb

8 rows * 50k columns * 1/8 * 365 maps = 18.25Mb.

However, the library also stores the current row of decompressed,
unconverted, unresampled raster data (between 1 and 8 bytes per map
column):

  50k columns * 1 * 365 maps = 18.25Mb (1 byte/cell)
  50k columns * 8 * 365 maps = 146Mb (8 bytes/cell)

while r.series itself stores one row of resampled DCELL data per
region column (another 145Mb).

So the worst case would be 570Mb data from the leak versus 182Mb of
actual data usage, which would be a problem. In practice, a
256-character mapset name is a massive over-estimate; if you allow 32
bytes including malloc() overhead, you're down to 71Mb, which is less
of a problem.

The actual numbers change depending upon resampling (horizontal
resampling will affect the memory used by r.series itself, and the
size of the null bitmap; vertical resampling shouldn't affect
anything), the number of map/region rows (more rows leak more but
don't affec any of the buffer sizes) and the size of each G_find_*
allocation. For 50k rows x 1k columns, the numbers would come out much
worse (legitimate memory consumption reduced 50x, leak unchanged).

so total mem use at completion = 1.2gb

if all of the above is correct (& I'm not sure it is), then I would say
that anyone running the above worse case is probably running it on a
system which has >2gb ram, & so not a cause for panic.

Absolute memory figures aren't necessarily meaningful on a server or
multi-user system. It's more useful to consider the ratio of the
actual memory usage to the necessary memory usage. If each process
uses twice as much memory as it should, you can only have half as
many.

--
Glynn Clements <glynn@gclements.plus.com>

I wrote:

The fix would be simpler if G_open_cell_old() called G_fatal_error() on
errors. This would make life simpler for modules, which currently need to
check the return value. Off the top of my head, I don't know of a single
case where the caller actually tries to recover from G_open_cell_old()
failing. Doing so would also be a benefit to a small number of modules
which forget to check for errors.

Can anyone think of a reason *not* to implement this change in 7.0?

As it stands, just about every raster module does something like:

    infd = Rast_open_old(name, "");
    if (infd < 0)
  G_fatal_error(_("Unable to open input map <%s>"), name);

It would be simpler for just about everything if Rast_open_old() just
called G_fatal_error() itself on errors.

I can't think of a single case where the caller does something other
than call G_fatal_error() if opening an input map fails (or output
map, for that); although I did spot a couple of cases (e.g.
v.vol.rst/user1.c) where the caller neglects to check the return
value.

--
Glynn Clements <glynn@gclements.plus.com>

Glynn:

> The fix would be simpler if G_open_cell_old() called
> G_fatal_error() on errors.

...

Can anyone think of a reason *not* to implement this change
in 7.0?

I would expect that anything which wanted to cope with missing
maps would run G_find_* before deciding to do G_open_*.

as this is a rasther big nail in the non-persistent use of
libgis, I'll ask how much of the existing code limiting us to
single use is by design and how much of it is by oversight?
Would we be fundamentally altering the design POV with this
change? (not in a practical sense, but in a design intentention
sense)

400 global variables are not an insurmountable number to slowly
chip away at IMO, just hit each libary one by one for some
years..

Hamish

Hamish wrote:

> > The fix would be simpler if G_open_cell_old() called
> > G_fatal_error() on errors.
...
> Can anyone think of a reason *not* to implement this change
> in 7.0?

I would expect that anything which wanted to cope with missing
maps would run G_find_* before deciding to do G_open_*.

as this is a rasther big nail in the non-persistent use of
libgis, I'll ask how much of the existing code limiting us to
single use is by design and how much of it is by oversight?
Would we be fundamentally altering the design POV with this
change? (not in a practical sense, but in a design intentention
sense)

The libraries were designed for use by modules. If they work for other
uses, that's coincidence.

400 global variables are not an insurmountable number to slowly
chip away at IMO, just hit each libary one by one for some
years..

The downside with trying to generalise the libraries is that it makes
development more complicated for the most common case.

Most of GRASS' "value" is in its modules, and IMHO we should do
whatever we can to reduce the effort involved in writing modules.

--
Glynn Clements <glynn@gclements.plus.com>

#837: Memory leaks in r.example
----------------------+-----------------------------------------------------
  Reporter: sprice | Owner: grass-dev@…
      Type: defect | Status: reopened
  Priority: normal | Milestone: 6.4.4
Component: Default | Version: svn-releasebranch64
Resolution: | Keywords:
  Platform: MacOSX | Cpu: OSX/Intel
----------------------+-----------------------------------------------------
Changes (by neteler):

  * milestone: 6.4.0 => 6.4.4

--
Ticket URL: <http://trac.osgeo.org/grass/ticket/837#comment:8&gt;
GRASS GIS <http://grass.osgeo.org>

#837: Memory leaks in r.example
----------------------+-----------------------------------------------------
  Reporter: sprice | Owner: grass-dev@…
      Type: defect | Status: reopened
  Priority: normal | Milestone: 6.4.4
Component: Default | Version: svn-releasebranch64
Resolution: | Keywords:
  Platform: MacOSX | Cpu: OSX/Intel
----------------------+-----------------------------------------------------

Comment(by sprice):

Hey there, just following up. I noticed that "Change line 58 in
lib/gis/gisinit.c from "G_location_path();" to
"G_free(G_location_path());"", although trivial, still exists in the GRASS
7 SVN:

http://grass.osgeo.org/programming7/gisinit_8c_source.html#l00041

--
Ticket URL: <http://trac.osgeo.org/grass/ticket/837#comment:9&gt;
GRASS GIS <http://grass.osgeo.org>