[GRASS-dev] [GRASS GIS] #2350: G7: r.texture large file support problem

#2350: G7: r.texture large file support problem
----------------------------+-----------------------------------------------
Reporter: neteler | Owner: grass-dev@…
     Type: defect | Status: new
Priority: normal | Milestone: 7.0.0
Component: Raster | Version: unspecified
Keywords: LFS, r.texture | Platform: Linux
      Cpu: x86-64 |
----------------------------+-----------------------------------------------
There seems to be a large file support issue under certain (?)
circumstances:

{{{
# Scientific Linux, 64bit:

# region size: 156044800 cells
r.texture input=pca_43140.1 prefix=pca_43140.1 \
   size=9 distance=1 method=asm,se,var
...
Reading raster map...
Calculating 3 texture measures

WARNING: Unable to rename null file
          '/grassdata/patUTM32/alba_classification/.tmp/blade21/2161.1' to
'/grassdata/patUTM32/alba_classification/cell_misc/x43140_2006_pca.1_ASM/null'
WARNING: Unable to rename cell file
          '/grassdata/patUTM32/alba_classification/.tmp/blade21/2161.0' to
'/grassdata/patUTM32/alba_classification/fcell/x43140_2006_pca.1_ASM'
WARNING: Unable to write quant rules: raster map <x43140_2006_pca.1_ASM>
is
          integer
WARNING: Unable to rename null file
          '/grassdata/patUTM32/alba_classification/.tmp/blade21/2161.3' to
'/grassdata/patUTM32/alba_classification/cell_misc/x43140_2006_pca.1_Var/null'
WARNING: Unable to rename cell file
          '/grassdata/patUTM32/alba_classification/.tmp/blade21/2161.2' to
'/grassdata/patUTM32/alba_classification/fcell/x43140_2006_pca.1_Var'
WARNING: Unable to write quant rules: raster map <x43140_2006_pca.1_Var>
is
          integer
WARNING: Unable to rename null file
          '/grassdata/patUTM32/alba_classification/.tmp/blade21/2161.5' to
'/grassdata/patUTM32/alba_classification/cell_misc/x43140_2006_pca.1_SE/null'
WARNING: Unable to rename cell file
          '/grassdata/patUTM32/alba_classification/.tmp/blade21/2161.4' to
'/grassdata/patUTM32/alba_classification/fcell/x43140_2006_pca.1_SE'
WARNING: Unable to write quant rules: raster map <x43140_2006_pca.1_SE> is
          integer
}}}

Result (failure):

{{{
r.info x43140_2006_pca.1_ASM
  +----------------------------------------------------------------
  | Layer: x43140_2006_pca.1_ASM Date: Tue Jun 24 17:17:35 2014
  | Mapset: alba_classification Login of Creator: lucadelu
  | Location: patUTM32
...
  |
  | Type of Map: raster Number of Categories: 0
  | Data Type: CELL
  | Rows: 11680
  | Columns: 13360
  | Total Cells: 156044800
  | Projection: UTM (zone 32)
  | N: 5124056.897747 S: 5118216.897747 Res: 0.5
  | E: 667131.354793 W: 660451.354793 Res: 0.5
  | Range of data: min = -2147483648 max = -2147483648
...
}}}

Using a different machine, it worked fine:

{{{
# Ubuntu 14 64 bit machine:

GRASS 7.0.0svn (Trento-landchange):~ > r.info pca_43140.1_ASM
  +--------------------------------------------------------------
  | Layer: pca_43140.1_ASM Date: Wed Jun 25 15:04:01 2014
  ...
   | Title: ( pca_43140.1_ASM )
  ...
  | Type of Map: raster Number of Categories: 0
  | Data Type: FCELL
  | Rows: 11680
  | Columns: 13360
  | Total Cells: 156044800
  | Projection: UTM (zone 32)
  | N: 5124056.897747 S: 5118216.897747 Res: 0.5
  | E: 667131.354793 W: 660451.354793 Res: 0.5
  | Range of data: min = 0.00737847 max = 1
...
}}}

I am not sure what to look for now, suggestions?

Is r.texture supporting LFS?

--
Ticket URL: <https://trac.osgeo.org/grass/ticket/2350&gt;
GRASS GIS <http://grass.osgeo.org>

#2350: G7: r.texture large file support problem
----------------------------+-----------------------------------------------
Reporter: neteler | Owner: grass-dev@…
     Type: defect | Status: new
Priority: normal | Milestone: 7.0.0
Component: Raster | Version: unspecified
Keywords: LFS, r.texture | Platform: Linux
      Cpu: x86-64 |
----------------------------+-----------------------------------------------

Comment(by glynn):

Replying to [ticket:2350 neteler]:

> There seems to be a large file support issue under certain (?)
> circumstances:

Why do you believe that this is related to LFS?

>
{{{
WARNING: Unable to rename null file
          '/grassdata/patUTM32/alba_classification/.tmp/blade21/2161.1' to
'/grassdata/patUTM32/alba_classification/cell_misc/x43140_2006_pca.1_ASM/null'
WARNING: Unable to rename cell file
          '/grassdata/patUTM32/alba_classification/.tmp/blade21/2161.0' to
'/grassdata/patUTM32/alba_classification/fcell/x43140_2006_pca.1_ASM'
}}}

These warnings are generated by Rast_close() or Rast_unopen(), and
correspond to a rename() system call failing.

Possible reasons for failure are listed in the rename(2) manual page, but
the most likely reason is filesystem permissions. Ensure that you own all
of the directories within the mapset, and that the owner has write
permission on them.

If you're using group-writeable mapsets (suppressing the mapset ownership
check), be aware that a user having bit 0020 (group-write) set in their
umask will cause this sort of issue (subdirectories created by commands
run by that user won't be writeable by anyone else, resulting in maps
which cannot be overwritten or removed). This is a large part of the
reason why the mapset ownership check exists.

Similar issues can arise from copying maps using standard (non-GRASS)
tools (if run by a privileged user, they may copy the ownership).

> Is r.texture supporting LFS?

Modules typically don't need to do anything regarding LFS; the support is
in the libraries. The main issue which affects modules is that they
shouldn't assume that cell counts will fit into an "int" or even a "long".
But even failing to do so won't have any effect upon I/O.

--
Ticket URL: <https://trac.osgeo.org/grass/ticket/2350#comment:1&gt;
GRASS GIS <http://grass.osgeo.org>

On Wed, Jun 25, 2014 at 4:35 PM, GRASS GIS <trac@osgeo.org> wrote:

> Is r.texture supporting LFS?

Modules typically don't need to do anything regarding LFS; the support is
in the libraries. The main issue which affects modules is that they
shouldn't assume that cell counts will fit into an "int" or even a "long".
But even failing to do so won't have any effect upon I/O.

Can you please explain what the modules are allowed to do? r.example does
not tell [1].

[1]
http://trac.osgeo.org/grass/browser/grass/trunk/doc/raster/r.example/main.c?rev=40771#L69

#2350: G7: r.texture large file support problem
----------------------------+-----------------------------------------------
Reporter: neteler | Owner: grass-dev@…
     Type: defect | Status: new
Priority: normal | Milestone: 7.0.0
Component: Raster | Version: unspecified
Keywords: LFS, r.texture | Platform: Linux
      Cpu: x86-64 |
----------------------------+-----------------------------------------------

Comment(by neteler):

Replying to [comment:1 glynn]:
> Replying to [ticket:2350 neteler]:
>
> > There seems to be a large file support issue under certain (?)
> > circumstances:
>
> Why do you believe that this is related to LFS?

I got the idea from

{{{
Range of data: min = -2147483648 max = -2147483648
}}}

...
> These warnings are generated by Rast_close() or Rast_unopen(), and
> correspond to a rename() system call failing.

Would it be possible to make the errno more visible or "clear"?

> Possible reasons for failure are listed in the rename(2)
> manual page, but the most likely reason is filesystem
> permissions. Ensure that you own all of the directories within
> the mapset, and that the owner has write permission on them.

Yes, I checked and the mapset owner has write permissions. So, the
situation is this:

{{{
/grassdata/patUTM32/alba_classification
- /grassdata/ is mounted to the actual blade (cluster system) via NFS
- /grassdata/patUTM32/: I am owner of that, group write is set
- /grassdata/patUTM32/alba_classification/: owned by the user having
problems
}}}

This setting we use for years without troubles.

> If you're using group-writeable mapsets (suppressing the mapset
> ownership check), be aware that a user having bit 0020 (group-write)
> set in their umask will cause this sort of issue

Here the various permissions:

{{{
[neteler@blade21 ~]$ ls -la /grassdata
lrwxrwxrwx 1 root root 20 Mar 3 2012 /grassdata -> /storage/2/grassdata/

[neteler@blade21 ~]$ ls -la /grassdata/ | grep patUTM
drwxrwxr-x 97 neteler gis 4096 Jun 23 16:23 patUTM32/

[neteler@blade21 ~]$ ls -la /grassdata/patUTM32/ | grep alba
drwxr-xr-x 14 lucadelu gis 4096 Jun 25 12:24 alba_classification/

[neteler@blade21 ~]$ ls -la /grassdata/patUTM32/alba_classification/
total 4260
drwxr-xr-x 14 lucadelu gis 4096 Jun 25 12:24 ./
drwxrwxr-x 97 neteler gis 4096 Jun 23 16:23 ../
-rw------- 1 lucadelu gis 1817 Jun 24 16:05 .bash_history
-rw-r--r-- 1 lucadelu gis 886 Jun 24 15:41 .bashrc
drwx------ 2 lucadelu gis 4096 Jun 25 17:17 cats/
drwx------ 2 lucadelu gis 4096 Jun 25 17:17 cell/
drwx------ 2 lucadelu gis 4096 Jun 25 17:17 cellhd/
drwx------ 47 lucadelu gis 4096 Jun 25 17:17 cell_misc/
drwx------ 2 lucadelu gis 4096 Jun 25 17:11 colr/
-rw------- 1 lucadelu gis 12 Jun 23 16:15 CURGROUP
drwx------ 2 lucadelu gis 4096 Jun 25 17:17 fcell/
drwx------ 2 lucadelu gis 10 Jun 23 16:15 g3dcell/
drwx------ 10 lucadelu gis 4096 Jun 23 16:15 group/
drwx------ 2 lucadelu gis 4096 Jun 25 17:17 hist/
-rw------- 1 lucadelu gis 70 Jun 24 09:57 legend
-rw-r--r-- 1 lucadelu gis 4264511 Jun 25 17:17 logfile
-rw-r--r-- 1 lucadelu gis 9916 Jun 24 17:18 logfile_size9
-rw-r--r-- 1 lucadelu gis 6810 Jun 25 15:14 method.py
drwx------ 2 lucadelu gis 30 Jun 25 15:56 sqlite/
drwxr-xr-x 3 lucadelu gis 28 Jun 24 17:17 .tmp/
-rw------- 1 lucadelu gis 81 Jun 23 16:15 VAR
drwx------ 9 lucadelu gis 4096 Jun 25 15:53 vector/
-rw------- 1 lucadelu gis 355 Jun 25 17:11 WIND
}}}

> (subdirectories created by commands run by that user won't be writeable
> by anyone else, resulting in maps which cannot be overwritten or
> removed). This is a large part of the reason why the mapset ownership
> check exists.

Not sure yet if above cited permissions cause the error?

--
Ticket URL: <https://trac.osgeo.org/grass/ticket/2350#comment:2&gt;
GRASS GIS <http://grass.osgeo.org>

Vaclav Petras wrote:

> > Is r.texture supporting LFS?
>
> Modules typically don't need to do anything regarding LFS; the support is
> in the libraries. The main issue which affects modules is that they
> shouldn't assume that cell counts will fit into an "int" or even a "long".
> But even failing to do so won't have any effect upon I/O.

Can you please explain what the modules are allowed to do? r.example does
not tell [1].

The most common way for the issue to arise is multiplying the number
of rows by the number of columns to obtain the total number of cells.
Most modules have no need to do this, but it is occasionally done e.g.
when calculating statistics or storing the data in a temporary file.

The number of rows and columns can reasonably be assumed to fit into a
signed 32-bit integer, but their product cannot.

Even on a system with a 64-bit "long"[1], multiplying 2 "int"s will
produce an "int" result, and assigning the result to a "long" variable
doesn't change that. E.g.

  int nrows = Rast_window_rows();
  int ncols = Rast_window_cols();
  long ncells = nrows * ncols;

will truncate the result of the multiplication to an "int" (which is
32 bits on all mainstream platforms) then expand the truncated value
to 64 bits in the assignment. To perform the multplication using
"long", one of the arguments must be converted, e.g.:

  long ncells = (long) nrows * ncols;

[1] This doesn't include 64-bit versions of Windows, where "long" is
only 32 bits for compatibility reasons.

The issue isn't strictly related to LFS; due to compression, it's
possible for a raster with more than 2^31 cells to take up less than
2 GiB on disk. LFS just makes the issue more likely to arise in practice.

--
Glynn Clements <glynn@gclements.plus.com>

#2350: G7: r.texture large file support problem
----------------------------+-----------------------------------------------
Reporter: neteler | Owner: grass-dev@…
     Type: defect | Status: new
Priority: normal | Milestone: 7.0.0
Component: Raster | Version: unspecified
Keywords: LFS, r.texture | Platform: Linux
      Cpu: x86-64 |
----------------------------+-----------------------------------------------

Comment(by glynn):

Replying to [comment:2 neteler]:
> > Why do you believe that this is related to LFS?
>
> I got the idea from
>
{{{
Range of data: min = -2147483648 max = -2147483648
}}}

LFS issues wouldn't (directly) cause this. -2147483648 is -2^31^ = INT_MIN
= (int)0x80000000. I believe that it arises from casting a double to an
int where the value isn't representable as an int:
{{{
#include <stdio.h>

int main(void)
{
     double x = 1.0e12;
     double y = 0.0 / 0.0;
     printf("%g %d\n", x, (int) x);
     printf("%g %d\n", y, (int) y);
     return 0;
}
}}}
produces:
{{{
1e+12 -2147483648
-nan -2147483648
}}}

> > These warnings are generated by Rast_close() or Rast_unopen(), and
> > correspond to a rename() system call failing.
>
> Would it be possible to make the errno more visible or "clear"?

Try r61048.

> Here the various permissions:

What are the permissions and ownership for
{{{
/grassdata/patUTM32/alba_classification/.tmp/blade21
/grassdata/patUTM32/alba_classification/cell_misc/*
}}}
?

--
Ticket URL: <https://trac.osgeo.org/grass/ticket/2350#comment:3&gt;
GRASS GIS <http://grass.osgeo.org>