[GRASS-user] Large File Support (LFS)

Hi,

I complied and installed grass6.4 with large file support (ubuntu9.10 on a
32bit machine). All went well with no errors. I ran my lidar script to test
it, and straight away failed on the 3.4Gb ascii file import with r.in.xyz to
get stats. I tried a 500Mb file and it was fine.

I found this: http://grass.osgeo.org/wiki/Large_File_support

and guess that is I why I never had a problem on my 64bit machine.

What is the current state over this issue? Is there an easy fix (considering I
am still relatively new to linux), or do I have to wait for the completion of
the wish list on the wiki? What are the likely timescales? Is this an issue
with grass7 also?

Cheers

John

On Saturday 20 March 2010 15:37:19 John Tate wrote:

Hi,

I complied and installed grass6.4 with large file support (ubuntu9.10 on a
32bit machine). All went well with no errors. I ran my lidar script to test
it, and straight away failed on the 3.4Gb ascii file import with r.in.xyz
to get stats. I tried a 500Mb file and it was fine.

I found this: http://grass.osgeo.org/wiki/Large_File_support

sorry: http://grass.osgeo.org/wiki/Large_File_Support

and guess that is I why I never had a problem on my 64bit machine.

What is the current state over this issue? Is there an easy fix
(considering I am still relatively new to linux), or do I have to wait for
the completion of the wish list on the wiki? What are the likely
timescales? Is this an issue with grass7 also?

Cheers

John
_______________________________________________
grass-user mailing list
grass-user@lists.osgeo.org
http://lists.osgeo.org/mailman/listinfo/grass-user

John Tate wrote:

I complied and installed grass6.4 with large file support (ubuntu9.10 on a
32bit machine). All went well with no errors. I ran my lidar script to test
it, and straight away failed on the 3.4Gb ascii file import with r.in.xyz to
get stats. I tried a 500Mb file and it was fine.

I found this: http://grass.osgeo.org/wiki/Large_File_support

and guess that is I why I never had a problem on my 64bit machine.

What is the current state over this issue? Is there an easy fix (considering I
am still relatively new to linux), or do I have to wait for the completion of
the wish list on the wiki? What are the likely timescales? Is this an issue
with grass7 also?

In 7.0, --enable-largefile causes everything to be compiled with
-D_FILE_OFFSET_BITS=64. Most uses of fseek/ftell have been replaced
with G_fseek() and G_ftell(), which accept and return an off_t, and
which use fseeko/ftello where available.

I have no idea whether widespread LFS will make it into 6.x.

--
Glynn Clements <glynn@gclements.plus.com>

On Saturday 20 March 2010 18:31:58 Glynn Clements wrote:

John Tate wrote:
> I complied and installed grass6.4 with large file support (ubuntu9.10 on
> a 32bit machine). All went well with no errors. I ran my lidar script to
> test it, and straight away failed on the 3.4Gb ascii file import with
> r.in.xyz to get stats. I tried a 500Mb file and it was fine.
>
> I found this: http://grass.osgeo.org/wiki/Large_File_support
>
> and guess that is I why I never had a problem on my 64bit machine.
>
> What is the current state over this issue? Is there an easy fix
> (considering I am still relatively new to linux), or do I have to wait
> for the completion of the wish list on the wiki? What are the likely
> timescales? Is this an issue with grass7 also?

In 7.0, --enable-largefile causes everything to be compiled with
-D_FILE_OFFSET_BITS=64. Most uses of fseek/ftell have been replaced
with G_fseek() and G_ftell(), which accept and return an off_t, and
which use fseeko/ftello where available.

Ok. I didn't get that on the webpage either. Goes over my head :slight_smile:
A work around I guess?
So in 7.0 I can use files over 2Gb on a 32bit machine?

I have no idea whether widespread LFS will make it into 6.x.

Right, time to rewrite the bash scripts...

Cheers

John

John wrote:

I complied and installed grass6.4 with large file support
(ubuntu9.10 on a 32bit machine). All went well with no errors.

I ran my lidar script to test it, and straight away failed on
the 3.4Gb ascii file import with r.in.xyz to get stats. I
tried a 500Mb file and it was fine.

how did it fail? what was the exact error message and command
line used? what does "g.region -p" say?

I found this: http://grass.osgeo.org/wiki/Large_File_support

and guess that is I why I never had a problem on my 64bit
machine.

No, LFS should be mostly irrelevant for r.in.xyz. The version in
6.4 without LFS should handle input files of hundreds of
gigabytes just fine. The only thing which might get messed up
is the % done, but that's just a harmless informational message.

worst come to worst, pipe from stdin instead of reading from a
file, but I'm skeptical that LFS is the cause.

I suspect you are running out of memory, for very large regions
with extended stats (median,percentile,skewness,trimmean) you
should make use of percent=25 or so as needed.

Or if there's a bug, I'd like to know about it.

Hamish

Hamish wrote:

how did it fail? what was the exact error message and command

line used? what does “g.region -p” say?

The successful r.in.xyz results below vary slightly due to merging different swaths to get a progressively larger file:

g.region -p

projection: 0 (x,y)

zone: 0

north: 406000

south: 392000

west: 401000

east: 412000

nsres: 1

ewres: 1

rows: 14000

cols: 11000

cells: 154000000

1.3Gb file

r.in.xyz -s -g input=/home/bob2/rawdata/clean/test.asc output=test method=min type=FCELL fs=, x=6 y=7 z=8 zscale=1.0 percent=100

n=405857.470000 s=392290.270000 e=410776.540000 w=401171.890000 b=174.140000 t=631.580000

1.5Gb file

r.in.xyz -s -g input=/home/bob2/rawdata/clean/test2.asc output=test2 method=min type=FCELL fs=, x=6 y=7 z=8 zscale=1.0 percent=100

n=405857.470000 s=392290.270000 e=410776.540000 w=401171.890000 b=174.010000 t=726.180000

1.9Gb file

r.in.xyz -s -g input=/home/bob2/rawdata/clean/test3.asc output=test3 method=min type=FCELL fs=, x=6 y=7 z=8 zscale=1.0 percent=100

n=405857.470000 s=392290.270000 e=411861.910000 w=401171.890000 b=174.010000 t=726.180000

2.1Gb file

r.in.xyz -s -g input=/home/bob2/rawdata/clean/test4.asc output=test4 method=min type=FCELL fs=, x=6 y=7 z=8 zscale=1.0 percent=100

Unable to open input file </home/bob2/rawdata/clean/test4.asc>

r.in.xyz -s -g input=/home/bob2/rawdata/clean/test4.asc output=test4 method=min type=FCELL fs=, x=6 y=7 z=8 zscale=1.0 percent=25

Unable to open input file </home/bob2/rawdata/clean/test4.asc>

3.4Gb file

r.in.xyz -s -g input=/home/bob2/rawdata/clean/allmerged.asc output=intest method=min type=FCELL fs=, x=6 y=7 z=8 zscale=1.0 percent=100

Unable to open input file </home/bob2/rawdata/clean/allmerged.asc>

r.in.xyz -s -g input=/home/bob2/rawdata/clean/allmerged.asc output=intest method=min type=FCELL fs=, x=6 y=7 z=8 zscale=1.0 percent=25

Unable to open input file </home/bob2/rawdata/clean/allmerged.asc>

I suspect you are running out of memory, for very large regions

with extended stats (median,percentile,skewness,trimmean) you

should make use of percent=25 or so as needed.

So anything above 2Gb fails, even using percent=25, which for the 2.1Gb file should have had an effect if memory was an issue.

It does not really use the memory (or swap file) when using ‘-s’ flag.

Memory hangs around 45%, swap file=0% but the CPU hits 100% varying between processors (watched in system monitor).

If ‘-s’ flag is removed to create a raster then it uses 100% of memory and <2Gb files start to process fine (didn’t continue processing, to save time in testing), but >2Gb give the same message: ‘Unable to open input file…’

This 32bit machine only has 1Gb RAM (the 64bit machine has 2Gb RAM and never had a problem), but (as observed in system monitor) I don’t think memory is an issue for scanning the data with the ‘-s’ flag.

Or if there’s a bug, I’d like to know about it.

What do you reckon?

I compiled and installed yesterday (20/03/10) following the specific Ubuntu page, but added yes to largefile support (and ignored the slight typo over folder locations, ‘grass_current / grass_trunk’.)

John

Hamish wrote:

> I found this: http://grass.osgeo.org/wiki/Large_File_support
>
> and guess that is I why I never had a problem on my 64bit
> machine.

No, LFS should be mostly irrelevant for r.in.xyz. The version in
6.4 without LFS should handle input files of hundreds of
gigabytes just fine. The only thing which might get messed up
is the % done, but that's just a harmless informational message.

_FILE_OFFSET_BITS=64 causes fopen() to be redirected to fopen64().
Without it, fopen() will fail on a file >=2GiB.

worst come to worst, pipe from stdin instead of reading from a
file, but I'm skeptical that LFS is the cause.

Reading from stdin shouldn't be a problem.

--
Glynn Clements <glynn@gclements.plus.com>