[GRASS-user] ERROR: Bytes do not match file size with r.in.bin (but file size is correct!!)

Hamish · May 7, 2013, 9:12pm

Ludovico wrote:

I get this error while running the following command:

...

r.in.bin -f input=inputfile.bin output=outputmap bytes=4
n=51:05:20.4N s=41:21:50.4N w=5:08:31.2W e=9:33:36E r=19450
c=29404 anull=-9999.0 --overwrite

...

WARNING: File Size -2007336096 ... Total Bytes 2287631200
ERROR: Bytes do not match file size 256

Note the minus sign in front of the first value given for the
files size in the warning message. Important to notice also
that when I do an ls –l on the binary input file I get the
correct size (19450x29404x4):

-rw-rw-r-- 1 user group 2287631200 May 7 15:05 inputfile.bin

Any suggestion on the origin of this error?

what version of GRASS are you using? That overflow should have
been fixed just after 6.4.1 was released.

Is there any limit on the file size for importing binaries
into GRASS?

there shouldn't be, other than what the operating system is
limited by.

Hamish

Ludovico_Nicotina · May 8, 2013, 8:17am

Thank you for your answers. A few info on my system:

I'm running on a computational node under linux with 64GB or RAM the machine architecture is a x86_64 and the kernel is also 64bit (running getconf LONG_BIT output is 64)

The version of GRASS I am running is 6.4.1

Also I don't have any problem allocating in fortran an 4bytes array of size 19450x29404x15 on the same machine, so I would exclude system limitations and it's likely to be a GRASS issue.

Thanks,

Ludovico

-----Original Message-----
From: Hamish [mailto:hamish_b@yahoo.com]
Sent: 07 May 2013 22:13
To: grass-user@lists.osgeo.org; Ludovico Nicotina
Subject: Re: [GRASS-user] ERROR: Bytes do not match file size with r.in.bin (but file size is correct!!)

Ludovico wrote:

I get this error while running the following command:

...

r.in.bin -f input=inputfile.bin output=outputmap bytes=4 n=51:05:20.4N
s=41:21:50.4N w=5:08:31.2W e=9:33:36E r=19450
c=29404 anull=-9999.0 --overwrite

...

WARNING: File Size -2007336096 ... Total Bytes 2287631200
ERROR: Bytes do not match file size 256

Note the minus sign in front of the first value given for the files
size in the warning message. Important to notice also that when I do
an ls –l on the binary input file I get the correct size
(19450x29404x4):

-rw-rw-r-- 1 user group 2287631200 May 7 15:05 inputfile.bin

Any suggestion on the origin of this error?

what version of GRASS are you using? That overflow should have been fixed just after 6.4.1 was released.

Is there any limit on the file size for importing binaries into GRASS?

there shouldn't be, other than what the operating system is limited by.

Hamish

This message and any attachments contain information that may be RMS Inc. confidential and/or privileged. If you are not the intended recipient (or authorized to receive for the intended recipient), and have received this message in error, any use, disclosure or distribution is strictly prohibited. If you have received this message in error, please notify the sender immediately by replying to the e-mail and permanently deleting the message from your computer and/or storage system.

Hamish · May 8, 2013, 9:27am

Ludovico wrote:

Thank you for your answers. A few info on my system:

I'm running on a computational node under linux with 64GB or
RAM the machine architecture is a x86_64 and the kernel is
also 64bit (running getconf LONG_BIT output is 64)

The version of GRASS I am running is 6.4.1

You'll have to upgrade to a newer version. The fix for r.in.bin
was added just a few days after the release of 6.4.1, which was
two years ago.

I'd suggest 6.4.3rc3, get in early and help us test the upcoming
release.

Hamish

Kapo_Coulibaly · May 9, 2013, 1:48pm

Hi Ludvico,

You also want to check how the binary was written in FORTRAN. depending on the compiler and/or the access type fortran can write extra bytes before and after every record (they are called record marker). In that particular case the file would be bigger than the information it is supposed to contain. It is explained here: http://paulbourke.net/dataformats/reading/

I copied the content of the site below in case the link doesn’t make it.

Problem

Ever wanted to read binary files written by a FORTRAN program with a C/C++ program? Not such an unusual or unreasonable request but FORTRAN does some strange things … consider the following FORTRAN code, where “a” is a 3D array of 4 byte floating point values.

        open(60,file=filename,status='unknown',form='unformatted')
        write(60) nx,ny,nz
        do k = 1,nz
          do j = 1,ny
           write(60) (a(i,j,k),i=1,nx)
          enddo
        enddo
        close(60)

What you will end up with is not a file that is (4 * nx) * ny * nz + 12 bytes long as it would be for the equivalent in most (if not all) other languages! Instead it will be nz * ny * (4 * nx + 8) + 20 bytes long. Why?

Reason

Each time the FORTRAN write is issued a “record” is written, the record consists of a 4 byte header, then the data, then a trailer that matches the header. The 4 byte header and trailer consist of the number of bytes that will be written in the data section. So the following

        write(60) nx,ny,nz

gets written on the disk as follows where nx,ny,nz are each 4 bytes, the other numbers below are 2 byte integers written in decimal

        0 12 nx ny nz 0 12

The total length written is 20 bytes. Similarly, the line

        write(60) (a(i,j,k),i=1,nx)

gets written as follows assuming nx is 1024 and “a” is real*4

        10 0 a(1,j,k) a(2,j,k) .... a(1024,j,k) 10 0

The total length is 4104 bytes. Fortunately, once this is understood, it is a trivial to read the correct things in C/C++.

A consequence that is a bit shocking for many programmers is that the file created with the above code gives a file that is about 1/3 the size than one created with this code.

        open(60,file=filename,status='unknown',form='unformatted')
        write(60) nx,ny,nz
        do k = 1,nz
          do j = 1,ny
            do i = 1,nx
              write(60) a(i,j,k)
            enddo
          enddo
        enddo
        close(60)

In this case each element of a is written in one record and consumes 12 bytes for a total file size of nx * ny * nz * 12 + 20.

Hope it helps

···

On Wed, May 8, 2013 at 5:27 AM, Hamish <hamish_b@yahoo.com> wrote:

Ludovico wrote:

Thank you for your answers. A few info on my system:

I’m running on a computational node under linux with 64GB or
RAM the machine architecture is a x86_64 and the kernel is
also 64bit (running getconf LONG_BIT output is 64)

The version of GRASS I am running is 6.4.1

You’ll have to upgrade to a newer version. The fix for r.in.bin
was added just a few days after the release of 6.4.1, which was
two years ago.

I’d suggest 6.4.3rc3, get in early and help us test the upcoming
release.

Hamish

grass-user mailing list
grass-user@lists.osgeo.org
http://lists.osgeo.org/mailman/listinfo/grass-user

Ludovico_Nicotina · May 9, 2013, 1:55pm

Hi and thank you both,

In my case the problem is not the fortran record markers, indeed the files are exactly the size they are supposed to be. Additionally I have doublechecked and found out that I’m actually running GRASS 6.4.2 so I’m wondering if the large file support is something that is embedded in that distribution or if it should have been explicitly included at installation time. Do you know if there is a way to check that and/or if it’s possible to add that without repeating the entire installation?

Thanks again,

PS. Hamish I will probably switch to 6.4.3rc3 anyway, would this solve the problem automatically in your opinion?

···

From: kapo coulibaly [mailto:kmcoulib@gmail.com]
Sent: 09 May 2013 14:49
To: Hamish
Cc: GRASS user list; Ludovico Nicotina
Subject: Re: [GRASS-user] ERROR: Bytes do not match file size with r.in.bin (but file size is correct!!)

Hi Ludvico,

You also want to check how the binary was written in FORTRAN. depending on the compiler and/or the access type fortran can write extra bytes before and after every record (they are called record marker). In that particular case the file would be bigger than the information it is supposed to contain. It is explained here: http://paulbourke.net/dataformats/reading/

I copied the content of the site below in case the link doesn’t make it.

Problem

Ever wanted to read binary files written by a FORTRAN program with a C/C++ program? Not such an unusual or unreasonable request but FORTRAN does some strange things … consider the following FORTRAN code, where “a” is a 3D array of 4 byte floating point values.

        open(60,file=filename,status='unknown',form='unformatted')

        write(60) nx,ny,nz

        do k = 1,nz

          do j = 1,ny

           write(60) (a(i,j,k),i=1,nx)

          enddo

        enddo

        close(60)

What you will end up with is not a file that is (4 * nx) * ny * nz + 12 bytes long as it would be for the equivalent in most (if not all) other languages! Instead it will be nz * ny * (4 * nx + 8) + 20 bytes long. Why?

Reason

Each time the FORTRAN write is issued a “record” is written, the record consists of a 4 byte header, then the data, then a trailer that matches the header. The 4 byte header and trailer consist of the number of bytes that will be written in the data section. So the following

        write(60) nx,ny,nz

gets written on the disk as follows where nx,ny,nz are each 4 bytes, the other numbers below are 2 byte integers written in decimal

        0 12 nx ny nz 0 12

The total length written is 20 bytes. Similarly, the line

        write(60) (a(i,j,k),i=1,nx)

gets written as follows assuming nx is 1024 and “a” is real*4

        10 0 a(1,j,k) a(2,j,k) .... a(1024,j,k) 10 0

The total length is 4104 bytes. Fortunately, once this is understood, it is a trivial to read the correct things in C/C++.

A consequence that is a bit shocking for many programmers is that the file created with the above code gives a file that is about 1/3 the size than one created with this code.

        open(60,file=filename,status='unknown',form='unformatted')

        write(60) nx,ny,nz

        do k = 1,nz

          do j = 1,ny

            do i = 1,nx

              write(60) a(i,j,k)

            enddo

          enddo

        enddo

        close(60)

In this case each element of a is written in one record and consumes 12 bytes for a total file size of nx * ny * nz * 12 + 20.

Hope it helps

On Wed, May 8, 2013 at 5:27 AM, Hamish <hamish_b@yahoo.com> wrote:

Ludovico wrote:

Thank you for your answers. A few info on my system:

I’m running on a computational node under linux with 64GB or
RAM the machine architecture is a x86_64 and the kernel is
also 64bit (running getconf LONG_BIT output is 64)

The version of GRASS I am running is 6.4.1

You’ll have to upgrade to a newer version. The fix for r.in.bin
was added just a few days after the release of 6.4.1, which was
two years ago.

I’d suggest 6.4.3rc3, get in early and help us test the upcoming
release.

Hamish

grass-user mailing list
grass-user@lists.osgeo.org
http://lists.osgeo.org/mailman/listinfo/grass-user

Glynn_Clements1 · May 10, 2013, 1:39pm

kapo coulibaly wrote:

You also want to check how the binary was written in FORTRAN. depending on
the compiler and/or the access type fortran can write extra bytes before
and after every record (they are called record marker).

That isn't his problem:

r.in.bin -f input=inputfile.bin output=outputmap bytes=4
n=51:05:20.4N s=41:21:50.4N w=5:08:31.2W e=9:33:36E r=19450
c=29404 anull=-9999.0 --overwrite

...

WARNING: File Size -2007336096 ... Total Bytes 2287631200
ERROR: Bytes do not match file size 256

19450 * 29404 * 4 = 2287631200 = 2^32 - 2007336096

So the file is exactly the size it's supposed to be, it's just
overflowing the range of a 32-bit signed integer.

Upgrading is recommended, but in the absence of that, it would be
possible to split the file into two halves with dd, e.g.:

dd if=inputfile.bin of=file1.bin bs=117616 count=9725
dd if=inputfile.bin of=file2.bin bs=117616 count=9725 skip=9725

then import them separately and join them with r.patch, e.g.:

  r.in.bin -f input=file1.bin output=map1 bytes=4 \
    n=51:05:20.4N s=46:13:35.4N w=5:08:31.2W e=9:33:36E \
    r=9725 c=29404 anull=-9999.0 --overwrite
  r.in.bin -f input=file2.bin output=map2 bytes=4 \
    n=46:13:35.4N s=41:21:50.4N w=5:08:31.2W e=9:33:36E \
    r=9725 c=29404 anull=-9999.0 --overwrite
  r.patch input=map1,map2 output=outputmap --overwrite
  g.remove rast=map1,map2
  rm file1.bin file2.bin

--
Glynn Clements <glynn@gclements.plus.com>