[GRASS5] new vector2

Hi again,

sorry but my last mail was somehow destroyed on the way, here should be full text:

i would like to discuss some problems related to new vector format/library i am working on.

1) Little or big? Dig files are written (version 4) in portable vector format which
is big endian byte order (note for David: I was wrong in my last mail).
On little endian machines (i386 for example) byte order of each number must be
changed. My test with d.vect says that such conversion takes 8% of time.
That is long time. The question is if we should keep big endian or change to little
endian. Exists any statistic about platforms used by GRASS users? Maybe
ftp server log for binaries downloads could help, is it somewhere on the Web Markus?
Somebody knows any web page with informations about numbers of computers
for each platform in use in the world? Any idea is welcome.

2) It is possible to compile version 4 vector library with
#define NO_PORTABLE
and all vector files will be written in native format instead of portable.
Speed on little endian machine is increased but such non portable vector files
may be not be used on big endian computer then.
Somebody ever used such option? Should be this feature keept in new vector
format/library? My suggestion is do not support that option and be sure that all vector
files in the world are always portable.

3) I want move header information (ORGANIZATION, DIGIT DATE, ....) from dig file
into separate dighd text file - separate coordinates from other informations.
Idea is taken from documents/grass6vector_api/vector/head.html.
I expect some changes in header fields in future (projection, units maybe) and changing
dighd text format will be easier than changin binary file. This should also prepare
GRASS better for using OSVecDB where coordinates will be saved in DB table - I am not sure here.

4) I want remove bounding box from dig file because it is not reliable. Bounding box should
be saved in dig_plus only.

Radim

----------------------------------------
If you want to unsubscribe from GRASS Development Team mailing list write to:
minordomo@geog.uni-hannover.de with
subject 'unsubscribe grass5'

On Thu, Dec 28, 2000 at 07:06:17PM +0100, Radim Blazek wrote:

1) Little or big? Dig files are written (version 4) in portable vector
format which is big endian byte order (note for David: I was wrong in
my last mail). On little endian machines (i386 for example) byte
order of each number must be changed. My test with d.vect says that
such conversion takes 8% of time. That is long time. The question is
if we should keep big endian or change to little endian. Exists any
statistic about platforms used by GRASS users? Maybe ftp server log
for binaries downloads could help, is it somewhere on the Web Markus?
Somebody knows any web page with informations about numbers of
computers for each platform in use in the world? Any idea is welcome.

As was mentioned, maybe it is possible to always write data in native
format, but perform a byte swap conversion on read if necessary. It
would require a bitfield or such to specify endianness. I don't know if
this would cause more headache than it's worth, since partial
updates/writes would also need a conversion check. Maybe it's just
better to choose one and live with the conversion slow down.

2) It is possible to compile version 4 vector library with #define
NO_PORTABLE and all vector files will be written in native format
instead of portable. Speed on little endian machine is increased but
such non portable vector files may be not be used on big endian
computer then. Somebody ever used such option? Should be this
feature keept in new vector format/library? My suggestion is do not
support that option and be sure that all vector files in the world are
always portable.

I agree with dropping it.

3) I want move header information (ORGANIZATION, DIGIT DATE, ....)
from dig file into separate dighd text file - separate coordinates
from other informations. Idea is taken from
documents/grass6vector_api/vector/head.html. I expect some changes in
header fields in future (projection, units maybe) and changing dighd
text format will be easier than changin binary file. This should also
prepare GRASS better for using OSVecDB where coordinates will be saved
in DB table - I am not sure here.

Sounds good to me. I'd like to get full tabular attribute data
supported in any new GRASS vector format. I think it is very important
to have such a thing. I guess it should key in on an ID pairing
point/line/area ID's to separate point/line/area tables. I think this
is most important (puts the "Information" in GIS!). I wouldn't worry
too much about OSVecDB at this point since it's not clear where that
might lead.

4) I want remove bounding box from dig file because it is not
reliable. Bounding box should be saved in dig_plus only.

Can't comment on this...

--
Eric G. Miller <egm2@jps.net>

----------------------------------------
If you want to unsubscribe from GRASS Development Team mailing list write to:
minordomo@geog.uni-hannover.de with
subject 'unsubscribe grass5'

Hi Radim,

On Thu, Dec 28, 2000 at 07:06:17PM +0100, Radim Blazek wrote:

Hi again,

sorry but my last mail was somehow destroyed on the way, here should be
full text:

i would like to discuss some problems related to new vector format/library i am working on.

1) Little or big? Dig files are written (version 4) in portable vector
format which is big endian byte order (note for David: I was wrong in my
last mail). On little endian machines (i386 for example) byte order of
each number must be changed. My test with d.vect says that such conversion
takes 8% of time. That is long time. The question is if we should keep big
endian or change to little endian. Exists any statistic about platforms
used by GRASS users? Maybe ftp server log for binaries downloads could
help, is it somewhere on the Web Markus? Somebody knows any web page with
informations about numbers of computers for each platform in use in the
world? Any idea is welcome.

The numbers of computers for each platform in use in the world and the
statistic about platforms used by GRASS users heavily differs (due to the
lack of the fully running winGRASS port). The major number of GRASS users
have PCs running Linux (means: little endian).

2) It is possible to compile version 4 vector library with
#define NO_PORTABLE
and all vector files will be written in native format instead of portable.
Speed on little endian machine is increased but such non portable vector files
may be not be used on big endian computer then.
Somebody ever used such option? Should be this feature keept in new vector
format/library? My suggestion is do not support that option and be sure that all vector
files in the world are always portable.

My opinion is to remove that flag: As we don't want even to allow users to
generate non-portable data (in order to avoid later complaints) the vector
data should be always portable.
The other proposal to "mi.ro@iol.it"<mi.ro@iol.it> sounds quite good.
Do the software engineers agree?

3) I want move header information (ORGANIZATION, DIGIT DATE, ....) from dig file
into separate dighd text file - separate coordinates from other informations.
Idea is taken from documents/grass6vector_api/vector/head.html.
I expect some changes in header fields in future (projection, units maybe) and changing
dighd text format will be easier than changin binary file. This should also prepare
GRASS better for using OSVecDB where coordinates will be saved in DB table - I am not sure here.

This proposal sounds good as well. To keep these information in another file
will increase flexibility.

4) I want remove bounding box from dig file because it is not reliable. Bounding box should
be saved in dig_plus only.

O.k. (but I don't know too much about the internal vector formats...)

Thanks for working on that!

Markus

----------------------------------------
If you want to unsubscribe from GRASS Development Team mailing list write to:
minordomo@geog.uni-hannover.de with
subject 'unsubscribe grass5'

Hi all

Markus Neteler wrote:

The other proposal to "mi.ro@iol.it"<mi.ro@iol.it> sounds quite good.
Do the software engineers agree?

From a software engineering point of view, the proposal you refer to

(included below for reference) seems to be the best choice. That way,
the conversion is handled automatically, and performace will improve on
someone's machine if they write a file that originally didn't have a
matching byte-order since the new file will have the proper byte-order
and the conversion step will be skipped the next time the file is read.

"mi.ro@iol.it" wrote:

I've used, in the past, a little different model, where you
check if the file byte-order matches or not the current platform
byte-order. Thus avoiding unrequested conversions.
In this case one should have the possibility to write files in both
big and little endian flagged by a specific field, and/or
be able to convert them.

--
Sincerely,

Jazzman (a.k.a. Justin Hickey) e-mail: jhickey@hpcc.nectec.or.th
High Performance Computing Center
National Electronics and Computer Technology Center (NECTEC)
Bangkok, Thailand

People who think they know everything are very irritating to those
of us who do. ---Anonymous

Jazz and Trek Rule!!!

----------------------------------------
If you want to unsubscribe from GRASS Development Team mailing list write to:
minordomo@geog.uni-hannover.de with
subject 'unsubscribe grass5'

"Eric G . Miller" wrote:

As was mentioned, maybe it is possible to always write data in native
format, but perform a byte swap conversion on read if necessary. It
would require a bitfield or such to specify endianness. I don't know if
this would cause more headache than it's worth, since partial
updates/writes would also need a conversion check. Maybe it's just
better to choose one and live with the conversion slow down.

> 2) It is possible to compile version 4 vector library with #define
> NO_PORTABLE and all vector files will be written in native format
> instead of portable. Speed on little endian machine is increased but
> such non portable vector files may be not be used on big endian
> computer then. Somebody ever used such option? Should be this
> feature keept in new vector format/library? My suggestion is do not
> support that option and be sure that all vector files in the world are
> always portable.

I agree with dropping it.

Another solution is always write data in native format with a flag to
specify endianness and have a program for converting from one format
to another. This means also that we must detect if a file doesn't fit
the machine endianness, in order to to call the conversion program
inside the open function. The slow down is then limited to the first
open...

--
Michel WURTZ - DIG - Maison de la télédétection
               500, rue J.F. Breton
               34093 MONTPELLIER Cedex 5

----------------------------------------
If you want to unsubscribe from GRASS Development Team mailing list write to:
minordomo@geog.uni-hannover.de with
subject 'unsubscribe grass5'

Michel Wurtz wrote:

"Eric G . Miller" wrote:
> As was mentioned, maybe it is possible to always write data in native
> format, but perform a byte swap conversion on read if necessary. It
> would require a bitfield or such to specify endianness. I don't know if
> this would cause more headache than it's worth, since partial
> updates/writes would also need a conversion check. Maybe it's just
> better to choose one and live with the conversion slow down.
>
> > 2) It is possible to compile version 4 vector library with #define
> > NO_PORTABLE and all vector files will be written in native format
> > instead of portable. Speed on little endian machine is increased but
> > such non portable vector files may be not be used on big endian
> > computer then. Somebody ever used such option? Should be this
> > feature keept in new vector format/library? My suggestion is do not
> > support that option and be sure that all vector files in the world are
> > always portable.
>
> I agree with dropping it.

Another solution is always write data in native format with a flag to
specify endianness and have a program for converting from one format
to another. This means also that we must detect if a file doesn't fit
the machine endianness, in order to to call the conversion program
inside the open function. The slow down is then limited to the first
open...

Thank you for your comments.

My conclusion is:
- vector file may ben written in big or little endian (no pdp or other)
  and byte order is specified in header
- by default new file will be written in machine native byte order
  (later add optional byte order for new files by enviroment
   variable GRASS_VECTOR_ENDIAN)
- vector modules (library) will work with both little and big endian vector files
- write new module for conversion little <=> big
- no auto conversion into native format (read only files, files shared by
  more platforms)
- vector written in one byte order (big for example) will be updated in that byte order
  (big) even on machine with some other byte order (little)

OK?

Radim

----------------------------------------
If you want to unsubscribe from GRASS Development Team mailing list write to:
minordomo@geog.uni-hannover.de with
subject 'unsubscribe grass5'

Hi Radim

Radim Blazek wrote:

My conclusion is:
- vector file may ben written in big or little endian (no pdp or
  other) and byte order is specified in header
- by default new file will be written in machine native byte order
  (later add optional byte order for new files by enviroment
   variable GRASS_VECTOR_ENDIAN)
- vector modules (library) will work with both little and big endian
  vector files
- write new module for conversion little <=> big
- no auto conversion into native format (read only files, files shared
  by more platforms)
- vector written in one byte order (big for example) will be updated
  in that byte order (big) even on machine with some other byte order
  (little)

Hmmm. I don't understand why you need to maintain the byte-order of a
file, or provide the users with a new module. It seems to me that it
would be easier (less work) to write a conversion function in the vector
library that checks the byte-order and converts it to the machine
byte-order if necessary. Write will always write in native format. Then
the conversion is only performed when necessary, the user never knows
the difference, and grass will always be able to read either format
making the files platform independent. It seems to me that in this case
the only task here is to write a single "check and convert" function for
reading vector files. Am I missing something?

--
Sincerely,

Jazzman (a.k.a. Justin Hickey) e-mail: jhickey@hpcc.nectec.or.th
High Performance Computing Center
National Electronics and Computer Technology Center (NECTEC)
Bangkok, Thailand

People who think they know everything are very irritating to those
of us who do. ---Anonymous

Jazz and Trek Rule!!!

----------------------------------------
If you want to unsubscribe from GRASS Development Team mailing list write to:
minordomo@geog.uni-hannover.de with
subject 'unsubscribe grass5'

Hi Justin and Radim,

Justin Hickey wrote:

Hi Radim

Radim Blazek wrote:
> My conclusion is:

[... snip...]

> - write new module for conversion little <=> big
> - no auto conversion into native format (read only files, files shared
> by more platforms)
> - vector written in one byte order (big for example) will be updated
> in that byte order (big) even on machine with some other byte order
> (little)

Hmmm. I don't understand why you need to maintain the byte-order of a
file, or provide the users with a new module. It seems to me that it
would be easier (less work) to write a conversion function in the vector
library that checks the byte-order and converts it to the machine
byte-order if necessary. Write will always write in native format. Then
the conversion is only performed when necessary, the user never knows
the difference, and grass will always be able to read either format
making the files platform independent. It seems to me that in this case
the only task here is to write a single "check and convert" function for
reading vector files. Am I missing something?

I agree with Radim when he says that autoconversion (the "check and
convert" function suggested by Justin) may be a bad idea in case of
file shared (NFS, etc...) by different platform (Ex :Sun + Pentium),
but I think that this may only be a problem if both platforms use the
file at the same time, which is actually possible, but extremelly
dangerous, even with the actual file format : there is no file level
locking mecanism. There is also no multiuser support in Grass : you
can only run one instance of Grass on one machine, even when you work
with two different databases, mainly because the fifos used for the
monitors are in a fixed location, not on a "per instance" location.

Sharing files implies the management of locks at least at the file level,
and this is a major change in sources... so it is imho only usefull
if you add also the possibility of multiple Grass instances running
on the same machine.

I am more sensible to the second argument : read-only files cannot be
converted "automagically"... so I agree with Radim except for his last
proposition : autoconversion can be done, but only for a file open
for update (or read/write)

--
Michel WURTZ - DIG - Maison de la télédétection
               500, rue J.F. Breton
               34093 MONTPELLIER Cedex 5

----------------------------------------
If you want to unsubscribe from GRASS Development Team mailing list write to:
minordomo@geog.uni-hannover.de with
subject 'unsubscribe grass5'

Justin Hickey wrote:

Hi Radim

Radim Blazek wrote:
> My conclusion is:
> - vector file may ben written in big or little endian (no pdp or
> other) and byte order is specified in header
> - by default new file will be written in machine native byte order
> (later add optional byte order for new files by enviroment
> variable GRASS_VECTOR_ENDIAN)
> - vector modules (library) will work with both little and big endian
> vector files
> - write new module for conversion little <=> big
> - no auto conversion into native format (read only files, files shared
> by more platforms)
> - vector written in one byte order (big for example) will be updated
> in that byte order (big) even on machine with some other byte order
> (little)

Hmmm. I don't understand why you need to maintain the byte-order of a
file, or provide the users with a new module. It seems to me that it
would be easier (less work) to write a conversion function in the vector
library that checks the byte-order and converts it to the machine
byte-order if necessary. Write will always write in native format. Then
the conversion is only performed when necessary, the user never knows
the difference, and grass will always be able to read either format
making the files platform independent. It seems to me that in this case
the only task here is to write a single "check and convert" function for
reading vector files. Am I missing something?

Yes. That is what I mean by: vector modules will work with both orders.

Conversion module may be used to increase speed if files are not in native
order.

Radim

----------------------------------------
If you want to unsubscribe from GRASS Development Team mailing list write to:
minordomo@geog.uni-hannover.de with
subject 'unsubscribe grass5'

Hi Radim

Hmmm. I think you missed my question. Maybe if I comment on each of your
points it will become clearer.

Radim Blazek wrote:

- vector file may ben written in big or little endian (no pdp or
  other) and byte order is specified in header

I don't understand the need for this. The vector file should be simply
written in native byte order, but maintaining this information in the
header could be useful for checking byte order if it is faster than a
system check (I don't know what the system check would be).

- by default new file will be written in machine native byte order

               ^^^
I think new should be all files. Please explain why only new files
should have the native byte order by default.

  (later add optional byte order for new files by enviroment
   variable GRASS_VECTOR_ENDIAN)

I don't understand the need for this.

- vector modules (library) will work with both little and big endian
  vector files

I agree with this but the conversion should be handled in the read
function. The modules should not need to be changed.

- write new module for conversion little <=> big
- no auto conversion into native format (read only files, files shared
  by more platforms)
- vector written in one byte order (big for example) will be updated
  in that byte order (big) even on machine with some other byte order
  (little)

I don't understand the need for these three.

Conversion module may be used to increase speed if files are not in
native order.

--
Sincerely,

Jazzman (a.k.a. Justin Hickey) e-mail: jhickey@hpcc.nectec.or.th
High Performance Computing Center
National Electronics and Computer Technology Center (NECTEC)
Bangkok, Thailand

People who think they know everything are very irritating to those
of us who do. ---Anonymous

Jazz and Trek Rule!!!

----------------------------------------
If you want to unsubscribe from GRASS Development Team mailing list write to:
minordomo@geog.uni-hannover.de with
subject 'unsubscribe grass5'

Hi Michel and Radim

Michel Wurtz wrote:

I agree with Radim when he says that autoconversion (the "check and
convert" function suggested by Justin) may be a bad idea in case of
file shared (NFS, etc...) by different platform (Ex :Sun + Pentium),
but I think that this may only be a problem if both platforms use the
file at the same time, which is actually possible, but extremelly
dangerous, even with the actual file format : there is no file level
locking mecanism.

Since Grass does not have file level locking, it would be dangerous to
access the same file regardless of the byte order. Simultaneous write
access is much more dangerous than byte order, and file level locking
would eliminate the byte order problem, so I think byte order in this
case is not an issue. Thus, I don't understand why byte order would
cause problems with an NFS network. Simply convert on read if necessary
and write to a native byte order. If performance is an issue, then a
local file should be used since there would be a performance hit from
the network anyway.

Actually, perhaps we could pop up a message (or print for text mode)
saying that the file is being converted and once we read it in
correctly, we write it out in native byte order. Of course this would
depend if we could detect whether the file was a network file since this
would be intended for a one time conversion anyway, which wouldn't be
the case for a network file. This idea is similar to upgrading versions
of files. If the software detects an old file format, it should upgrade
the file to the new format. Just a thought.

I am more sensible to the second argument : read-only files cannot be
converted "automagically"...

This one puzzles me. If the file is read-only, then we can't write the
file, so writing in a different byte order is not an issue. We still
have to convert it to read it properly, so why not do it automatically?

In general byte order decisions need to be done transparent to the user.
Most users probably don't understand the byte order problem and probably
don't want to. Thus I see no reason to have a new module to perform a
conversion. I'm sorry, but I just don't see a reason to write a file in
a non-native byte order when Grass will be able to read the non-native
byte order anyway.

Please don't take this the wrong way. I am not saying that these
proposals are a bad idea. I just don't see why we need all the features
Radim proposed and I am only trying to save whoever will end up coding
this some time and effort. However, if the list wants to include these
features, or if I am missing an issue (please tell me if I am), then
that's OK with me.

--
Sincerely,

Jazzman (a.k.a. Justin Hickey) e-mail: jhickey@hpcc.nectec.or.th
High Performance Computing Center
National Electronics and Computer Technology Center (NECTEC)
Bangkok, Thailand

People who think they know everything are very irritating to those
of us who do. ---Anonymous

Jazz and Trek Rule!!!

----------------------------------------
If you want to unsubscribe from GRASS Development Team mailing list write to:
minordomo@geog.uni-hannover.de with
subject 'unsubscribe grass5'

Justin Hickey (and Michel WURTZ) wrote:

Hi Radim

Hmmm. I think you missed my question. Maybe if I comment on each of your
points it will become clearer.

Sorry if my thoughts/english was not clear enough.
(I will use BO = byte order, LE = little endian, BE = big endian)

Radim Blazek wrote:
> - vector file may ben written in big or little endian (no pdp or
> other) and byte order is specified in header

I don't understand the need for this. The vector file should be simply
written in native byte order, but maintaining this information in the
header could be useful for checking byte order if it is faster than a
system check (I don't know what the system check would be).

Vector files may exist in LE or BE. It is about format
of vector file more than about how it is created. BO information
MUST be saved in header because it is the only way how to recognize byte
order of the file. The file may be created on BE machine and used on LE machine
and I don't see any other oportunity how to recognize byte order.

> - by default new file will be written in machine native byte order
               ^^^
I think new should be all files. Please explain why only new files
should have the native byte order by default.

New files are files opened by Vect_open_new().

I think that two rules above together say the same what you want:
Format is BE or LE + Vect_open_new() opens in native BO = file
is written in native BO.

> (later add optional byte order for new files by enviroment
> variable GRASS_VECTOR_ENDIAN)

I don't understand the need for this.

We can simly forget this point. That was intended for flexibility only.
I can imagine following situation in some institution:
1 BE machine producing grass files (maybe creates new files by reclassification
   according to attributes in database each day)
25 LE computers used to view/query that files
It would be usefull to write that files on BE machine in LE.

> - vector modules (library) will work with both little and big endian
> vector files

I agree with this but the conversion should be handled in the read
function. The modules should not need to be changed.

Of course - conversion is done by library in memory and modules
don't know about that.

> - write new module for conversion little <=> big
> Conversion module may be used to increase speed if files are not in
> native order.

What is not clear here? User received files created on BE computer but
he is running LE computer. However he can use that files immediately
(conversion in memory by library function) he want to increase speed.
Conversion module may be used for that.

> - no auto conversion into native format (read only files, files shared
> by more platforms)

Conversion on disk i.e. rewrite file into native BO on the hard disk
will not be automatic. (conversion in memory by library is always done
if native BO is not file BO)

> - vector written in one byte order (big for example) will be updated
> in that byte order (big) even on machine with some other byte order
> (little)
I don't understand the need for these three.

Michel WURTZ:... so I agree with Radim except for his last
proposition : autoconversion can be done, but only for a file open
for update (or read/write)

I am not sure in autoconversion for update. Michel, are there any reasons
don't update in non native BO. (You want to chage category of one line in
20 MB file for example.)

File FORMAT is most important now because other things like autoconversion
or optional writting in non native BO may be changed in future.

Radim

----------------------------------------
If you want to unsubscribe from GRASS Development Team mailing list write to:
minordomo@geog.uni-hannover.de with
subject 'unsubscribe grass5'

Hi Radim

Radim Blazek wrote:

Sorry if my thoughts/english was not clear enough.
(I will use BO = byte order, LE = little endian, BE = big endian)

It's not entirely your fault. My head seems to be in the clouds these
days. :slight_smile:

Vector files may exist in LE or BE. It is about format
of vector file more than about how it is created. BO information
MUST be saved in header because it is the only way how to recognize
byte order of the file. The file may be created on BE machine and used
on LE machine and I don't see any other oportunity how to recognize
byte order.

OK, the confusion was on my part. I don't know why I didn't see this the
first time.

New files are files opened by Vect_open_new().

I think that two rules above together say the same what you want:
Format is BE or LE + Vect_open_new() opens in native BO = file
is written in native BO.

Please see below for this one.

> > - write new module for conversion little <=> big
> > Conversion module may be used to increase speed if files are not
> > in native order.

What is not clear here? User received files created on BE computer but
he is running LE computer. However he can use that files immediately
(conversion in memory by library function) he want to increase speed.
Conversion module may be used for that.

Well, it doesn't really matter. It's just that I was trained to keep low
level functionality (like byte swapping) hidden from users. My idea was
to automatically write out the file in native byte order after reading
it. That way the user sees a one time conversion to native byte order (a
message would indicate the conversion) and gets the increased
performance from then on. The only consideration are network files. In
this case, if the user wants performance he/she will download the file
to the local machine anyway. And if they are already using a network,
then they should be willing to take a performance hit that would be
caused by the conversion, since they are willing to take a performance
hit for the network. Just a suggestion that might save some work.

Conversion on disk i.e. rewrite file into native BO on the hard disk
will not be automatic.

Why not? I don't understand why the byte order has to be maintained for
the file. In most cases, since Grass requires a user to own a file to
have write access, then I would expect that user to use a local file.
Also, many users don't even know about the byte order problem. Thus, I
would suggest that the default behaviour would be to always write in
native byte order. If you want to provide the option of writing in
non-native byte order to advanced users, then I would suggest passing an
argument to the pertinent functions. This may be more work than writing
a new module, I don't know, but it somewhat hides the low level
functionality from the user. Again, only a suggestion.

Now a couple of suggestions that you may already know, and I apologize
if that is the case. I saw that XDR maintains byte order regardless of
the machine byte order. Since we are already using XDR, maybe we should
look into using it for the vector format. After all, it is a data
description language. But you may have already considered this.

If we do decide to do our own byte swapping routines, then I suggest we
use a socket function like ntohs() that will do nothing if the byte
order matches or swap the bytes if they don't match. This seems to be
the most efficient method to me. Again, this may be what you were
planning. I just thought I'd mention it.

Please don't take this the wrong way. I'm only trying to understand the
reasons for some of the features you proposed, as well as making some
alternative suggestions that could save people some work.

Talk to you later.

--
Sincerely,

Jazzman (a.k.a. Justin Hickey) e-mail: jhickey@hpcc.nectec.or.th
High Performance Computing Center
National Electronics and Computer Technology Center (NECTEC)
Bangkok, Thailand

People who think they know everything are very irritating to those
of us who do. ---Anonymous

Jazz and Trek Rule!!!

----------------------------------------
If you want to unsubscribe from GRASS Development Team mailing list write to:
minordomo@geog.uni-hannover.de with
subject 'unsubscribe grass5'

Justin Hickey wrote:

Radim Blazek wrote:
> Conversion on disk i.e. rewrite file into native BO on the hard disk
> will not be automatic.

Why not? I don't understand why the byte order has to be maintained for
the file. In most cases, since Grass requires a user to own a file to
have write access, then I would expect that user to use a local file.
Also, many users don't even know about the byte order problem. Thus, I
would suggest that the default behaviour would be to always write in
native byte order. If you want to provide the option of writing in
non-native byte order to advanced users, then I would suggest passing an
argument to the pertinent functions. This may be more work than writing
a new module, I don't know, but it somewhat hides the low level
functionality from the user. Again, only a suggestion.

I see some problems with automatic conversion: example:
- mapset PERMANENT - contains larger (500MB) vectors in BE order
- mapset my_mapset - user's current mapset
- user works on LE computer
Now user want view some vectors from PERMANENT:
- he type: d.vect map=v1@PERMANENT
- user cannot write into PERMANENT and he can convert only into
   v1@my_mapset but:
   - v1@my_mapset may already exist
   - user hasn't enough space on disk (hw/sw limit) for that large file
   - $DATABASE is completely read only on cdrom for example
       (I know that it's impossible now and you need write access but it
         should be changed I think)
   - if conversion into v1@my_mapset was succesfully done the command
      d.vect map=v1@PERMANENT cannot go on because map is now
      v1@my_mapset
   - after running d.vect on all vectors in PERMANENT your $DATABASE
       doesn't occupy original 500MB but 1GB

Now a couple of suggestions that you may already know, and I apologize
if that is the case. I saw that XDR maintains byte order regardless of
the machine byte order. Since we are already using XDR, maybe we should
look into using it for the vector format. After all, it is a data
description language. But you may have already considered this.

In grass5.0 there are already conversion functions for conversion from native
into BE which is now BO for rotable files. I want only modify that function for
new format (very similar to old) and that is almost done. I have considered
XDR shortly and I have found easier to modify old functions. But you
may be right that XDR is better. I'll look into XDR once more.

If we do decide to do our own byte swapping routines, then I suggest we
use a socket function like ntohs() that will do nothing if the byte
order matches or swap the bytes if they don't match. This seems to be
the most efficient method to me. Again, this may be what you were
planning. I just thought I'd mention it.

Yes but I don't see some ntohd/f/() - double, float and because it is defined:
unsigned long int htonl(unsigned long int hostlong);
I am wory that it doesn't convert between various type lengths.
long may be 4 or 8 bytes and we need conversion to/from 4 bytes

Please don't take this the wrong way. I'm only trying to understand the
reasons for some of the features you proposed, as well as making some
alternative suggestions that could save people some work.

All comments/suggestions/criticism are welcome because that is the only way
how to keep quality in GPL SW.

Radim

----------------------------------------
If you want to unsubscribe from GRASS Development Team mailing list write to:
minordomo@geog.uni-hannover.de with
subject 'unsubscribe grass5'

Hi Radim

Radim Blazek wrote:

I see some problems with automatic conversion: example:
- mapset PERMANENT - contains larger (500MB) vectors in BE order
- mapset my_mapset - user's current mapset
- user works on LE computer
Now user want view some vectors from PERMANENT:
- he type: d.vect map=v1@PERMANENT
- user cannot write into PERMANENT and he can convert only into
   v1@my_mapset but:
   - v1@my_mapset may already exist
   - user hasn't enough space on disk (hw/sw limit) for that large
     file
   - $DATABASE is completely read only on cdrom for example
       (I know that it's impossible now and you need write access but > it should be changed I think)
   - if conversion into v1@my_mapset was succesfully done the command
      d.vect map=v1@PERMANENT cannot go on because map is now
      v1@my_mapset
   - after running d.vect on all vectors in PERMANENT your $DATABASE
       doesn't occupy original 500MB but 1GB

Ahh. OK I see what you are thinking now. I agree that this would be
dangerous. However, this is not the type of automatic conversion I was
thinking about. Here is an example:

- user downloads a BE file (v1) from the internet into own mapset (ie
   has write access)
- user is using a LE machine
- user tries d.vect v1
- read function in vector library detects wrong BO
- message appears saying something like "file will be converted ...
   please wait"
- read function reads in file, swaps bytes, and then overwrites file v1
   (BE file no longer exists)
- read function passes swapped data to d.vect

Thus, the automatic conversion will only overwrite a file. It will never
copy it. The only problem is a possible lack of disk space for a
temporary file to perform the conversion. However, I'm not sure if we
would need to have a copy on disk. Perhaps grass reads the whole file
into memory, I don't know. But if we do need to create a temporary copy,
then we provide a check for disk space and if there is not enough, we
skip the write out phase and only swap the bytes in memory.

In other words, the automatic conversion will only occur if the file is
writable. Thus, the read function will have to test if the file is
writable before it tries the automatic conversion.

For writing a file, we simply write out the data in native BO since
grass can read either BO. Thus, I still do not see a need to write files
in non-native BO.

Yes but I don't see some ntohd/f/() - double, float and because it is
defined: unsigned long int htonl(unsigned long int hostlong);
I am wory that it doesn't convert between various type lengths.
long may be 4 or 8 bytes and we need conversion to/from 4 bytes

OK, I thought they were more standard than that. I just feel that using
system functions is usually more efficient than using our own.
Unfortunately, they must be standard to use them. :frowning:

All comments/suggestions/criticism are welcome because that is the
only way how to keep quality in GPL SW.

Agreed. It's just sometimes the tone of my writing is mis-interpreted so
I tend to warn people about it. :slight_smile:

Talk to you later.

--
Sincerely,

Jazzman (a.k.a. Justin Hickey) e-mail: jhickey@hpcc.nectec.or.th
High Performance Computing Center
National Electronics and Computer Technology Center (NECTEC)
Bangkok, Thailand

People who think they know everything are very irritating to those
of us who do. ---Anonymous

Jazz and Trek Rule!!!

----------------------------------------
If you want to unsubscribe from GRASS Development Team mailing list write to:
minordomo@geog.uni-hannover.de with
subject 'unsubscribe grass5'

Justin Hickey wrote:

Ahh. OK I see what you are thinking now. I agree that this would be
dangerous. However, this is not the type of automatic conversion I was
thinking about. Here is an example:

- user downloads a BE file (v1) from the internet into own mapset (ie
   has write access)
- user is using a LE machine
- user tries d.vect v1
- read function in vector library detects wrong BO
- message appears saying something like "file will be converted ...
   please wait"
- read function reads in file, swaps bytes, and then overwrites file v1
   (BE file no longer exists)
- read function passes swapped data to d.vect

Thus, the automatic conversion will only overwrite a file. It will never
copy it. The only problem is a possible lack of disk space for a
temporary file to perform the conversion. However, I'm not sure if we
would need to have a copy on disk. Perhaps grass reads the whole file
into memory, I don't know. But if we do need to create a temporary copy,
then we provide a check for disk space and if there is not enough, we
skip the write out phase and only swap the bytes in memory.

In other words, the automatic conversion will only occur if the file is
writable. Thus, the read function will have to test if the file is
writable before it tries the automatic conversion.

That is possible.

(XDR uses big endian and fixed word length n*4 bytes. That is not
useful for us.)

Radim

----------------------------------------
If you want to unsubscribe from GRASS Development Team mailing list write to:
minordomo@geog.uni-hannover.de with
subject 'unsubscribe grass5'