let's standardize sites

Clearly this attempt at standardization is wonderful!

I would propose, however, that elevation be explicitly included in the format
as in:

east|north|z|#n label

All places in the real world have three components of location. This leaves
label for truly ancillary data.

Consider for example well data. We need location and elevation of the Kelly
Bushing and woul like to reserve label for depths or other floating point
data.

Glenn C. Kroeger, Ph.D.
Associate Professor of Geology
Director of Environmental Studies
Trinity University, 715 Stadium Dr., San Antonio, TX 78212
(210) 736-7607
gkroeger@geology.trinity.edu

I'm glad to see there's interest in settling on a standard. Regarding
the multiple-attribute issue, I think that this is best left to an
external database program. However, Dr. Kroeger brings up a good
point:

Glenn C. Kroeger (gkroeger@physics.trinity.edu) writes on 22 Jan 94:

I would propose, however, that elevation be explicitly included in the format
as in:
east|north|z|#n label
All places in the real world have three components of location. This leaves
label for truly ancillary data.

Most of my sites processing assume measurements taken on a "plane," so
I haven't really had the need for elevation information. However, I
seem to recall hearing that Helena & friends were doing some 3D
modeling, and I assume that this would make life easier in this
situation (having the elevation and attribute information together).
It would have to be made clear that the third field is part of the
location and NOT the typical measured value (e.g., not a chemical
concentration or bug count or ...)

If we included this extra field, it would have to remain
optional (i.e., G_get_site (fd, &e, &n, &z, &str) > 0 if
the third field is blank).

In terms of storage space, this only adds another character (one byte)
for each site to existing site lists to make them compatible, so this
is not a big negative (<=5% increase in storage requirements).
However, in terms of program overhead, this adds another double (8
bytes) for each fully-specified site. This increases the mininum size
of a site by <= 38%. Is this a concern? I can think of ways of
optimizing here if data structures and something like a G_read_sites()
were ever implemented in GISLIB.

Are there compelling reasons for keeping elevation separate?

Does anyone have a problem with the
easting|northing|elevation|#category attribute_label
format? It sounds okay to me.

--Darrell

mccauley@ecn.purdue.edu wrote:

Does anyone have a problem with the
easting|northing|elevation|#category attribute_label
format? It sounds okay to me.

- I like the suggestion to add the elevation field.
- The #category field should remain optional!
- The user should be able to interpolate on data which is
    in the <attribute_label> field or in the elevation field.
   - we might want to add a function that programs can use if desired
     to parse the attrlabel string for fields of numeric data and which
     allows the user to specify which field of data they are interested in.
     (of course ultimately this should be handled by the dbms but it is
     still very useful if you want to avoid a database)
- We should develop a set of routines to parse the fields for the programmer
    so that we get a standard interface. [G_get_site ()]

You also said:

In terms of storage space, this only adds another character (one byte)
for each site to existing site lists to make them compatible, so this
is not a big negative (<=5% increase in storage requirements).

This makes me think that you want to change all existing site files.
There is no reason to. Because what you said:

If we included this extra field, it would have to remain
optional (i.e., G_get_site (fd, &e, &n, &z, &str) > 0 if
the third field is blank).

is true even for the current format. If there are only 2 pipes (|) then
G_get_site can be smart enough to correct for this. (that is unless
<attribute_label> has pipes in it.

Furthur ruminations produced:

How about a format like:
<easting>|<northing>|[z|[d4|]...][#category_int] [attr_text OR %flt[%flt]...]

where
     = optional.
    ... = optionally repeate
    | = dimension suffix
    # = category prefix
    % = float data prefix

this allows for N dimensional sites, an optional
category number, and an optional text attribution OR any number of floating
point numbers

OR we could provide for text AND floats to be allowed if desired.

--
  David Gerdes
  US Army Construction Engineering Research Lab
  Spatial Analysis & Systems Team
  dpgerdes@zorro.cecer.army.mil

My purpose for calling for standardization was largely for
simplicity and consistency for the users, which would lead to
an easier job for programmers. Implied was a standard format
for data, not necessarily a standard set of parsing rules.

With the many (excellent) suggestions that have followed, it
seems like we're losing a lot of the simplicity.

That's not necessarily bad, but is it what is needed at this
point in GRASS' development? Has the time come to extend this
part of the database to handle data in multi-dimensions? I think
that we should take things one step at a time. After all, the
vector and raster formats don't even have floating-point support
(yet).

When asking for standardization, I was looking for something that
could possibly be implemented as early as 4.2.

In retrospect, shouldn't we keep the dimensionality the same for
all data formats and leave extensions to a dbms?

David Gerdes (dpgerdes@zorro.cecer.army.mil) writes on 24 Jan 94:

mccauley@ecn.purdue.edu wrote:

Does anyone have a problem with the
easting|northing|elevation|#category attribute_label
format? It sounds okay to me.

it's starting to not look so good to me now... the proverbial
can opener (worms everywhere! :slight_smile:

- I like the suggestion to add the elevation field.
- The #category field should remain optional!

then we lose consistency for all sites lists.

- We should develop a set of routines to parse the fields for the programmer
   so that we get a standard interface. [G_get_site ()]

Amen. If this were done (in whatever manner), it would alleviate
a lot of the concern, at least from the programming point of view.

You also said:

In terms of storage space, this only adds another character (one byte)
for each site to existing site lists to make them compatible, so this
is not a big negative (<=5% increase in storage requirements).

This makes me think that you want to change all existing site files.
There is no reason to. Because what you said:

I want things to be consistent and simple. If we lose this, then
we lose one of the merits of simple ascii files.

If we included this extra field, it would have to remain
optional (i.e., G_get_site (fd, &e, &n, &z, &str) > 0 if
the third field is blank).

is true even for the current format. If there are only 2 pipes (|) then
G_get_site can be smart enough to correct for this. (that is unless
<attribute_label> has pipes in it.

Furthur ruminations produced:

How about a format like:
<easting>|<northing>|[z|[d4|]...][#category_int] [attr_text OR %flt[%flt]...]

I could foresee confusion and increased complexity for the user
if multidimensions were added (not to mention that all programs
would need a flag or parameter specifying the dimension in which
calculations would be done). Multiple attributes ("%flt[[%flt]...]"),
IMO, would probably best be left to a database program. I didn't
anticipate creating one for this effort.

--Darrell

James Darrell McCauley, Purdue Univ, West Lafayette, IN 47907-1146, USA
mccauley@ecn.purdue.edu, mccauley%ecn@purccvm.bitnet, pur-ee!mccauley