[GRASS-user] Entering '<1' and 'BDL' in Database

   For a scientific database holding observed and measured values there are
two non-numeric (NAN) legitimate entries: <1 and 'below detection limits.'
Is anyone aware of a standard way of expressing these?

   The problem is that the column needs to be of a numeric type because the
numbers will be manipulated mathematically and summarized statistically. So
a NAN such as '<1' or 'BDL' doesn't cut it.

   Any thoughts?

Rich

On Monday 19 April 2010, Rich Shepard wrote:

   For a scientific database holding observed and measured values there are
two non-numeric (NAN) legitimate entries: <1 and 'below detection limits.'
Is anyone aware of a standard way of expressing these?

   The problem is that the column needs to be of a numeric type because the
numbers will be manipulated mathematically and summarized statistically. So
a NAN such as '<1' or 'BDL' doesn't cut it.

   Any thoughts?

how about NULL ?

Dylan

--
Dylan Beaudette
Soil Resource Laboratory
http://casoilresource.lawr.ucdavis.edu/
University of California at Davis
530.754.7341

On Mon, 19 Apr 2010, Dylan Beaudette wrote:

how about NULL ?

Dylan,

   NULL means unknown, and 'less than 1' or 'below detection limits' means it
cannot be measured with the instruments used. There's a big difference. NULL
can result from the measurement not being taken or the analysis not being
conducted.

Thanks,

Rich

On Monday 19 April 2010, Rich Shepard wrote:

On Mon, 19 Apr 2010, Dylan Beaudette wrote:
> how about NULL ?

Dylan,

   NULL means unknown, and 'less than 1' or 'below detection limits' means
it cannot be measured with the instruments used. There's a big difference.
NULL can result from the measurement not being taken or the analysis not
being conducted.

Thanks,

Rich

Hi,

In that case, how about another column that contains quality control
information. There isn't really any mechanism in most relational database
models that would allow you to store 'mixed' datatypes AND interpret them as
a numeric datatype... to the best of my knowledge.

Cheers,
Dylan

--
Dylan Beaudette
Soil Resource Laboratory
http://casoilresource.lawr.ucdavis.edu/
University of California at Davis
530.754.7341

On Mon, 19 Apr 2010, Adam Dershowitz, Ph.D., P.E. wrote:

I would suggest that NULL doesn't mean "unknown".

Adam,

   Actually, it does.

Instead it means, "can't be represented as a valid number".

   One can have NULL in a text column, too, when the value is not known.

Another option is to use your own codes. For example, if your sample data
is all positive, then -999, might represent "less than 1". These type of
codes are often used, but realize that they are somewhat of a kluge
because fundamentally a column that holds numbers should hold NUMBERS, and
putting in <1 is not really a number, but a flag that means something
else. You have to be careful if you do things like that because it is
easy to do things like average a column, and accidentally include
-999 in there.

   All database experts, including Joe Celko and Rick van der Lans, write
that such codes should never be used for exactly the reasons you present.

   However, thinking more about the situation, NULL would work. If the value
is below detection limits (which '<1' also means), then we really don't know
the value, so NULL is a reasonable entry.

Thanks to you and Dylan for suggestions,

Rich