[GRASS-dev] r.series threshold patch

Glynn,

to easier operate on incomplete time series from MODIS (and
others), we would like to suggest attached patch. It
adds a threshold to filter out incomplete pixel series
before calling the aggregation function which saves us
to perform extra runs on counting valid pixels and to
post-filter the aggregated results.

Markus

(attachments)

r_series.patch (3.32 KB)

Markus Neteler wrote:

to easier operate on incomplete time series from MODIS (and
others), we would like to suggest attached patch. It
adds a threshold to filter out incomplete pixel series
before calling the aggregation function which saves us
to perform extra runs on counting valid pixels and to
post-filter the aggregated results.

While I don't doubt that this is a useful optimisation for your
particular case, I'm generally opposed to adding such optimisations
for specific cases.

A more general optimisation would be to extend the method= and output=
options to accept multiple values, so that you can compute multiple
aggregates in a single run. You would still need to combine the two
outputs with e.g. r.mapcalc, but you would only need one run of
r.series.

--
Glynn Clements <glynn@gclements.plus.com>

Glynn Clements wrote on 08/16/2007 08:59 PM:

Markus Neteler wrote:

to easier operate on incomplete time series from MODIS (and
others), we would like to suggest attached patch. It
adds a threshold to filter out incomplete pixel series
before calling the aggregation function which saves us
to perform extra runs on counting valid pixels and to
post-filter the aggregated results.
    
While I don't doubt that this is a useful optimisation for your
particular case, I'm generally opposed to adding such optimisations
for specific cases.

A more general optimisation would be to extend the method= and output=
options to accept multiple values, so that you can compute multiple
aggregates in a single run. You would still need to combine the two
outputs with e.g. r.mapcalc, but you would only need one run of
r.series.
  

The optimization you propose is of course far more general than what
we did, and could be extremely valuable.

Nonetheless, we think that introducing the threshold parameter is not
really a special case hack: all it really does is a straightforward
generalization of the current -n flag, transforming it from a ON/OFF
switch to an integer value.

The threshold parameter indicates the minimum number of non NULL
inputs required for passing over the inputs to the aggregation
function.

It varies in the range [1,num_inputs]; thresh=num_inputs is equivalent
to -n (return NULL unless the inputs are all non NULL), while thresh=1
is the standard behaviour (compute the aggregation if there is at
least 1 non NULL input).

Markus and Antonio

------------------
ITC -> dall'1 marzo 2007 Fondazione Bruno Kessler
ITC -> since 1 March 2007 Fondazione Bruno Kessler
------------------

Markus Neteler wrote:

>> to easier operate on incomplete time series from MODIS (and
>> others), we would like to suggest attached patch. It
>> adds a threshold to filter out incomplete pixel series
>> before calling the aggregation function which saves us
>> to perform extra runs on counting valid pixels and to
>> post-filter the aggregated results.
>
> While I don't doubt that this is a useful optimisation for your
> particular case, I'm generally opposed to adding such optimisations
> for specific cases.
>
> A more general optimisation would be to extend the method= and output=
> options to accept multiple values, so that you can compute multiple
> aggregates in a single run. You would still need to combine the two
> outputs with e.g. r.mapcalc, but you would only need one run of
> r.series.
>
The optimization you propose is of course far more general than what
we did, and could be extremely valuable.

Nonetheless, we think that introducing the threshold parameter is not
really a special case hack: all it really does is a straightforward
generalization of the current -n flag, transforming it from a ON/OFF
switch to an integer value.

The threshold parameter indicates the minimum number of non NULL
inputs required for passing over the inputs to the aggregation
function.

It varies in the range [1,num_inputs]; thresh=num_inputs is equivalent
to -n (return NULL unless the inputs are all non NULL), while thresh=1
is the standard behaviour (compute the aggregation if there is at
least 1 non NULL input).

Actually, thresh=0 would give the existing behaviour. If -n isn't
used, the values are always passed to the aggregate function. If all
of the values are null, most aggregates will return null, but the
"count" aggregate will return 0.

--
Glynn Clements <glynn@gclements.plus.com>

Glynn Clements wrote:

Markus Neteler wrote:

>> to easier operate on incomplete time series from MODIS (and
>> others), we would like to suggest attached patch. It
>> adds a threshold to filter out incomplete pixel series
>> before calling the aggregation function which saves us
>> to perform extra runs on counting valid pixels and to
>> post-filter the aggregated results.
>
> While I don't doubt that this is a useful optimisation for your
> particular case, I'm generally opposed to adding such optimisations
> for specific cases.
>
> A more general optimisation would be to extend the method= and output=
> options to accept multiple values, so that you can compute multiple
> aggregates in a single run. You would still need to combine the two
> outputs with e.g. r.mapcalc, but you would only need one run of
> r.series.
>
The optimization you propose is of course far more general than what
we did, and could be extremely valuable.

Nonetheless, we think that introducing the threshold parameter is not
really a special case hack: all it really does is a straightforward
generalization of the current -n flag, transforming it from a ON/OFF
switch to an integer value.

The threshold parameter indicates the minimum number of non NULL
inputs required for passing over the inputs to the aggregation
function.

It varies in the range [1,num_inputs]; thresh=num_inputs is equivalent
to -n (return NULL unless the inputs are all non NULL), while thresh=1
is the standard behaviour (compute the aggregation if there is at
least 1 non NULL input).

Actually, thresh=0 would give the existing behaviour. If -n isn't
used, the values are always passed to the aggregate function. If all
of the values are null, most aggregates will return null, but the
"count" aggregate will return 0.

You are right.
Do you still vote against the patch in general (along with better
documentation)?

Markus
--
View this message in context: http://www.nabble.com/r.series-threshold-patch-tf4280608.html#a12238104
Sent from the Grass - Dev mailing list archive at Nabble.com.

Markus Neteler wrote:

>> >> to easier operate on incomplete time series from MODIS (and
>> >> others), we would like to suggest attached patch. It
>> >> adds a threshold to filter out incomplete pixel series
>> >> before calling the aggregation function which saves us
>> >> to perform extra runs on counting valid pixels and to
>> >> post-filter the aggregated results.
>> >
>> > While I don't doubt that this is a useful optimisation for your
>> > particular case, I'm generally opposed to adding such optimisations
>> > for specific cases.
>> >
>> > A more general optimisation would be to extend the method= and output=
>> > options to accept multiple values, so that you can compute multiple
>> > aggregates in a single run. You would still need to combine the two
>> > outputs with e.g. r.mapcalc, but you would only need one run of
>> > r.series.
>> >
>> The optimization you propose is of course far more general than what
>> we did, and could be extremely valuable.
>>
>> Nonetheless, we think that introducing the threshold parameter is not
>> really a special case hack: all it really does is a straightforward
>> generalization of the current -n flag, transforming it from a ON/OFF
>> switch to an integer value.
>>
>> The threshold parameter indicates the minimum number of non NULL
>> inputs required for passing over the inputs to the aggregation
>> function.
>>
>> It varies in the range [1,num_inputs]; thresh=num_inputs is equivalent
>> to -n (return NULL unless the inputs are all non NULL), while thresh=1
>> is the standard behaviour (compute the aggregation if there is at
>> least 1 non NULL input).
>
> Actually, thresh=0 would give the existing behaviour. If -n isn't
> used, the values are always passed to the aggregate function. If all
> of the values are null, most aggregates will return null, but the
> "count" aggregate will return 0.

You are right.
Do you still vote against the patch in general (along with better
documentation)?

I probably won't be extending r.series to support multiple aggregates
and outputs in the near future, so I don't have any real objection to
the patch.

One minor nit: the test doesn't need any additional parentheses; using:

  if (null && flag.nulls->answer || num_inputs - null < thresh)

is sufficient.

C operator precedence (highest to lowest):

  application () -> .
  unary ! ~ ++ -- + - * (type) sizeof
  multiplicative * / %
  additive + -
  shift << >>
  inequality < <= > >=
  equality == !=
  bitwise-and &
  bitwise-xor ^
  bitwise-or |
  logical-and &&
  logical-or ||
  conditional ?:
  assignment = += -= *= /= %= &= ^= |= <<= >>=
  sequencing ,

The unary, conditional and assignment operators are right-associative,
all others are left-associative.

Briefly: arithmetic operators are higher than relational operators
which are higher than logical operators, which is the most convenient
order for if/while/etc tests, and && is higher than ||, so
sum-of-products doesn't require parentheses.

--
Glynn Clements <glynn@gclements.plus.com>