Hi,
Another question concernant i.segment's details:
IIUC, the threshold used in region-growing is normalized using a common denominator defined by:
divisor = globals->nrows + globals->ncols;
(BTW, why '+', not '*' ?)
Row and column numbers come from
globals->nrows = Rast_window_rows();
globals->ncols = Rast_window_cols();
The threshold is then adjusted to take into account object size, i.e. to favor merging of smaller regions compared to merging of larger regions:
adjthresh = pow(alpha2, 1. + (double) smaller / divisor);
It is this adjusted threshold that is used to decide whether to merge regions or not, depending on whether their similarity is smaller than this adjusted threshold or not.:
if (compare_double(Ri_similarity, adjthresh) == -1)
Ri_similarity is normalized by
val /= globals->max_diff;
where globals->max_diff is defined as the difference between max anf min values in the input file, as obtained by
Rast_get_fp_range_min_max(&(fp_range[n]), &min[n], &max[n]);
I hope that I've understood all of this correctly.
Now my question:
Are nrows and ncols region-dependent, i.e. will the divisor in the calculation of the adjusted vary depending on the region I defined ?
And max->diff do I understand correctly that Rast_get_fp_range_min_max() is region-independent, i.e. that if I take different regions of the same image, I will always get the same max_diff ?
If this is correct, does this mean the region size might determine whether some objects (or pixels) are merged or not ?
This would put into question the determination of a good threshold by testing on small regions as the same threshold might not have the same effect in larger regions, or ?
Moritz
On 20/05/16 18:40, Markus Metz wrote:
Hi Moritz,
On Wed, May 18, 2016 at 6:36 PM, Moritz Lennert
<mlennert@club.worldonline.be> wrote:
Hi Markus,
I'm working on potentially improbing the i.segment.uspo addon and am looking
at the possibility of including the goodness of fit output map somehow in
the evaluation of the quality of the segmentation.
For that, I need to exactly understand the goodness of fit measure.
As a starter: why is the threshold parameter (globals->alpha) squared before
being used in create_isegs.c (and in the calculation of the goodness of fit)
? Is it because i.segment works with the squared distance and not the actual
distance ?
Yes, i.segment works with the squared distance to avoid sqrt() which
is slow. All that matters is if the distance is larger or smaller than
threshold, and this relation is the same with squared values.
IIUC, the worst goodness of fit measure (i.e. 1 - difference) is equal to
the 1 - threshold parameter value. This thus means that if one would want to
compare segmentations done with different threshold values by comparing mean
goodness of fit, for example, this would have to be scaled taking into
account the respective parameter value. Would something like
( goodness of fit - (1 - threshold parameter value) ) / threshold parameter
value
make sense ?
The goodness of fit is currently 1 - similarity by comparing the
current cell values to the object's mean values. Similarity is in the
range [0, 1], 0 means identical, 1 means maximum possible difference.
With the region growing algorithm, that difference can actually be
larger than the given threshold if a cell is included in an object and
subsequent growing of the object shifts the mean away.
BTW, in write_output.c, in the comments starting at line 82, there is
mention of a globals->threshold, but there is not threshold in the globals
structure... I guess this should read globals->alpha or threshold->answer,
or ?
The comments starting at line 82 in write_output.c are an idea for
goodness of fit, the actual goodness of fit is calculated in lines 168
and 182.
HTH,
Markus
--
Département Géosciences, Environnement et Société
Université Libre de Bruxelles
Bureau: S.DB.6.138
CP 130/03
Av. F.D. Roosevelt 50
1050 Bruxelles
Belgique
tél. + 32 2 650.68.12 / 68.11 (secr.)
fax + 32 2 650.68.30