[GRASS-dev] i.cca eigenvector computation strangeness

Hi,
while including the ccmath library into grass and
patching i.cca to use the new ccmath library eigen vector functions,
i noticed, that the eigen vector computation of the original i.cca version
differs from the expected values?

First, the i.cca code looks wired. Like Fortran code ported to C while keeping the
fortran style of counting (starting from 1). I reomved the Fortran coding style and replaced
the eigen vector computaion function from lib/gmath/jacobi.c with a ccmath version.
As a side effect: it is now possible to compute a cca of subgroups of size much larger than 8.

I verified the new eigen vector compuation of i.cca (the ccmath version) with octave, so
octave and i.cca computing now the same eigen vecotors for a given matrix.
But this eigen vecotrs differ from the original version of i.cca. The difference is in the
sign of the vectors. And therefore the results of i.cca are looking different.

Example:

The following steps are made with the original and the ccmath version of i.cca:

i.group group=LandSat input=LandsatTM_Blue,LandsatTM_Green,LandsatTM_IR1,LandsatTM_IR2,LandsatTM_IR3,LandsatTM_Red
i.group group=LandSat subgroup=visible input=LandsatTM_Blue,LandsatTM_Green,LandsatTM_Red
i.cluster group=LandSat subgroup=visible sigfile=sig.txt classes=4
i.cca --v group=LandSat subgroup=visible signature=sig.txt output=test_cca
r.info -r test_cca.1
r.info -r test_cca.2
r.info -r test_cca.3

Sorry, i have send accidentally an unfinished mail, so sorry for the typing errors.
I intended to push the save button … .
The rest of the mail is attached below.

2009/8/17 Soeren Gebbert <soerengebbert@googlemail.com>

Hi,
while including the ccmath library into grass and
patching i.cca to use the new ccmath library eigen vector functions,
i noticed, that the eigen vector computation of the original i.cca version
differs from the expected values?

First, the i.cca code looks wired. Like Fortran code ported to C while keeping the
fortran style of counting (starting from 1). I removed the Fortran coding style and replaced
the eigen vector computaion function from lib/gmath/jacobi.c with a ccmath version.
As a side effect: it is now possible to compute a cca of subgroups of size much larger than 8.

I verified the new eigen vector computation of i.cca (the ccmath version) with octave, so
octave and i.cca computing now the same eigen vectors for a given matrix.
But this eigen vectors differ from the original version of i.cca. The difference is in the
sign of the vectors. And therefore the results of i.cca are looking different.

Example:

The following steps are made with the original and the ccmath version of i.cca:

i.group group=LandSat input=LandsatTM_Blue,LandsatTM_Green,LandsatTM_IR1,LandsatTM_IR2,LandsatTM_IR3,LandsatTM_Red
i.group group=LandSat subgroup=visible input=LandsatTM_Blue,LandsatTM_Green,LandsatTM_Red
i.cluster group=LandSat subgroup=visible sigfile=sig.txt classes=4
i.cca --v group=LandSat subgroup=visible signature=sig.txt output=test_cca
r.info -r test_cca.1
r.info -r test_cca.2
r.info -r test_cca.3

Result of original i.cca:

  1. eigen value: 12265.2 – eigen vector: 0.159569 0.389926 -0.906915
  2. eigen value: 308.024 – eigen vector: -0.154224 -0.89756 -0.413039
  3. eigen value: -1.29363e+06 – eigen vector: -0.975065 0.205776 -0.0830864
    Transform completed.

WARNING: The output cell map <test_cca.1> has values outside the 0-255
range.
WARNING: The output cell map <test_cca.2> has values outside the 0-255
range.
WARNING: The output cell map <test_cca.3> has values outside the 0-255
range.
r.info -r test_cca.1
min=-16
max=8
r.info -r test_cca.2
min=-32
max=0
r.info -r test_cca.3
min=-23
max=-3

Results of the new ccmath version:

  1. eigen value: 12265.2 – eigen vector: -0.159569 -0.389926 0.906915
  2. eigen value: 308.024 – eigen vector: 0.154224 0.89756 0.413039
  3. eigen value: -1.29363e+06 – eigen vector: 0.975065 -0.205776 0.0830864

Transform completed.
WARNING: The output cell map <test_cca.1> has values outside the 0-255
range.
r.info -r test_cca.1
min=-7
max=17
r.info -r test_cca.2
min=0
max=33
r.info -r test_cca.3
min=4
max=24

The difference is in the sign of the eigen vectors which results in different raster map values.

I have verified every step between the old and the new version of i.cca. The difference is only in the eigen vector computation.

So the big question is … which result is correct???
The new one looks more reasonable to me, but well this is only one example … .
Do i have found a bug in the numerical recipes eigen vector code, or is the sign of eigen vectors not fixed and changes with the chosen algorithm?

Can i commit such changes to svn?

Any suggestions are welcome.

Best regards
Soeren

Soeren wrote:

I have verified every step between the old and the new
version of i.cca. The difference is only in the eigen vector
computation.

So the big question is ... which result is correct???

is it more a matter of popular convention than correctness?
i.e. does it depend on your reference frame / POV ?

The new one looks more reasonable to me, but well this is
only one example ... .

Do i have found a bug in the numerical recipes eigen vector
code, or is the sign of eigen vectors not fixed and changes
with the chosen algorithm?

perhaps the output of the i.pca module, the m.eigensystem module (addons),
and R-stats can give you some confidence in one flavour over another.

I'd guess it is highly likely that you are correct and this source was
adapted from Fortran beginnings.

Hamish

Hello Hamish,

2009/8/18 Hamish <hamish_b@yahoo.com>

Soeren wrote:

I have verified every step between the old and the new
version of i.cca. The difference is only in the eigen vector
computation.

So the big question is … which result is correct???

is it more a matter of popular convention than correctness?
i.e. does it depend on your reference frame / POV ?

I don’t know. That may be the point.
I faced similar problems while implementing the Karhunen-Loeve-Transformation for 3d geometric body alignment. Dependent on the sign of the eigen vectors, the bodies are rotated by 180° on the principal axes. So i had to rotate the bodies for correct alignment by 180°.
In case of a sphere or a cube, the rotation was irrelevant.
So maybe the sign is irrelevant and the result must be interpreted dependent on the POV?

The new one looks more reasonable to me, but well this is
only one example … .

Do i have found a bug in the numerical recipes eigen vector
code, or is the sign of eigen vectors not fixed and changes
with the chosen algorithm?

perhaps the output of the i.pca module, the m.eigensystem module (addons),
and R-stats can give you some confidence in one flavour over another.

I notice the same behavior with i.pca. The eigen vectors between old and new version have different signs.
I have tested this with the landsat satellite images from the north carolina data set.

For validation i have put the sources of my development version of grass6 and a patched grass6.4RC5 version (for eigen vector output of i.cca) on my web side:

http://www-pool.math.tu-berlin.de/~soeren/grass/files/grass6_devel_soeren.tar.gz
http://www-pool.math.tu-berlin.de/~soeren/grass/files/grass-6.4.0RC5_soeren.tar.gz

Anybody who is interested in this issue, can compile them and run the following script for each grass version with the nc data set:

Set the region to the landsat images

g.region rast=lsat5_1987_10
i.group group=LandSat input=lsat5_1987_10,lsat5_1987_20,lsat5_1987_30,lsat5_1987_40,lsat5_1987_50,lsat5_1987_60,lsat5_1987_70
i.group group=LandSat subgroup=visible input=lsat5_1987_10,lsat5_1987_20,lsat5_1987_30,lsat5_1987_40
i.cluster group=LandSat subgroup=visible sigfile=sig.txt classes=5
i.cca --v group=LandSat subgroup=visible signature=sig.txt output=test_cca_1

i.group group=LandSat subgroup=infra input=lsat5_1987_50,lsat5_1987_60,lsat5_1987_70
i.cluster group=LandSat subgroup=infra sigfile=sig.txt classes=4
i.cca --v group=LandSat subgroup=infra signature=sig.txt output=test_cca_2

Compute the principal components

i.pca --o input=lsat5_1987_10,lsat5_1987_20,lsat5_1987_30 output=test_pca_1
i.pca --o input=lsat5_1987_20,lsat5_1987_30,lsat5_1987_40 output=test_pca_2
i.pca --o input=lsat5_1987_30,lsat5_1987_40,lsat5_1987_50 output=test_pca_3

You will see the different eigen vector signs.

I’d guess it is highly likely that you are correct and this source was
adapted from Fortran beginnings.

I dont know. The code of i.pca and i.cca seems to be correct. I guess the cca developer was not very familiar with the C way of counting but the code seems to be well structured.

The only way to validate i.pca and i.cca are analytical results which can be
adopted to grass using r.mapcalc to create the input raster maps for i.cca and i.pca.
But i strongly guess … there are no analytical results for validation. :confused:

I guess, the sign of the eigen vectors are not of great important’s.
So the new versions of i.pca and .icca are working correctly … like the old ones. :slight_smile:

Best regards
Soeren

Hamish