[GRASS-dev] PCA question

cmbarton · June 18, 2012, 10:46pm

The constant (i.e., the band mean) was in the pca primer that someone sent me a link to in this discussion. Using the Eigenvectors resulting from i.pca, I can achieve the results of i.pca using my formula below. This is essentially the same as your formula minus the constant--which doesn't really make much (of any) difference in the final result.

However, my question is about performing an *inverse pca*--getting back to the original values from the principal components maps. The idea of pca sharpening is that you perform a pca, then do an inverse pca substituting the pan band for pc1 to enhance the resolution. The equations I show below seem to work, but what I've read indicates that it is not possible to *exactly* get the original values back; you can only approximate them.

Michael

On Jun 17, 2012, at 10:48 AM, Duccio Rocchini wrote:

Dear all,
first, sorry for the delay...
Here are my 2 cents to be added to the discussion. I re-took in my
hands the John Jensen book.
Accordingly

new brightness values1,1,1 = a1,1*BV1,1,1 +a2,1*BV1,1,2..... + an,1*BV1,1,m

where a=eigenvector and BV=original brightness value.

I found no evidence for the mean term: "- ((b1+b2+b3)/3)"

Michael: do you have a proof/reference for that?

P.S. thanks for involving me in this discussion which is really stimulating!

Duccio

2012/6/7 Michael Barton <michael.barton@asu.edu>:

I think I've figured it out.

If (ev1-1, ev1-2, ev1-3) are the eigenvectors of the first principal component for 3 imagery bands (b1, b2, b3), the corresponding factor scores of the PC1, PC2, and PC3 maps (fs1, fs2, fs3) are calculated as:

fs1 = (ev1-1*b1) + (ev1-2*b2) + (ev1-3*b3) - ((b1+b2+b3)/3)
fs2 = (ev2-1*b1) + (ev2-2*b2) + (ev2-3*b3) - ((b1+b2+b3)/3)
fs3 = (ev3-1*b1) + (ev3-2*b2) + (ev3-3*b3) - ((b1+b2+b3)/3)

So to do an inverse PCA on the same data you need to do the following:

b1' = (fs1/ev1-1) + (fs2/ev2-1) + (fs3/ev3-1)
b2' = (fs1/ev1-2) + (fs2/ev2-2) + (fs3/ev3-2)
b3' = (fs1/ev1-3) + (fs2/ev2-3) + (fs3/ev3-3)

Adding the constant back on doesn't really seem to matter because you need to rescale b1' to b1, b2' to b2, and b3' to b3 anyway.

Michael

On Jun 7, 2012, at 1:55 AM, Markus Neteler wrote:

Hi Duccio,

On Wed, Jun 6, 2012 at 11:39 PM, Michael Barton <michael.barton@asu.edu> wrote:

On Jun 6, 2012, at 2:20 PM, Markus Neteler wrote:

On Wed, Jun 6, 2012 at 5:09 PM, Michael Barton <michael.barton@asu.edu> wrote:

...

I'm working on a pan sharpening script that will combine your i.fusion.brovey with options to do other pan sharpening methods. An IHS transformation is easy. I think that a PCA sharpening is doable too if I can find an equation to rotate the PCA channels back into unrotated space--since i.pca does provide the eigenvectors.

Maybe there is material in (see m.eigenvector)
Principal Components Analysis - GRASS-Wiki

This has a lot of good information and ALMOST has what I need. Everything I read suggests that there is a straightforward way to get the original values from the factor scores if you have the eigenvectors. But no one I've yet found provides the equation or algorithm to do it.

@Duccio: any idea about this by chance?

thanks
Markus

_____________________
C. Michael Barton
Visiting Scientist, Integrated Science Program
National Center for Atmospheric Research &
University Corporation for Atmospheric Research
303-497-2889 (voice)

Director, Center for Social Dynamics & Complexity
Professor of Anthropology, School of Human Evolution & Social Change
Arizona State University
www: http://www.public.asu.edu/~cmbarton, http://csdc.asu.edu

--
Duccio Rocchini, PhD

http://gis.cri.fmach.it/rocchini/

Fondazione Edmund Mach
Research and Innovation Centre
Department of Biodiversity and Molecular Ecology
GIS and Remote Sensing Unit
Via Mach 1, 38010 San Michele all'Adige (TN) - Italy
Phone +39 0461 615 570
ducciorocchini@gmail.com
duccio.rocchini@fmach.it
skype: duccio.rocchini

_____________________
C. Michael Barton
Visiting Scientist, Integrated Science Program
National Center for Atmospheric Research &
University Consortium for Atmospheric Research
303-497-2889 (voice)

Director, Center for Social Dynamics & Complexity
Professor of Anthropology, School of Human Evolution & Social Change
Arizona State University
www: http://www.public.asu.edu/~cmbarton, http://csdc.asu.edu

Markus_GRASS · June 25, 2012, 4:11pm

An inverse PCA can be regarded as the inverse of a transformation
using matrix notation. PC scores are calculated with

b = A a

with A being the transformation matrix composed of the Eigenvectors, a
being the vector of the original values and b the PC scores. What you
now need is inverse of A, A^-1. The original values can then be
retrieved with

A^-1 b = a

A^-1 is the inverse of the transformation matrix A which you can get
in R with solve(A).

For a PCA, the original values are usually shifted to the mean and
optionally scaled to stddev before computing the Eigenvectors. The
mean shift is always performed by i.pca, scaling is optional. That
means that A^-1 b gives the original values shifted to the mean and
maybe scaled, and the mean of each original band needs to be added to
get the original values used as input to i.pca. With scaling applied,
the shifted values need to be multiplied by the stddev for each
original band.

HTH,

Markus M

On Tue, Jun 19, 2012 at 12:46 AM, Michael Barton <Michael.Barton@asu.edu> wrote:

The constant (i.e., the band mean) was in the pca primer that someone sent me a link to in this discussion. Using the Eigenvectors resulting from i.pca, I can achieve the results of i.pca using my formula below. This is essentially the same as your formula minus the constant--which doesn't really make much (of any) difference in the final result.

However, my question is about performing an *inverse pca*--getting back to the original values from the principal components maps. The idea of pca sharpening is that you perform a pca, then do an inverse pca substituting the pan band for pc1 to enhance the resolution. The equations I show below seem to work, but what I've read indicates that it is not possible to *exactly* get the original values back; you can only approximate them.

Michael

On Jun 17, 2012, at 10:48 AM, Duccio Rocchini wrote:

Dear all,
first, sorry for the delay...
Here are my 2 cents to be added to the discussion. I re-took in my
hands the John Jensen book.
Accordingly

new brightness values1,1,1 = a1,1*BV1,1,1 +a2,1*BV1,1,2..... + an,1*BV1,1,m

where a=eigenvector and BV=original brightness value.

I found no evidence for the mean term: "- ((b1+b2+b3)/3)"

Michael: do you have a proof/reference for that?

P.S. thanks for involving me in this discussion which is really stimulating!

Duccio

2012/6/7 Michael Barton <michael.barton@asu.edu>:

I think I've figured it out.

If (ev1-1, ev1-2, ev1-3) are the eigenvectors of the first principal component for 3 imagery bands (b1, b2, b3), the corresponding factor scores of the PC1, PC2, and PC3 maps (fs1, fs2, fs3) are calculated as:

fs1 = (ev1-1*b1) + (ev1-2*b2) + (ev1-3*b3) - ((b1+b2+b3)/3)
fs2 = (ev2-1*b1) + (ev2-2*b2) + (ev2-3*b3) - ((b1+b2+b3)/3)
fs3 = (ev3-1*b1) + (ev3-2*b2) + (ev3-3*b3) - ((b1+b2+b3)/3)

So to do an inverse PCA on the same data you need to do the following:

b1' = (fs1/ev1-1) + (fs2/ev2-1) + (fs3/ev3-1)
b2' = (fs1/ev1-2) + (fs2/ev2-2) + (fs3/ev3-2)
b3' = (fs1/ev1-3) + (fs2/ev2-3) + (fs3/ev3-3)

Adding the constant back on doesn't really seem to matter because you need to rescale b1' to b1, b2' to b2, and b3' to b3 anyway.

Michael

On Jun 7, 2012, at 1:55 AM, Markus Neteler wrote:

Hi Duccio,

On Wed, Jun 6, 2012 at 11:39 PM, Michael Barton <michael.barton@asu.edu> wrote:

On Jun 6, 2012, at 2:20 PM, Markus Neteler wrote:

On Wed, Jun 6, 2012 at 5:09 PM, Michael Barton <michael.barton@asu.edu> wrote:

...

I'm working on a pan sharpening script that will combine your i.fusion.brovey with options to do other pan sharpening methods. An IHS transformation is easy. I think that a PCA sharpening is doable too if I can find an equation to rotate the PCA channels back into unrotated space--since i.pca does provide the eigenvectors.

Maybe there is material in (see m.eigenvector)
http://grass.osgeo.org/wiki/Principal_Components_Analysis

This has a lot of good information and ALMOST has what I need. Everything I read suggests that there is a straightforward way to get the original values from the factor scores if you have the eigenvectors. But no one I've yet found provides the equation or algorithm to do it.

@Duccio: any idea about this by chance?

thanks
Markus

_____________________
C. Michael Barton
Visiting Scientist, Integrated Science Program
National Center for Atmospheric Research &
University Corporation for Atmospheric Research
303-497-2889 (voice)

Director, Center for Social Dynamics & Complexity
Professor of Anthropology, School of Human Evolution & Social Change
Arizona State University
www: http://www.public.asu.edu/~cmbarton, http://csdc.asu.edu

--
Duccio Rocchini, PhD

http://gis.cri.fmach.it/rocchini/

Fondazione Edmund Mach
Research and Innovation Centre
Department of Biodiversity and Molecular Ecology
GIS and Remote Sensing Unit
Via Mach 1, 38010 San Michele all'Adige (TN) - Italy
Phone +39 0461 615 570
ducciorocchini@gmail.com
duccio.rocchini@fmach.it
skype: duccio.rocchini

_____________________
C. Michael Barton
Visiting Scientist, Integrated Science Program
National Center for Atmospheric Research &
University Consortium for Atmospheric Research
303-497-2889 (voice)

Director, Center for Social Dynamics & Complexity
Professor of Anthropology, School of Human Evolution & Social Change
Arizona State University
www: http://www.public.asu.edu/~cmbarton, http://csdc.asu.edu

_______________________________________________
grass-dev mailing list
grass-dev@lists.osgeo.org
http://lists.osgeo.org/mailman/listinfo/grass-dev

cmbarton · June 25, 2012, 4:47pm

Thanks Marcus,

Please see my post below, starting with "I think I've figured it out..."

Since I'm working in GRASS instead of R, I need to work back from the values given by i.pca. Also, I'm not up on matrix algebra notation. So it would be a help to me to see exactly what is meant by an inverted Eigenvector matrix. I tried to give how I interpreted this in GRASS map algebra terms below. Is this correct? Using my simple algebraic notation, the Eigenvector matrix produced by i.pca would be:

ev1-1 ev1-2 ev1-3
ev2-1 ev2-2 ev2-3
ev3-1 ev3-2 ev3-3

Where ev1-2 refers to the eigenvector in the first row and second column

Spot checking the factor scores against original values, with no scaling, seems to indicate that it is the mean of all original bands for each pixel that are subtracted from the factor scores in i.pca. This is somewhat different from what you suggest.

So my questions are:

1) Do I have the inversion correct in terms of how it needs to be calculated in GRASS?
2) Even though the mean of all bands seems to be subtracted from each factor score, does inverting the matrix mean that the mean of *each* band is added back to its transformed value? Adding either the mean of all original bands or 1 original band seems to produce values that are much higher than the original, and so need to be rescaled. Maybe this is OK.
3) I do not normalize or rescale in i.pca. This seems to make it easier to do the inverse PCA with fewer calculations. Is there any reason I *should* rescale and/or normalize?

Thanks much
Michael

On Jun 25, 2012, at 10:11 AM, Markus Metz wrote:

An inverse PCA can be regarded as the inverse of a transformation
using matrix notation. PC scores are calculated with

b = A a

with A being the transformation matrix composed of the Eigenvectors, a
being the vector of the original values and b the PC scores. What you
now need is inverse of A, A^-1. The original values can then be
retrieved with

A^-1 b = a

A^-1 is the inverse of the transformation matrix A which you can get
in R with solve(A).

For a PCA, the original values are usually shifted to the mean and
optionally scaled to stddev before computing the Eigenvectors. The
mean shift is always performed by i.pca, scaling is optional. That
means that A^-1 b gives the original values shifted to the mean and
maybe scaled, and the mean of each original band needs to be added to
get the original values used as input to i.pca. With scaling applied,
the shifted values need to be multiplied by the stddev for each
original band.

HTH,

Markus M

On Tue, Jun 19, 2012 at 12:46 AM, Michael Barton <Michael.Barton@asu.edu> wrote:

The constant (i.e., the band mean) was in the pca primer that someone sent me a link to in this discussion. Using the Eigenvectors resulting from i.pca, I can achieve the results of i.pca using my formula below. This is essentially the same as your formula minus the constant--which doesn't really make much (of any) difference in the final result.

However, my question is about performing an *inverse pca*--getting back to the original values from the principal components maps. The idea of pca sharpening is that you perform a pca, then do an inverse pca substituting the pan band for pc1 to enhance the resolution. The equations I show below seem to work, but what I've read indicates that it is not possible to *exactly* get the original values back; you can only approximate them.

Michael

On Jun 17, 2012, at 10:48 AM, Duccio Rocchini wrote:

Dear all,
first, sorry for the delay...
Here are my 2 cents to be added to the discussion. I re-took in my
hands the John Jensen book.
Accordingly

new brightness values1,1,1 = a1,1*BV1,1,1 +a2,1*BV1,1,2..... + an,1*BV1,1,m

where a=eigenvector and BV=original brightness value.

I found no evidence for the mean term: "- ((b1+b2+b3)/3)"

Michael: do you have a proof/reference for that?

P.S. thanks for involving me in this discussion which is really stimulating!

Duccio

2012/6/7 Michael Barton <michael.barton@asu.edu>:

I think I've figured it out.

If (ev1-1, ev1-2, ev1-3) are the eigenvectors of the first principal component for 3 imagery bands (b1, b2, b3), the corresponding factor scores of the PC1, PC2, and PC3 maps (fs1, fs2, fs3) are calculated as:

fs1 = (ev1-1*b1) + (ev1-2*b2) + (ev1-3*b3) - ((b1+b2+b3)/3)
fs2 = (ev2-1*b1) + (ev2-2*b2) + (ev2-3*b3) - ((b1+b2+b3)/3)
fs3 = (ev3-1*b1) + (ev3-2*b2) + (ev3-3*b3) - ((b1+b2+b3)/3)

So to do an inverse PCA on the same data you need to do the following:

b1' = (fs1/ev1-1) + (fs2/ev2-1) + (fs3/ev3-1)
b2' = (fs1/ev1-2) + (fs2/ev2-2) + (fs3/ev3-2)
b3' = (fs1/ev1-3) + (fs2/ev2-3) + (fs3/ev3-3)

Adding the constant back on doesn't really seem to matter because you need to rescale b1' to b1, b2' to b2, and b3' to b3 anyway.

Michael

On Jun 7, 2012, at 1:55 AM, Markus Neteler wrote:

Hi Duccio,

On Wed, Jun 6, 2012 at 11:39 PM, Michael Barton <michael.barton@asu.edu> wrote:

On Jun 6, 2012, at 2:20 PM, Markus Neteler wrote:

On Wed, Jun 6, 2012 at 5:09 PM, Michael Barton <michael.barton@asu.edu> wrote:

...

I'm working on a pan sharpening script that will combine your i.fusion.brovey with options to do other pan sharpening methods. An IHS transformation is easy. I think that a PCA sharpening is doable too if I can find an equation to rotate the PCA channels back into unrotated space--since i.pca does provide the eigenvectors.

Maybe there is material in (see m.eigenvector)
Principal Components Analysis - GRASS-Wiki

This has a lot of good information and ALMOST has what I need. Everything I read suggests that there is a straightforward way to get the original values from the factor scores if you have the eigenvectors. But no one I've yet found provides the equation or algorithm to do it.

@Duccio: any idea about this by chance?

thanks
Markus

_____________________
C. Michael Barton
Visiting Scientist, Integrated Science Program
National Center for Atmospheric Research &
University Corporation for Atmospheric Research
303-497-2889 (voice)

Director, Center for Social Dynamics & Complexity
Professor of Anthropology, School of Human Evolution & Social Change
Arizona State University
www: http://www.public.asu.edu/~cmbarton, http://csdc.asu.edu

--
Duccio Rocchini, PhD

http://gis.cri.fmach.it/rocchini/

Fondazione Edmund Mach
Research and Innovation Centre
Department of Biodiversity and Molecular Ecology
GIS and Remote Sensing Unit
Via Mach 1, 38010 San Michele all'Adige (TN) - Italy
Phone +39 0461 615 570
ducciorocchini@gmail.com
duccio.rocchini@fmach.it
skype: duccio.rocchini

_____________________
C. Michael Barton
Visiting Scientist, Integrated Science Program
National Center for Atmospheric Research &
University Consortium for Atmospheric Research
303-497-2889 (voice)

Director, Center for Social Dynamics & Complexity
Professor of Anthropology, School of Human Evolution & Social Change
Arizona State University
www: http://www.public.asu.edu/~cmbarton, http://csdc.asu.edu

_______________________________________________
grass-dev mailing list
grass-dev@lists.osgeo.org
grass-dev Info Page

_____________________
C. Michael Barton
Visiting Scientist, Integrated Science Program
National Center for Atmospheric Research &
University Corporation for Atmospheric Research
303-497-2889 (voice)

Director, Center for Social Dynamics & Complexity
Professor of Anthropology, School of Human Evolution & Social Change
Arizona State University
www: http://www.public.asu.edu/~cmbarton, http://csdc.asu.edu

Markus_GRASS · June 25, 2012, 6:03pm

On Mon, Jun 25, 2012 at 6:47 PM, Michael Barton <Michael.Barton@asu.edu> wrote:

Thanks Marcus,

Please see my post below, starting with "I think I've figured it out..."

Yes, I saw, but the formula below "I think I've figured it out..."
looked strange to me.

Since I'm working in GRASS instead of R,

[ should be "and", not "instead of" ]

Here is an example based on on the landsat imagery in mapset landsat,
North Carolina sample dataset:

i.pca without rescaling of the output:
i.pca input=lsat7_2000_10,lsat7_2000_20,lsat7_2000_30
output_prefix=lsat7_2000_pc rescale=0,0

The eigenvectors are
0.4494, 0.5128, 0.7315
0.6726, 0.3447,-0.6548
0.5879,-0.7863, 0.1901

this is a square 3x3 matrix. R gives the inverse matrix as

0.4493372, 0.6726549, 0.5879235
0.5128129, 0.3446144,-0.7862660
0.7315070,-0.6548316, 0.1899992

Test with band 10:
the general formula is
b1' = (ev1-1' * pc1 + ev1-2' * pc2 + ev1-3' * pc3) + mean(b1)

The mean of lsat7_2000_b10 is 79.925.
r.mapcalc "lsat7_2000_10.pc = (lsat7_2000_pc.1 * 0.4493372 +
lsat7_2000_pc.2 * 0.6726549 + lsat7_2000_pc.3 * 0.5879235) + 79.925"

Difference to original:
r.mapcalc "diff = lsat7_2000_10 - lsat7_2000_10.pc"

r.univar diff -g
n=250325
null_cells=0
cells=250325
min=-0.00140021304849824
max=0.00723340429107111
range=0.00863361733956935
mean=-1.8473599814612e-05

That's close enough, given that the eigenvectors are printed by i.pca
with limited precision (only 4 decimal places).

Without using R, the inverse of a 3x3 matrix can be manually
calculated as follows:

original matrix:
a b c
d e f
g h k

inverse matrix:
(ek - fh) (ch - bk) (bf - ce)
(fg - dk) (ak - cg) (cd - af)
(dh - eg) (gb - ah) (ae - bd)

each entry in this inverse matrix needs to be multiplied with

1 / (a(ek - fh) - b(kd - fg) + c(dh - eg))

in order to get the transformation matrix to be applied to the pc scores.

The formula is more complex for matrices with more than 3 dimensions,
e.g. for i.pca with 6 input bands.

So my questions are:

1) Do I have the inversion correct in terms of how it needs to be calculated in GRASS?

I don't think so. Using your formula and testing for the difference gives me
min=-546.255208609669
max=174.771853180844
range=721.027061790513
mean=79.9249815357317

a bit too much IMHO.

2) Even though the mean of all bands seems to be subtracted from each factor score, does inverting the matrix mean that the mean of *each* band is added back to its transformed value? Adding either the mean of all original bands or 1 original band seems to produce values that are much higher than the original, and so need to be rescaled. Maybe this is OK.

The mean of each band is added back to each recalculated band. See
above r.mapcalc formula for band 10.

3) I do not normalize or rescale in i.pca. This seems to make it easier to do the inverse PCA with fewer calculations. Is there any reason I *should* rescale and/or normalize?

Normalization applies to the input of i.pca and is by default not
done. It needs to be done for heterogenous input data such as e.g.
rainfall, temperature, NDVI, etc. Rescaling is automatically applied
to the output of i.pca unless explicitly disabled with rescale=0,0.

Markus M

Thanks much
Michael

On Jun 25, 2012, at 10:11 AM, Markus Metz wrote:

An inverse PCA can be regarded as the inverse of a transformation
using matrix notation. PC scores are calculated with

b = A a

with A being the transformation matrix composed of the Eigenvectors, a
being the vector of the original values and b the PC scores. What you
now need is inverse of A, A^-1. The original values can then be
retrieved with

A^-1 b = a

A^-1 is the inverse of the transformation matrix A which you can get
in R with solve(A).

For a PCA, the original values are usually shifted to the mean and
optionally scaled to stddev before computing the Eigenvectors. The
mean shift is always performed by i.pca, scaling is optional. That
means that A^-1 b gives the original values shifted to the mean and
maybe scaled, and the mean of each original band needs to be added to
get the original values used as input to i.pca. With scaling applied,
the shifted values need to be multiplied by the stddev for each
original band.

HTH,

Markus M

On Tue, Jun 19, 2012 at 12:46 AM, Michael Barton <Michael.Barton@asu.edu> wrote:

The constant (i.e., the band mean) was in the pca primer that someone sent me a link to in this discussion. Using the Eigenvectors resulting from i.pca, I can achieve the results of i.pca using my formula below. This is essentially the same as your formula minus the constant--which doesn't really make much (of any) difference in the final result.

However, my question is about performing an *inverse pca*--getting back to the original values from the principal components maps. The idea of pca sharpening is that you perform a pca, then do an inverse pca substituting the pan band for pc1 to enhance the resolution. The equations I show below seem to work, but what I've read indicates that it is not possible to *exactly* get the original values back; you can only approximate them.

Michael

On Jun 17, 2012, at 10:48 AM, Duccio Rocchini wrote:

Dear all,
first, sorry for the delay...
Here are my 2 cents to be added to the discussion. I re-took in my
hands the John Jensen book.
Accordingly

new brightness values1,1,1 = a1,1*BV1,1,1 +a2,1*BV1,1,2..... + an,1*BV1,1,m

where a=eigenvector and BV=original brightness value.

I found no evidence for the mean term: "- ((b1+b2+b3)/3)"

Michael: do you have a proof/reference for that?

P.S. thanks for involving me in this discussion which is really stimulating!

Duccio

2012/6/7 Michael Barton <michael.barton@asu.edu>:

I think I've figured it out.

If (ev1-1, ev1-2, ev1-3) are the eigenvectors of the first principal component for 3 imagery bands (b1, b2, b3), the corresponding factor scores of the PC1, PC2, and PC3 maps (fs1, fs2, fs3) are calculated as:

fs1 = (ev1-1*b1) + (ev1-2*b2) + (ev1-3*b3) - ((b1+b2+b3)/3)
fs2 = (ev2-1*b1) + (ev2-2*b2) + (ev2-3*b3) - ((b1+b2+b3)/3)
fs3 = (ev3-1*b1) + (ev3-2*b2) + (ev3-3*b3) - ((b1+b2+b3)/3)

So to do an inverse PCA on the same data you need to do the following:

b1' = (fs1/ev1-1) + (fs2/ev2-1) + (fs3/ev3-1)
b2' = (fs1/ev1-2) + (fs2/ev2-2) + (fs3/ev3-2)
b3' = (fs1/ev1-3) + (fs2/ev2-3) + (fs3/ev3-3)

Adding the constant back on doesn't really seem to matter because you need to rescale b1' to b1, b2' to b2, and b3' to b3 anyway.

Michael

On Jun 7, 2012, at 1:55 AM, Markus Neteler wrote:

Hi Duccio,

On Wed, Jun 6, 2012 at 11:39 PM, Michael Barton <michael.barton@asu.edu> wrote:

On Jun 6, 2012, at 2:20 PM, Markus Neteler wrote:

On Wed, Jun 6, 2012 at 5:09 PM, Michael Barton <michael.barton@asu.edu> wrote:

...

I'm working on a pan sharpening script that will combine your i.fusion.brovey with options to do other pan sharpening methods. An IHS transformation is easy. I think that a PCA sharpening is doable too if I can find an equation to rotate the PCA channels back into unrotated space--since i.pca does provide the eigenvectors.

Maybe there is material in (see m.eigenvector)
http://grass.osgeo.org/wiki/Principal_Components_Analysis

This has a lot of good information and ALMOST has what I need. Everything I read suggests that there is a straightforward way to get the original values from the factor scores if you have the eigenvectors. But no one I've yet found provides the equation or algorithm to do it.

@Duccio: any idea about this by chance?

thanks
Markus

_____________________
C. Michael Barton
Visiting Scientist, Integrated Science Program
National Center for Atmospheric Research &
University Corporation for Atmospheric Research
303-497-2889 (voice)

Director, Center for Social Dynamics & Complexity
Professor of Anthropology, School of Human Evolution & Social Change
Arizona State University
www: http://www.public.asu.edu/~cmbarton, http://csdc.asu.edu

--
Duccio Rocchini, PhD

http://gis.cri.fmach.it/rocchini/

Fondazione Edmund Mach
Research and Innovation Centre
Department of Biodiversity and Molecular Ecology
GIS and Remote Sensing Unit
Via Mach 1, 38010 San Michele all'Adige (TN) - Italy
Phone +39 0461 615 570
ducciorocchini@gmail.com
duccio.rocchini@fmach.it
skype: duccio.rocchini

_____________________
C. Michael Barton
Visiting Scientist, Integrated Science Program
National Center for Atmospheric Research &
University Consortium for Atmospheric Research
303-497-2889 (voice)

Director, Center for Social Dynamics & Complexity
Professor of Anthropology, School of Human Evolution & Social Change
Arizona State University
www: http://www.public.asu.edu/~cmbarton, http://csdc.asu.edu

_______________________________________________
grass-dev mailing list
grass-dev@lists.osgeo.org
http://lists.osgeo.org/mailman/listinfo/grass-dev

_____________________
C. Michael Barton
Visiting Scientist, Integrated Science Program
National Center for Atmospheric Research &
University Corporation for Atmospheric Research
303-497-2889 (voice)

Director, Center for Social Dynamics & Complexity
Professor of Anthropology, School of Human Evolution & Social Change
Arizona State University
www: http://www.public.asu.edu/~cmbarton, http://csdc.asu.edu

cmbarton · June 25, 2012, 6:21pm

Markus,

This is very helpful. See below for a couple additional questions.

Thanks much
Michael

On Jun 25, 2012, at 12:03 PM, Markus Metz wrote:

On Mon, Jun 25, 2012 at 6:47 PM, Michael Barton <Michael.Barton@asu.edu> wrote:

Thanks Marcus,

Please see my post below, starting with "I think I've figured it out..."

Yes, I saw, but the formula below "I think I've figured it out..."
looked strange to me.

Since I'm working in GRASS instead of R,

[ should be "and", not "instead of" ]

Here is an example based on on the landsat imagery in mapset landsat,
North Carolina sample dataset:

i.pca without rescaling of the output:
i.pca input=lsat7_2000_10,lsat7_2000_20,lsat7_2000_30
output_prefix=lsat7_2000_pc rescale=0,0

The eigenvectors are
0.4494, 0.5128, 0.7315
0.6726, 0.3447,-0.6548
0.5879,-0.7863, 0.1901

this is a square 3x3 matrix. R gives the inverse matrix as

0.4493372, 0.6726549, 0.5879235
0.5128129, 0.3446144,-0.7862660
0.7315070,-0.6548316, 0.1899992

Test with band 10:
the general formula is
b1' = (ev1-1' * pc1 + ev1-2' * pc2 + ev1-3' * pc3) + mean(b1)

The mean of lsat7_2000_b10 is 79.925.
r.mapcalc "lsat7_2000_10.pc = (lsat7_2000_pc.1 * 0.4493372 +
lsat7_2000_pc.2 * 0.6726549 + lsat7_2000_pc.3 * 0.5879235) + 79.925"

Difference to original:
r.mapcalc "diff = lsat7_2000_10 - lsat7_2000_10.pc"

r.univar diff -g
n=250325
null_cells=0
cells=250325
min=-0.00140021304849824
max=0.00723340429107111
range=0.00863361733956935
mean=-1.8473599814612e-05

That's close enough, given that the eigenvectors are printed by i.pca
with limited precision (only 4 decimal places).

Without using R, the inverse of a 3x3 matrix can be manually
calculated as follows:

original matrix:
a b c
d e f
g h k

inverse matrix:
(ek - fh) (ch - bk) (bf - ce)
(fg - dk) (ak - cg) (cd - af)
(dh - eg) (gb - ah) (ae - bd)

Your example from R above *looks* like the inverse of matrix...

a b c
d e f
g h k

is (given rounding errors) ...

a d g
b e h
c f k

That is, transpose rows and columns.
Is this just a coincidence or a reasonable approximation?

each entry in this inverse matrix needs to be multiplied with

1 / (a(ek - fh) - b(kd - fg) + c(dh - eg))

by a(ek - fh)... , you mean a = factor score 1 of a cell, b = factor score 2 of a cell, c = factor score 3 of a cell (not eigenvectors a,b,c or the matrix), right?

in order to get the transformation matrix to be applied to the pc scores.

The formula is more complex for matrices with more than 3 dimensions,
e.g. for i.pca with 6 input bands.

So my questions are:

1) Do I have the inversion correct in terms of how it needs to be calculated in GRASS?

I don't think so. Using your formula and testing for the difference gives me
min=-546.255208609669
max=174.771853180844
range=721.027061790513
mean=79.9249815357317

a bit too much IMHO.

Right. Looks like I'll need to use r.univar to calculate the mean of each band to add back in. Thanks again. I will try this out manually and test before updating the pan sharpening routine.

2) Even though the mean of all bands seems to be subtracted from each factor score, does inverting the matrix mean that the mean of *each* band is added back to its transformed value? Adding either the mean of all original bands or 1 original band seems to produce values that are much higher than the original, and so need to be rescaled. Maybe this is OK.

The mean of each band is added back to each recalculated band. See
above r.mapcalc formula for band 10.

3) I do not normalize or rescale in i.pca. This seems to make it easier to do the inverse PCA with fewer calculations. Is there any reason I *should* rescale and/or normalize?

Normalization applies to the input of i.pca and is by default not
done. It needs to be done for heterogenous input data such as e.g.
rainfall, temperature, NDVI, etc. Rescaling is automatically applied
to the output of i.pca unless explicitly disabled with rescale=0,0.

Thanks. That is what I thought.

Michael

Markus M

Thanks much
Michael

On Jun 25, 2012, at 10:11 AM, Markus Metz wrote:

An inverse PCA can be regarded as the inverse of a transformation
using matrix notation. PC scores are calculated with

b = A a

with A being the transformation matrix composed of the Eigenvectors, a
being the vector of the original values and b the PC scores. What you
now need is inverse of A, A^-1. The original values can then be
retrieved with

A^-1 b = a

A^-1 is the inverse of the transformation matrix A which you can get
in R with solve(A).

For a PCA, the original values are usually shifted to the mean and
optionally scaled to stddev before computing the Eigenvectors. The
mean shift is always performed by i.pca, scaling is optional. That
means that A^-1 b gives the original values shifted to the mean and
maybe scaled, and the mean of each original band needs to be added to
get the original values used as input to i.pca. With scaling applied,
the shifted values need to be multiplied by the stddev for each
original band.

HTH,

Markus M

On Tue, Jun 19, 2012 at 12:46 AM, Michael Barton <Michael.Barton@asu.edu> wrote:

The constant (i.e., the band mean) was in the pca primer that someone sent me a link to in this discussion. Using the Eigenvectors resulting from i.pca, I can achieve the results of i.pca using my formula below. This is essentially the same as your formula minus the constant--which doesn't really make much (of any) difference in the final result.

However, my question is about performing an *inverse pca*--getting back to the original values from the principal components maps. The idea of pca sharpening is that you perform a pca, then do an inverse pca substituting the pan band for pc1 to enhance the resolution. The equations I show below seem to work, but what I've read indicates that it is not possible to *exactly* get the original values back; you can only approximate them.

Michael

On Jun 17, 2012, at 10:48 AM, Duccio Rocchini wrote:

Dear all,
first, sorry for the delay...
Here are my 2 cents to be added to the discussion. I re-took in my
hands the John Jensen book.
Accordingly

new brightness values1,1,1 = a1,1*BV1,1,1 +a2,1*BV1,1,2..... + an,1*BV1,1,m

where a=eigenvector and BV=original brightness value.

I found no evidence for the mean term: "- ((b1+b2+b3)/3)"

Michael: do you have a proof/reference for that?

P.S. thanks for involving me in this discussion which is really stimulating!

Duccio

2012/6/7 Michael Barton <michael.barton@asu.edu>:

I think I've figured it out.

If (ev1-1, ev1-2, ev1-3) are the eigenvectors of the first principal component for 3 imagery bands (b1, b2, b3), the corresponding factor scores of the PC1, PC2, and PC3 maps (fs1, fs2, fs3) are calculated as:

fs1 = (ev1-1*b1) + (ev1-2*b2) + (ev1-3*b3) - ((b1+b2+b3)/3)
fs2 = (ev2-1*b1) + (ev2-2*b2) + (ev2-3*b3) - ((b1+b2+b3)/3)
fs3 = (ev3-1*b1) + (ev3-2*b2) + (ev3-3*b3) - ((b1+b2+b3)/3)

So to do an inverse PCA on the same data you need to do the following:

b1' = (fs1/ev1-1) + (fs2/ev2-1) + (fs3/ev3-1)
b2' = (fs1/ev1-2) + (fs2/ev2-2) + (fs3/ev3-2)
b3' = (fs1/ev1-3) + (fs2/ev2-3) + (fs3/ev3-3)

Adding the constant back on doesn't really seem to matter because you need to rescale b1' to b1, b2' to b2, and b3' to b3 anyway.

Michael

On Jun 7, 2012, at 1:55 AM, Markus Neteler wrote:

Hi Duccio,

On Wed, Jun 6, 2012 at 11:39 PM, Michael Barton <michael.barton@asu.edu> wrote:

On Jun 6, 2012, at 2:20 PM, Markus Neteler wrote:

On Wed, Jun 6, 2012 at 5:09 PM, Michael Barton <michael.barton@asu.edu> wrote:

...

I'm working on a pan sharpening script that will combine your i.fusion.brovey with options to do other pan sharpening methods. An IHS transformation is easy. I think that a PCA sharpening is doable too if I can find an equation to rotate the PCA channels back into unrotated space--since i.pca does provide the eigenvectors.

Maybe there is material in (see m.eigenvector)
Principal Components Analysis - GRASS-Wiki

This has a lot of good information and ALMOST has what I need. Everything I read suggests that there is a straightforward way to get the original values from the factor scores if you have the eigenvectors. But no one I've yet found provides the equation or algorithm to do it.

@Duccio: any idea about this by chance?

thanks
Markus

_____________________
C. Michael Barton
Visiting Scientist, Integrated Science Program
National Center for Atmospheric Research &
University Corporation for Atmospheric Research
303-497-2889 (voice)

Director, Center for Social Dynamics & Complexity
Professor of Anthropology, School of Human Evolution & Social Change
Arizona State University
www: http://www.public.asu.edu/~cmbarton, http://csdc.asu.edu

--
Duccio Rocchini, PhD

http://gis.cri.fmach.it/rocchini/

Fondazione Edmund Mach
Research and Innovation Centre
Department of Biodiversity and Molecular Ecology
GIS and Remote Sensing Unit
Via Mach 1, 38010 San Michele all'Adige (TN) - Italy
Phone +39 0461 615 570
ducciorocchini@gmail.com
duccio.rocchini@fmach.it
skype: duccio.rocchini

_____________________
C. Michael Barton
Visiting Scientist, Integrated Science Program
National Center for Atmospheric Research &
University Consortium for Atmospheric Research
303-497-2889 (voice)

Director, Center for Social Dynamics & Complexity
Professor of Anthropology, School of Human Evolution & Social Change
Arizona State University
www: http://www.public.asu.edu/~cmbarton, http://csdc.asu.edu

_______________________________________________
grass-dev mailing list
grass-dev@lists.osgeo.org
grass-dev Info Page

_____________________
C. Michael Barton
Visiting Scientist, Integrated Science Program
National Center for Atmospheric Research &
University Corporation for Atmospheric Research
303-497-2889 (voice)

Director, Center for Social Dynamics & Complexity
Professor of Anthropology, School of Human Evolution & Social Change
Arizona State University
www: http://www.public.asu.edu/~cmbarton, http://csdc.asu.edu

_____________________
C. Michael Barton
Visiting Scientist, Integrated Science Program
National Center for Atmospheric Research &
University Corporation for Atmospheric Research
303-497-2889 (voice)

Director, Center for Social Dynamics & Complexity
Professor of Anthropology, School of Human Evolution & Social Change
Arizona State University
www: http://www.public.asu.edu/~cmbarton, http://csdc.asu.edu

Markus_GRASS · June 25, 2012, 6:51pm

On Mon, Jun 25, 2012 at 8:21 PM, Michael Barton <Michael.Barton@asu.edu> wrote:

On Jun 25, 2012, at 12:03 PM, Markus Metz wrote:

Here is an example based on on the landsat imagery in mapset landsat,
North Carolina sample dataset:

i.pca without rescaling of the output:
i.pca input=lsat7_2000_10,lsat7_2000_20,lsat7_2000_30
output_prefix=lsat7_2000_pc rescale=0,0

The eigenvectors are
0.4494, 0.5128, 0.7315
0.6726, 0.3447,-0.6548
0.5879,-0.7863, 0.1901

this is a square 3x3 matrix. R gives the inverse matrix as

0.4493372, 0.6726549, 0.5879235
0.5128129, 0.3446144,-0.7862660
0.7315070,-0.6548316, 0.1899992

[...]

Without using R, the inverse of a 3x3 matrix can be manually
calculated as follows:

original matrix:
a b c
d e f
g h k

inverse matrix:
(ek - fh) (ch - bk) (bf - ce)
(fg - dk) (ak - cg) (cd - af)
(dh - eg) (gb - ah) (ae - bd)

Your example from R above *looks* like the inverse of matrix...

a b c
d e f
g h k

is (given rounding errors) ...

a d g
b e h
c f k

That is, transpose rows and columns.
Is this just a coincidence or a reasonable approximation?

Oops, right. A PCA uses an orthogonal transformation, in which case
the inverse of the transformation matrix is identical to the transpose
of the transformation matrix. Easy solution.

each entry in this inverse matrix needs to be multiplied with

1 / (a(ek - fh) - b(kd - fg) + c(dh - eg))

by a(ek - fh)... , you mean a = factor score 1 of a cell, b = factor score 2 of a cell, c = factor score 3 of a cell (not eigenvectors a,b,c or the matrix), right?

No, a,b,c are here Eigenvector values. But this formula is the general
case for the inverse of a 3x3 matrix and not needed here, because the
inverse and the transpose are the same.

Markus M

in order to get the transformation matrix to be applied to the pc scores.

The formula is more complex for matrices with more than 3 dimensions,
e.g. for i.pca with 6 input bands.

So my questions are:

1) Do I have the inversion correct in terms of how it needs to be calculated in GRASS?

I don't think so. Using your formula and testing for the difference gives me
min=-546.255208609669
max=174.771853180844
range=721.027061790513
mean=79.9249815357317

a bit too much IMHO.

Right. Looks like I'll need to use r.univar to calculate the mean of each band to add back in. Thanks again. I will try this out manually and test before updating the pan sharpening routine.

2) Even though the mean of all bands seems to be subtracted from each factor score, does inverting the matrix mean that the mean of *each* band is added back to its transformed value? Adding either the mean of all original bands or 1 original band seems to produce values that are much higher than the original, and so need to be rescaled. Maybe this is OK.

The mean of each band is added back to each recalculated band. See
above r.mapcalc formula for band 10.

3) I do not normalize or rescale in i.pca. This seems to make it easier to do the inverse PCA with fewer calculations. Is there any reason I *should* rescale and/or normalize?

Normalization applies to the input of i.pca and is by default not
done. It needs to be done for heterogenous input data such as e.g.
rainfall, temperature, NDVI, etc. Rescaling is automatically applied
to the output of i.pca unless explicitly disabled with rescale=0,0.

Thanks. That is what I thought.

Michael

Markus M

Thanks much
Michael

On Jun 25, 2012, at 10:11 AM, Markus Metz wrote:

An inverse PCA can be regarded as the inverse of a transformation
using matrix notation. PC scores are calculated with

b = A a

with A being the transformation matrix composed of the Eigenvectors, a
being the vector of the original values and b the PC scores. What you
now need is inverse of A, A^-1. The original values can then be
retrieved with

A^-1 b = a

A^-1 is the inverse of the transformation matrix A which you can get
in R with solve(A).

For a PCA, the original values are usually shifted to the mean and
optionally scaled to stddev before computing the Eigenvectors. The
mean shift is always performed by i.pca, scaling is optional. That
means that A^-1 b gives the original values shifted to the mean and
maybe scaled, and the mean of each original band needs to be added to
get the original values used as input to i.pca. With scaling applied,
the shifted values need to be multiplied by the stddev for each
original band.

HTH,

Markus M

On Tue, Jun 19, 2012 at 12:46 AM, Michael Barton <Michael.Barton@asu.edu> wrote:

The constant (i.e., the band mean) was in the pca primer that someone sent me a link to in this discussion. Using the Eigenvectors resulting from i.pca, I can achieve the results of i.pca using my formula below. This is essentially the same as your formula minus the constant--which doesn't really make much (of any) difference in the final result.

However, my question is about performing an *inverse pca*--getting back to the original values from the principal components maps. The idea of pca sharpening is that you perform a pca, then do an inverse pca substituting the pan band for pc1 to enhance the resolution. The equations I show below seem to work, but what I've read indicates that it is not possible to *exactly* get the original values back; you can only approximate them.

Michael

On Jun 17, 2012, at 10:48 AM, Duccio Rocchini wrote:

Dear all,
first, sorry for the delay...
Here are my 2 cents to be added to the discussion. I re-took in my
hands the John Jensen book.
Accordingly

new brightness values1,1,1 = a1,1*BV1,1,1 +a2,1*BV1,1,2..... + an,1*BV1,1,m

where a=eigenvector and BV=original brightness value.

I found no evidence for the mean term: "- ((b1+b2+b3)/3)"

Michael: do you have a proof/reference for that?

P.S. thanks for involving me in this discussion which is really stimulating!

Duccio

2012/6/7 Michael Barton <michael.barton@asu.edu>:

I think I've figured it out.

If (ev1-1, ev1-2, ev1-3) are the eigenvectors of the first principal component for 3 imagery bands (b1, b2, b3), the corresponding factor scores of the PC1, PC2, and PC3 maps (fs1, fs2, fs3) are calculated as:

fs1 = (ev1-1*b1) + (ev1-2*b2) + (ev1-3*b3) - ((b1+b2+b3)/3)
fs2 = (ev2-1*b1) + (ev2-2*b2) + (ev2-3*b3) - ((b1+b2+b3)/3)
fs3 = (ev3-1*b1) + (ev3-2*b2) + (ev3-3*b3) - ((b1+b2+b3)/3)

So to do an inverse PCA on the same data you need to do the following:

b1' = (fs1/ev1-1) + (fs2/ev2-1) + (fs3/ev3-1)
b2' = (fs1/ev1-2) + (fs2/ev2-2) + (fs3/ev3-2)
b3' = (fs1/ev1-3) + (fs2/ev2-3) + (fs3/ev3-3)

Adding the constant back on doesn't really seem to matter because you need to rescale b1' to b1, b2' to b2, and b3' to b3 anyway.

Michael

On Jun 7, 2012, at 1:55 AM, Markus Neteler wrote:

Hi Duccio,

On Wed, Jun 6, 2012 at 11:39 PM, Michael Barton <michael.barton@asu.edu> wrote:

On Jun 6, 2012, at 2:20 PM, Markus Neteler wrote:

On Wed, Jun 6, 2012 at 5:09 PM, Michael Barton <michael.barton@asu.edu> wrote:

...

I'm working on a pan sharpening script that will combine your i.fusion.brovey with options to do other pan sharpening methods. An IHS transformation is easy. I think that a PCA sharpening is doable too if I can find an equation to rotate the PCA channels back into unrotated space--since i.pca does provide the eigenvectors.

Maybe there is material in (see m.eigenvector)
http://grass.osgeo.org/wiki/Principal_Components_Analysis

This has a lot of good information and ALMOST has what I need. Everything I read suggests that there is a straightforward way to get the original values from the factor scores if you have the eigenvectors. But no one I've yet found provides the equation or algorithm to do it.

@Duccio: any idea about this by chance?

thanks
Markus

_____________________
C. Michael Barton
Visiting Scientist, Integrated Science Program
National Center for Atmospheric Research &
University Corporation for Atmospheric Research
303-497-2889 (voice)

Director, Center for Social Dynamics & Complexity
Professor of Anthropology, School of Human Evolution & Social Change
Arizona State University
www: http://www.public.asu.edu/~cmbarton, http://csdc.asu.edu

--
Duccio Rocchini, PhD

http://gis.cri.fmach.it/rocchini/

Fondazione Edmund Mach
Research and Innovation Centre
Department of Biodiversity and Molecular Ecology
GIS and Remote Sensing Unit
Via Mach 1, 38010 San Michele all'Adige (TN) - Italy
Phone +39 0461 615 570
ducciorocchini@gmail.com
duccio.rocchini@fmach.it
skype: duccio.rocchini

_____________________
C. Michael Barton
Visiting Scientist, Integrated Science Program
National Center for Atmospheric Research &
University Consortium for Atmospheric Research
303-497-2889 (voice)

Director, Center for Social Dynamics & Complexity
Professor of Anthropology, School of Human Evolution & Social Change
Arizona State University
www: http://www.public.asu.edu/~cmbarton, http://csdc.asu.edu

_______________________________________________
grass-dev mailing list
grass-dev@lists.osgeo.org
http://lists.osgeo.org/mailman/listinfo/grass-dev

_____________________
C. Michael Barton
Visiting Scientist, Integrated Science Program
National Center for Atmospheric Research &
University Corporation for Atmospheric Research
303-497-2889 (voice)

Director, Center for Social Dynamics & Complexity
Professor of Anthropology, School of Human Evolution & Social Change
Arizona State University
www: http://www.public.asu.edu/~cmbarton, http://csdc.asu.edu

_____________________
C. Michael Barton
Visiting Scientist, Integrated Science Program
National Center for Atmospheric Research &
University Corporation for Atmospheric Research
303-497-2889 (voice)

Director, Center for Social Dynamics & Complexity
Professor of Anthropology, School of Human Evolution & Social Change
Arizona State University
www: http://www.public.asu.edu/~cmbarton, http://csdc.asu.edu

cmbarton · June 25, 2012, 7:12pm

Great news. Thanks.

Michael

On Jun 25, 2012, at 12:51 PM, Markus Metz wrote:

On Mon, Jun 25, 2012 at 8:21 PM, Michael Barton <Michael.Barton@asu.edu> wrote:

On Jun 25, 2012, at 12:03 PM, Markus Metz wrote:

Here is an example based on on the landsat imagery in mapset landsat,
North Carolina sample dataset:

i.pca without rescaling of the output:
i.pca input=lsat7_2000_10,lsat7_2000_20,lsat7_2000_30
output_prefix=lsat7_2000_pc rescale=0,0

The eigenvectors are
0.4494, 0.5128, 0.7315
0.6726, 0.3447,-0.6548
0.5879,-0.7863, 0.1901

this is a square 3x3 matrix. R gives the inverse matrix as

0.4493372, 0.6726549, 0.5879235
0.5128129, 0.3446144,-0.7862660
0.7315070,-0.6548316, 0.1899992

[...]

Without using R, the inverse of a 3x3 matrix can be manually
calculated as follows:

original matrix:
a b c
d e f
g h k

inverse matrix:
(ek - fh) (ch - bk) (bf - ce)
(fg - dk) (ak - cg) (cd - af)
(dh - eg) (gb - ah) (ae - bd)

Your example from R above *looks* like the inverse of matrix...

a b c
d e f
g h k

is (given rounding errors) ...

a d g
b e h
c f k

That is, transpose rows and columns.
Is this just a coincidence or a reasonable approximation?

Oops, right. A PCA uses an orthogonal transformation, in which case
the inverse of the transformation matrix is identical to the transpose
of the transformation matrix. Easy solution.

each entry in this inverse matrix needs to be multiplied with

1 / (a(ek - fh) - b(kd - fg) + c(dh - eg))

by a(ek - fh)... , you mean a = factor score 1 of a cell, b = factor score 2 of a cell, c = factor score 3 of a cell (not eigenvectors a,b,c or the matrix), right?

No, a,b,c are here Eigenvector values. But this formula is the general
case for the inverse of a 3x3 matrix and not needed here, because the
inverse and the transpose are the same.

Markus M

in order to get the transformation matrix to be applied to the pc scores.

The formula is more complex for matrices with more than 3 dimensions,
e.g. for i.pca with 6 input bands.

So my questions are:

1) Do I have the inversion correct in terms of how it needs to be calculated in GRASS?

I don't think so. Using your formula and testing for the difference gives me
min=-546.255208609669
max=174.771853180844
range=721.027061790513
mean=79.9249815357317

a bit too much IMHO.

Right. Looks like I'll need to use r.univar to calculate the mean of each band to add back in. Thanks again. I will try this out manually and test before updating the pan sharpening routine.

2) Even though the mean of all bands seems to be subtracted from each factor score, does inverting the matrix mean that the mean of *each* band is added back to its transformed value? Adding either the mean of all original bands or 1 original band seems to produce values that are much higher than the original, and so need to be rescaled. Maybe this is OK.

The mean of each band is added back to each recalculated band. See
above r.mapcalc formula for band 10.

3) I do not normalize or rescale in i.pca. This seems to make it easier to do the inverse PCA with fewer calculations. Is there any reason I *should* rescale and/or normalize?

Normalization applies to the input of i.pca and is by default not
done. It needs to be done for heterogenous input data such as e.g.
rainfall, temperature, NDVI, etc. Rescaling is automatically applied
to the output of i.pca unless explicitly disabled with rescale=0,0.

Thanks. That is what I thought.

Michael

Markus M

Thanks much
Michael

On Jun 25, 2012, at 10:11 AM, Markus Metz wrote:

An inverse PCA can be regarded as the inverse of a transformation
using matrix notation. PC scores are calculated with

b = A a

with A being the transformation matrix composed of the Eigenvectors, a
being the vector of the original values and b the PC scores. What you
now need is inverse of A, A^-1. The original values can then be
retrieved with

A^-1 b = a

A^-1 is the inverse of the transformation matrix A which you can get
in R with solve(A).

For a PCA, the original values are usually shifted to the mean and
optionally scaled to stddev before computing the Eigenvectors. The
mean shift is always performed by i.pca, scaling is optional. That
means that A^-1 b gives the original values shifted to the mean and
maybe scaled, and the mean of each original band needs to be added to
get the original values used as input to i.pca. With scaling applied,
the shifted values need to be multiplied by the stddev for each
original band.

HTH,

Markus M

On Tue, Jun 19, 2012 at 12:46 AM, Michael Barton <Michael.Barton@asu.edu> wrote:

The constant (i.e., the band mean) was in the pca primer that someone sent me a link to in this discussion. Using the Eigenvectors resulting from i.pca, I can achieve the results of i.pca using my formula below. This is essentially the same as your formula minus the constant--which doesn't really make much (of any) difference in the final result.

However, my question is about performing an *inverse pca*--getting back to the original values from the principal components maps. The idea of pca sharpening is that you perform a pca, then do an inverse pca substituting the pan band for pc1 to enhance the resolution. The equations I show below seem to work, but what I've read indicates that it is not possible to *exactly* get the original values back; you can only approximate them.

Michael

On Jun 17, 2012, at 10:48 AM, Duccio Rocchini wrote:

Dear all,
first, sorry for the delay...
Here are my 2 cents to be added to the discussion. I re-took in my
hands the John Jensen book.
Accordingly

new brightness values1,1,1 = a1,1*BV1,1,1 +a2,1*BV1,1,2..... + an,1*BV1,1,m

where a=eigenvector and BV=original brightness value.

I found no evidence for the mean term: "- ((b1+b2+b3)/3)"

Michael: do you have a proof/reference for that?

P.S. thanks for involving me in this discussion which is really stimulating!

Duccio

2012/6/7 Michael Barton <michael.barton@asu.edu>:

I think I've figured it out.

If (ev1-1, ev1-2, ev1-3) are the eigenvectors of the first principal component for 3 imagery bands (b1, b2, b3), the corresponding factor scores of the PC1, PC2, and PC3 maps (fs1, fs2, fs3) are calculated as:

fs1 = (ev1-1*b1) + (ev1-2*b2) + (ev1-3*b3) - ((b1+b2+b3)/3)
fs2 = (ev2-1*b1) + (ev2-2*b2) + (ev2-3*b3) - ((b1+b2+b3)/3)
fs3 = (ev3-1*b1) + (ev3-2*b2) + (ev3-3*b3) - ((b1+b2+b3)/3)

So to do an inverse PCA on the same data you need to do the following:

b1' = (fs1/ev1-1) + (fs2/ev2-1) + (fs3/ev3-1)
b2' = (fs1/ev1-2) + (fs2/ev2-2) + (fs3/ev3-2)
b3' = (fs1/ev1-3) + (fs2/ev2-3) + (fs3/ev3-3)

Adding the constant back on doesn't really seem to matter because you need to rescale b1' to b1, b2' to b2, and b3' to b3 anyway.

Michael

On Jun 7, 2012, at 1:55 AM, Markus Neteler wrote:

Hi Duccio,

On Wed, Jun 6, 2012 at 11:39 PM, Michael Barton <michael.barton@asu.edu> wrote:

On Jun 6, 2012, at 2:20 PM, Markus Neteler wrote:

On Wed, Jun 6, 2012 at 5:09 PM, Michael Barton <michael.barton@asu.edu> wrote:

...

I'm working on a pan sharpening script that will combine your i.fusion.brovey with options to do other pan sharpening methods. An IHS transformation is easy. I think that a PCA sharpening is doable too if I can find an equation to rotate the PCA channels back into unrotated space--since i.pca does provide the eigenvectors.

Maybe there is material in (see m.eigenvector)
Principal Components Analysis - GRASS-Wiki

This has a lot of good information and ALMOST has what I need. Everything I read suggests that there is a straightforward way to get the original values from the factor scores if you have the eigenvectors. But no one I've yet found provides the equation or algorithm to do it.

@Duccio: any idea about this by chance?

thanks
Markus

_____________________
C. Michael Barton
Visiting Scientist, Integrated Science Program
National Center for Atmospheric Research &
University Corporation for Atmospheric Research
303-497-2889 (voice)

Director, Center for Social Dynamics & Complexity
Professor of Anthropology, School of Human Evolution & Social Change
Arizona State University
www: http://www.public.asu.edu/~cmbarton, http://csdc.asu.edu

--
Duccio Rocchini, PhD

http://gis.cri.fmach.it/rocchini/

Fondazione Edmund Mach
Research and Innovation Centre
Department of Biodiversity and Molecular Ecology
GIS and Remote Sensing Unit
Via Mach 1, 38010 San Michele all'Adige (TN) - Italy
Phone +39 0461 615 570
ducciorocchini@gmail.com
duccio.rocchini@fmach.it
skype: duccio.rocchini

_____________________
C. Michael Barton
Visiting Scientist, Integrated Science Program
National Center for Atmospheric Research &
University Consortium for Atmospheric Research
303-497-2889 (voice)

Director, Center for Social Dynamics & Complexity
Professor of Anthropology, School of Human Evolution & Social Change
Arizona State University
www: http://www.public.asu.edu/~cmbarton, http://csdc.asu.edu

_______________________________________________
grass-dev mailing list
grass-dev@lists.osgeo.org
grass-dev Info Page

_____________________
C. Michael Barton
Visiting Scientist, Integrated Science Program
National Center for Atmospheric Research &
University Corporation for Atmospheric Research
303-497-2889 (voice)

Director, Center for Social Dynamics & Complexity
Professor of Anthropology, School of Human Evolution & Social Change
Arizona State University
www: http://www.public.asu.edu/~cmbarton, http://csdc.asu.edu

_____________________
C. Michael Barton
Visiting Scientist, Integrated Science Program
National Center for Atmospheric Research &
University Corporation for Atmospheric Research
303-497-2889 (voice)

Director, Center for Social Dynamics & Complexity
Professor of Anthropology, School of Human Evolution & Social Change
Arizona State University
www: http://www.public.asu.edu/~cmbarton, http://csdc.asu.edu

_____________________
C. Michael Barton
Visiting Scientist, Integrated Science Program
National Center for Atmospheric Research &
University Corporation for Atmospheric Research
303-497-2889 (voice)

Director, Center for Social Dynamics & Complexity
Professor of Anthropology, School of Human Evolution & Social Change
Arizona State University
www: http://www.public.asu.edu/~cmbarton, http://csdc.asu.edu

Hamish · June 25, 2012, 10:23pm

Hi,

FWIW, it's a known feature that GRASS's i.pca and R's version
give transposed output. It's just cosmetic and the row and column
headers tell you what's what. GRASS lists the principal
components row-wise, R presents them column-wise.

See discussion and GRASS vs. R testing/validation w/Nikos from
c. March 2009. (& trac ticket #430)

See also other open i.pca tickets:
"i.pca fails to center data prior to analysis"
http://trac.osgeo.org/grass/ticket/576

"i.pca: close map before writing out metadata?"
http://trac.osgeo.org/grass/ticket/511

"i.pca metadata truncated"
http://trac.osgeo.org/grass/ticket/1279

Hamish

Hamish · June 25, 2012, 10:27pm

Hamish:

See discussion and GRASS vs. R testing/validation w/Nikos
from c. March 2009. (& trac ticket #430)

and the (colorized) discussion here:

http://grass.osgeo.org/wiki/Principal_Components_Analysis#SVD_solution

Hamish

cmbarton · June 25, 2012, 10:29pm

Fortunately, those are not issues for pan sharpening.

Michael

On Jun 25, 2012, at 4:23 PM, Hamish wrote:

Hi,

FWIW, it's a known feature that GRASS's i.pca and R's version
give transposed output. It's just cosmetic and the row and column
headers tell you what's what. GRASS lists the principal
components row-wise, R presents them column-wise.

See discussion and GRASS vs. R testing/validation w/Nikos from
c. March 2009. (& trac ticket #430)

See also other open i.pca tickets:
"i.pca fails to center data prior to analysis"
#576 (i.pca fails to center data prior to analysis) – GRASS GIS

"i.pca: close map before writing out metadata?"
#511 (i.pca: close map before writing out metadata?) – GRASS GIS

"i.pca metadata truncated"
#1279 (i.pca metadata truncated) – GRASS GIS

Hamish

_____________________
C. Michael Barton
Visiting Scientist, Integrated Science Program
National Center for Atmospheric Research &
University Corporation for Atmospheric Research
303-497-2889 (voice)

Director, Center for Social Dynamics & Complexity
Professor of Anthropology, School of Human Evolution & Social Change
Arizona State University
www: http://www.public.asu.edu/~cmbarton, http://csdc.asu.edu

Markus_GRASS · June 26, 2012, 8:59am

On Tue, Jun 26, 2012 at 12:23 AM, Hamish <hamish_b@yahoo.com> wrote:

FWIW,

See discussion and GRASS vs. R testing/validation w/Nikos from
c. March 2009. (& trac ticket #430)

Ticket #430 is fixed in all branches.

See also other open i.pca tickets:
"i.pca fails to center data prior to analysis"
http://trac.osgeo.org/grass/ticket/576

Ticket #576 is fixed in 7.0 and 6.5, awaiting backport to 6.4.

"i.pca: close map before writing out metadata?"
http://trac.osgeo.org/grass/ticket/511

Ticket #511 is fixed in all branches.

"i.pca metadata truncated"
http://trac.osgeo.org/grass/ticket/1279

Ticket #1279 is fixed in trunk, difficult to backport to 6.x because
of the 80x40 limit of the history.

Markus M

NikosAlexandris · June 26, 2012, 10:02am

Michael Barton:

> Thanks Marcus,
> Please see my post below, starting with "I think I've figured it out..."

Markus Metz:

Yes, I saw, but the formula below "I think I've figured it out..."
looked strange to me.

> Since I'm working in GRASS instead of R,

[ should be "and", not "instead of" ]

Here is an example based on on the landsat imagery in mapset landsat,
North Carolina sample dataset:

i.pca without rescaling of the output:
i.pca input=lsat7_2000_10,lsat7_2000_20,lsat7_2000_30
output_prefix=lsat7_2000_pc rescale=0,0

The eigenvectors are
0.4494, 0.5128, 0.7315
0.6726, 0.3447,-0.6548
0.5879,-0.7863, 0.1901

this is a square 3x3 matrix. R gives the inverse matrix as

0.4493372, 0.6726549, 0.5879235
0.5128129, 0.3446144,-0.7862660
0.7315070,-0.6548316, 0.1899992

Test with band 10:
the general formula is
b1' = (ev1-1' * pc1 + ev1-2' * pc2 + ev1-3' * pc3) + mean(b1)

The mean of lsat7_2000_b10 is 79.925.
r.mapcalc "lsat7_2000_10.pc = (lsat7_2000_pc.1 * 0.4493372 +
lsat7_2000_pc.2 * 0.6726549 + lsat7_2000_pc.3 * 0.5879235) + 79.925"

Difference to original:
r.mapcalc "diff = lsat7_2000_10 - lsat7_2000_10.pc"

r.univar diff -g
n=250325
null_cells=0
cells=250325
min=-0.00140021304849824
max=0.00723340429107111
range=0.00863361733956935
mean=-1.8473599814612e-05

That's close enough, given that the eigenvectors are printed by i.pca
with limited precision (only 4 decimal places).

The math demonstrated above is pretty clear and Correct! The above
should/will go to the wiki, i.e. in a separate "Inverse PCA" page. Hopefully
I'll get to that as soon as possible (=next month!).

A general note though:

in a new produced PC-coordinate system, one has to be very careful about its
interpretation. There is no prior knowledge on what (new values) is what
(i.e. relation to land cover) exactly, as in the way we do understand that
reflectance, for example, can be associated -- and explained -- with specific
land cover types or other surface features.

We have to keep in mind that the re-distribution of the information contained
in the input bands, does strictly follow math rules (i.e. achieve a minimal
RMSE, extract highest variances into first "principal" components). It does
not care about class labels.

Any manipulation of the data, prior to and while aiming at an inverse PCA,
might mix-up things in a way that the final data are very difficult to interpret
and, especially, to automatically post-process (i.e. use well known Remote
Sensing algorithms).

In the case of a pan-sharpening attempt, I would thoroughly test.

Without using R, the inverse of a 3x3 matrix can be manually
calculated as follows:

original matrix:
a b c
d e f
g h k

inverse matrix:
(ek - fh) (ch - bk) (bf - ce)
(fg - dk) (ak - cg) (cd - af)
(dh - eg) (gb - ah) (ae - bd)

each entry in this inverse matrix needs to be multiplied with

1 / (a(ek - fh) - b(kd - fg) + c(dh - eg))

in order to get the transformation matrix to be applied to the pc scores.

The formula is more complex for matrices with more than 3 dimensions,
e.g. for i.pca with 6 input bands.

> So my questions are:
>
> 1) Do I have the inversion correct in terms of how it needs to be
> calculated in GRASS?
I don't think so. Using your formula and testing for the difference gives me
min=-546.255208609669
max=174.771853180844
range=721.027061790513
mean=79.9249815357317

a bit too much IMHO.

It is a "bit" more than a "bit" too much...

> 2) Even though the mean of all bands seems to be subtracted from each
> factor score, does inverting the matrix mean that the mean of *each* band
> is added back to its transformed value? Adding either the mean of all
> original bands or 1 original band seems to produce values that are much
> higher than the original, and so need to be rescaled. Maybe this is OK.

The mean of each band is added back to each recalculated band. See
above r.mapcalc formula for band 10.

> 3) I do not normalize or rescale in i.pca. This seems to make it easier to
> do the inverse PCA with fewer calculations. Is there any reason I
> *should* rescale and/or normalize?

Normalization applies to the input of i.pca and is by default not
done. It needs to be done for heterogenous input data such as e.g.
rainfall, temperature, NDVI, etc. Rescaling is automatically applied
to the output of i.pca unless explicitly disabled with rescale=0,0.

Adding my 1 old drachma:

PCA works on/with global stats. One would want to normalise exactly when it
comes to "heterogenous" input data, speak largely different ranges of input
data. For example there is an input "band" which ranges from 0 to 100 and
another one which ranges from 0 to 100000. Normalisation would help to make
different inputs comparable. That is mainly the reason to apply "scaling", as
far as I understand PCA.

Note, though, that a normalisation will swipe out "subtle" differences between
input "bands" of "similar" range which might or might _not_ be
useful/destructive -- it always depends on what you want to do with your data.
There is no guarantee that you don't want to normalise even for inputs of the
same nature/unit.

I guess that in the case of a pan-sharpening attempt, one would not want to
normalise the input data!

[rest deleted]

Nikos

NikosAlexandris · June 26, 2012, 10:02am

Hamish wrote:

FWIW,

> See discussion and GRASS vs. R testing/validation w/Nikos from
> c. March 2009. (& trac ticket #430)

Markus M wrote:

Ticket #430 is fixed in all branches.

> See also other open i.pca tickets:
> "i.pca fails to center data prior to analysis"
> http://trac.osgeo.org/grass/ticket/576

Ticket #576 is fixed in 7.0 and 6.5, awaiting backport to 6.4.

> "i.pca: close map before writing out metadata?"
> http://trac.osgeo.org/grass/ticket/511

Ticket #511 is fixed in all branches.

> "i.pca metadata truncated"
> http://trac.osgeo.org/grass/ticket/1279

Ticket #1279 is fixed in trunk, difficult to backport to 6.x because
of the 80x40 limit of the history.

I, too, noticed, that issues have been fixed. I keep wanting to "clean" the
respective wiki-page. I feel it is kind of "noisy". I hope that, with the
Universe conspiring on my side, I'll jump-in (grass ML) next month again

NikosAlexandris · June 26, 2012, 10:24am

Michael Barton:

[...]

However, my question is about performing an *inverse pca*--getting back to
the original values from the principal components maps.

Michael, getting back to the original values can _only_ be done if one does
not "touch" the data in any of the intermediate steps, i.e. Input > EVD (or
SVD) > Inverse PCA > Original values.

If one alters the data at any step prior to the Eigenanalysis or SVD, I don't
think it is possible to land back on "level 1". From the moment that global
stats of a multivariarte dataset, subject to PCA, are changed, one will
probably jump into a "new" reality. This means that it takes an extra effort
to interpret the "new" stuff.

The idea of pca sharpening is that you perform a pca, then do an inverse pca
substituting the pan band for pc1 to enhance the resolution.

I haven't tried PCA sharpening. So, apologies for my simplistic question(s),
I just want to understand the trick here.

Which resolution is to be enhanced? The geometric? Is it meant to keep PC1
and mix it with the rest, or keep the Pan and throw away PC1?

Principal Component 1 will contain the highest variance of your input data --
which, in fact, is a composition of different amount of information originated
from all input bands. If you throw that away you are left with a dataset which
is likely to be useless!

The equations I show below seem to work, but what I've read indicates that
it is not possible to *exactly* get the original values back; you can only
approximate them.

As Markus' demonstration showed in another post, the results can be close
enough so the differences can be disregarded. As far as I have understood PCA,
it depends on how many decimals are taken into account, while doing all the
math and _not_ effectively altering the data at any of the intermediate steps.

Pff, it's been a while I got my hands dirty with PCA and I might forget
something here.

[...]

>>>>>> I'm working on a pan sharpening script that will combine your
>>>>>> i.fusion.brovey with options to do other pan sharpening methods. An
>>>>>> IHS transformation is easy. I think that a PCA sharpening is doable
>>>>>> too if I can find an equation to rotate the PCA channels back into
>>>>>> unrotated space--since i.pca does provide the eigenvectors.>>>>>

[...]

Thanks, Nikos

Hamish · June 26, 2012, 12:25pm

Nikos wrote:

Which resolution is to be enhanced? The geometric? Is it meant
to keep PC1 and mix it with the rest, or keep the Pan and throw
away PC1?

Principal Component 1 will contain the highest variance of
your input data -- which, in fact, is a composition of different
amount of information originated from all input bands. If you
throw that away you are left with a dataset which is likely to
be useless!

(not talking about pan-sharpening, but in general,)
how about the situation where you have a map data which is
loudly dominated by a signal, and you want to try and remove
that loud signal so that you can look at the subtle variations
caused by a different source that the loud signal had been
masking?

is removing PC1 then back-inverting a suitable method for that
sort of task? or is there another more suitable method?

thanks,
Hamish

NikosAlexandris · June 26, 2012, 1:31pm

Nikos:

> Which resolution is to be enhanced? The geometric? Is it meant
> to keep PC1 and mix it with the rest, or keep the Pan and throw
> away PC1?

> Principal Component 1 will contain the highest variance of
> your input data -- which, in fact, is a composition of different
> amount of information originated from all input bands. If you
> throw that away you are left with a dataset which is likely to
> be useless!

Hamish:

(not talking about pan-sharpening, but in general,)
how about the situation where you have a map data

( we are talking about multi-dimensional data, right? )

which is loudly dominated by a signal, and you want to try and remove that
loud signal so that you can look at the subtle variations caused by a
different source that the loud signal had been masking?

Yes, this _can_ be a perfect use-case.

Especially if the presence of the feature in question, is in at least one or
in some of the input dimensions near/close to zero. This last statement is
based on Pielou's (flawless explanation of how PCA works) [1] and own
experiences [2].

A separation/isolation attempt of the feature in question from dominant
variances will be "supported". The "loud" signal would be channeled among the
first few PCs and the "subtle variations" _could_ then be more evident in some
of the higher order components.

All in all, one has to look at the numbers -- drawing conclusions from the PC
images is not safe!

is removing PC1 then back-inverting a suitable method for that sort of task?

Short answer: yes, it can be, but back-inverting might not be necessary!

Longer story: if the "subtle variations" (featueres of low(er) variance,
rather homogenous stuff) are, as expected, more evident (read: enhanced as
compared to the original data set) in some of the higher order components, why
bother to back-invert? Supervised classification techniques can directly
operate on selected PCs and attempt to extract whatever is of your interest.

More on the subject of back-inverting -- quotting from Dr. Koutsias paper:

"A critical issue in the back-transformation process is the
amount of information taken from each PC axis. The original
spectral pattern of the satellite image is modified to a degree that
depends on the amount of the information taken from each PC axis."

In this work (mapping burned areas), "back-transformation coefficients", in the
range of 0 to 1, were worked-out in order to 'grep' specific percentages (0 to
100%) from each of the produced PCs and channel them back (via inverse-PCA) to
a data set _similar_ to the original one, though different to the extent of the
removed information (excluded PC).

or is there another more suitable method?

Dunno more...
Kindest regards, Nikos

---
[1] Book: Pielou, E. C. The interpretation of ecological data: a primer on
classification and ordination Wiley, New York, 1984

[2] <Dissertation: Burned area mapping via non-centered PCA using Public
Domain Data and Free Open Source Software Institut für Forstökonomie, Fakultät
für Forst- und Umweltwissenschaften, Albert-Ludwigs-Universität Freiburg,
2011>

[3] <Koutsias, N.; Mallinis, G. & Karteris, M. A forward/backward principal
component analysis of Landsat-7 ETM+ data to enhance the spectral signal of
burnt surfaces ISPRS Journal of Photogrammetry and Remote Sensing, 2009, 64,
37>

NikosAlexandris · June 26, 2012, 1:34pm

Allow to me clarify the sentence below to let it fit into the frame of the
current thread!

On Tuesday 26 of June 2012 13:24:03 Nikos Alexandris wrote:

If one alters the data at any step prior to the Eigenanalysis or SVD, I
don't think it is possible to land back on "level 1".

I meant: if the data are manipulated, at any step, prior to the "back-
inverting" EVD or SVD...

Thanks, Nikos

cmbarton · June 26, 2012, 4:59pm

Hi Nicos
On Jun 26, 2012, at 4:24 AM, Nikos Alexandris wrote:

Michael Barton:

[...]

However, my question is about performing an *inverse pca*--getting back to
the original values from the principal components maps.

Michael, getting back to the original values can _only_ be done if one does
not "touch" the data in any of the intermediate steps, i.e. Input > EVD (or
SVD) > Inverse PCA > Original values.

If one alters the data at any step prior to the Eigenanalysis or SVD, I don't
think it is possible to land back on "level 1". From the moment that global
stats of a multivariarte dataset, subject to PCA, are changed, one will
probably jump into a "new" reality. This means that it takes an extra effort
to interpret the "new" stuff.

Right. That is why I'm not doing any alteration of the data after transforming to PC's.

The idea of pca sharpening is that you perform a pca, then do an inverse pca
substituting the pan band for pc1 to enhance the resolution.

I haven't tried PCA sharpening. So, apologies for my simplistic question(s),
I just want to understand the trick here.

Which resolution is to be enhanced? The geometric? Is it meant to keep PC1
and mix it with the rest, or keep the Pan and throw away PC1?

Principal Component 1 will contain the highest variance of your input data --
which, in fact, is a composition of different amount of information originated
from all input bands. If you throw that away you are left with a dataset which
is likely to be useless!

The way this works is to:

1) transform 3 lower resolution bands to 3 principal components
2) replace PC1 with the higher resolution panchromatic band (under the reasonable assumption that the pan band will include more of the total spectral variability than will any more spectrally limited band). Histogram matching the pan band to PC1 is recommended here.
3) do an inverse PCA to get back to the original bands with a similar range of spectral response but with higher spatial resolution.

There have been--and continue to be--studies of the performance of different pan sharpening algorithms from various perspectives. For myself, pan sharpening can help with visually resolving more features in greater detail. But this is at the cost of making it considerably more difficult to understand what the pixel values of the enhanced bands mean.

The equations I show below seem to work, but what I've read indicates that
it is not possible to *exactly* get the original values back; you can only
approximate them.

As Markus' demonstration showed in another post, the results can be close
enough so the differences can be disregarded. As far as I have understood PCA,
it depends on how many decimals are taken into account, while doing all the
math and _not_ effectively altering the data at any of the intermediate steps.

Yes. Markus' demo made me more comfortable with the algorithm overall. When you replace PC1 with the pan band, of course, you don't get back to the original values. But the ranges look pretty good now.

I'll attach the script here in case you want to try it. Ver. 2 and 3 represent different kinds of optimizing for calculation speed. v3 only works with a new GRASS python function that Hamish committed to trunk yesterday. V2 should work with all current versions of GRASS.

Here are some helpful references:

Amarsaikhan, D., & Douglas, T. (2004). Data fusion and multisource image classification. International Journal of Remote Sensing, 25(17), 3529–3539.
Behnia, P. (2005). Comparison between four methods for data fusion of ETM+ multispectral and pan images. Geo-spatial Information Science, 8(2), 98–103. doi:10.1007/BF02826847
Du, Q., Younan, N. H., King, R., & Shah, V. P. (2007). On the Performance Evaluation of Pan-Sharpening Techniques. Geoscience and Remote Sensing Letters, IEEE, 4(4), 518 –522. doi:10.1109/LGRS.2007.896328
Karathanassi, V., Kolokousis, P., & Ioannidou, S. (2007). A comparison study on fusion methods using evaluation indicators. International Journal of Remote Sensing, 28(10), 2309–2341. doi:10.1080/01431160600606890

Michael

(attachments)

i.pansharpen2 (14.2 KB)
i.pansharpen3 (14 KB)

NikosAlexandris · June 26, 2012, 9:59pm

Thank you for the details Michael!

(cc-ing to Dr. Koutsias, this discussion might be of his interest)

Michael Barton:

> [...]

>> However, my question is about performing an *inverse pca*--getting back
>> to the original values from the principal components maps.

Nikos:

> Michael, getting back to the original values can _only_ be done if one
> does not "touch" the data in any of the intermediate steps, i.e. Input >
> EVD (or SVD) > Inverse PCA > Original values.

> If one alters the data at any step prior to the Eigenanalysis or SVD, I
> don't think it is possible to land back on "level 1". From the moment
> that global stats of a multivariarte dataset, subject to PCA, are
> changed, one will probably jump into a "new" reality. This means that it
> takes an extra effort to interpret the "new" stuff.

Michael:

Right. That is why I'm not doing any alteration of the data after
transforming to PC's.
>> The idea of pca sharpening is that you perform a pca, then do an inverse
>> pca substituting the pan band for pc1 to enhance the resolution.

Nikos:

> I haven't tried PCA sharpening. So, apologies for my simplistic
> question(s), I just want to understand the trick here.

> Which resolution is to be enhanced? The geometric? Is it meant to keep
> PC1 and mix it with the rest, or keep the Pan and throw away PC1?

> Principal Component 1 will contain the highest variance of your input data
> -- which, in fact, is a composition of different amount of information
> originated from all input bands. If you throw that away you are left with
> a dataset which is likely to be useless!

Michael:

The way this works is to:

1) transform 3 lower resolution bands to 3 principal components

2) replace PC1 with the higher resolution panchromatic band (under the
reasonable assumption that the pan band will include more of the total
spectral variability than will any more spectrally limited band). Histogram
matching the pan band to PC1 is recommended here.

This assumption is, indeed, necessary and sounds pretty rational. Interesting
stuff.

3) do an inverse PCA to get back to the original bands with a similar range
of spectral response but with higher spatial resolution.

There have been--and continue to be--studies of the performance of different
pan sharpening algorithms from various perspectives. For myself, pan
sharpening can help with visually resolving more features in greater
detail. But this is at the cost of making it considerably more difficult to
understand what the pixel values of the enhanced bands mean.

>> The equations I show below seem to work, but what I've read indicates
>> that it is not possible to *exactly* get the original values back; you
>> can only approximate them.

Nikos:

> As Markus' demonstration showed in another post, the results can be close
> enough so the differences can be disregarded. As far as I have understood
> PCA, it depends on how many decimals are taken into account, while doing
> all the math and _not_ effectively altering the data at any of the
> intermediate steps.

Michael:

Yes. Markus' demo made me more comfortable with the algorithm overall. When
you replace PC1 with the pan band, of course, you don't get back to the
original values. But the ranges look pretty good now.

I'll attach the script here in case you want to try it. Ver. 2 and 3
represent different kinds of optimizing for calculation speed. v3 only
works with a new GRASS python function that Hamish committed to trunk
yesterday. V2 should work with all current versions of GRASS.

Thanks a lot. Noted on my ToDo list: "Check MichaelB's pan-sharpening scripts
(after next week)".

Here are some helpful references:

Amarsaikhan, D., & Douglas, T. (2004). Data fusion and multisource image
classification. International Journal of Remote Sensing, 25(17), 3529–3539.

Behnia, P. (2005). Comparison between four methods for data fusion of ETM+
multispectral and pan images. Geo-spatial Information Science, 8(2),
98–103. doi: 10.1007/BF02826847

Du, Q., Younan, N. H., King, R., & Shah, V. P. (2007). On the Performance
Evaluation of Pan-Sharpening Techniques. Geoscience and Remote Sensing
Letters, IEEE, 4(4), 518 –522. doi: 10.1109/LGRS.2007.896328

Karathanassi, V., Kolokousis, P., & Ioannidou, S. (2007). A comparison study
on fusion methods using evaluation indicators. International Journal of
Remote Sensing, 28(10), 2309–2341. doi: 10.1080/01431160600606890

NikosAlexandris · June 27, 2012, 12:34pm

[Addendum]

Hamish:

> (not talking about pan-sharpening, but in general,)
> how about the situation where you have a map data
> which is loudly dominated by a signal, and you want to try and remove that
> loud signal so that you can look at the subtle variations caused by a
> different source that the loud signal had been masking?

Nikos:

Yes, this _can_ be a perfect use-case.

Especially if the presence of the feature in question, is in at least one or
in some of the input dimensions near/close to zero. This last statement is
based on Pielou's (flawless explanation of how PCA works) [1] and own
experiences [2].

If the above holds true (near-zero projection, in one or some dimensions, of
the features of interest), and in some way, these "subtle variations" do form
distinct clusters, I'd be keen to try out non-centered PCA and (re-)confirm
Pielou and myself (claiming, for example, higher Producer's accuracies as in
--my-- burned area mapping).

Nikos

A separation/isolation attempt of the feature in question from dominant
variances will be "supported". The "loud" signal would be channeled among
the first few PCs and the "subtle variations" _could_ then be more evident
in some of the higher order components.

All in all, one has to look at the numbers -- drawing conclusions from the
PC images is not safe!

[...]