[GRASS-user] Creating PCA Plot

Hi,

How to create a PCA plot for two channels of a landsat image.?

i.pca outputs a eigen values, vectors and percentage importance

Could anybody explain how to plot it?

Does i.pca transforms/changes pixel values?

Regards,
Rashad

Rashad M wrote:

Hi,

Hi Rashad :slight_smile:

How to create a PCA plot for two channels of a landsat image.?

In grass use d.correlate, in R too many options!

i.pca outputs a eigen values, vectors and percentage importance

Quoting myself :-p:

--%<---
The eigenvalues define proportionally the length of the axes of variation
and the eigen or characteristic vectors define the direction of the
variation (Ahearn and Wee, 1991). Since both the eigenvectors and the PCs
only define directions, they can be arbitrarily multiplied by &#8722;1
(Cadima and Jolliffe, 2009).
--->%--

Effectively, percentages indicate the amount of variance that has been
redistributed in a Principal Component -- remember, PCs are sorted from
the one that holds the largest variance to the one that holds the smallest
variance.

Could anybody explain how to plot it?

You mean just a bi-variate scatter-plot?

PCA is a linear transformation for multivariate data sets. The new,
transformed variables (or dimensions or channels or you name them) can
then be plotted the same way as any other raster map. E.g., d.histogram
for single stuff, d.correlate (for a scatter-plot) and probably more.

Does i.pca transforms/changes pixel values?

Yes.

Normally, one would select Landsat bands of interest, i.e. bands that are
profiling "wanted" landscape features. A PCA would then transform a set of
bands into something new: sorted variables in which the original variance
of the data is redistributed in a way that the first Principal Components
contain most of it, while the higher order transformed variables contain
the smallest amounts of the original variance. Note, changes tend to
appear in some of the higher order PCs. Noise, is most of the time
accumulated in the last PC. And, of course, there are many and diverse
uses of PCs (like compression, fusion, etc.).

(So,) If you have your PCs of interest, then you can scatter-plot them in
grass with <d.correlate> for example. In R, however, you can load, in
theory, infinite number of dimensions (in your wording == channels) and
plot really nice and fancy stuff.

I have tried to clearly present the PCA concept in my work. Will send you
a link and stuff of mine -- they might be useful for you to make them even
better (!).

Additionally, recently I have seen some very nice tri-variate PC plots in
some presentation... (dunno remember now, it was certainly someone inside
the GRASS GIS community!).

Ah, don't forget to have a look in GRASS-Wiki (and maybe help iron the
page!): <http://grasswiki.osgeo.org/wiki/Principal_Components_Analysis&gt;\.

Best, N

Thanks Nikos Alexandris,

This is helpful for me. So to get a pca plot i need to use just d.correlate with the output raster from i.pca right?

Also i.pca changes spatial position and to output of i.pca will have the changed pixel position

···

On Wed, Feb 6, 2013 at 2:33 PM, Nikos Alexandris <nik@nikosalexandris.net> wrote:

Rashad M wrote:

Hi,

Hi Rashad :slight_smile:

How to create a PCA plot for two channels of a landsat image.?

In grass use d.correlate, in R too many options!

i.pca outputs a eigen values, vectors and percentage importance

Quoting myself :-p:

–%<—
The eigenvalues define proportionally the length of the axes of variation
and the eigen or characteristic vectors define the direction of the
variation (Ahearn and Wee, 1991). Since both the eigenvectors and the PCs
only define directions, they can be arbitrarily multiplied by −1
(Cadima and Jolliffe, 2009).
—>%–

Effectively, percentages indicate the amount of variance that has been
redistributed in a Principal Component – remember, PCs are sorted from
the one that holds the largest variance to the one that holds the smallest
variance.

Could anybody explain how to plot it?

You mean just a bi-variate scatter-plot?

PCA is a linear transformation for multivariate data sets. The new,
transformed variables (or dimensions or channels or you name them) can
then be plotted the same way as any other raster map. E.g., d.histogram
for single stuff, d.correlate (for a scatter-plot) and probably more.

Does i.pca transforms/changes pixel values?

Yes.

Normally, one would select Landsat bands of interest, i.e. bands that are
profiling “wanted” landscape features. A PCA would then transform a set of
bands into something new: sorted variables in which the original variance
of the data is redistributed in a way that the first Principal Components
contain most of it, while the higher order transformed variables contain
the smallest amounts of the original variance. Note, changes tend to
appear in some of the higher order PCs. Noise, is most of the time
accumulated in the last PC. And, of course, there are many and diverse
uses of PCs (like compression, fusion, etc.).

(So,) If you have your PCs of interest, then you can scatter-plot them in
grass with <d.correlate> for example. In R, however, you can load, in
theory, infinite number of dimensions (in your wording == channels) and
plot really nice and fancy stuff.

I have tried to clearly present the PCA concept in my work. Will send you
a link and stuff of mine – they might be useful for you to make them even
better (!).

Additionally, recently I have seen some very nice tri-variate PC plots in
some presentation… (dunno remember now, it was certainly someone inside
the GRASS GIS community!).

Ah, don’t forget to have a look in GRASS-Wiki (and maybe help iron the
page!): <http://grasswiki.osgeo.org/wiki/Principal_Components_Analysis>.

Best, N

Regards,
Rashad

Rashad M wrote:

This is helpful for me. So to get a pca plot i need to use just
d.correlate with the output raster from i.pca right?

Well,

PCA (in grass i.pca) will output as many variables you will feed it to --
they will be the transformed variables.

Out of those, ususally 2-dimensional plots, or even 3-dimensional plots
can be created -- various combinations. To exemplify, if you transform 6
Landsat bands and get 6 new Principal Components, then you would pick up
combinations of 2 PCs to created 2D scatter-plots, or at most,
combinations of 3 PCs to create 3D plots.

( We are, after all, bound to at most 3D, right? :smiley: )

Anna K showed me also the wxIClass stuff which might help in searching for
class separabilities! <http://grasswiki.osgeo.org/wiki/WxIClass&gt;\.

It's the user's responsibility, however, to identify which PCs are of his
interest, which combination will offer useful information.

Also i.pca changes spatial position and to output of i.pca will have the
changed pixel position

No. It's simply a mathematical linear transformation -- does not affect
the "spatial" domain. Dunno, in my brain I see it simply as a
redistribution of the original data's variance(s) (?).

Nikos

[..]