Vertex Component Analysis in Hyperspectral Unmixing

Vertex Component Analysis in Hyperspectral Unmixing

Edward Leaver
Icarus Resources LLC

July 10, 2013
Best viewed with MathJax or Firefox

[Picture]

Contents

 1 Introduction
 2 Image Classification and Statistics
  2.1 Image Mean Vector and Covariance Matrix
  2.2 Principal Component Transformation
  2.3 Minimum Noise Fraction
 3 Virtual Dimensionality
  3.1 HFC
  3.2 NWHFC
  3.3 NSP
  3.4 OSP
 4 Vertex Component Analysis
  4.1 VCA Algorithm

1 Introduction

Remote Sensing Analyis of Alteration Minerology Associated With Natural Acid Drainage
in the Grizzly Peak Caldera, Sawatch Range, Colorado
David W. Coulter
Ph.D. Thesis Colorado School of Mines, 2006

[Picture]

The Gizzly Peak Caldera is located Colorado’s Mineral Belt approximately 15 miles southeast of Aspen. It is the product of a volcanic eruption 35 million years ago that ejected some 600  km3 of magma. With a volcanic explosive index of 7, it was at least four times larger than Tambora.[3, sec 1.5]


[Picture]

Figure 1: West Red (Ruby Mountain) From Enterprise Peak

Coulter sought to identify acidic conditions by the different weathering states of iron oxide, which forms Hematite (red) in acidic conditions, and Goethite (lite yellow) and Jarosite (dark yellow) as pH increases, Jarosite being the most immediate oxidation product of iron sulfide (pyrite).


[Picture]

Figure 2: West Red Iron Endmembers from Aviris:
Red: HematiteGreen: GoethiteBlue: Jarosite
(high pH) (low pH)

Note on Partial Unmixing:

”Unmixing is critical in imaging spectrometry since virtually every pixel contains some macroscopic mixture of materials. The theory of, and methods for unmixing of spectroscopic signatures are found in a number of sources. Hapke (1993) provides models for linear and non-linear spectral mixing and a discussion of the criteria for using each approach. In Earth remote sensing, a linear mixing model is typically used. Full unmixing which assumes that the spectra of all possible components are known, is described by van der Meer (2000) and Boardman (1989). Since it is often impossible to identify, a priori, all possible components, partial unmixing is an important tool. Match Filtering (MF) (Harsanyi and Chang 1994), Constrained Energy Minimization (CEM – similar to Match Filtering) (Farrand and Harsanyi 1997; Farrand 2001), and Mixture Tuned Match Filtering ( MTMF tm) (Boardman et al. 1995) are commonly used methods for partial unmixing.  MTMF tmis probably the most popular unmixing method used for geologic remote sensing.”[3, Coulter sec 1.4.4]

(Emphasis Added.)

2 Image Classification and Statistics

2.1 Image Mean Vector and Covariance Matrix

Image Mean Vector and Covariance Matrix [10, White 2005] If image has P bands and N pixels, the mean vector is

mp = (m1,m2,...mP )T = 1 N j=1Nf j (1)

where fj is the jth pixel vector of the image

fj = (fi,f2,...fP )jT (2)

The image covariance matrix C PxP is[2, Chang 2013 (6.6)]

C PxP = 1 N (F PxN M PxN) (F PxN M PxN) T (3)

where F PxN is the matrix of N pixel vectors each of length P

F PxN = f1,f2,f3,...,fN (4)

M PxN is the matrix of N identical mean vectors (P rows by N columns):

M PxN = mP ,mP ,mP ,...,mP = mP 1 PxN (5)

where 1 PxN is an P × N matrix of ones.

2.2 Principal Component Transformation

Principal Component Transformation [7, Smith 1985] Karhunen-Loeve Transformation[11, White 2005] GRASS imagery i.pca: Let

X PxN = (F PxN M PxN) zero-mean image matrix (6)
Z PxN = A PxPX PxN (7)

F  PxN
input-image multi-pixel vector (P bands by N pixels)
M  PxN
mean vector matrix,
Z  PxN
output-image multi-pixel vector,
A  PxP
P × P matrix whose rows are the eigenvectors of the covariance matrix C PxP, arranged by decreasing magnitude of eigenvalue, as typically returned by SVD routines.
Zik = j=1P a ijXjki = 1, 2P; k = 1, 2N (8) ZkT = A PxPX kT  orthogonal w.r.t the N   pixels: (9)

λiδil = k=1NZ ikZlk (10) λi = k=1N ( j=1P a ijXjk) ( m=1P a lmXmk) (11) = j=1P m=1P a ijalm [ k=1NX jkXmk] (12) j=1P m=1P a ijalmCjm (13) = aiT C PxPa l (14)

C PxPthe symmetric positive-definite image covariance matrix
ai are its orthonormal eigenvectors with eigenvalue λi.

Magnitudes of λi impose ordering on transformed component vectors Zik = j=1P a ijXjk. Those with the largest λi s.t. λiλmax > tol are the Principal Components. Tolerance tol should be related to the noise floor.

2.3 Minimum Noise Fraction

Minimum Noise Fraction[9, pg. 38] [4, ] We wish to find a particular coefficient matrix {aij; i,j = 1,P} that in some sense maximizes the image S/N, assuming the image pixel vectors {Xk,k = 1,N} are the sum of uncorrelated signal and noise:

Xik = Sik + Niki = 1,Pk = 1,N (15) Zik = j=1P a ijXjk = j=1P a ij(Sjk + Njk) (16) ZiT = a iT X PxN     where        a iT = (a i1,ai2,aiP ) (17) = aiT S PxN + a iT N PxN (18)

Maximize

R = V ar(Zi signal) V ar(Zi noise) = (aiT S PxN)(a iT S PxN)T (aiT N PxN)(aiT N PxN)T (19)
R = (aiT S PxN)(a iT S PxN)T (aiT N PxN)(aiT N PxN)T = aiT (S PxNS PxNT )a i aiT (N PxNN PxNT )ai (20) = aiT C PxPSa i aiT C PxPNai = aiT (C PxP C PxPN)a i aiT C PxPNai (21) = aiT C PxPa i aiT C PxPNai 1;         IF        C PxP = C PxPS + C PxPN (22) = λi 1 (23)

where λi is generalized eigenvalue of C PxP wrt C  PxPN, and ai are corresponding generalized eigenvectors. Compare with PCA:

λiPCA = a iT C PxPa l        (eq. 14) (24)

Noise Covariance C PxPN Green suggests C PxP be of unit variance and band-to-band uncorrelated.

Homogeneous Area Method [8, sec 2.9.1] If possible find homogenous area of Nh pixels in image:

M L = 1 Nk k=1Nh Xk            local mean, vector over bands (29) σ L = 1 Nk 1 [ k=1Nh (Xk M L)2] 12 (30) σLi = 1 Nk 1 [ k=1Nh (Xik MLi)2] 12 (31) C Lij = 1 Nk 1 k=1Nh (Xik MLi)(Xjk MLj)       (general) (32) = δij Nk 1 k=1Nh (Xik MLi)2          (zero band-to-band) (33)

Local Means and Local Variances [8, sec 2.9.2]

1.
Divide image into small Nb pixel blocks (4x4, 5x5,...)
2.
For each (block, band) get local mean and variance: M Li = 1 Nb k=1Nb Xiki = 1, 2,P   bands (34) σ Li2 = 1 Nb 1 k=1Nb (Xik M Li)2 (35) C Lij = (δij) Nk 1 k=1Nb (Xik M Li)(Xjk M Lj) (36)

C is the local Nb × Nbcovariance matrix.

3.
bin {σ Li2} into classes between band min and max values.
4.
bin with most blocks represents mean noise of image. Hope this bin is same for all bands...

Local Means and Local Variances (con’t)

5.
Suppose ”most popular” bin is a P-cube, each side ranging [σi,σ i + σ i], contains N points. Then the average value over the bin
C Lij¯ = 1 N k=1N (C Lij)k (37)

is desired noise covariance matrix.

6.
Caveat: Assumes image is ”slowly varying enough” that enough of the N total blocks are homogeneous, i.e. their covariance is really due to noise, not true features in the image. Blocks and bins must both be ”small enough” – but not too small!

Other methods: ”unsupervised training” derived-endmember classification schemes e.g. LAS’ search ([12, ]) and GRASS’ cluster/maxlik are based upon local covariance minimization.

3 Virtual Dimensionality

We implement the Harsani-Farrand-Chang (HFC) and noise-whitened Harsani-Farrand-Chang (NWHFC) esimators of the image virtual dimensionality VD = p L as described in [1, C.-I Chang and Du 2004] and [2, C.-I Chang 2013] Chapter 5.3.6.

3.1 HFC

Let R LxN [r1,r2,rN] be the input image matrix of N pixel (sample) vectors rn each of length L spectral bands: ri = (ri1,ri2,ri,L)T ; 1 i N. The image spectral correlation matrix R LxL and covariance matrix K LxL are then

R LxL = 1 NR LxNR LxNT (38) K LxL = 1 N(R LxN μ)(R LxN μ)T  where (39) μl = 1 N i=1Nr il    is the vector of image spectral means (40)

Let {λ^1 λ^2 λ^L} and {λ1 λ2 λL} denote the (ordered) correlation and covariance eigenvalues. By assuming that signal sources are nonrandom unknown positive constants and noise is white with zero mean, we may expect

λ^l > λl forl = 1,, VD (41) λ^l = λl = σnl2 forl =  VD + 1,,L (42) (43)

where σnl2 is the noise variance in the lth spectral channel. Formulate the VD determination as a binary hypothesis problem:

H0 : zl = λ^l λl = 0  versus (44) H1 : zl = λ^l λl > 0  for  l = 1, 2,,L (45)

The null hypothesis H0 and the alternative hypothesis H1 represent the case that the correlation eigenvalue is equal to its corresponding covariance eigenvalue (no signal), and the case that the correlation eigenvalue is greater than its corresponding covariance eigenvalue. When H1 is true (i.e. H0 fails), it implies there is an endmember contributing to the correlation eigenvalue in addition to noise, since the noise energy represented by the eigenvalue of R  LxL in that particular component is the same as the one represented by the eigenvalue of K  LxL in its corresponding component.

If we assume the noise in each spectral dimension has zero mean and varance σnl2, then λ^l = μl2 + σ nl2 and λl = σnl2 where μl is the sample mean in the lth spectral dimension and σnl2 is the channel noise. Despite the fact the λ^ and λ are unknown constants, we can model each pair of eigenvalues (λ^,λ) under hypothesis H0 and H1 as random variables with asymptotic probability densities given by

p0(zl) = p(zl|H0) N(0,σzl2)  and (46) p1(zl) = p(zl|H1) N(μl,σzl2)  forl = 1, 2,,L (47) N(μ,σ2) = 1 σ2π exp [ (x μ)2 2σ2 ] (48)

respectively, where μl is an unknown constant for each l and the variance σzl2 is given by

σzl2 =  Var[λ^ l λl] =  Var[λ^l] +  Var[λl] 2 Cov(λ^l,λl) (49)

“When the total number of samples N is sufficiently large,  Var[λ^l] 2λ^l2N and  Var[λl] 2λl2N. Therefore, the noise variance σzl2 in (42) can be estimated and approximated using (49).” (Although its not obvious exactly how. Play statisitics on collections of subimages, perhaps? In his MATLAB code, Chang approximates σzl2 (2N)(λ^ l2 + λ l2), so we do as well.)

Chang and Du ([1, eq. 8]) use Schwartz’ inequality to bound

 Cov(λ^l,λl)  Var[λ^l] Var[λl] 2 N(λ^l λl) so that (50) σzl2 2 N(λ^l λl)2 (51) σzl2 2 N(λ^l2 + λ l2) (52) (53)

From (46), (47), and (49), we define the false alarm probability and detection probability as

PF = τlp 0(z)dz (54) PD = τlp 1(z)dz (55)

We then choose a given false-alarm probability PF (for example 0.001) and invert (54) to give

τl = σl2 erfc_inv(2PF ) where (56) σl 2(λ^l2 + λl2)N (57)

A case of λ^l λl > τl fails the null hypothesis test, and indicates there is signal energy assumed to contribute to the eigenvalue λ^l in the lth spectral dimension. Do note the threshold τl is different for each spectral dimension l.

3.2 NWHFC

As noted by Chang ([2, Section 5.3.6]), the signature variance σsl2 is generally very small and can be influenced by interband covariances. Interband covariance may be reduced by “noise whitening”: preprocessing by multiplying the input image by a noise covariance matrix estimated by a technique developed by Roger and Arnold, which may be performed prior to the HFC method. This is discussed far too sparsely in ([1, Chang and Du]) and hardly at all in ([2, Chang 2013]), so we’ll merely reference the former and translate Matlab code from the latter (Section A.1).

Noise Estimation :

K inv = K LxL1 (58) κ =  diag(K inv) (59) K noise =  diag(1.κ) (60) (61)

The noise covariance matrix K noise is the LxL diagonal matrix whose elements are the inverse of the diagonal elements of K  LxL1. For noise whitening we want the square root of its inverse, denoted Kn(12), also diagonal:

Kn(12) = (K  noise)1 (62) Kn(12) l,l = Kl,l1 (63) Y LxN = Kn(12)R LxN (64)  VD =  HFC(Y,t) (65)

Yis the noise-whitened image matrix. The NWHFC VD estimate is obtained simply by running the HFC algorithm on Yrather than R.

3.3 NSP

The Noise Subspace Projection method ([1, ]) estimates VD solely from the spectral covariance matrix K  LxL. Let the matrix Kn(12) be as above and define the noise-whitened covariance matrix

K¯ = Kn(12)K LxLK n(12) (66)

As result, the noise variance of each band in the whitened K¯ is reduced to unity. Let {ul}l=1L be a set of eigenvectors generated by K¯, which can then be expressed as

K¯ = l=1 VDλ¯ lululT + l= VD+1Lλ¯ lululT (67)

where {ul}l=1 VD and {ul}l= VD+1L span signal subspace and noise subspace respectively. The variances of the noise components of (67) have been whitened and normalized to unity, so λ¯l = 1; l =  VD + 1,,L and λ¯ > 1 otherwise. The problem of VD estimation can then be formulated as a binary hypothesis testing problem:

H0 : yl = λ¯l = 1  versus (68) H1 : yl = λ¯l > 1  for  l = 1, 2,,L (69)

where

p0(yl) = p(yl|H0) N(1,σyl2)  and (70) p1(yl) = p(yl|H1) N(μl,σyl2)  forl = 1, 2,,L (71) N(μ,σ2) = 1 σ2π exp [ (x μ)2 2σ2 ] (72)

where μl is an unknown constant for each l and the variance σyl2 is given by

σyl2 =  Var[λ¯ l] 2λ¯l2 N (73)

which can be further reduced under hypothesis H0 to

σyl2 2 N (74)

Finally, find the Neyman-Pearson detector δ NP for (68) to determine VD. The false-alarm probability PF is the probability that we think we’ve detected a signal when none was actually present. In terms of eigenvalues λ¯l this the probability of measuring a value λ¯l > 1 when the “true” value was its minimum noise-only value λ¯l = 1. For a desired PF we seek a threshold τl such that

PF = 1+τlp 0(y)dy 1+τlN(1,σ l2)dy (75) = 1 σl2π1+τl exp [ (y 1)2 2σl2 ]dy (76) = 1 σl2πτl exp [ x2 2σl2 ]dx (77) = 1 πτ l(σl2) exp [z2]dz (78) = 1 2 erfc(τl(σl2)) (79)

with solution

τl = (σl2) erfc_inv(2PF ) (80) 2λ¯l N erfc_inv(2PF ) (81) 2 N erfc_inv(2PF ) (82)

The virtual dimensionality VD is then the number of eigenvalues λ¯l whose value exceeds the threshold 1 + τl: λ¯l 1 + τl; l = 1, 2,, VD.

3.4 OSP

Orthogonal Subspace Projection ([2, 5.4.1]). OSP is a general technique that also works well as part of VCA (4).

Assume that there are p signal sources with L bands, {s1,s2,sp}, present in the data to be processed and every data sample vector ri can be expressed by a linear mixture of these p signal sources as

ri = Spαi + ni (83)

where Sp = [s1s2sp] is a signal matrix made up of the p signal sources, {s1,s2,sp}, and ni can be interpreted as the noise vector or model error vector.

Let Pp = Sp(SpT S p)1S pT be the p-signal projection matrix formed by the signal matrix Sp. It can then be used to map all data sample vectors {r1,r2,,rN} into the spectral space linearly spanned by the p signal sources {s1,s2,sp}. In other words, every data sample vector ri can be expressed by a linear mixture of the p signal sources,{s1,s2,sp}, specified by (83) via the p-signal projection matrix Pp where the noise ni in (83) is included to account for the linear mixture model error:

αi = Sp1(r i ni) (84) = (SpT S p)1S pT (r i ni) (85)

Since each sample vector produces a different model error and residual, the sample mean vector is used to represent an averaged model error and residual. In this case, from (83) the sample mean vector μ can be expressed by

μ = (1N)[Sp i=1Nα i + i=1Nn i] (86) = Sp((1N) i=1Nα i) + (1N) i=1Nn i (87) = Spα¯p + n¯ (88)

where α¯p = (1N) i=1Nα i and n¯ = (1N) i=1Nn i. There are two ways to find  VD OSP.

 OSP(p) = E [(Ppμ)T (P pμ)]    or (89)  OSP(p) α¯pT S pT S pα¯p (90)

without involving noise covariance matrix estimation. Theoretically, the value of OSP(p) in (89) increases as the value of p increases. For any given error threshold 𝜖, VD can be determined by a stopping rule by 𝜖. The value of p determining the OSP(p) in (89) or (90) is denoted by  VD OSP. Two criteria are developed to detect the abrupt change of the OSP(p) value. One that is based on the gradient, denoted by “”, is defined by

 VD algorithm OSP,(𝜖) = arg {min 1pL | OSP(p + 1)  OSP(p)  OSP(p)  OSP(p 1) | < 𝜖}. (91)

The other is based on the difference and is denoted by minus “-”:

 VD algorithm OSP,(𝜖) = arg {min 1pL | OSP(p + 1)  OSP(p)| < 𝜖}. (92)

For the difference criterion, the sample mean vector μ is normalized before orthogonal projection so that the values of threshold 𝜖 for these two criteria are comparable for analysis. The threshold 𝜖 in (91) and (92) is generally selected according to a sudden drop or a clear gap between two consecutive values of p in plots of the gradient in (91) and difference (92) versus the value of p.

It should be noted the above  VD OSP definitions involve two key parameters: one is the error threshold 𝜖,and the other is the algorithm used to produce the p-signal matrix Sp, which is not specified in (91) and (92). We shall use “VCA” for “algorithm” at least initially, but there are other choices including target-specific algorithms such as ATGP, as well as generic SVD. See ([2, 5.4.1] for further discussion.

4 Vertex Component Analysis

Vertex Component Analysis is an “unsupervised training” derived-endmember classification scheme. We use the notation of [6, Nascimento and Dias], which is slightly different than that used in previous sections.

4.1 VCA Algorithm

Assume linear mixing, and let

N number of pixels in image
L number of spectral bands recorded at each pixel
p number of endmembers, p L, and usually p L for hyperspectral images.
mj is an endmember spectral vector of length L, 1 j p
γi is a scale factor modeling illumination variability due to surface topography at pixel i
αi = [αi1,αi2,...αip]T is the abundance vector containing the fractions of each endmember at pixel i. Positivity: αik 0 and 1pT α i = 1, that is k=1pα ik = 1.
xi is the “true” noise-free spectral vector of length L at pixel i N.
ni is the noise vector at pixel i.
M LxP = [m1,m2,mp] is an Lxp mixing matrix that maps an abundance vector α to a “true” spectral vector x.

Then the recorded spectral vector at pixel i may be given by

ri = xi + ni = Mγiαi + nii = 1,N (93) = xi + ni = Msi + nii = 1,N (94) si γiαi scaled abundance vector (95)

Our goal is to find a set of abundance vectors {αi,i = 1,N} corresponding to some endmember set {mj,j = 1,p}. An appropriate endmember set {m} is to be determined as part of the VCA algorithm. Endmember identification – matching one or more of the VCA generated endmembers to actual specimen spectral samples such as from USGS spectral library or field ground-truth sampling – may be done in subsequent processing steps.

Since the set {α p : 1T α = 1,α 0} is a simplex, the set Sz = {z L : z = Mα,1T α = 1,α 0; γ 0} is also a simplex. However, even assuming n = 0, the observed vector set belongs to convex cone Cp = {r L : r = Mγα,1T α = 1,α 0; γ 0} owing to different scale factors γ.

But the projective projection of the convex cone Cp onto a properly chosen hyperplane is a simplex with vertices corresponding to the vertices of simplex Sx. The simplex Sp = {y L : y = r(rT u),r C p} is the projective projection of the convex cone Cp onto the hyperplane rT u = 1, where the choice of u assures there are no observed vectors r parallel to the hyperplane ( orthogonal to its defining vector u): riT u|u| < |r i|,i = 1,...N.

Geometry Note : I’m going to suggest there’s just a shade more going on here. Suppose we neglect noise and normalize all the pixel values such that |ri| 1, which is easy enough to do. Consider a three band image, such as an rgb color image. the {ri} are now neatly mapped onto the surface of the first (all axis non-negative) octant of the rgb unit sphere – a pixel’s intensity of any color cannot be negative. For convenience of visualization, assume we may choose u = (1, 1, 1)3 which, if there were pure pixels lying on each unit axis ie (1,0,0), (0,1,0), (0,0,1) assures the simplex of points projected onto the u plane is a pure equilateral triangle whose verteces correspond to the “1” values on the rgb unit axis.

But real photos rarely if ever span the complete rgb bandwidth. Their normalizations will usually map to some sub-patch of that first unit octant. Their projections onto the u plane will still form a simplex, but will it have more vertices than a triangle? The VCA algorithm asserts “no”, and starts by finding the vertices of that projected simplex on the u plane.

4.1.1 SNR

VCA algorithm accuracy is dependent on image SNR. Several ways to estimate:

4.1.2 Algorithm 1: Vertex Component Analyis (VCA)


Inputs:

R LxN [r1,r2,rN] is the image matrix of N pixel vectors rn each of length L spectral bands.
p, desired dimensionality of problem, the number of endmembers. p may be estimated as the number of “large” eigenvalues of the image covariance matrix. p L.
 SNR th is the fixed threshold value 101.5L of the SNR measured wrt the signal subspace. Different values might be used for different SNR estimators. See above.

Outputs:

M^ is the L × p estimated mixing matrix returned by VCA.

Notations:

[M^]:,j is the jth column of M^
[M^]:,i:k are the ith through kth columns of M^
[X]:,[index] is the matrix formed from the columns of X listed in index vector [index]
A+ denotes the pseudo-inverse of pxp matrix A as computed by SVD. A+ is what you think it is. Write the SVD as A = UΣVT where Σ is the pxp diagonal matrix whose diagonal elements are the (ordered) singular values: Σ =  diag{w1,w2,wp} Then the pseudo-inverse may be written A+ = VΣ+UT where Σ+ =  diag{w 1+,w 2+,w p+} and the pseudo-inverse of any scalar w+ = 0 if w 0 and w+ = w1 otherwise. Thus A+ = A1 if A is non-singular. If P = AA+ and Q = A+A, then
  • P is the orthogonal projector onto the range of A
  • Q is the orthogonal projector onto the range of AT
  • (I P) is the orthogonal projector onto the kernel (null-space) of AT
  • (I Q) is the orthogonal projector onto the kernel (null-space) of A
  • P = AA+ = UΣVT VΣ+UT = UΣΣ+UT and ΣΣ+ =  diag{1, 1,0,} has d non-zero diagonal entries, d p. P is a full pxp symetric matrix, none of whose components contain values from lower right p-d block of U. If p=d then P and Q are the identity matrix.

See Moore-Penrose_pseudoinverse In particular “ the pseudoinverse for matrices related to A can be computed by applying the Sherman-Morrison-Woodbury formula to update the inverse of the correlation matrix, which may need less work. In particular, if the related matrix differs from the original one by only a changed, added or deleted row or column, incremental algorithms[12][13] exist that exploit the relationship.” – a fact that might be quite useful here.

Ud is an Lxd matrix whose columns are the first d eigenvectors of either RT R (which is LxL) or the LxL spectral covariance matrix C. Ud is the left eigenmatrix computed by SVD whose columns are ordered (left to right) in order of decreasing singular value. Its first d columns are the most significant (principal) relative to the rest. We usually select d such that the singular values ωi : {i > d} are in some sense “too small to be significant”.

1: Compute SNR and Virtual Dimensionality p
2: if ( SNR >  SNR th) then

d := p; (96) X := UdT R; U d obtained by SVD of the correlation matrix of R. R is LxN and X is dxN (97) u :=  mean(X); u is a 1 x d vector (98) [Y]:,j := [X]:,j([X]:,jT u);  {projective projection} (99)

else

d := p 1; (100) [X]:,j := UdT ([R] :j r¯j); Ud obtained by PCA of (R r¯);  R is LxN and X is dxN (101) c := arg max j=1 N[X]:,j; (102) c := [c,cc]; c is a 1 x N vector (103) Y := [cX] (104)

Here r¯j is the sample mean of [R]:j, for j = 1,,N.
end if

14: A := [0,0,0]; initialize A, a pxp auxiliary matrix, and its pseudo inverse A+
 Xcor :=  dxd spectral correlation matrix of X, and Uxcor, its SVD eigenvectors.
15: for (i = 1 top) do

w :=  Uxcor[i]; { the ith left SVD eigenvector of Xcor} (105) f := ((I AA+)w)((I AA+)w); {f is a vector orthonormal to the subspace spanned by[A] :,1:i} (106) v := fT Y; (107) k := arg max j=1, ,N|vj|;  which has the maximum projection in the f direction}{find the projection extreme, the index of the component of Y (108) [A]:,i := [Y]:,k;  (if i = 1 find A+ by SVD, else by an update algorithm) (109) [indice]i := k;  {store the pixel index of the extreme, i.e. the i-th endmember} (110)

22: end for
Stopping Criteria: if p in 15: above has been pre-selected e.g. by HFC as described in section (3.1), we’re done. But if OSP (or similar) is desired, there are other tests that might be applied.

23: if ( SNR >  SNR th) then
24: M^ := Ud[X]:,[indice]; M^ is a L × p estimated mixing matrix.
25: else
26: M^ := Ud[X]:,[indice] + r¯; M^ is a L × p estimated mixing matrix.
27: end if

4.1.3 VCA Programming Notes

4.1.4 Icarus Image GUI Notes

Needs:

References

[1]   C.-I Chang and Q. Du. Estimation of number of spectrally distinct signal sources in hyperspectral imagery. IEEE Transactions on Geoscience and Remote Sensing, 42(3):608–619, March 2004.

[2]   Chein-I Chang. Hyperspectral Data Processing: Algorithm Design and Analysis. John Wiley & Sons, 111 River St, Hoboken, NJ 07030, 2013.

[3]   David W. Coulter. Remote Sensing Analysis of Alteration Mineralogy Associated with Natural Acid Drainage in the Grizzly Peak Caldera, Sawatch Range, Colorado. PhD thesis, Colorado School of Mines, Golden, Colorado, 2006.

[4]   A.A. Green, M. Berman, P. Switzer, and M.D. Graig. A transformation for ordering multispectral data in terms of image quality with implications for noise removal. Journal of Geophysical Reseach, 90:797 – 804, 1988.

[5]   Fred A. Kruse. Comparison of aviris and hyperion for hyperspectral mineral mapping. http://w.hgimaging.com/PDF/Kruse_JPL2002_AVIRIS_Hyperion.pdf, 2002.

[6]   José M. P. Nascimento and José M. Bioucas Dias. Vertex component analysis: A fast algorithm to unmix hyperspectral data. IEEE Transactions on Geoscience and Remote Sensing, 43(4), April 2005.

[7]   M.O. Smith, P.E. Johnson, and J.B. Adams. Quantitative determination of mineral types and abundances from reflectance spectra using principal component analysis. IEEE Transactions on Geoscience and Remote Sensing, 36:65 – 74, 1985.

[8]   Frank D. van der Meer and Steven M. de Jong. Imaging Spectroscopy. Kluwer Academic Publishers, Dordrecht, Boston, London, 2001.

[9]   Frank D. van der Meer, Steven M. de Jong, and W. Bakker. Imaging Spectroscopy: Basic analytical techniques, pages 17–62. Kluwer Academic Publishers, Dordrecht, Boston, London, 2001.

[10]   R. A. White. Image mean and covariance: http://dbwww.essc.psu.edu/lasdoc/user/covar.html, 2005.

[11]   R. A. White. Karhunen-loeve transformation: http://dbwww.essc.psu.edu/lasdoc/user/karlov.html, 2005.

[12]   R. A. White. Search unsupervised training site selection: http://dbwww.essc.psu.edu/lasdoc/user/search.html, 2005.