Optimization in Spectral Unmixing

Edward Leaver
Icarus Resources LLC

June 15, 2009
Best viewed with MathJax or Firefox

1 Introduction
2 Cross-Correlation Spectral Matching
3 Optimized Cross-Correlation Spectral Matching
4 Linear and Non-Linear Least-Squares Fitting
5 Image Classification and Statistics
  5.1 Image Mean Vector and Covariance Matrix
  5.2 Principal Component Transformation
  5.3 Minimum Noise Fraction

1 Introduction

Remote Sensing Analyis of Alteration Minerology Associated With Natural Acid Drainage
in the Grizzly Peak Caldera, Sawatch Range, Colorado
David W. Coulter
Ph.D. Thesis Colorado School of Mines, 2006

[Picture]

The Gizzly Peak Caldera is located Colorado’s Mineral Belt approximately 15 miles southeast of Aspen. It is the product of a volcanic eruption $\sim 35$ million years ago that ejected some 600 ${km}^{3}$ of magma. With a volcanic explosive index of 7, it was at least four times larger than Tambora.[2, sec 1.5]

Figure 1: West Red (Ruby Mountain) From Enterprise Peak

Coulter sought to identify acidic conditions by the different weathering states of iron oxide, which forms Hematite (red) in acidic conditions, and Goethite (lite yellow) and Jarosite (dark yellow) as pH increases, Jarosite being the most immediate oxidation product of iron sulfide (pyrite).

Figure 2: West Red Iron Endmembers from Aviris:

Red: Hematite	Green: Goethite	Blue: Jarosite
(high pH)		(low pH)

Note on Partial Unmixing:

”Unmixing is critical in imaging spectrometry since virtually every pixel contains some macroscopic mixture of materials. The theory of, and methods for unmixing of spectroscopic signatures are found in a number of sources. Hapke (1993) provides models for linear and non-linear spectral mixing and a discussion of the criteria for using each approach. In Earth remote sensing, a linear mixing model is typically used. Full unmixing which assumes that the spectra of all possible components are known, is described by van der Meer (2000) and Boardman (1989). Since it is often impossible to identify, a priori, all possible components, partial unmixing is an important tool. Match Filtering (MF) (Harsanyi and Chang 1994), Constrained Energy Minimization (CEM – similar to Match Filtering) (Farrand and Harsanyi 1997; Farrand 2001), and Mixture Tuned Match Filtering ( ${MTMF}^{tm}$ ) (Boardman et al. 1995) are commonly used methods for partial unmixing. ${MTMF}^{tm}$ is probably the most popular unmixing method used for geologic remote sensing.”[2, Coulter sec 1.4.4]

(Emphasis Added.)

2 Cross-Correlation Spectral Matching

Cross-Correlation Spectral Matching

${x_{i}; i = 1 . . . N}$ wavelength vector with $N$ bands
${y_{i} = y (x_{i}); i = 1 . . . N}$ unknown (observed) image spectrum at given pixel
${z_{i} = z (x_{i}); i = 1 . . . N}$ known reference spectrum;
(to be matched against y)

Linear Correlation Coefficient (Pearson’s $r$ ):

r = \frac{\sum_{i = 1}^{N} (y_{i} - \bar{y}) (z_{i} - \bar{z})}{\sqrt{\sum_{i = 1}^{N} {(y_{i} - \bar{y})}^{2}} \sqrt{\sum_{i = 1}^{N} {(z_{i} - \bar{z})}^{2}}} - 1 \leq r \leq 1 .

(1)

See [4, eq. (13.7.1)]. Generalize: cross-correlation at position $m < N$ :

r_{m} = \frac{\sum_{i = 1}^{N - m} (y_{i + m} - \bar{y}) (z_{i} - \bar{z})}{\sqrt{\sum_{i = 1}^{N - m} {(y_{i} - \bar{y})}^{2}} \sqrt{\sum_{i = 1}^{N - m} {(z_{i} - \bar{z})}^{2}}}

(2)

Ref. [8, eq. (33)]. Measures of fit:

1.

r_{0}

relative to 1 is not particularly good, but

t = r \sqrt{\frac{N - 2}{1 - r^{2}}}

(3)

(see [4, eq. (13.7.5) ff.]) is distributed in null case as Student’s $A (t | N - 2)$ .

2.

χ^{2}

statistic:

χ^{2} = (1 - r^{2}) \sum_{i = 1}^{N} {(y_{i} - \bar{y})}^{2}

(4)

See [4, eq. (14.2.13)]. For non-uniform weights, weight the sums in eq. (1) by $1 ∕ σ_{i}^{2}$ . Then goodness-of-fit

Q = Q (\frac{N - 2}{2}, \frac{χ^{2}}{2})

(5)

([4, eq. (14.2.12)]) where the incomplete gamma function $Q (a, x)$ is

\begin{array}{rcl} Q (a, x) & = & \frac{Γ (a, x)}{Γ (a)} = \frac{1}{Γ (a)} \int_{x}^{\infty} e^{- t} t^{a - 1} d t & (6) \\ Q (a, 0) & = & 1 and Q (a, \infty) = 0 & (7) \end{array}

See [4, eq. (6.2.3)]. If e.g. $Q > 0.1$ , then the goodness-of-fit is believable. If $Q$ is larger than, say, 0.001, the fit may be acceptable if the the errors are non-normal or have been moderately underestimated. If $Q < 0.001$ then question the model and/or estimation proceedure eq. (2). If latter, consider robust estimation.

Other useful measures Skewness

1.

Skewness: cross-correlogram of perfect match is parabola:

S k e w (r_{m}) = \frac{1}{M} \sum_{m = 1}^{M} {(r_{m} - r_{- m})}^{(2)} \geq 0

(8)

2.

RMS Skewness error [8, eq. (34)]:

R M S = \sqrt{\frac{\sum_{m = 0}^{M} {(r_{m} - r_{m}^{'})}^{2}}{M}}

(9)

where $r_{m}^{'}$ is auto-correlogram of z with itself, $M$ is number of match positions, $m$ is match number.

1.

Another possibly useful statistical measure, if

(y, z)

is binormal and

N

is moderately large

(N \geq 10)

, is Fisher’s z-transform

z = (1 ∕ 2) \ln [(1 + r) ∕ (1 - r)]

. Then if you measure the correlation coefficients of the unknown signal y against two reference signals

(z_{1}, z_{2})

, the significance of a difference between the two measured correlation coefficients

r_{1}

and

r_{2}

is given by the complementary error function [4, eq. (13.7.10)]

erfc (\frac{| r_{1} - r_{2} |}{\sqrt{\frac{2}{N_{1} - 3} + \frac{2}{N_{2} - 3}}})

(10)

This can be useful when trying to assess whether a parameterized fit changes ”significantly” when a given change of parameter produces the two references $(z_{1}, z_{2})$ .

2.

Non-parametric or Rank Correlation (Spearman’s Rank-Order Correlation Coefficient). See [4, Sec. 13.8] and [6, ].

3 Optimized Cross-Correlation Spectral Matching

Optimized Cross-Correlation Spectral Matching In ”traditional” CCSM, the reference spectrum z to which $y$ is compared is taken to be a single (pure) endmember.
Next assume M endmembers $X_{k}$ of interest, and linear mixing. Seek weighting factors ${a_{k}; k = 1, \dots M}$

z_{i} = \sum_{k = 1}^{M} a_{k} X_{i k}

(11)

(synthesized pixel intensity at band $i$ ) that maximize the cross correlogram

r_{m} = \frac{\sum_{i = 1}^{N - m} (y_{i + m} - \bar{y}) (z_{i} - \bar{z})}{\sqrt{\sum_{i = 1}^{N - m} {(y_{i} - \bar{y})}^{2}} \sqrt{\sum_{i = 1}^{N - m} {(z_{i} - \bar{z})}^{2}}}

(12)

at match position $m = 0$ , places that maximum at $m = 0$ , and minimizes its skew, subject to $0 \leq a_{k} \leq 1$ and $\sum_{k = 1}^{M} a_{k} = 1$ . Significance of different values of $r_{0}$ resulting from different choices of endmember set ${X_{k}; k = 1, \dots M}$ may be assessed using eq. (10). Coulter [2, ] chooses endmember set that maximizes $r_{0}$ . As written (12) is the cross-correlation of two unit vectors:

\begin{array}{rcl} r_{m} & = & \sum_{i = 1}^{N - m} {\hat{y}}_{i + m} {\hat{z}}_{i} - 1 \leq r_{m} \leq 1 where & (13) \\ {\hat{z}}_{i} & = & \frac{\sum_{k = 1}^{M} a_{k} (X_{i k} - \bar{X_{k}})}{\sqrt{\sum_{i = 1}^{N} [\sum_{k = 1}^{M} a_{k} (X_{i k} - \bar{X_{k}})]^{2}}} & (14) \end{array}

which is independent of the normalization of $a$ . Likewise, if $\vec{y}$ is spectrum at a particular image pixel, then the unit normalization $\hat{y}$ rends the $r_{m}$ somewhat independent of shadow and striping.

Equation (14) makes $r_{m}$ non-linear in the $a_{k}$ ; eq (13) may be maximized w.r.t the $a_{k}$ by the non-linear constrained optimizer of one’s choice. It does not give much insight into the relative band-to-band errors inherent in $y$ . As written, it assumes they are all equal, and we can pretend:

σ^{2} = \frac{1}{N} \sum_{i = 1}^{N} {({\hat{y}}_{i} - {\hat{z}}_{i})}^{2}

(15)

Given spectrometer calibration, we can do better than this.

4 Linear and Non-Linear Least-Squares Fitting

Least-Squares Fitting Since $r_{m}$ is expressed in terms of unit vectors, maximizing $r_{0} = \hat{y} \cdot \hat{z}$ is equivalent to minimizing

\begin{array}{rcl} {(\hat{y} - \hat{z})}^{2} & = & \hat{y} \cdot \hat{y} + \hat{z} \cdot \hat{z} - 2 \hat{y} \cdot \hat{z} & (16) \\ = & 2 (1 - \hat{y} \cdot \hat{z}) & (17) \end{array}

Minimizing the squared-difference of unit vectors $\hat{y}$ and $\hat{z}$ in (16) is still a non-linear problem in the $a_{k}$ because of how they appear in the denominator of the normalization of $\hat{z}$ (14). However, if we relax the restrictions that z be a unit vector (and that $\sum a_{k} \equiv 1$ ), we can define the equivalent problem

\min F (\hat{y}; a_{1} . . . a_{M}) = {(\hat{y} - \sum_{k = 1}^{M} a_{k} X_{k})}^{2}

(18)

which is linear in the $a_{k}$ , but not yet quite as general as we want. At each band $i$ there are (at least) three uncertainties in the sensor measurement: those in the received radiance $y_{i}$ , those in the center wavelength $x_{i}$ , and those in the sensor’s bandwidth, assumed as a fwhm of a gaussian. Fwhm is typically on order of band-to-band wavelength spacing. Fwhm errors are usually of secondary significance and will be ignored.

If we allow for uncertain band centers $x_{i}$ and assume normal independent distributions, we can write

P (x_{i}^{0}, y_{i}^{0}) = e^{- \frac{{(y_{i}^{0} - y (x_{i}))}^{2}}{2 σ_{y_{i}}^{2}}} e^{- \frac{{(x_{i}^{0} - x_{i})}^{2}}{2 σ_{x_{i}}^{2}}}

(19)

as joint probability sensor records band radiance $y_{i}^{0}$ at recorded wavelength $x_{i}^{0}$ , when it actually received (unknown) radiance $y_{i}$ at actual (and unknown) wavelength $x_{i}$ . Joint probability of pixel’s measured spectrum across N bands is

P (x^{0}, y^{0}) = \prod_{i = 1}^{N} e^{\frac{1}{2} [{(\frac{y_{i}^{0} - y (x_{i})}{σ_{y_{i}}})}^{2} + {(\frac{x_{i}^{0} - x_{i}}{σ_{x_{i}}})}^{2}]}

(20)

Define

\begin{array}{rcl} χ^{2} & = & 2 \log P (x^{0}, y^{0}) & (21) \\ = & \sum_{i = 1}^{N} [{(\frac{y_{i}^{0} - y (x_{i})}{σ_{y_{i}}})}^{2} + {(\frac{x_{i}^{0} - x_{i}}{σ_{x_{i}}})}^{2}] & (22) \end{array}

Associate the (unknown) actual spectrum $y (x_{i})$ with the modeled mixture $z (x_{i}; a_{1}, \dots a_{M}) = \sum_{k = 1}^{M} a_{k} X_{k} (x_{i})$ . Then

χ^{2} = \sum_{i = 1}^{N} [{(\frac{y_{i}^{0} - z (x_{i}; a_{1}, \dots a_{M})}{σ_{y_{i}}})}^{2} + {(\frac{x_{i}^{0} - x_{i}}{σ_{x_{i}}})}^{2}]

(23)

represents a constrained optimization problem wherein we wish to minimize $χ^{2}$ as a function of $N + M$ variables $(a_{1}, \dots a_{M}, x_{1}, \dots x_{N})$ subject to $a_{i} \geq 0 \forall i$ . The motivation for including the $(x_{1}, \dots x_{N})$ is to (hopefully) obtain ”more” non-negative $a_{k}$ in the optimal solution.

Normal Equations

\begin{array}{rcl} \frac{\partial χ^{2}}{\partial a_{k}} & = & 0 k = 1, \dots M & (24) \\ \frac{\partial χ^{2}}{\partial x_{i}} & = & 0 i = 1, \dots N & (25) \end{array}

where

\frac{\partial χ^{2}}{\partial a_{k}} = 2 \sum_{i = 1}^{N} [\frac{y_{i}^{0} - z (x_{i}; a_{1}, \dots a_{M})}{σ_{y_{i}}^{2}} \frac{\partial z (x_{i}; a_{1}, \dots a_{M})}{\partial a_{k}}]

(26)

and

\frac{\partial χ^{2}}{\partial x_{i}} = 2 \sum_{i = 1}^{N} [\frac{y_{i}^{0} - z (x_{i}; a_{1}, \dots a_{M})}{σ_{y_{i}}^{2}} \frac{\partial z (x_{i}; a_{1}, \dots a_{M})}{\partial x_{i}} + \frac{x_{i} - x_{i}^{0}}{σ_{x_{i}}^{2}}]

(27)

and

\begin{array}{rcl} z (x_{i}; a_{1}, \dots a_{M}) & = & \sum_{j = 1}^{M} a_{j} X_{j} (x_{i}) & (28) \\ \frac{\partial z (x_{i}; a_{1}, \dots a_{M})}{\partial a_{k}} & = & X_{k} (x_{i}) & (29) \\ \frac{\partial z (x_{i}; a_{1}, \dots a_{M})}{\partial x_{i}} & = & \sum_{j = 1}^{M} a_{j} \frac{\partial X_{j} (x)}{\partial x} ∣_{x = x_{i}} & (30) \end{array}

Eqs (28) and (30) make eq. (27) quadratic in the $a_{k}$ , so the optimization including center wavelengths uncertainties is non-linear. The $\frac{\partial X_{j} (x)}{\partial x}$ can be computed efficiently enough, but the $X_{k} (x_{i})$ will need to be reconvolved against the new center wavelengths for each new set of center wavelengths ${x_{i}; i = 1 \dots N}$ required during the optimization. The problem will obviously be much simpler if we assume the provided center wavelength values are ”good enough” and simply ignore eqs (27). Dropping the superscript $0$ , equations (26) then become

\frac{\partial χ^{2}}{\partial a_{k}} = 2 \sum_{i = 1}^{N} [\frac{1}{σ_{y_{i}}^{2}} (y_{i} - \sum_{j = 1}^{M} a_{j} X_{j}) \cdot X_{k}] = 0

(31)

which are the usual linear least-squares normal equations for the $a_{k}$ . For what its worth, for the unconstrained problem the minimal solution vector ${a_{k}; k = 1, \dots M}$ will be unique provided the matrix $M_{i j} = X_{i} \cdot X_{j}$ is non-singular.

5 Image Classification and Statistics

5.1 Image Mean Vector and Covariance Matrix

Image Mean Vector and Covariance Matrix [9, White 2005] If image has $P$ bands and $N$ pixels, the mean vector is

{\vec{m}}_{p} = {(m_{1}, m_{2}, . . . m_{P})}^{T} = \frac{1}{N} \sum_{j = 1}^{N} {\vec{f}}_{j}

(32)

where ${\vec{f}}_{j}$ is the jth pixel vector of the image

{\vec{f}}_{j} = {(f_{i}, f_{2}, . . . f_{P})}_{j}^{T}

(33)

The image covariance matrix $C_{PxP}$ is[1, Chang 2013 (6.6)]

C_{PxP} = \frac{1}{N} (F_{PxN} - M_{PxN}) {(F_{PxN} - M_{PxN})}^{T}

(34)

where $F_{PxN}$ is the matrix of $N$ pixel vectors each of length $P$

F_{PxN} = {\vec{f}}_{1}, {\vec{f}}_{2}, {\vec{f}}_{3}, . . ., {\vec{f}}_{N}

(35)

$M_{PxN}$ is the matrix of $N$ identical mean vectors ( $P$ rows by $N$ columns):

M_{PxN} = {\vec{m}}_{P}, {\vec{m}}_{P}, {\vec{m}}_{P}, . . ., {\vec{m}}_{P} = {\vec{m}}_{P} 1_{PxN}

(36)

where $1_{PxN}$ is an $P \times N$ matrix of ones.

5.2 Principal Component Transformation

Principal Component Transformation [5, Smith 1985] Karhunen-Loeve Transformation[10, White 2005] GRASS imagery i.pca: Let

X_{PxN} = (F_{PxN} - M_{PxN}) zero-mean image matrix

(37)

Z_{PxN} = A_{PxP} X_{PxN}

(38)

$F_{PxN}$: input-image multi-pixel vector ( $P$ bands by $N$ pixels)
$M_{PxN}$: mean vector matrix,
$Z_{PxN}$: output-image multi-pixel vector,
$A_{PxP}$: $P \times P$ matrix whose rows are the eigenvectors of the covariance matrix $C_{PxP}$ , arranged by decreasing magnitude of eigenvalue, as typically returned by SVD routines.

\begin{array}{rcl} Z_{i k} & = & \sum_{j = 1}^{P} a_{i j} X_{j k} i = 1, 2 \dots P; k = 1, 2 \dots N & (39) \\ {\vec{Z}}_{k}^{T} & = & A_{PxP} {\vec{X}}_{k}^{T} orthogonal w.r.t the N pixels: & (40) \end{array}

\begin{array}{rcl} λ_{i} δ_{i l} & = & \sum_{k = 1}^{N} Z_{i k} Z_{l k} & (41) \\ λ_{i} & = & \sum_{k = 1}^{N} (\sum_{j = 1}^{P} a_{i j} X_{j k}) (\sum_{m = 1}^{P} a_{l m} X_{m k}) & (42) \\ = & \sum_{j = 1}^{P} \sum_{m = 1}^{P} a_{i j} a_{l m} [\sum_{k = 1}^{N} X_{j k} X_{m k}] & (43) \\ \equiv & \sum_{j = 1}^{P} \sum_{m = 1}^{P} a_{i j} a_{l m} C_{j m} & (44) \\ = & a_{i}^{T} C_{PxP} a_{l} & (45) \end{array}

$C_{PxP}$	the symmetric positive-definite image covariance matrix
${\vec{a}}_{i}$	are its orthonormal eigenvectors with eigenvalue $λ_{i}$ .

Magnitudes of $λ_{i}$ impose ordering on transformed component vectors $Z_{i k} = \sum_{j = 1}^{P} a_{i j} X_{j k}$ . Those with the largest $λ_{i}$ s.t. $λ_{i} ∕ λ_{m a x} > t o l$ are the Principal Components. Tolerance tol should be related to the noise floor.

5.3 Minimum Noise Fraction

Minimum Noise Fraction[8, pg. 38] [3, ] We wish to find a particular coefficient matrix ${a_{i j}; i, j = 1, \dots P}$ that in some sense maximizes the image S/N, assuming the image pixel vectors ${{\vec{X}}_{k}, k = 1, \dots N}$ are the sum of uncorrelated signal and noise:

\begin{array}{rcl} X_{i k} & = & S_{i k} + N_{i k} i = 1, \dots P k = 1, \dots N & (46) \\ Z_{i k} & = & \sum_{j = 1}^{P} a_{i j} X_{j k} = \sum_{j = 1}^{P} a_{i j} (S_{j k} + N_{j k}) & (47) \\ Z_{i}^{T} & = & a_{i}^{T} X_{PxN} where a_{i}^{T} = (a_{i 1}, a_{i 2}, \dots a_{i P}) & (48) \\ = & a_{i}^{T} S_{PxN} + a_{i}^{T} N_{PxN} & (49) \end{array}

Maximize

R = \frac{V a r (Z_{i}^{signal})}{V a r (Z_{i}^{noise})} = \frac{(a_{i}^{T} S_{PxN}) {(a_{i}^{T} S_{PxN})}^{T}}{(a_{i}^{T} N_{PxN}) {(a_{i}^{T} N_{PxN})}^{T}}

(50)

\begin{array}{rcl} R & = & \frac{(a_{i}^{T} S_{PxN}) {(a_{i}^{T} S_{PxN})}^{T}}{(a_{i}^{T} N_{PxN}) {(a_{i}^{T} N_{PxN})}^{T}} = \frac{a_{i}^{T} (S_{PxN} S_{PxN}^{T}) a_{i}}{a_{i}^{T} (N_{PxN} N_{PxN}^{T}) a_{i}} & (51) \\ = & \frac{a_{i}^{T} C_{PxP}^{S} a_{i}}{a_{i}^{T} C_{PxP}^{N} a_{i}} = \frac{a_{i}^{T} (C_{PxP} - C_{PxP}^{N}) a_{i}}{a_{i}^{T} C_{PxP}^{N} a_{i}} & (52) \\ = & \frac{a_{i}^{T} C_{PxP} a_{i}}{a_{i}^{T} C_{PxP}^{N} a_{i}} - 1; IF C_{PxP} = C_{PxP}^{S} + C_{PxP}^{N} & (53) \\ = & λ_{i} - 1 & (54) \end{array}

where $λ_{i}$ is generalized eigenvalue of $C_{PxP}$ wrt $C_{PxP}^{N}$ , and $a_{i}$ are corresponding generalized eigenvectors. Compare with PCA:

λ_{i}^{P C A} = a_{i}^{T} C_{PxP} a_{l} (eq. 45)

(55)

Noise Covariance $C_{PxP}^{N}$ Green suggests $C_{PxP}$ be of unit variance and band-to-band uncorrelated.

Sensor error estimate: (Aviris .rcc file) Call supplied error vector ${e_{j}; j = 1, \dots P}$ . Then
$C_{l m}^{N} = e_{m}^{n} δ_{l m} (n = 1 or 2)$ (56)

is completely uncorrelated. In ideal case all $e_{m}$ are equal:
$\begin{array}{rcl} C_{l m}^{N} & = & e I, and & (57) \\ a_{i}^{T} C_{PxP} a_{i}^{T} & = & λ_{i} a_{i}^{T} C_{PxP}^{N} a_{i} = λ_{i} e (a_{i}^{T} I a_{i}) & (58) \\ = & λ_{i} e (a_{i}^{T} \cdot a_{i}) = λ_{i} e = λ_{i}^{P C A} & (59) \end{array}$

since the eigenvectors $a_{i}$ are orthonormal. $λ_{i} = λ_{i}^{P C A}$ if the variance $e \equiv 1$ .

Homogeneous Area Method [7, sec 2.9.1] If possible find homogenous area of $N_{h}$ pixels in image:

\begin{array}{rcl} {\vec{M}}_{L} & = & \frac{1}{N_{k}} \sum_{k = 1}^{N_{h}} {\vec{X}}_{k} local mean, vector over bands & (60) \\ {\vec{σ}}_{L} & = & \frac{1}{N_{k} - 1} [\sum_{k = 1}^{N_{h}} {({\vec{X}}_{k} - {\vec{M}}_{L})}^{2}]^{1 ∕ 2} & (61) \\ σ_{L i} & = & \frac{1}{N_{k} - 1} {[\sum_{k = 1}^{N_{h}} {(X_{i k} - M_{L i})}^{2}]}^{1 ∕ 2} & (62) \\ C_{L i j} & = & \frac{1}{N_{k} - 1} \sum_{k = 1}^{N_{h}} (X_{i k} - M_{L i}) (X_{j k} - M_{L j}) (general) & (63) \\ = & \frac{δ_{i j}}{N_{k} - 1} \sum_{k = 1}^{N_{h}} {(X_{i k} - M_{L i})}^{2} (zero band-to-band) & (64) \end{array}

Local Means and Local Variances [7, sec 2.9.2]

1.

Divide image into small

N_{b}

pixel blocks (4x4, 5x5,...)

2.

For each (block, band) get local mean and variance:

\begin{array}{rcl} M_{L i} & = & \frac{1}{N_{b}} \sum_{k = 1}^{N_{b}} X_{i k} i = 1, 2, \dots P bands & (65) \\ {σ_{L i}}^{2} & = & \frac{1}{N_{b} - 1} \sum_{k = 1}^{N_{b}} {(X_{i k} - M_{L i})}^{2} & (66) \\ C_{L i j} & = & \frac{(δ_{i j})}{N_{k} - 1} \sum_{k = 1}^{N_{b}} (X_{i k} - M_{L i}) (X_{j k} - M_{L j}) & (67) \end{array}

$C$ is the local $N_{b} \times N_{b}$ covariance matrix.

3.

bin

{{σ_{L i}}^{2}}

into classes between band min and max values.

4.

bin with most blocks represents mean noise of image. Hope this bin is same for all bands...

Local Means and Local Variances (con’t)

5.

Suppose ”most popular” bin is a P-cube, each side ranging

[σ_{i}^{*}, σ_{i}^{*} + △ σ_{i}^{*}]

, contains

N^{*}

points. Then the average value over the bin

\bar{C_{L i j}} = \frac{1}{N^{*}} \sum_{k = 1}^{N^{*}} {(C_{L i j})}_{k}

(68)

is desired noise covariance matrix.

6.

Caveat: Assumes image is ”slowly varying enough” that enough of the

N_{total}

blocks are homogeneous, i.e. their covariance is really due to noise, not true features in the image. Blocks and bins must both be ”small enough” – but not too small!

Other methods: ”unsupervised training” derived-endmember classification schemes e.g. LAS’ search ([11, ]) and GRASS’ cluster/maxlik are based upon local covariance minimization.

References

[1] Chein-I Chang. Hyperspectral Data Processing: Algorithm Design and Analysis. John Wiley & Sons, 111 River St, Hoboken, NJ 07030, 2013.

[2] David W. Coulter. Remote Sensing Analysis of Alteration Mineralogy Associated with Natural Acid Drainage in the Grizzly Peak Caldera, Sawatch Range, Colorado. PhD thesis, Colorado School of Mines, Golden, Colorado, 2006.

[3] A.A. Green, M. Berman, P. Switzer, and M.D. Graig. A transformation for ordering multispectral data in terms of image quality with implications for noise removal. Journal of Geophysical Reseach, 90:797 – 804, 1988.

[4] William H. Press, Brian P. Flannery, Saul A. Teukolsky, and William T. Vetterling. Numerical Recipes in C. Cambridge University Press, Cambridge, New York, Port Chester, Melbourne, Sidney, 1988.

[5] M.O. Smith, P.E. Johnson, and J.B. Adams. Quantitative determination of mineral types and abundances from reflectance spectra using principal component analysis. IEEE Transactions on Geoscience and Remote Sensing, 36:65 – 74, 1985.

[6] Frank D. van der Meer. Extraction of mineral absorbtion features from high-spectral resolution data using non-parameteric geostatistical techniques. International Journal of Remote Sensing, 15:2193–2214, 1994.

[7] Frank D. van der Meer and Steven M. de Jong. Imaging Spectroscopy. Kluwer Academic Publishers, Dordrecht, Boston, London, 2001.

[8] Frank D. van der Meer, Steven M. de Jong, and W. Bakker. Imaging Spectroscopy: Basic analytical techniques, pages 17–62. Kluwer Academic Publishers, Dordrecht, Boston, London, 2001.

[9] R. A. White. Image mean and covariance: http://dbwww.essc.psu.edu/lasdoc/user/covar.html, 2005.

[10] R. A. White. Karhunen-loeve transformation: http://dbwww.essc.psu.edu/lasdoc/user/karlov.html, 2005.

[11] R. A. White. Search unsupervised training site selection: http://dbwww.essc.psu.edu/lasdoc/user/search.html, 2005.