Potential reasons for ionospheric anomalies detected by nonlinear principal component analysis just before the China Wenchuan earthquake, and their relationship to source conditions

Nonlinear principal component analysis (NLPCA) was performed to examine the total electron content (TEC) anomalies for the China Wenchuan earthquake of May 12, 2008 (= 7.9). This was applied to global ionospheric maps (GIMs) at heights from 150 km to 450 km, with the transforms conducted for the time period 00:00 to 06:00 UT on May 12, 2008. The earthquake occurred at 06:28 UT. The GIMs were analyzed by PCA and NLPCA, whereby they were separated into 100 smaller maps of 36° in longitude and 18° in latitude. These smaller maps are constructed at 71 × 71 pixels, which forms the transform matrix for NLPCA. The transform allows a principal eigenvalue to be assigned for each of the smaller maps. The results of the transforms provide 100 principal eigenvalues that cover the region, which includes the epicenter of the Wenchuan earthquake. The possibility of TEC anomalies caused by X-ray fluxes and geomagnetic activity is eliminated by reviewing X-ray flux data and the Kp index. The eigenvalues of NLPCA are compared with those of PCA. TEC anomalies were clearly detected using NLPCA, with large principal eigenvalues that represent earthquake-related TEC anomalies near the epicenter for the time period 00:00-0600 UT. The results of this study are discussed in terms of the potential causes of these TEC anomalies, especially for the time period 02:00-04:00 UT, when rocks around the epicenter were potentially subjected to a rapid increase in stress, which might have induced very pronounced p-type semiconductor effects. These effects will probably have been present for the other time periods, but might not have been so pronounced.


Introduction
The China Wenchuan earthquake occurred on May 12, 2008, at 06:28:00 (UT), at latitude 31.119˚N and longitude 103.258˚E with M = 7.9. It has been the subject of a great many investigations into earthquake precursors. One particular area of study has been the detection of total electron content (TEC) anomalies prior to the mainshock [Zhao et al. 2008, Liu et al. 2009, Jhung et al. 2010, Jin et al. 2010. These studies have shown either enhancement or depletion of the TEC over the earthquake preparation zone in the few days leading up to the Wenchuan earthquake. Lin [2010aLin [ , 2010bLin [ , 2011aLin [ , 2011bLin [ , 2011c presented the Karhunen-Loève transform (also known as principal component analysis, PCA) as an alternative method for the measurement of these earthquake-associated TEC anomalies. In these cases, the PCA recognized the same anomalies determined in other studies. In particular, Lin [2011a] focused on the detection of earthquake-associated TEC anomalies in the 24 h before 24 earthquakes of Richter magnitude scale M 5.0, and before six weaker earthquakes of Richter magnitude scale M <5 that occurred on Taiwan from January 1, 2000, to December 31, 2001. The results showed earthquake-related TEC anomalies before 21 of these 24 larger earthquakes, after making allowances for Xray flux and geomagnetic storm activity.
The present study takes a similar approach, but contrasts nonlinear PCA (NLPCA) with PCA. Like PCA, NLPCA is used to identify and remove correlations among problem variables, so as to reduce the dimensional space and to uncover correlations between variables in a given dataset. However, while PCA is useful for uncovering linear correlations, NLPCA can uncover both linear and nonlinear correlations; for example, the dynamics of ionospheric plasma when a TEC anomaly is in its nebulous stage.
The coupling of ionospheric TEC anomalies and earthquakes is not clearly understood. However, many studies have proposed reasonable explanations for the coupling mechanism. Pulinets and Boyarchuk [2004] suggested that radon emanating from active faults and cracks before earthquakes ionizes the near-ground atmosphere, to produce large vertical electric fields. Molchanov and Hayakawa [1998] suggested gravity waves that arise from fine vibrations in the Earth surface and lead to gas release that results in lower atmospheric turbulence and eventual ionospheric perturbations. Such fine vibrations might also be responsible for sub-audile gravity waves that cause such anomalies, and substantial post-earthquake research has been conducted in this area [Garcia et al. 2005].
Independent component analysis was applied recently by Orihara et al. [2009] to detect geoelectric field changes in noisy daytime data in Japan. They had previously determined that there were geoelectric field anomalies in relatively night-time data, 11 days before a mainshock earthquake (M = 4.8) that occurred in Nagano Prefecture on January 28, 1999. They attributed this precursor signal to being similar to those investigated by the Varotsos, Alexopoulos and Nomikos (VAN) group in Greece Alexopoulos 1984a, 1984b]. The VAN group has examined seismic electric signals (SES) to determine the epicenters and the magnitudes of impending earthquakes Lazaridou 1991, Varotsos et al. 1993], as well as time windows for forthcoming earthquakes using natural time analysis [Varotsos et al. 2002[Varotsos et al. , 2003a[Varotsos et al. , 2003b[Varotsos et al. , 2005. Examples of SES natural time analysis have been reported for both Greece [Varotsos et al. 2005[Varotsos et al. , 2006a[Varotsos et al. , 2006b and Japan ]. In addition, predictions based on SES of four major earthquakes in Greece during 2008 were recently described by Kamogawa [2008, 2010].
The exact cause of SES before an earthquake is not fully understood. However, experimental data have shown that stressed rock can produce p-type semiconductor effects, whereby, in metamorphic and igneous rock, charge separation occurs between electrons in the stressed areas of rock and small positive holes (p-holes) occur away from the stressed regions. Once the positive holes are generated, a number of phenomena occur at the Earth surface as the current propagates through rock, leading to electromagnetic emissions, positive surface potentials, corona discharges, positive ion emissions, and mid-infrared radiation [Freund 2003].
In the present study, NLPCA and PCA are applied to the TEC global ionospheric maps (GIMs) at heights from 150 km to 450 km, to identify TEC anomalies associated with the China May 12, 2008, Wenchuan earthquake, for the time period 00:00-6:00 (UT) of May 12, 2008. The earthquake occurred at 06:28 UT, so this time-frame is immediately before the earthquake. The results of both of these approaches are compared. The data comes from the FORMOSAT-3 satellite system and the International Global Navigation Satellite Systems (GNSS) Service (IGS). The satellite data are updated every 90 min, and there are approximately 2,500 GPS-occultation-density profiles per 24 h. The number of profiles is adequate to obtain meaningful results using either PCA or NLPCA.

Principal component analysis and nonlinear principal component analysis
PCA is a widely used technique in data analysis. It is a simple method that allows the extraction of relevant data from multivariate datasets. PCA is used to identify and remove correlations among problem variables, so as to reduce the dimensional space and to uncover linear correlations between variables in a given dataset.
NLPCA, on the other hand, can uncover both linear and nonlinear correlations; e.g., the dynamics of ionospheric plasma. Mathematically, for PCA the input data matrix is X, which has the dimensions n × m; S is called the scores matrix, and has dimensions n × r, where r is the given number of factors; R is called the loading matrix, and it has dimensions n × r (m > r); and the matrix of residuals is called E, with dimensions n × m [Kramer 1991]. The scores matrix is given by the following relation: Taking P T P = I as the unit matrix for mapping a gaseous form: ( 2) where is a row of S (a single data vector), and is the corresponding row of X, or the coordinates of in the feature space. R is the matrix for linear transformation. By reversing the projection, we get: ( 3) where is the reconstructed measurement vector, and the Euclidean norm is , which must be minimized for the given number of factors. To satisfy this criterion, the columns of R must be the eigenvector corresponding to the r eigenvalues of the covariance matrix of X when computed.
For NLPCA, by analogy with Equation (2), the mapping is in the form: where is a nonlinear vector function with r individual nonlinear functions: = {H 1 , H 2 , ... H r }. By analogy with Equation (3): where is the second nonlinear vector function with m individual nonlinear functions: = {G 1 , G 2 , ...G m }, the loss of information is again measured by to minimize , as with PCA, when and are selected. To generate and , a basis function approach is used [Cybenko 1989].

S XR
Similar to PCA, the largest eigenvalues m 1 of the covariance matrix X can then be computed. The largest eigenvalue is called the principal eigenvalue (m 1 ), which represents the principal characteristics of input signals X.

Global ionospheric map processing
First, the GIMs are divided into 100 smaller maps of 36i n longitude and 18˚in latitude, and then these smaller maps are constructed at 71 × 71 pixels. The smaller maps form the matrix of Equations (2), (3) (for PCA) and Equations (4), (5) (for NLPCA) as input data with dimensions 2 × 1 (i.e., m = 2, n = 1) through image processing [Vasilescu andTerzopoulos 2007, Zhang et al. 2009] for PCA and NLPCA. With 2,500 GPS-occultation electron-density profiles generated every 24 h, one to two measured electron-density profiles can be included in each of the 100 grids, which allows for principal eigenvalues to be computed for each of the 100 smaller maps. For comparison, PCA and NLPCA are used to examine the GIMs for 00:00-06:00 UT on May 12, 2008, and the respective results are given in Figures 1 to 3. Figure 1a shows the GIM of May 12, 2008, for 00:00-02:00 UT. Figure 1b gives the color-coded scale of the magnitudes of the principal eigenvalues using NLPCA that correspond to Figure 1a. The color intensity denotes magnitude. In this manner, 100 principal eigenvalues are assigned (i.e., each grid in the bottom figures represents a principal eigenvalue). Figure 1c gives a color-coded scale of the magnitudes of the principal eigenvalues using PCA that correspond to Figure 1a, to compare these with those using NLPCA. Figures 2 and 3 show the same approach, but for the time range of 02:00-06:00 UT.

Results
By examining the Figures 1b to 3b, using NLPCA, it can be readily seen that there were TEC anomalies on May 12, 2008, for the time from 00:00 to 06:00 UT, as given by the color intensities that represent large eigenvalues (i.e., values >0.5 in a normalized set) [Lin 2010a]. In particular, for the time period 02:00-04:00 UT, the principal eigenvalues are clearly large above the epicenter (see Figure 2b). These large eigenvalues cannot be detected using PCA (see Figures 1c to  3c). These results show that NLPCA is better at the detection of TEC anomalies than PCA for these time periods. Figure 4 shows the X-ray flux data (Figure 4a) and the Kp index (Figure 4b) for the corresponding time periods. It is evident that such disturbances were low for May 12, 2008. The large principal eigenvalues given in Figures 1, 2 and 3b are around and above the epicenter. Lin [2010a] applied PCA (or the Karhunen-Loève transform) to ionospheric TECs to investigate precursors to 12 earthquakes of M ≥5.0 for January 1, 2002, to December 31, 2003. That analysis recognized the same anomalies found by Liu et al. [2006] for this series of earthquakes, thereby providing a firm mathematical basis to these findings and TEC ANOMALY DETECTION USING NLPCA  . (b, c) Color-coded scales of the magnitudes of principal eigenvalues using NLPCA (b) and PCA (c) corresponding to (a). The colors within the grids give the magnitudes of a principal eigenvalue corresponding to (a), so that there are 100 principal eigenvalues assigned (i.e., each grid represents a principal eigenvalue). a b c establishing criteria for using PCA in the detection of earthquake-associated TEC anomalies. These established criteria allow PCA to detect earthquake-associated TEC anomalies when earthquakes are have M ≥5.0. Lin [2011c] applied PCA to show earthquake-associated TEC anomalies identified by Zhao et al. [2008] for the May 12, 2008, Wenchuan earthquake, and described their spatial extent. Similarly, Lin [2011b] applied PCA immediately after the Wenchuan earthquake, and found a cone-shaped TEC anomaly over the epicenter that gathered in intensity  between the heights of 200 km to 300 km, which is suggestive of a snowplow effect from a rising acoustic shockwave. The present study is the first to report TEC anomalies so close to the nucleation phase of the Wenchuan earthquake, through using NLPCA. The advantage of NLPCA in finding TEC anomalies so close to the mainshock is its high temporal resolution. As mentioned in the Introduction, the VAN group in Greece has been looking at the possibility of such TEC anomalies being caused by stressed rock that produces electric fields. This area of study has looked at pressurestimulated currents that can create electric fields in nonhomogenous crustal rock [Varotsos and Alexopoulos 1984c, Varotsos et al. 1998, Parsons et al 2008, Xu et al. 2009 and that produce SES, which they attempt to define as earthquake precursors.

Discussion
Another area of study is that of p-holes. These charge carriers are defect electrons in the O 2− sub-lattice that are chemically equivalent to O − in a matrix of O 2− . These occur in stressed igneous and metamorphic rock, which creates a ptype semiconductor effect [Freund 2003]. If the stress is pronounced, then the p-type semiconductor effect will also be pronounced. Pulinets and Boyarchuk [2004] suggested that such lower-atmosphere electric fields can travel unimpeded into the ionosphere along geomagnetic lines, causing TEC anomalies.
In the present study, the most intense TEC anomalies detected by NLPCA were for the time period 02:00-04:00 UT (Figure 2b). At this time, it would be reasonable to assume that the source fault would have been experiencing pronounced changes in stress. Rapid increases in stress might have caused pronounced p-type semiconductor effects. This would offer an explanation for the clear TEC anomalies observed above the epicenter for this time period. However, for the other time periods, of 00:00-02:00 UT and 04:00-06:00 UT, the source-fault stress might not have been as large, so that the p-type semiconductor effect was not as pronounced, and hence the anomalies were less evident. Lin [2010Lin [ , 2011b studied the spatial extent of the post Wenchuan earthquake TEC anomaly. However, that anomaly was assumed to have been caused by acoustic shock waves or by acoustic gravity waves that accompanied the earthquake. These types of anomaly require strong ground motion, which would have been absent before the earthquake.

Conclusions
TEC anomalies that are believed to be earthquake related were identified for the time period 00:00 to 06:00 UT using NLPCA. The M = 7.9 May 12, 2008, Wenchuan earthquake occurred at 06:28 UT. The TEC anomalies were around and over the earthquake epicenter, and then quickly spread out over a wide region. These TEC anomalies might be attributable to lower atmosphere electric field activity associated with the earthquake preparation processes, such as stressed rock producing p-type semiconductor effects. These might have occurred if the source fault underwent rapidly increasing stress in the time period of 02:00-04:00 UT. This stress might not have been as pronounced for the other time periods, and therefore the TEC anomalies recognized were lower.