Early warning of tsunami from seismo-ionospheric fluctuation after Japan ’ s March 11 , 2011 , M = 9 . 0 Tohoku earthquake using two-dimensional principal component analysis

Two-dimensional principal component analysis (2DPCA) is implemented to analyze the total electron content (TEC) anomalies after Japan’s Tohoku earthquake that occurred at 05:46 on March 11, 2011 (UTC) (Mw=9). 2DPCA and TEC data processing were conducted just after this devastating earthquake. Analysis results show an earthquakeassociated TEC anomaly near the epicenter that began at 05:47. This may represent an extension of the precursor of the earthquake, to the precursor of China’s Wenchuan earthquake on May 12, 2008, detected by the study of Lin [2012], for which the data were obtained at a height of 150200 km by the FORMOSAT-3 satellite system. It is impossible that such precursor caused by the acoustic shock waves. Another TEC anomaly near the epicenter occurred at 05:53, and this initiated the propagation of the tsunami effect related to the ionosphere through the acoustic shock waves from the epicenter. However, the TEC anomalies did not appear to be affected by a contemporaneous geomagnetic storm and other non-earthquake effects. The propagation of anomalous fluctuation could be an early warning of the tsunami for the regions far from the epicenter as it began to propagate with the higher speed of 3960-4950 km/h than the tsunami speed (720-800 km/h).


Introduction
Research into total electron content (TEC) anomalies associated with earthquakes has been greatly facilitated by the advent of global positioning system (GPS) technology, especially the number of GPS satellites and density of ground receivers [Heki and Ping 2005, Otsuka et al. 2006, Kakinami et al. 2010, Liu et al. 2011, Matsumura et al. 2011, Saito et al. 2011a, Saito et al. 2011b, Tsugawa et al. 2011, Khegai et al. 2013, Jin et al. 2014].A possible cause of earthquake-related TEC anomalies is acoustic gravity waves (AGWs).AGWs are caused by vibrations at the solid-Earth's surface and at sea, resulting from earthquakes, meteor strikes and even anthropogenic activity [Woulff and McGetchin 1976, Rothkaehl et al. 2004, Shinagawa et al. 2007].The data of global ionospheric maps (GIMs) analyzed with two-dimensional principal component analysis (2DPCA) will be used to detect such anomalies.Tsugawa et al. [2011] and Saito et al. [2011b] have investigate TEC maps over Japan for this earthquake event and shown that the TEC variation appeared approximately 7 minutes after the earthquake occurrence.This scenario is reproduced by computer simulations by Matsumura et al. [2011].Effects of tsunami on the ionosphere propagate from the epicenter have been shown by Makela et al. [2011].Occhipinti et al. [2011] describes the tsunami following M w =9 Tohoku earthquake that occurred at 05:46 UT on March 11, 2011, produced internal gravity waves into the neutral atmosphere and large disturbances in the overlying ionospheric plasma while propagating through the Pacific Ocean.Therefore the remote sensing of atmosphere and ionosphere can give new tools for tsunami monitoring and detection.Liu et al. [2011] describe fast propagating plasma disturbances as a type of Rayleigh waves [Yuen et al. 1969] quickly travel away above the epicenter along the main island of Japan with a speed of 2.3-3.3 km/s, followed by circular ripples and a tsunami speed of about is 720-800 km/h.GPS users with single-frequency receivers need ionospheric electron content information in order to achieve positioning accuracy similar to dual-frequency receivers.The Global Differential GPS (GDGPS) System provides a global real-time map of ionospheric electron content.These maps are also of value in monitoring the effect of the ionosphere on radio signals, power grids, and on space weather.The maps are derived using data from the ~100 real-time GDGPS tracking sites.The integrated electron density data along each receiver-GPS satellite link is processed through a Kalman filter to pro-duce the global ionospheric TEC maps (GIM).The maps are available from multiple GDGPS operation centers (GOCs) as images, as text files containing the gridded TEC values, and TEC is an integration of the plasma density from the height 150-450 km, or as a binary data stream containing the gridded TEC values.TEC measured errors (biases), and their correction using the Kalman filter, are described in the following references: Wu and Bar-Sever [2005], Kechine et al. [2004], Ouyang et al. [2008].The GDGPS system is a complete, highly accurate, and extremely robust realtime GPS monitoring and augmentation system.Employing a large ground network of real-time reference receivers, innovative network architecture, and awardwinning real-time data processing software, the GDGPS system provides decimetre (10 cm)-scale positioning accuracy and subnanosecond-scale time transfer accuracy on ground, in air, and in space, independent of local infrastructure.A complete array of real-time GPS state information, environmental data, and ancillary products are available in support of the most demanding GPS augmentation operations -assisted GPS (A-GPS) services, situational assessment, and environmental monitoring -globally, uniformly, accurately, and reliably.The largest real-time tracking network on the ground is shown in Figure 1.The global TEC data were real-time measured by the sensors in the GDGPS satellites, and then they were sent back to this network, therefore the spatial resolutions of the original TEC data would have 5.0 and 2.5 degrees in longitude and latitude, respectively [Hernández-Pajares et al. 2009].The estimated TEC data have been corrected for biases during measurements of dual-frequency delays of GPS signals e.g. carrier phase biases, satellite state corrections, ionospheric delay and troposphere, which need to be removed using ground-based post-processing software [Mannucci et al. 1998, Raman and Garin 2005, Wu and Bar-Sever 2005].The GIMs contain vertical (VTEC), which has been converted from the slant (STEC) at the ionospheric pierce points as STEC=VTEC • ME + b + r, where ME = 1/cos(i) is the mapping function, i is the zenith angle of GPS satellite at the single layer height of the ionosphere, b and r are the instrument biases of the satellites and receivers, respectively.Then VTEC and the instrument biases b and r are obtained by combining interpolation and least-square fitting procedure.For details of the method used to derive the VTEC from GPS measurements, please refer to Mao [2007] and Mao et al. [2008].The estimated VTEC data (after processing as stated above and later is named TEC in this study) at 05:47, 05:53 and 06:47 (UTC) on March 11, 2011, are examined just after Tohoku earthquake that occurred 05:46 UTC on March 11, 2011, at (38.322°N, 142.369°E) with a depth of 24.4 km during a contemporaneous geomagnetic storm.These estimated TEC data are not interpolations and GIMs are used only for observation.In this study, the estimated TEC data are used to process and plot the eigenvalues shown Figures 4-6.For the reasons stated previously, building GIMs via interpolation is beyond the scope of this study and therefore, artificial effects due to interpolation have been avoided.

2DPCA
For 2DPCA, signals are represented by a matrix A (the dimension of n × m).Linear projection of the form is considered as followed [Sanguansat 2012], Here x is an n dimensional project axis and y is the projected feature of signals on x called principal component vector.
Here S x is the covariance matrix of the project feature vector.
The trace of S x is defined; The matrix G is called signal covariance matrix.The vector x maximizing Equation (4) corresponds to the largest (principal) eigenvalue of G and is the largest eigenvalue of the most dominant component of the data and therefore represents the main characteristics and thus the inhomogeneity of the data is detected in this study.2DPCA can remove computing or artificial errors when dealing with small sample signal size (SSS) (low-dimensional data) problem.Non-linear PCA and PCA convert the one-dimensional data prior to the covariance matrix calculation.The covariance matrix of the non-linear PCA and PCA is based on an input matrix with dimensions of m × n, which is reshaped from one-dimensional data (length of m multiplied by n).Reshaping the data will cause computing errors because non-linear PCA and PCA are tools for handling one-dimensional data.Therefore, Lin's study [2012], for which the data were obtained at a height of 150-200 km by the FORMOSAT-3 satellite system and the international GNSS Service (IGS) had computing or artificial errors.The FORMOSAT-3 satellite system has dense TEC data and therefore, the computing or artificial errors should be suppressed by the TEC anomalies related to the earthquake.The TEC data used in this study were not dense; however, 2DPCA can process such TEC data to examine TEC anomalies related to the earthquake and tsunami effects after the removal of computing or artificial errors.Now backing to discuss the concept of 2DPCA, for 2DPCD, the spatial structure and information cannot be well preserved due to some original information loss when inverting to original dimension under the condition of the matrix being small sample size (SSS).Such information loss is called SSS problem.However, the covariance matrix in 2DPCA is full rank for a matrix of low dimension.The curse of dimensionality and SSS problem (low-dimensional data problem) can be avoided by 2DPCA.Therefore 2DPCA will be performed to detect TEC anomaly related to Tohoku earthquake due to the non-dense TEC data set.

Data processing
The estimated TEC data (after processing stated in Section 1) are processed at 05:47, 05:53 and 06:47 (UTC) in Figures 2-4 on March 11, 2011 (UTC).This global region is divided into 600 smaller areas, 12° in longitude and 9° in latitude to detect more detailed TEC information because the resolution of the original TEC data for this GPS system is 5.0 and 2.5 degrees in longitude and latitude, respectively [Hernández-Pajares et al. 2009], and therefore in each area at least 4 TEC data points are taken (12/5=2.4,9/2.5=3.6, the 2 and 2 TEC data are taken in longitude and latitude, respectively.If 6 points are taken, which are 2 and 3 points in longitude and latitude, then more computing time is needed, however the results are the same as using 4 TEC points).In any place of each smaller area (a grid) includes 4 TEC data and thus 4 TEC data are taken to analyze.The 4 TEC data must not have uniform distribution in each smaller area (a grid).The 4 TEC data form a matrix A of dimensions 2 × 2 in Equation (1) in Section 2.1.The matrix belongs to the SSS data.Therefore the 2DPCA is more suitable for SSS data.The TEC data are not uniformly distributed in each grid.It means that the TEC data are not uniformly distributed in the geographical coordinate.Four of the TEC data in each grid are SSS data (low-dimensional data), and 2DPCA is therefore able to manage such data.Results showed an eigenvalue in any place within each grid (each grid is filled with a uniform color for convenience).This study therefore shows large eigenvalues near the epicenter at 05:47, 05:53, which spread out until at 06:47 (UTC) on March 11, 2011.

Results
The first earthquake-related TEC anomaly with a large eigenvalue near the epicenter began at 05:47 on March 11, 2011 (UTC) shown in Figure 5.This anomaly may be an extension of precursor just before this earthquake that was the same as the precursor of China' Wenchuan earthquake on May 12, 2008, detected by EARLY WARNING OF TSUNAMI OF TOHOKU EARTHQUAKE   Lin's study [2012] because the acoustic waves did not generate on the seasurface propagate into the ionosphere at this time [Saito et al. 2011b, Tsugawa et al. 2011].After 05:47, the precursor disappeared quickly because it was slighter weaker than the TEC anomalous fluctuation caused by the tsunami effect (i.e., in Figure 5, the smaller magnitude of the eigenvalue near the epicenter compared with the later eigenvalue figures).This TEC disturbance was caused by a different mechanism to that responsible for the TEC anomaly, which will be discussed later, i.e., they represent different TEC effects.Therefore, these two TEC disturbances (05:47, 05:53) were caused by different mechanisms for which it is possible to have a short time lag.Moreover, in Lin's study [2012], the TEC data were obtained from an integration of the plasma density from the height 150-200 km by the FORMOSAT-3 satellite system, and the P-Type effects disappeared quickly.The P-type effects should result in lower temporal and spatial TEC variations caused by slow rock-cracking under stress as compared to those caused by acoustic shock waves [Freund 2003, Liu et al. 2011], and therefore, it could be detected from the TEC data of the FORMOSAT-3 satellite system with low temporal and spatial resolution [Lin 2012].The TEC variations of the GDGPS system would result in higher variations caused by acoustic shock waves as indicated by the findings of Liu et al. [2011].However, in this case, such TEC variations may have included some low temporal and spatial TEC variations at 05:47 (UT) on March 11, 2011, and therefore, the P-type effects could be found by 2DPCA at this time.The TEC data used in this study were obtained from the GDGPS system and therefore, both TEC data anomalies are different.Both TEC anomalies, separated by 1-7 minutes, should be considered as isolated events for which having a short time lag is reasonable.For comparing with other background noises, the global ionospheric TEC data were examined using 2DPCA.Through this examining, other TEC anomalies of other places (e.g.strong magnetic storm) and strong background noises resulted in small eigenvalues and thus a lager eigenvalue of 2DPCA over the epicenter should be a pattern of earthquake-related TEC anomaly.It means that the eigenvalues of non-earthquake effects have been suppressed by the eigenvalues of the earthquake and tsunami effects.Therefore 2DPCA method can distinguish an earthquake-related TEC anomaly with a large eigenvalue from other TEC anomalies during the examined time period.The reason to distinguish pattern of earthquake-related TEC anomaly may be from the technology of 2DPCA, which is suitable to deal with low dimension TEC data, thereby pattern of earthquake-related TEC anomaly was accidentally caught.The TEC data after 05:47 were examined by 2DPCA.Another earthquake-associated TEC anomaly (occurring simultaneously with the tsunami effects on the ionosphere at 05:53 initiated the propagation of an anomalous fluctuation through the acoustic shock waves from the epicenter (Figure 6).After about 1 hour, this anomalous fluctuation spread out in a similar way to the propagated effects of the tsunami on the ionosphere (from the epicenter at 36° longitude and 45° latitude, as shown in Figure 7).The types of TEC variations represented by the principal (largest) eigenvalues were TEC large disturbances, such as those in the study of Liu et al. [2011].Another interesting aspect of this study is that the Dst index (Figure 8) indicates a geomagnetic storm for this time period [Hamilton et al. 1988, Mukherjee 1999, Elsner and Kavlakov 2001, Plotnikov and Barkova 2007].Studies of earthquake related TEC anomalies often completely discount days when geomagnetic storms occur because such storms are believed to distort TEC analysis profiles.However, in this case the TEC anomalies were detected regardless of the geomagnetic storm and other background noises, which resulted in small eigenvalues.The principle eigenvalues shown in this study could represent the inhomogeneity of the TEC within an area of 12° longitude and 9° latitude, which is why TEC variations with a spatial scale size of several hundred kilometres are picked up.TEC variations caused by geomagnetic disturbances and other non-earthquake effects could be smaller than the TEC anomalies related to both earthquake and tsunami effects on the ionosphere.A large eigenvalue indicates the patterns of the tsunami effect and earthquake-related TEC anomalies, and these patterns may represent the types of certain particular spatial TEC frequencies or characteristics that were detected by the 2DPCA.It is worth mentioning that if artificial effects caused by 2DPCA really occurred and then the computing errors of the 600 eigenvalues (Figures 5, 6 and 7) will have almost the same magnitudes, thus the larger eigenvalues could also show a relation to the earthquake and tsunami effects.

Discussion
2DPCA was able to detect a TEC anomaly related to the earthquake that began at 05:47 near the epicenter, which is considered to possibly be an extension of the precursor of this earthquake which may be the same as stark p-type semiconductor effects [Lin 2012].After this TEC anomaly disappeared, a TEC anomaly at 05:53 initiated the propagation of an anomalous fluctuation and propagated the effects of the tsunami on the ionosphere through the acoustic shock waves from the epicenter.As stated previously, one reason should be acoustic shock waves.This anomaly resembled what one would expect from rising acoustic shock waves be-cause strong motion.As discussed in the introduction, Earth's atmosphere could act as a natural amplifier due to declining atmospheric density with height.A large earthquake, such as this earthquake, was characterized by many fine vibrations at the Earth's surface which could produce a vertical stark acoustic pressure wave of great amplitude by the time it reached the ionosphere.Such a description could possibly represent the concentrated energy of an acoustic shock wave with fast speed being formed in the lower atmosphere after the earthquake traveling up into the ionosphere to cause the anomalous fluctuation near the epicenter.This anomalous fluctuation could be an early warning for the regions far from the epicenter when this anomalous fluctuation beginning to propagate as a type of Rayleigh waves.As stated in Section 3, this anomalous fluctuation spread out around the region of TEC anomaly after 1 h in Figure 4   earthquake.Therefore the result in this study should be reasonable.Moreover, some studies had different results that the propagation velocity of the TEC disturbances from the epicenter was quite larger e.g.Liu et al. [2011] describe fast propagating plasma disturbances as a type of Rayleigh waves [Yuen et al. 1969] quickly travel away above the epicenter along the main island of Japan with a speed of 2.3-3.3 km/s.they used another data source.The tsunami arrived with the speed being about 720-800 km/h at the far regions very slower than such anomalous fluctuation propagated.Through this study, the TEC fluctuation propagation could be as an early warning of the tsunami caused by this earthquake.Please note that it resulted in an eigenvalue at any place within each grid (the largest eigenvalue locations were surely near the epicenter and spread out from the epicenter).However, for convenience, each grid was filled with uniform color.This study has shown many square regions that are presented only for convenience.The aim of this paper was to compute the propagation of the anomalous fluctuation with its speed of 3600-6120 km/h, and it was not focused on the shape of the propagating anomalous fluctuation.

Conclusion
2DPCA had the advantage to detect the TEC anomalies related to the March 11, 2011, Tohoku earthquake.Results have shown that a local ranging TEC anomaly may be an extension of the precursor of this earthquake and that another TEC anomaly at 05:53 began the propagation of the effects of the tsunami on the ionosphere through the acoustic shock waves from the epicenter.The TEC anomaly could be a TEC anomalous fluctuation with the speed being 3960-4950 km/h.The anomalous fluctuation could be an early warning of the tsunami for far regions because the anomalous fluctuation propagating speed was fast than the speed of tsunami with about 720-800 km/h.

Figure 1 .
Figure 1.The largest real-time tracking network on the ground (the Global Differential GPS (GDGPS) System; from www.gdgps.net).

Figure 5 .
Figure 5.The figure gives a color-coded scale of the magnitudes of principal eigenvalues corresponding to Figure 2. The color within a grid denotes the magnitude of assigned principal eigenvalues, respectively.Note it resulted in an eigenvalue in any place of each grid.However, for convenience to show, each grid was filled with a uniform color.This study has shown large eigenvalue near the epicenter.
Figure 6.The figure gives a color-coded scale of the magnitudes of principal eigenvalues corresponding to Figure 3.

Figure 7 .
Figure 7.The figure gives a color-coded scale of the magnitudes of principal eigenvalues corresponding to Figure 4.

Figure 8 .
Figure 8.This figure shows the Dst indices in March 2011 (World Data Center for Geomagnetism, Kyoto).