Ionospheric perturbations associated with two huge earthquakes in Japan , using principal component analysis for multiple subionospheric VLF / LF propagation paths

The presence of ionospheric perturbations in possible association with two huge earthquakes (Noto-hanto peninsula and Niigata-chuetu-oki earthquakes) in 2007 was studied on the basis of a conventional statistical study for a particular propagation path from the JJI transmitter in Miyazaki, Kyushu, to Moshiri in Hokkaido. This is based on automatic routine-based signal processing, in which the trend as the average nighttime amplitude is significantly decreased, with almost simultaneous significant enhancement in the night-time fluctuation as the night-time integration of negative fluctuation from the average. It is, however, shown that this routine-based signal analysis sometime suffers from artificial (or man-made) effects. Thus, in this study, we propose an additional use of principal component analysis (PCA) for simultaneous observation of a few VLF/LF propagation paths. With the application of this PCA method to multi-path data, the artificial effects can be reasonably removed, and also only the geophysical effects associated with earthquakes are detected, by focusing mainly on the third principal component. The satisfactory separation of the principal components is made possible by pre-analysis of the VLF data (extraction from the raw data of the average over a whole year). This PCA method enables us to identify the seismogenic effects in association with earthquakes with smaller magnitudes, down to M 5.5 or


Introduction
There has been an accumulation of a lot of electromagnetic phenomena that might be associated with earthquakes [e.g., Molchanov andHayakawa 2008, Hayakawa 2009a].These seismogenic effects can be classified as from the lithosphere [e.g., Fraser-Smith 2009, Freund 2009], in the atmosphere [Biagi 2009, Hayakawa 2009b] or in the ionosphere [Hayakawa 2009c, Liu 2009, Parrot 2009].In particular, there appears to be a consensus that the ionosphere is perturbed prior to large earthquakes, both in the lowest D/E region and in the upper F region [e.g., Liu et al. 2006].
We have been recording the network observations of subionospheric very low frequency (VLF)/low frequency (LF) propagation in and around Japan [see Hayakawa et al. 2004, Molchanov and Hayakawa 2008, Hayakawa 2009c].A substantial number of studies have been carried out on the presence of seismo-ionospheric perturbations and their temporal and spatial characteristics on the basis of case studies, and also of statistical studies.Case studies start from the 1995 Kobe earthquake [Hayakawa et al. 1996, Molchanov andHayakawa 1998], and many case studies were summarized by Hayakawa [2009c].In terms of statistical studies, there have been a few reports [Rozhnoi et al. 2004, Maekawa et al. 2006, Kasahara et al. 2008].Recently, Hayakawa et al. [2010a,b] published two studies that indicated that there are significant statistical correlations of these ionospheric perturbations with earthquakes with magnitudes >6.0 and with shallow depths (d <40 km).
Even under these conditions, there is the need to increase the numbers of case studies of seismo-ionospheric perturbations for the huge earthquakes that have occurred Special Issue: EARTHQUAKE PRECURSORS
in Japan.During 2007, there were two huge earthquakes in Japan with magnitudes >6: the Noto-hanto peninsula and Niigata-chuetsu-oki earthquakes.By making full use of our VLF/LF network data, we provide here a case study of the seismo-ionospheric perturbations for these two earthquakes.
We initially indicate the results based on conventional statistical analysis for a specific propagation path that is closest to these two earthquakes.Then, to confirm the presence of these seismogenic ionospheric perturbations, we suggest the additional important analysis method of principal component analysis (PCA).Ten years ago, we proposed the use of PCA of multi-stationed ultra-low frequency (ULF) data to identify the presence of seismo-ULF electromagnetic emissions [Gotoh et al. 2002], and here this method is applied to subionospheric VLF/LF data from different propagation paths.

Target earthquakes and VLF/LF network observations
We consider the complete year of data for 2007, as this period included two major earthquakes that were strong even for Japan: the Noto-hanto peninsula earthquake (March 25, 2007;M 6.8, d = 8 km) and the Niigata-chuetsu-oki earthquake ( July 16, 2007;M 6.6, d = 12 km).Figure 1 illustrates the propagation paths from the VLF transmitter with a call sign of JJI (Kyushu, Miyazaki, 22.2 kHz) to the three receiving stations, MSR (Moshiri, Hokkaido), KCK (Kamchatka, Russia) and TYM (Tateyama, Chiba).One additional propagation path is plotted from another Japanese LF transmitter, with a call sign of JJY (Fukushima, 40 kHz), to MSR.Their corresponding wave-sensitive areas defined by the 5th Fresnel zone are also indicated in Figure 1.For these propagation paths, we took earthquakes with magnitudes >5 from the US Geological Survey earthquake catalog, as also indicated in Figure 1, that might have effects on the VLF/LF propagation.Each circle in Figure 1 corresponds to an earthquake, with the earthquake epicenter at its center, and with the circle size (radius) proportional to the magnitude of the earthquake.The color of the circle indicates the earthquake depth, as indicated in Figure 1, and the numbers with the circles indicate the dates when the earthquakes occurred.

Conventional statistical data analysis
Conventional (routine based) signal processing was applied to the propagation path from JJI to MSR over the one year of data for 2007, using the night-time fluctuation method [as in Hayakawa et al. 2010a].This path was the most suitable for the study of these two earthquakes.Using this method, we estimated the average night-time amplitude, called the trend, the dispersion, as the night-time amplitude fluctuation, and the night-time fluctuation; the details of those parameters were described by Kasahara et al. [2008] and Hayakawa et al. [2010a].All of these parameters were normalized with their corresponding standard deviations (v) over 30 days before any current day.Figure 2 shows the result of an automatic (or routine based) analysis over the year 2007, for the JJI-MSR path.The top panel of Figure 2 shows the temporal evolution of the trend, the second panel shows the dispersion, and the third panel shows the night-time fluctuation.The bottom panel of Figure 2 shows the plot of RKp (the geomagnetic activity).As has already been shown statistically [Hayakawa et al. 2010a], the most important parameter, the trend, was seen to decrease several days before an earthquake, while the other two parameters, as dispersion and night-time fluctuation, respectively, tended to be simultaneously enhanced prior to an earthquake.Let us look at the result in Figure 2. A significant decrease in the trend is seen at least before the two huge earthquakes of interest.Over the week before the Noto-hanto peninsula earthquake, we note a significant decrease in the trend, and correspondingly, the other two parameters are seen to be enhanced.This was apparently a precursor to this earthquake.
Next, we move on to the second earthquake, that of Niigata-chuetsu-oki. Again, a noticeable decrease (exceeding 140 ONO ET AL. -3v) is seen in the trend over about one week prior to this second earthquake.The dispersion is not enhanced so much, but the night-time fluctuation is increased.This characteristic is considered to be a precursor to this second earthquake.One more thing that should be mentioned is that the geomagnetic activity was rather quiet during the VLF anomaly.Another clear anomaly was noted at the end of August, until the beginning of September, 2007, which was probably related to a series of rather strong after-shocks that occurred up to the beginning of September, 2007.
However, some abnormal behaviors were seen during this year (2007).For example, by looking at the temporal evolution of the trend in Figure 2 (top panel), there is one depletion peak (one day) in the middle of February, and another towards the end of May, with another one-day depletion peak at the end of December, 2007.These are seen to exceed -3v in the trend, which is too strongly abnormal and they look artificial.A normal perturbation that appears to be associated with an earthquake is known to persist for at least a few days, or even more.However, of course, a one-day anomaly is also possible for an earthquake with a smaller magnitude, so that it appears difficult to distinguish between a seismogenic effect and an artificial (man-made) effect, except by using a single propagation path.When only one propagation path was considered, we compared those anomalies (probably not associated with earthquakes) with the other corresponding phenomena (solar-terrestrial and meteorological effects), to study these peaks extensively.However, we have the advantage that we have the data for a few subionospheric VLF/LF propagation paths, and this is the reason why we propose the use of PCA for the data analysis of these few subionospheric VLF/LF propagation paths.The purpose of the use of PCA is to: (1) re-confirm the existence of seismo-ionospheric perturbations for these two huge earthquakes; (2) determine whether these one-day anomalies are really artificial or not; and (3) explore the possibility of identifying the ionospheric perturbations for even weaker earthquakes, with magnitudes down to M ~5.

PCA and preparation of the data for PCA
We present some fundamental descriptions of PCA, which involves a mathematical procedure that transforms a number of possibly correlated variables into a number of uncorrelated variables, called the principal components; these are related to the original variables by an orthogonal transformation.This transformation is defined in such a way that the first principal component has as high a variance as possible (that is, accounts for as much of the variability in the data as possible), and each succeeding component in turn has the highest variance possible under the constraint that it be orthogonal to the proceedings.
PCA is intended to identify different noise sources from multiple data.Gotoh et al. [2002] made the first attempt to investigate seismogenic ULF emissions by applying PCA to the multi-stationed ULF data observed at three stations on the Izu peninsula, in the case of the Izu Islands earthquake swarm.They then found that the third principal component was highly likely to be seismogenic.
We have to prepare the subionospheric VLF/LF data observed for three propagation paths.The propagation paths we chose were: 1) The JJY transmitter (Fukushima) -MSR (Moshiri, Hokkaido); 2) The JJI transmitter (Miyazaki) -MSR; 3) The JJI transmitter -KCK (Kamchatka, Russia).This was because we are interested in the two major earthquakes that occurred in the year of 2007 on the side of the Japan Sea.

Selection of analysis time interval
The above three propagation paths were chosen based on the relative locations of the transmitters and receivers with respect to our huge earthquakes, so that the local nighttime was different from one path to another.In particular, the local night was a few hours different for the two paths of JJI-MSR and JJI-KCK.Then, we chose the time interval of UT = 12-16 h (4 h) as the common night-time for all of the above three propagations, with exclusion of the terminator times, as studied by Hayakawa et al. [1996].The sampling rate of our VLF/LF receiving system was 120 s, so that we can use 120 data points (N = 120) for one day for one propagation path.As the total, we have 360 data points per day for all three of the propagation paths.

Selection of the analysis method
There were some differences in the propagation characteristics for these three propagation paths.So we tried the following three possible pre-analysis methods before the application of PCA, in order to have appropriate and acceptable results.
a) The use of the raw data.PCA was applied to the raw data sampled every 120 s during the same local night-time (UT = 12-16 h) observed for the three propagation paths.b) Extraction from the raw data of the night-time average amplitudes through the whole period.Initially we estimated the average night-time (same UT as before) amplitude over the whole period (one year) for a particular path, and we took the difference between the current data and the average above.The same procedure was used for the other two paths, and then we applied PCA to these data.
c) The difference between the current raw data and the average on each current day.Unlike the above procedure, we take the averages during the local night (UT = 12-16h) on each day, and we take the differences between the raw data and this average.The same procedure was applied to the other two paths, and then PCA was applied to these data.

Comparison of PCA results for the three different methods
We analyzed the data for the two years of 2006 and 2007.Although the main target was 2007, we also investigated another year, as 2006, because there were not so many large earthquakes in 2006, and so it was suitable to study which pre-analysis treatment in Subsection 4.2 was best for the further PCA analysis.
The PCA results are illustrated in Figures 3, 4 and 5, for the VLF/LF data in Subsection 4.2 (i.e., for the pre-analysis methods a, b and c, respectively).In these figures, the eigenvalues of the first, second and third principal components (m 1 , m 2 and m 3 ) are plotted for the whole year of 2006.The earthquake information is given at the top of each top panel of Figures 3, 4 and 5, as a downward arrow that indicates the earthquake magnitude, with all of the earthquakes with magnitudes <6.0.
Figure 3 is the PCA result for the data as (a) in Subsection 4.2 (i.e., use of the raw data).This PCA was intended to separate a few possible effects, such as geomagnetic variations, seasonal variations, and earthquake effects, among others.Figure 3 indicates that only the first principal component is well separated; although we noted the significant similarities in the temporal evolutions of the second and third principal components, we utterly failed to extract the second and third components.So, this preanalysis treatment (a) is not so good for PCA analysis.
Next, we move to Figure 5 using the data as (c) in Subsection 4.2.A comparison of the temporal evolution of the three principal components (m 1 , m 2 , m 3 ) suggested that they show similar temporal variations, which indicates apparent correlations among the three components.Due to this, this method was not effective in separating two components and is not appropriate for our PCA.
On the other hand, Figure 4 shows the PCA results for the data (b) in Subsection 4.2 (while considering the average over the whole period of one year), which indicates that there appears to be nearly no (or weak) correlation among the three principal components.So, the three principal components are likely to be well separated, such that in the following we adopt this pre-treatment (b) of Subsection 4.2 for further PCA procedures for our target year of 2007.

Analysis data and analysis results
Now we come back to Figure 4 to discuss this figure carefully, because the separation of the principle components is best using the data treatment as (b) in Subsection 4.2.The ordinate indicates the intensity of each principal component.From the top, there are the first, second and third principal components.When we have no plots, it means there are no data.When we focus on the first principal component (m 1 ), it tends to increase from spring to summer, decrease from summer to autumn, increase from autumn to winter, and decrease from winter to spring.The occurrences of the earthquakes are indicated along the top of the top panel in Figure 4.It appears that the first principal component does not depend on the earthquakes.Then, for the second principal component, the value of m 2 is seen to be enhanced in February, March, November and December.However, we cannot judge whether m 2 is correlated with any earthquakes or not.Finally, the third principal component (m 3 ) was enhanced in November and December, although it cannot be certain whether m 3 in Figure 4 is correlated with earthquakes, because the earthquakes for this year were too weak (magnitudes <5.5).
To study the detailed temporal evolution of m 2 and m 3 , in Figure 6 we used the conventional ratios of m 2 /m 1 (upper panel), m 3 /m 1 (lower panel) and RKp (bottom).The ratio of m 2 /m 1 shows enhancements in its value in February-April and in November-December.By comparing these temporal variations of m 2 /m 1 with the corresponding variation of RKp (geomagnetic activity) in the bottom panel of Figure 6, we can imagine that the temporal evolution of m 2 /m 1 approximately reflects the variation in the RKp index (albeit, not so perfectly).Thus, we can regard the ratio of m 2 /m 1 as an indicator of the geomagnetic activity.On the other hand, the second ratio, of m 3 /m 1 , in Figure 6 is relatively quiet during the whole of this year, as compared with the variation of m 2 /m 1 .When the earthquake epicenter was located within the Fresnel zones of a few propagation paths, we can easily identify the ionospheric perturbation.When we focus on earthquakes in Figure 1 for which two paths from the chosen three are within the relevant Fresnel zones, the ratio of m 3 /m 1 shows significant SEISMO-IONOSPHERIC PERTURBATIONS enhancements 2-10 days before an earthquake.This is true for three earthquakes out of the four.So the ratio of m 3 /m 1 is considered to reflect the earthquake signature well, even though there appears to be some effects of m 2 .
Figure 7 is the result of the PCA analysis for 2007 (November and December data are missing), the year of our main interest.This is based on the data treated as for (b) in Subsection 4.2, because we have already seen that the separation into the first to third principal components is best, as compared using the corresponding results for the previous year of 2006 for the other two data analyses, as (a) and (c) in Subsection 4.2.When we look at the temporal evolution of the first principal component (i.e., the eigenvalue of the first principal component (m 1 )) at the top panel of Figure 7, we see that the value of m 1 does not fluctuate very much, and that it has no relationship to any of the earthquakes.Next, we look at the temporal evolutions of the eigenvalues m 2 and m 3 of the second and third principal components in Figure 7.The second principal component is enhanced during January and through March, to vary similarly and to increase before the two huge earthquakes (Noto-hanto peninsula earthquake and Niigata-Chuetsu-oki earthquake; i.e., from 1~10 days before these earthquakes).
To enhance the variations of m 2 and m 3 more clearly, Figure 8 illustrates the temporal evolutions of the ratios m 2 /m 1 (top panel) and m 3 /m 1 (second panel), and of RKp (bottom panel).As was already seen for the previous year of 2006 in Figures 5 and 6, the ratio of m 2 /m 1 for 2007 is relatively disturbed over the whole year, although the ratio of m 3 /m 1 is relatively stable.In other words, the variability of m 2 /m 1 is much larger than that of m 3 /m 1 .A comparison of the temporal variation of m 2 /m 1 with the RKp index, might suggest a relatively close resemblance between these two.On the other hand, the ratio of m 2 /m 1 shows nearly no correlation with the variation of the RKp index.When we focus on the Noto-hanto peninsula earthquake (on March 26, 2007), we can see a clear enhancement in the ratio of m 3 /m 1 one to two weeks prior to the earthquake.A similar enhancement in m 3 /m 1 takes place a few days before the Niigata-Chuetsu-oki earthquake on July 16, 2007.Similar tendencies for enhancements in m 3 /m 1 are noted for some other earthquakes with smaller magnitudes, of the order of 5.0.The larger the earthquake magnitude, the larger the enhancements in the ratio of m 3 /m 1 .

Concluding remarks
To further define the presence of seismo-ionospheric perturbations on a particular subionospheric VLF/LF propagation path using our conventional night-time fluctuation analysis (trend, dispersion, and night-time fluctuation) [e.g., Muto et al. 2009, Kasahara et al. 2009, Hayakawa et al. 2010a], we proposed the use of PCA.This can make use of our advantage of simultaneously having a few propagation paths, which might be an impetus to VLF/LF propagation studies.In one sense, we would like to accumulate more evidence on the existence of seismoionospheric perturbations for huge earthquakes in Japan as case studies (in the present case, for two earthquakes, the Noto-hando peninsula earthquake and the Niigata-Chuetsuoki earthquake, both in 2007 and with magnitudes >6.0).
To cover the above two huge earthquakes as our target, we used the data from three VLF/LF propagation paths ( JJY-MSR, JJI-KCK and JJI-MSR).Initially, we compared three different pre-analysis methods for our VLF/LF data, to determine which pre-analysis is the best for the independent separation of the three eigenvalues of the PCA application.As the result, the extraction of the average amplitudes from the raw data over the whole year indicated that the three eigenvalues were well separated.Then, we found that the ratios of m 2 /m 1 and m 3 /m 1 are more suitable for the physical discussion.The ratio of 2/ 1 reflects the variations in the geomagnetic activity (RKp index) well, whereas the ratio of m 3 /m 1 appears to be an indicator of earthquakes.By looking at the temporal evolution of m 3 /m 1 , we can definitely conclude that there are definite ionospheric perturbations before these two major earthquakes of our present interest (Noto-hando peninsula earthquake and Niigata-Chuetsu-oki earthquake), that had large magnitudes (>6.0) and shallow depths.These perturbations are confirmed as precursors to these earthquakes.The values of m 3 /m 1 for these earthquakes are well pronounced due to their huge magnitudes, and we can conclude that the use of PCA based on multiple propagation paths allows us to re-confirm the presence of seismoionospheric disturbances detected by the conventional statistical approach for a particular path.Also, we can reject the anomalies seen in the conventional analysis (Figure 2), which are likely to be artificial.Finally, it becomes possible for us to identify the ionospheric perturbations in association with earthquakes with smaller magnitudes, down to M ~5, even though the peak value of m 3 /m 1 is less enhanced than that for the huge earthquakes used as our targets.

Figure 1 .
Figure 1.Relative locations of the two Japanese transmitters with call signs of JJY in Fukushima and JJI in Kyushu, and the VLF/LF observation stations: MSR (Moshiri), TYM (Tateyama, Chiba), and KCK (Kamchatka, Russia).The two large earthquakes of Noto-hanto peninsula (March 25, 2007) and Niigata-chuetsu-oki ( July 16, 2007) are indicated as the two largest circles.Other earthquakes are also shown by circles, with the earthquake epicenter at the circle center, the circle size (radius) proportional to the magnitude of the earthquake, the circle color indicating the earthquake depth, and the numerals indicating the date of occurrence.The ellipses show the wave-sensitive areas for each propagation path between the transmitters and receivers.

Figure 2 .
Figure 2. The temporal evolution for the conventional routine-based analysis for 2007.Top panel, trend; second panel, dispersion; third panel, night-time fluctuation (parameters normalized with their corresponding standard deviations); bottom panel, temporal evolution of RKp (geomagnetic activity).The two big earthquakes of interest are indicated at the top of the top panel (EQ, downward arrows).The likely seismogenic effects are marked by red arcs in the top panel, while some unlikely anomalies are indicated by question marks.

Figure 3 .
Figure 3. Temporal evolution of the eigenvalues of the three principal components (m 1 , m 2 and m 3 , as indicated) for 2006, using method (a) in Subsection 4.2.The information for the earthquakes (see Figure 1) is indicated by the marks (top panel), according to date, magnitude (length) and depth (color).

Figure 4 .
Figure 4. Temporal evolution of the eigenvalues of the three principal components (m 1 , m 2 and m 3 , as indicated) for 2006, using method (b) in Subsection 4.2.Details as for Figure 3.

Figure 5 .
Figure 5. Temporal evolution of the eigenvalues of the three principal components (m 1 , m 2 and m 3 , as indicated) for 2006, using method (c) in Subsection 4.2.Details as for Figure 3.

Figure 6 .
Figure 6.Temporal evolution of m 2 /m 1 (top), m 3 /m 1 (middle) and RKp (geomagnetic activity; bottom) for 2006.The information for the earthquakes (see Figure1) is indicated by the marks (middle panel), according to date, magnitude (length) and depth (color).

Figure 7 .
Figure 7. Temporal evolution of the eigenvalues of the three principal components (m 1 , m 2 and m 3 , as indicated) and RKp (geomagnetic activity; bottom) for 2007 (data for November and December 2007 are missing).Details as for Figure 3.