Analysis of soil radon data in earthquake precursory studies

Soil radon data were recorded at two selected sites along Mat fault in Mizoram (India), which lies in the highest seismic zone in India. The study was carried out during July 2011 to May 2013 using LR-115 Type II films. Precursory changes in radon concentration were observed prior to some earthquakes that occurred around the measuring sites. Positive correlation was found between the measured radon data and the seismic activity in the region. Statistical analysis of the radon data together with the meteorological parameters was done using Multiple Regression Method. Results obtained show that the method employed was useful for removing the effect of meteorological parameters and to identify radon maxima possibly caused by seismic activity.


Introduction
Variation of radon concentration in soil can give evidence on tectonic disturbances in the Earth's crust [King 1986].However, the changes in radon concentration are not only controlled by seismic activity but also influenced by meteorological parameters like soil moisture, rainfall, temperature, barometric pressure and relative humidity [Kraner et al. 1964, Pearson 1967, Singh et al. 1988, King and Minissale 1994, Virk et al. 2000, Zmazek et al. 2003, Walia et al. 2005].These parameters change the physical characteristics of the soil and thus affect the soil radon variation caused by geophysical processes.Therefore, theoretical and empirical algorithm is necessary for removing the effects of meteorological parameters [Zmazek et al. 2003, Ramola et al. 2008, Choubey et al. 2009].
Northeast India is seismically one of the six most active regions of the world, the other five being Mexico, Taiwan, California, Japan and Turkey.It lies at the junction of Himalayan arc to the north and Burmese arc to the east [Jaishi et al. 2014a, Singh et al. 2014].The high seismicity in this zone is due to the collision tectonics between the Indian plate and the Eurasian plate in the north and Subduction tectonics along the Indo-Myanmar Range (IMR) in the east [Dewey and Bird 1970].Based on the distribution of epicentres, fault plane solutions and geotectonic features, northeast region can be divided into five seismotectonic zones which include Eastern Himalayan collision zone, Indo-Myanmar subduction zone, Syntaxis zone of Himalayan arc and Burmese arc (Mishmi Hills), Plate boundary zone of the Shillong plateau and Assam valley, and Bengal basin and plate boundary zone of Tripura-Mizoram fold belt (Figure 1).The Indo-Myanmar subduction zone is highly seismic zone extending from the Bay of Bengal to the Mishmi

Subject classification:
Surveys, measurements and monitoring, Geochemical data, Statistical analysis.[Jaishi et al. 2014b].It obliquely cuts across the general north-south trend of the Indo-Burmese arc.It trends NW-SE (Figure 2) and is traceable across entire Mizoram on the satellite as well as on the geological maps.Mat River crosses the Mat fault and follows it for considerable distance.Most pronounced part of this fault is in Serchhip district along Serchhip-Thenzawl road.Hence this part has been selected for monitoring of radon anomaly and their possible correlation with the occurrence of earthquakes.Authors have done primary analysis of soil radon and thoron anomalies as a precursor to earthquakes [Jaishi et al. 2013, 2014a, 2014b, Singh et al. 2014].The present paper describes the multiple regression method to differentiate those radon anomalies which are solely caused by seismic events and not by meteorological parameters.

Materials and methods
For radon monitoring in soil, we have used LR-115(II) cellulose nitrate detector manufactured by Kodak Pathe in France, a useful solid state nuclear track detector (SSNTD) for registering the alpha tracks generating from radon gas.The LR-115 (II) films of size 3cm × 3cm was attached in the dosimeter with the sen-sitive side facing downwards for exposure.The dosimeter has been developed at the Bhabha Atomic Research Centre (BARC), India.The dosimeters were buried in holes of depth of about 80 cm in the soil (Figure 3).The detector films were exposed for a period of 15 days, after that the exposed films were removed and reinstalled with the new one in the dosimeters.The SSNTD's are etched with 10% sodium hydroxide solution in an etching bath at a temperature of 60°C for developing the registered tracks.The etching time was set for 1 hour.The track etched method removes a bulk thickness of 4µm leaving a residual thickness of 8µm.As soon as the etching process is complete, films were washed in cold running water and then kept in distilled water for two hours.The detector films are pre-sparked at a voltage of 900V to fully develop the partially etched track holes using spark counter.The tracks are then counted at a voltage corresponding to plateau region of the counter (around 450V).

Results and discussion
Radon concentration in soil gas have been measured after every 15 days using LR-115(II) films during July 2011 to May 2013 together with the meteorological parameters viz.; rainfall, relative humidity, air temperature and barometric pressure at two locations (Mat Bridge and Tuichang) in Serchhip District, Mizoram (India).Numerous criteria have been used by different authors [Rannou 1990, Price et al. 1994, King et al. 1996, Fu et al. 2005] in order to calculate soil radon gas anomalies which include mean plus 'n' (where n = 1, 2, 3,…) standard deviation (SD).Fleischer [1981] proposed the empirical relation between the magnitude (M) of the  earthquake and the epicentre distance (D) within which the earthquake prediction zone can be manifested. (1) The details of the earthquakes (www.imd.gov.in) that occurred around the study areas fulfilling Equation ( 1) is summarised in Table 1.

Multiple Linear Regression (MLR) Model
Multiple regression analysis permits the study of several independent (predictor) variables for a given dependent (criterion) variable.For 'n' predictor variables the raw score regression equation is given by where, is the predicted value of the dependent variable, a is the constant term and b 1 ,b 2 ,...,b n represents the regression coefficient of the independent variables X 1 ,X 2 ,...,X n respectively.The drawback of the raw score regression is that the size of the b-coefficients is dependent on the scale of the predictor variables.Therefore, they cannot be compared directly with each other to determine which predictor variable has the most influence on the criterion.By standardizing both the coefficients and the variables we could make direct comparison among the coefficients [Yount 2006].The standardized regression equation takes the form where, b's represents the standardized coefficients and Z's represents the Z-scores.In the present investigation we perform MLR taking soil radon concentration (Rn) as the dependent variable and the meteorological parameters viz.Rainfall (RF), relative humidity (RH), temperature (T) and pressure (P) as independent variables.The data used for analysis is reported in Table 2.

Data analysis at Mat Bridge
At Mat Bridge Rn 1 is regressed with RF, RH, T and P (taken from Table 2).All the predictor variables mentioned may not be a significant predictor of the dependent variable.To deal with this problem and to build a better model we perform backward regression method using SPSS statistical package.The backward regression method starts with all the predictor variables, test the significance of each predictor variable using a t-test and then delete the variable (if any) that improves the model the most by being deleted.This process is repeated until the maximum number of steps is reached or no more independent variable qualifies for removal [Hocking 1976].Table 3 shows the coefficients of the backward regression output.The backward regression output of the data has four distinct models (Model 1, Model 2, Model 3 and Model 4).The first model includes all the predictor variables (RF, RH, T and P).To check the significance of the predictor variables a t-test is performed at 95% confidence interval (i.e., the alpha value is set at 0.05) and its corresponding probability is shown in sig.column of the table (see Table 3).If we look at the significance of each predictor variable in Model 1, two predictor variables (T and P) are non-significant predictors of the dependent variable as the p-value (sig.) is more than a-value (0.05) in both the cases.The backward regression method eliminates one predictor variable at a time (the predictor variable with the highest p-value is deleted first).So, temperature (T) having the highest p-value (0.757) is removed from Model 1 and a second model is constructed with RF, RH and P as independent variables.In Model 2, P is a non significant predictor (since the pvalue is more than 0.05).Therefore, P is removed from the model and a third model is built with RF and RH as predictor variables.Model 3 tells us that RF is a non significant predictor (since the p-value is much more than 0.05).RF is therefore removed from the model and a fourth model is built with RH as a single predictor vari- To check how much of the criterion variability is accounted for by the predictor and the viability of the model-as-a-whole we examine the squared multiple R value and a significant F-ratio (p<0.05)respectively, which is reported in Table 4.In the model summary of Table 4, R-square and adjusted R-square is reported.The adjusted R-square is the measure of the shrinkage if we were to apply this model to another sample (in other words the amount of predictive loss that we would observe).The value of adjusted Rsquare in model 4 is 0.198 i.e. 19.8% of the variation in radon concentration is accounted by RH.By reducing the number of predictors from 4 to 1 the adjusted Rsquare dropped from 0.240 to 0.198, a change of 0.042 (or a little over 4%) which is good.We do not lose much R-square by dropping 3 of the 4 predictor variables.Also the standard error of estimates increase from 162.51 to 166.99 (only 2.68% increase) which is not much.To test the significance of the model-as-awhole an F-ratio (MS reg./MS res. ) is calculated for each model and the probability of the computed F-ratio is displayed in sig.column of Table 4.If we compare all the models, F-ratio is largest in Model 4 showing better ratio of regression to noise.Reducing the number of predictors from 4 to 1 reduces the p-value of F-ratio from 0.004 to 0.001 indicating a more significant and a better model.The raw score regression equation for Model 4 can be written as (4) .
.  and the corresponding standardized equation would be (5) where ZRn' 1 is the standardized value (Z-score) of Rn' 1 and ZRH is the Z-score of RH.We calculate the Zscores of the measured radon concentration (ZRn 1 ) and relative humidity (ZRH) during the entire period of observation and uses Equation ( 5) to find the standardized predicted value (ZRn' 1 ).When the difference (i.e. the standardized residuals) between ZRn 1 and ZRn' 1 is more than 1SD, we denote it as a possible radon anomaly.
The graphical representation of the standardized measured radon concentration and residuals during the observation period is shown in Figure 4.The radon maximum observed on 45th day of the observation period (Figure 4a) was ZRn 1 = 2.263 Tracks/cm 2 .The difference (ZRes.) between the measured and the predicted value was, 1.943 Tracks/cm 2 (close to 2SD; Figure 4b).This radon maximum could be correlated to the earthquake of which occurred on 65th day of the observation period (i.e.20 days after the anomaly was observed).The second radon peak was observed on 330th day.The difference between the measured and the predicted value was recorded to be ZRes.=1.458Tracks/cm 2 , which is greater than 1SD.The second peak may be correlated to the earthquake of M5.8 which occurred on 353rd day (23 days after the anomaly was observed).The third anomalous change in radon concentration was recorded on 375th day.The difference between the measured and the predicted value during this period was recorded to be 1.847 Tracks/cm 2 (much greater than 1SD).An earthquake of M6.0 occurred only 6 days after the anomaly was observed.So, the third radon maximum may be correlated to the mentioned M6.0 quake.The fourth maximum was recorded on 435th day.The difference in the concentration was recorded to be 1.305 Tracks/cm 2 (just above 1SD level).Though an earthquake of M6.7 occurred several days after the anomaly was observed, it was quite difficult to directly correlate the radon maximum to the mentioned event as the time interval between the anomaly observed and the event was quite large (51 days).In other words, this radon maximum may be treated as a false anomaly  which may have occurred due to variation in relative humidity.For the radon maximum that occurred on 540th day, the residual was 1.620 Tracks/cm 2 which is well above the 1SD level.This may be treated as a precursory signal before the event of M5.9 which occurred on 545th day (5 days after the anomaly was observed).

Data analysis at Tuichang
As in case of Mat Bridge we perform backward regression taking Rn 2 (measured radon concentration at Tuichang) as dependent variable and RF, RH, T and P as independent variables (data taken from Table 2).The SPSS print out of the results (coefficients) is shown in Table 5.The output of the backward regression method produces three models namely Model 1, Model 2 and Model 3. Model 1 includes all the predictor variables.Temperature (T) is having a p-value of 0.753 which is greater than the a-value (0.05); therefore T is removed from Model 1 to construct Model 2 with RF, RH and P as independent variables.It is clear from the sig.column of Table 5 that P is a non-significant predictor (since the pvalue >0.05).Consequently, P is removed from the Model and a third Model is established with two predictor variables (RF and RH).Both RF and RH are significant predictors since the p-value in both the cases is less than 0.05.The model summary and the analysis of variance are given in Table 6.To verify the significance of the model-as-a-whole, first we look at the adjusted Rsquare value.The adjusted R-square value is 0.178 (i.e.17.8 % of the variation in radon concentration is explained by RF and RH).Reducing the number of parameters from 4 to 2 does not reduce the adjusted R-square value.In fact, the value increases from 0.148 to 0.178 (i.e.16.86% increase).Second, there is no increment in standard error of estimates by reducing the predictor variables which is a good indication that Model 3 is better.Third, the F-ratio is largest in Model 3 (i.e. a better ratio of regression to noise) and the p-value of F-ratio is least.In conclusion Model 3 is the most significant and a bet-ter model.The raw score and the Z-score regression equations for Model 3 can be written as where ZRn' 2 , ZRF and ZRH are the standardized value of Rn' 2 , RF and RH respectively in units shown in Table 2.
The plot of standardized measured radon concentration and residuals during the observation period at Tuichang is shown in Figure 5.The first radon maximum was recorded on 90th day of the observation period (Figure 5a).During this period the difference between the  (6) measured and the predicted value of radon concentration was, ZRes.= 1.066Tracks/cm 2 (just above 1SD; Figure 5b).This radon maximum was quite difficult to directly correlate to any seismic event (as the time interval between the radon maximum and the closest earthquake M5.8 was 39 days).So, the first radon maximum may be treated as a false anomaly possibly may have caused by variation in meteorological parameters (RF and RH).The next increment in radon concentration was recorded on 360th day.The difference between the measured and the predicted value was recorded to be 1.902 Tracks/cm 2 (close to 2SD).This peak may be correlated to the event of M6.0 which occurred 21 days after the anomaly was observed.Third change in radon concentration was observed on 495th day.But it was quite difficult to correlate this peak to the event of M6.7 as during this period the difference between the measured and the predicted value was quite low (ZRes.= 0.338 Tracks/cm 2 ; less than 1SD).Next impulsive change was recorded on 540th day of the observation period.The difference between the measured and predicted value during this period was 1.934 Tracks/cm 2 (close to 2SD).The mentioned peak may be treated as a precursor signal before the earthquake of M5.9 which occurred 5 days after the signal.

Conclusion
We have performed statistical analysis of the radon data collected from July 2011 to May 2013 at two sites (Mat bridge and Tuichang) along Mat Fault, the most prominent fault in Mizoram.The multiple regression analysis of the radon data with the meteorological parameters provides useful evidence to extract a radon anomaly possibly caused by seismic event.The results obtained shows that at Mat Bridge about 57% of the anomaly observed could be correlated to the seismic events whereas at Tuichang 28.6% could be correlated, indicating that the radon activity in soil gas is mostly controlled by geophysical process.The precursor time between the observed radon anomalies and the earthquakes varies from 5-23 days.High percentage of correlation between radon anomalies and the earthquakes at Mat Bridge indicates that the area under investigation is seismically active and would be useful to carry out further studies.Several spike-like radon anomalies observed prior to the earthquakes in the present investigation may provide soil radon database and the preliminary trend of their response to earthquake activity in the region.The Backward regression method applied gives valuable information about the meteorological parameter(s) that have the strongest effect on radon concentration.However, long term and continuous data recording is necessary to confirm a relation between soil radon activity and earthquakes.

Figure 1 .
Figure 1.Seismo-tectonic map of North East India showing epicentres of damaging earthquakes (modified after Raghukant and Dash [2009]).Tectonic zones (zones A, B, C, D and E) and major thrusts (MBT-Main Boundary Thrust; MCT-Main Central Thrust; NT-Naga Thrust; DT-Disang Thrust) are also shown.Thick blue line represents the International boundary of North Eastern part of India bordering Myanmar and Bangladesh (not to scale) and red lines represents faults and thrusts.

Figure 3 .
Figure 3. Schematic description of the equipment used for radon measurements in soil.Figure 2. Location map of study area.

Figure 2 .
Figure 3. Schematic description of the equipment used for radon measurements in soil.Figure 2. Location map of study area.

Figure 4 .
Figure 4. Variation of (a) Standardized measured radon concentration and (b) Standardized residuals during the observation period at Mat Bridge.The solid vertical lines represent the earthquakes along with their magnitudes (M).

Figure 5 .
Figure 5. Variation of (a) Standardized measured radon concentration and (b) Standardized residuals during the observation period at Tuichang.The solid vertical lines represent the earthquakes along with their magnitudes (M).

Table 1 .
Details of earthquakes that occurred around the study area during the period of investigation (earthquake data obtained from: www.imd.gov.in).

Table 2 .
Radon concentrations and the meteorological parameters during the period of observation at Mat Bridge and Tuichang.

Table 4 .
Model summary and analysis of variance (Mat Bridge).

Table 3 .
Coefficients of the Backward regression output at Mat Bridge.

Table 5 .
Coefficients of the Backward regression output at Tuichang.

Table 6 .
Model summary and analysis of variance (Tuichang).