Probabilistic interpretation of « Bath ’ s Law »

Assuming that, in a catalog, all the earthquakes with magnitude larger than or equal to a cutoff magnitude Mc follow the Gutenberg-Richter Law, the compatibility of this hypothesis with «Bath’s Law» is examined. Considering the mainshock M 0 and the largest aftershock M 1 of a sequence respectively as the first and the second largest order statistic of a sample of independent and identically distributed exponential random variables, the distribution of M 0 , M 1 and of their difference D1 is evaluated. In particular, it is analyzed as the distribution of D1 changes when only the sequences with the magnitude of the mainshock above a second threshold Mc * Mc are considered. It results that the distributions of M 0 , M 1 and D1 depend on the difference Mc * Mc and on the number of events in the sequence. Moreover, the expected value of D1 increases with increasing of Mc * Mc for every value of N. Then it is shown that «Bath’s Law» could be ascribed to selection of data caused by the two thresholds Mc and Mc * and that it has a qualitative agreement with the model proposed.


Introduction
In 1958, Richter, in a note to his book, wrote: «Dr.Bath has lately noted that in many instances the magnitude of the largest aftershock is about 1.2 less than that of the main shock» (Richter, 1958, pag. 69).
Some years later, Bath (1965) confirmed this issue, asserting that in a catalog for shallow shocks and for magnitudes based on the surface wave scale the sample mean of the difference D 1 between the magnitude of the mainshock M 0 and Mailing address: Dr. Anna Maria Lombardi, Istituto Nazionale di Geofisica e Vulcanologia, Via di Vigna Murata 605, 00143 Roma, Italy; e-mail: lombardi@ingv.it the respective largest aftershock M 1 followed the rule (1.1) independently of the sequences examined.He also provided some examples for which the formula (1.1) did not completely fit the data and tried to make changes to it.In spite of this, from then onwards, eq.(1.1) has become, under the name of « Bath's Law», one of the most important and mentioned statistical laws concerning the distribution of the earthquakes in a seismic sequence.Several papers have been published concerning the law and many related aspects (i.e. the distribution of the mainshock and the one of the larger aftershock; the distribution of the events in a sequence; the relation of M 0 , M 1 and D 1 with b or with the number of events in the sequence, ...) and in particular, diverse attempts have been made to provide a statistical explanation of the eq.(1.1).
The most commonly accepted form for the magnitude distribution in a catalog is expressed by the well-known formula (Gutenberg-Richter law) (1.2) where N (M) is the number of events with magnitude larger than of equal to M (Gutenberg-Richter, 1954).This equation is equivalent to the statement that the magnitudes of events in a catalog are independent and identically distributed random variables: their distribution is exponential with parameter = b ln(10) (Ranalli, 1969).Considering that the estimated value of the parameter b in eq.(1.2) is close to unity, the value of is about 2.3.
If M c is the cutoff magnitude (the lowest magnitude above which the data set is complete), the density function of the above-mentioned magnitudes is (Ranalli, 1969) (1.3) Utsu (1961Utsu ( , 1969) ) noted that if the magnitude of the mainshock was included in the sample of random variables with density function (1.3), the Gutenberg-Richter law and «Bath's Law» were in strong contradiction.In fact, it is well-known by the theory of order statistics that if M 1 , ...M N are independent and identically distributed random variables with density function (1.3), then the difference between the largest order statistic M (N) and the second largest order statistic M (N-1) is an exponential random variable with the same parameter as in (1.3), irrespective of the sample size N; furthermore, under the same assumptions, this difference is positively correlated with M (N ) (Feller, 1966;Utsu, 1969;Vere-Jones, 1969).Therefore Utsu pointed out that, if the mainshock was included in the random sample, the observed sample mean of D 1 (equal to 1.2) was considerably larger than the expected value 1/ (Ӎ0.5).Moreover, the positive correlation between M (N ) and M (N ) M (N-1) , predicted by the Probability Theory, was in disagreement with the independence, implicit in « Bath's Law», or with the marked negative correlation, observed by himself on the Japanese aftershock sequences (Utsu, 1961(Utsu, , 1969)), between D 1 and M 0 .He inferred from these results that the mainshocks had a different distribution from the aftershocks: it was not acceptable to consider the mainshock as the largest order statistic of an exponential random sample.
The statistical interpretation of Utsu was confuted by Vere-Jones (1969): he ascribed the discrepancies observed by Utsu to the bias in selecting data and not to intrinsic properties of the aftershocks.In fact, only the sequences with the magnitude of the mainshock above a second threshold M c * (larger than M c ) were included in the data set.This selection caused, in his opinion, a different distribution of order statistics, an expected value larger than 1/ for D 1 and a negative correlation between M 0 and D 1 .Therefore, he concluded that « Bath's Law » and the results of Utsu were compatible with the hypothesis that the mainshock is the largest member of a sample of independent exponential random variables.He corroborated this thesis in a second work (Vere-Jones, 1975).
It does not seem that the following papers on distribution of D 1 (Papazachos, 1974;Purcaru, 1974;Tsapanos, 1990) have accepted the interpretation given by Vere-Jones, preferring to it the one given by Utsu.Moreover, in other papers, «Bath's Law » is mentioned as a proof that the mainshock does not come from the same population as the aftershocks (Evison, 1999;Evison and Rhoades, 2001;Lavenda and Cipollone, 2000).On the contrary, in my opinion, Vere-Jones provides a satisfying explanation of the problem.
The purpose of this paper is to go on with the interpretation of Vere-Jones making a thorough mathematical analysis of « Bath's Law » and of related matters, based only upon basic elements of the Probability Theory, and to verify if the hypothesis that all the events come from the same population and «Bath's Law » are compatible.

Data and statistical analysis
Considering the magnitudes of the mainshock and of its aftershocks as a sample of independent and exponential distributed random variables, it is possible to evaluate the density functions of M 0 , M 1 and D 1 conditioned by the event {M 0 M c * } and the relative expected values.In Appendix A, the mathematical elaboration is explained in detail: it is shown as the conditional ditributions of M 0 , M 1 and D 1 depend on b, M c * M c and on the number N of aftershocks in the sequence (see eqs. (A.5), (A.7), (A.8)).
To verify the reliability of the hypothesis of full compatibility between the Gutenberg-Richter law and « Bath's law», the catalog compiled by the Southern California Earthquake Data Center, including events with magnitude equal to or larger than 2.0 occurred in the time period 1990-2001, was analyzed; the total number of events is 62 394.As fig. 1 shows, the Gutenberg-Richter law fits very well the magnitudes of events and the catalog can be considered complete.
To decluster the above-mentioned catalog the Reasenberg algorithm was used (Reasenberg, 1985).The clusters with aftershocks identified number 1763 and the clustered events are 33 360.
To estimate the b-value the maximum likelihood method was used (Utsu, 1966).The value obtained was b ˆ = 0.8851.
The dependence of probability distributions of M 0 , M 1 and D 1 on N implies that, to make a statistical analysis, the data must be divided into groups according to the size of clusters.
Table I shows the results of the analysis of catalog: the data have been divided into groups and only the results relative to all clusters and to groups with at least 30 sequences have been reported.The clusters with M c * = 4.0 and M c = 2.0 have a very variable size and it is impossible to make a statistical analysis with data relative to a single value of N; then only the results relative to all clusters have been reported.Comparing the sample mean of M 0 and D 1 (M ෆ 0 , D ෆ 1 ), relative to clusters with number of events N (foreshocks have been left out) and with thresholds M c and M c * , with the corresponding efficient is inside the range [0.6-0.9].Also by the analysis of the catalog it results that D 1 and M 0 are positively correlated random variables and that the estimated values of the correlation coefficient agree with the theoretical ones.As table I shows, the overall estimated correlation coefficient between M 0 and D 1 for M c * = 4.0 and M c = 2.0 is very near to 0. As I show in following section, this results is not in contradiction with the positive values of the theoretical correlation coefficient plotted in fig. 2.

Discussion
Unfortunaly, it is very difficult to verify, with a rigorous statistical test, if the theoretical averages expected by the model (IE[M 0 ], IE[D 1 ]), a substantial agreement can be noted: for the same value of N the sample mean for D 1 increases with increasing M c * M c and the overall mean value 1.2 can be justified only for a difference between the two thresholds equal to 2.
The table also lists the estimated values of the correlation coefficient between M 0 and D 1 (EstCorr [M 0 , D 1 ]) and the corresponding values predicted by the model .For the sake of brevity not all calculations to evaluate it by the model are reported.However in fig. 2 Table I. Results of the statistical analysis of the Southern California catalog divided according to number of events in every cluster and to the difference of the two thresholds.N is the number of events in the cluster (foreshocks have been left out); N cl is the number of clusters with number of events N (only values with N cl 30 are shown); is the sample mean of D 1 ; is the theoretical expected value of D 1 ; is the sample mean of M 0 ; is the theoretical expected value of M 0 ; EstCorr[M 0 , D 1 ] is the estimated sample correlation coefficient; Corr[M 0 , D 1 ] is the theoretical correlation coefficient.In columns relative to and , after the symbol ± , there is the standard deviation of the relative variable.
distribution evaluated in Appendix fit the data used by Utsu and other authors for their statistical analyses, most of all because the sample size N of every sequence is not known.However some results, presented in the works on « Bath's Law», seem to agree with the model.For example, an explanation of Purcaru's results (1974), obtained in his analysis of Japanese and Greek data, can be provided.In his paper, he observed that in both data sets, M 0 followed the exponential distribution with the same parameter as the distribution of general earthquakes.Vere-Jones (1969) showed that this result was incompatible with the Probability Theory (the maximum of a sample of N independent and identically distributed random variables does not follow the same distribution as the elements of the sample) and that the distribution of M 0 was only asymptotically (i.e. when M c * + ) an exponential one.Figure A.1a-e shows that for every value of N, M 0 is not an exponential random variable and that the mainshocks have a different distribution from that of any random variable in the sample: in fact it is not a generic variable of the sample, but it is the largest one.However, the distribution of M 0 converges rather fast to an exponential when M c * M c + and then the model is completely in agreement with their conclusions, considering that in the catalogs analyzed by the two authors M c * M c it is larger than or equal to 2; to be precise: in Vere-Jones paper, for the Japanese catalog, it is M 0 6 and M 1 4; in Purcaru's paper, for the Japanese catalog, it is M 0 6 and M 1 3.2; in Purcaru's paper for the Greek catalog it is M 0 5.6 and M 1 3.5.
It has been observed (Solov'ev and Solov'eva, 1962;Vere-Jones, 1969;Lavenda and Cipollone, 2000) that there is a linear relation between the sample mean of the magnitude of the mainshock and the natural logarithm of the number of events (or only of the aftershocks) in the sequences of a catalog.Moreover, Vere-Jones  -d, it is evident that the distribution of M 0 is not equal to that of M 1 .Therefore, the observed differences between the two above-mentioned distributions do not necessarily have to be ascribed to «different conditions in which the mainshocks and the largest aftershocks occur» (Purcaru, 1974): M 0 and M 1 are not variables selected at random, but they are the first and the second largest observations of a random sample respectively and, in agreement with the theory of order statistics, they cannot have the same distribution.Then, even considering the Gutenberg-Richter law true for all the events, the Probability Theory predicts that the mainshock and the largest aftershock come from different populations.
As regards the distribution of D 1 , Purcaru (1974) showed in his paper that it was not an exponential one: in fact the histograms of the differences between the mainshocks and the relative largest aftershocks for the catalogs tested by him did not agree with the exponential density.Moreover, the evaluated values of the coefficient of variation (0.52 for Japan and 0.5 for Greece) were discordant with the value (equal to 1) of the coefficient of variation for an exponential variable.Furthermore, in his conclusions, he pointed out that the observations did not confirm,  Purcaru, 1974): these curves could be consistent with the values, close to 0.5, of the analyses of these authors.
generally, «Bath's Law» and that, when the same cut-off was chosen for M 0 and M 1 , the distribution of D 1 seemed to be an exponential one with a sample mean close to 0.5 (see Purcaru, 1974: tables 7 and 8) and an estimated coefficient of variation close to 0.8.In the end, he observed that the sample mean of D 1 increased with the increase in the difference between the cut-off value of M 0 and M 1 by a linear dependence.Similar results had been obtained by Vere-Jones (1969) in his analysis of the catalog of Japanese aftershock sequences compiled by Utsu in 1961.In fact, he had shown that when the difference between the two thresholds decreased from 2 to 0, the sample mean of D 1 decreased from 1.39 to  Utsu (1961Utsu ( , 1969))) and by Vere-Jones (1969)), for M c * M c = 2.1 (value of the difference between the two thresholds of the Greek catalog analyzed by Purcaru (1974)) and for M c * M c = 2.8 (value of the difference between the two thresholds of the Japanese catalog analyzed by Purcaru, 1974).
The plot in fig. 5 shows, for b equal to 1 and four values of N, the relation between the conditional average and the difference of the two thresholds M c * and M c : it is consistent with the linear relation observed by Purcaru (1974: fig. 11), considering that in both catalogs analyzed by him M c * M c is larger than 2.0.
In a more recent paper, Drakatos and Latoussakis ( 2001), in their description of spatial and temporal characteristics of sequences in Greece, dealt with the distribution of D 1 .They concluded that data were fitted by a normal distribution with an average of 0.9.Considering that the two thresholds chosen were M c = 3.The only point that seems to be in disagreement with the results and conclusions of Utsu and Purcaru, is the correlation coefficient between M 0 and D 1 .In fact fig. 2 shows that it is always positive for every value of N and M c * M c .Also the analysis of the Southern California catalog, in the previous section, shows a positive correlation between the two variables, but the overall estimated correlation coefficient for M c = 2.0 and M c * = 4.0 is 0.08.To weigh the compatibility of this value with the model, a simulation was done of 1000 groups of 142 clusters of independent exponential variables with the same size as clusters identified in the California catalog and the same b-value.The values obtained for the correlation coefficient are inside the range [ 0.010, 0.522]; moreover 15 of them are less than 0.08 and 190 are less than 0.2.Then, considering all clusters, apart from their size, the value of the correlation coefficient between M 0 and D 1 can be rather low.It is not easy to justify this result with the model proposed: it becomes necessary to consider a distribution function for the cluster size N and to study the distribution of M 0 and D 1 for a generic cluster (aside from N ).However it is not impossible for M 0 and D 1 to have a correlation coefficient very near to 0 or even negative, as for the data analyzed by Utsu and Purcaru.Moreover, as the same authors noted, for some clusters of their catalogs, the values of D 1 were not known and then they were excluded in the statistical analysis: this selection could have caused a bias in results.
In my opinion, there are too many factors that influence the results: the declustering algorithm used, the precision in estimate of magnitudes and then of b-value, the number of events not recorded (it is known that soon after a mainshock with high magnitude, the aftershocks are not recorded).The question needs further enquiries, but, in my opinion, only by analysis of data, without hypotheses about the magnitude distribution, it is evident that the value 1.2 of the «Bath Law » is justified only if the catalog is selected with thresholds that have a difference equal to or larger than 2.0.Moreover, it does not seem to me that the previous studies exhaustively justify the incompatibility between the Gutenberg-Richter law and the «Bath Law».

Conclusions
Considering the Gutenberg-Richter law true for all the earthquakes of a catalog and utilizing only the results of the Probability Theory about the order statistics, the conditioned distributions of M 0 , M 1 and D 1 by the event {M 0 M c * } are evaluated.As shown in the previous sections, most of the results obtained in the past on «Bath's Law» are consistent with the probabilistic analysis expounded in the second section.In particular, assuming that the mainshock is the largest event of a sample of independent and identically distributed exponential random variables, it results that: 1) M 0 converges in distribution to an exponential variable when M c * + : for low values of N the convergence is very fast and this could justify the frequent conclusion that the mainshock magnitude is an exponential variable.
2) M 0 and M 1 are random variables with different distributions, not because the mainshocks have a different nature from aftershocks, but because the order statistics are not independent and identically distributed random variables as the non-ordered random variables of the sample.
3) The choice of the two thresholds M c and M c * is crucial for the distributions of M 0 , M 1 and D 1 .Moreover, these distributions depend on N and, of course, on b.In particular, when M c * is larger than M c , D 1 is not an exponential variable, its expected value is higher than 1/ and it could be consistent with «Bath's Law ».
4) There is a substantial agreement between the theoretical model presented and the results obtained in the past on the distribution of M 0 , M 1 and D 1 and those obtained by the analysis of the Southern California catalog.In particular, the hypothesis of Vere-Jones, i.e. that « Bath's Law» can be ascribed to the selection of data caused by the choice of the threshold of the mainshocks M c * , seems to be confirmed.5) Even if the model predicts a positive correlation between M 0 and D 1 for every value of M c * M c and N, it has been shown by simulations that for all clusters, apart from N, the value of the correlation coefficient can become near to 0 or negative.This result could justify the negative or law correlation observed by Utsu and Purcaru, considering also that for catalogs analyzed by them, the values of D 1 were not all known and that a selection of data was made.
The purpose of the present study was to show that «Bath's Law » is justified only if catalogs are suitably selected and that it does not seem to be in disagreement with the very simple hypothesis that all the events follow the Gutenberg-Richter law.Of course, further analyses must be made on this matter.First of all, the model has to be tested on a suitable data set: in fact the dependence on N of the distribution of D 1 in the model requires that this be tested in a catalog with many aftershock sequences and that the size of every sequence is known.As observed in Appendix A. Mathematical background.
Let M 1 , ..., M N be N independent and identically distributed random variables with density function (1.3) and let M (1) ... M ( N) be the corresponding order statistics; then (Feller, 1966;Casella-Berger, 1990) M (1) , ..., M (N ) are not independent and the density function of the joint density function of M (i ) and where .
Let's consider the random variables 1) ; then D 1 , ...,D N are independent exponential random variables with parameter , 2 , ..., N , respectively.Therefore the distribution of D 1 is independent of the sample size N and its expected value is of the magnitude scale used.In conclusion, the same probabilistic elaboration could be made utilizing different distributions from the exponential one for the magnitudes of a catalog (i.e. the Kagan distribution; Kagan, 1997).
its theoretical values are plotted versus N for four values of M c * M c , N 100 and for b =1.As fig. 2 shows, M 0 and D 1 are always positively correlated and the correlation co-

Fig. 2 .
Fig. 2. Correlation coefficient between M 0 and D 1 , relative to the conditional distributions of Section 2, versus the sample size N for M c * = M c , M c + 1, M c + 2, M c + 3.

Fig. 3 .
Fig. 3. Conditional expected value of M 0 by the event {M 0 M c * } versus the natural logarithm of the sample size N for M c * = M c , M c + 1, M c + 2, M c + 3.

Fig. 4 .
Fig. 4. Coefficient of variation of D 1 relative to the conditional density function of eq.(A.8) versus the sample size N for M c * M c = 2.0 (value of the difference between the two thresholds of the Japanese catalog analyzed byUtsu (1961Utsu ( , 1969))) and by Vere-Jones (1969)), for M c 2 and M c * = 5.0, this result is consistent with the model: the similarity between fig. 5 of the Drakatos-Latoussakis paper and fig.A.3a-d of the present work is evident.

Fig. 5 .
Fig. 5. Conditional expected value of D 1 by the event {M 0 M c * } versus the difference between the two thresholds M c * M c for N = 2, 10, 100, 1000.

Fig
Fig. A.1a-e.a) Conditional density function of M 0 by the event {M 0 M c * } (see eq. (A.5)) for M c * =M c , M c + 1, M c + 2, M c + 3, and for N = 2. b) The same as (a) but for N = 10.c) The same as (a) but for N = 100.d) The same as (a) but for N = 1000.e) Conditional expected value of M 0 by the event {M 0 M c * } (see eq. (A.6)) versus the sample size N for four values of difference between M c * and M c (M c * M c = 0,1,2,3).
Values of the expected value of conditioned by the event {M 0 M c * } for some values of b, N,