The intensity attenuation of Colfiorito and other strong earthquakes : the viewpoint of forecasters and data gatherers

This article has been originated by thoughts on previous analyses related to the probabilistic treatment of macroseismic attenuation, from which it emerged that in Italian territory the intensity decay ∆I varies greatly from one region to another, depending on many factors, some of them not easily measurable. By applying a clustering algorithm we classified some macroseismic fields drawn from the Italian felt report database in three classes. Earthquakes in the same class constituted the input of a two-step procedure for the Bayesian estimation of the probability distribution of ∆I at any distance from the epicenter, conditioned on I0, where ∆I is considered an integer, random variable, following a binomial distribution. The estimated distributions were validated by forecasting the macroseismic field of the Colfiorito earthquake. This article deals with the issues left open by those statistical analyses by following two ways: on one hand we test the procedure by forecasting the macroseismic field of other strong earthquakes recorded in Italy during the last century and, on the other hand, we ask the reasons for peculiarities in the results to experts in other fields. The article is hence an introductory work, an example of the possibility and of the need of exchange of knowledge. Mailing address: Dr. Renata Rotondi, CNR, Istituto di Matematica Applicata e Tecnologie Informatiche, Via Bassini 15, 20133 Milano, Italy; e-mail: reni@mi.imati.cnr.it Vol51,2_3,2008 4-03-2009 10:28 Pagina 499


Introduction
In ten years since the 1997 Colfiorito earthquake many articles have been published on every scientific feature of that dramatic event and in some cases this has provided the occasion for developing new methodologies.Some of the authors dealt with the attenuation problem (Rotondi and Zonno (2004) and found that the intensity decay shown by he Colfiorito macroseismic field does not fit in the average trend of the fields of other Italian earthquakes of the same epicentral intensity (IX MCS).Whether the discrepancy is due to difficulties in the damage survey or to peculiar geological/geomorphological characteristics of the location cannot be established through statistical analysis, but it a matter for geologists, historians, and other experts in collecting data.
Notwithstanding this, we thought of drawing some information by the comparison with other strong earthquakes of I0 ≥ VIII that occurred in the last century in Italy.To do that, we have built up a data set of 19 earthquakes and applied to them the probabilistic procedure described in Zonno et al. (2008) to forecast the corresponding macroseismic scenarios.The same data set was also analyzed following the approach given in Magri et al. (1994 to provide an evaluation on the probabilistic procedure.The time period considered was chosen so as to have well-documented macroseismic fields available. Both these methods study the intensity at site I s as a function of the epicenter intensity I0 according to most of the relationships adopted in the evaluation of the most recent official hazard map of Italy (see the final report of the Project S1 (2007) in the framework of INGV-DPC 2004-2006 Agreement).However while in the deterministic relationships the uncertainty is generally dealt with by including in the model a gaussian error with an assigned standard deviation, in our approach we treat Is as a random variable and we let its estimated distribution express the uncertainty on its realizations.
Section 2 describes the analysis carried out.We turn the questions that arise from the exame of the results to field experts, whom we provide with some hints from the exploratory data analysis and validation criteria.
Section 3 deals with some comments about the qualitative features of the macroseismic field considered in the analysis.

Local intensity attenuation: Colfiorito versus other strong earthquakes of the 20th century in Italy
In Rotondi and Zonno (2004), and Zonno et al. (2008) the 1997 Colfiorito earthquake was taken as a test site to which we have applied a procedure that allows us to derive the estimate of the probability distribution of the intensity at site given the epicentral intensity and the epicentralsite distance.For the details on the method we refer to the above-mentioned articles.Following the Bayesian approach, that estimation method exploits a learning set to assign the a priori distributions of the model parameters and then updates them through data considered similar to the quantity to be estimated as far as the issue under study is concerned.The learning set chosen in Rotondi and Zonno (2004) included the sets of the earthquakes with the same epicentral intensity belonging to the seismogenic zones of the zonation ZS4 (available at http://emidius.mi.ingv.it/GNDT/P511/home.html)numbered28,29,32,33,34,36,37,44,45,46,47,50,51,52.These intermediate zones are characterized by mixed expected rupture mechanisms, and are also considered homogeneous with respect to waves attenuation, as most of the area is located south of latitude 43°30' (Gasperini (2001) and Mele et al. (1997)).In particular we used the earthquakes with intensity IX that occurred in those zones except the 47 one as a learning set, and the earthquake of the zone 47, 1799 Camerino, for estimation.Then we forecast the macroseismic field of the 1997 Colfiorito earthquake included in the zone 47 according to the estimated model.The result is illustrated on the top of fig. 1.We obtained two results according to two different distributions, namely the predictive and the binomial distributions, and we compared them with the result provided by the competing method based on the use of the logistic distribution (Magri et al. (1994)).
In Zonno et al. (2008) we revisited the problem of attenuation and we let the data themselves indicate the sets homogeneous from the attenuation viewpoint.To this aim, we considered the macroseismic field of 55 well-documented earthquakes with intensity between VII to XI degrees of the MCS scale, covering the period from 1560 to 1980 and representative of the spatial distribution of seismicity in Italy, except the volcanic zones.Through a clustering technique we identified three decay trends, according to which the earthquakes were classified in three groups indicated by C A, CB, CC respectively, from the fastest trend to the slowest one.The same method for the estimation of the distribution of the intensity at site presented in Rotondi and Zonno (2004) was applied here for each class separately.For instance, all the earthquakes of class CA except those of epicentral intensity IX were used as learning set, whereas those of class CA with I0 = IX were used for updating the model parameters.Also in this case the 1997 Colfiorito earthquake was not included in the data set and hence it could be used to judge the predictive power of the procedure through both probabilistic and deterministic validation criteria: the logarithmic scoring rule based on the marginal likelihood, the odds ratio, and the absolute discrepancy between estimated and observed intensities at site.Table I summarizes the results obtained in the two analyses.We note that the estimates based on the second data set improve the forecast of all the models, but there is no clear evidence in favour of one of the three models, even if the results based on the logistic distribution seem more satisfactory.We point out that, as suggested in Magri et al.(1994), we estimated two different logistic distributions: one for the earthquakes of intensity X and XI degrees and another for those of intensity VIII and IX degrees, respectively.The graphical representation of the predicted scenario for the 1997 Colfiorito earthquake is given on the bottom of fig. 1.
A similar test procedure was applied to the 1799 Camerino earthquake and the result indicated clearly the predictive distribution as the best model.Comparing the decay trend of the Camerino (fig.2) and Colfiorito (fig. 1) earthquakes we noted that the latter shows a more scattered set of intensity points in the range of 10-20 km from the epicentre; as a consequence, the intensities recorded at these locations were up to four degrees lower than the epicentral intensity.
We have hypothesized that the unsatisfactory result of our method in the Colfiorito case was due to this fact, thus some questions arose: Q1.Is the shape of the macroseismic field of Colfiorito an uncommon or an usual case?Is it due to uncertainty in the damage survey, hence in the assessment of the macroseismic intensity or is it due to the particular geological and geographical characteristics of the epicentral area?Q2.If this is the case, can these features be recognized in other Italian locations so that they can be taken into account in an improved formulation of the probability treatment of the attenuation?As statisticians we do not have the tools to answer these questions, then we must call for help from other fields experts.However, from our side we can check whether other strong earthquakes behave in the same way and give the results of our analyses to the experts as support to their research.
In order to do that, we have extracted from the data base DBMI04 (Stucchi M. et alii. (2007)) the macroseismic fields of the last century earthquakes with intensity larger than or equal to VIII degree and with a quite large number of intensity points.The resulting data set is listed in table II and shown in fig. 3. We added to it the field of the 1950 Gran Sasso earthquake, recently revised (Tertulliani et al. (2006)).
First of all, we analysed the dispersion of the decay ∆ I in four bins of 10 km, mainly to examine the neighbours of the epicentres.We used boxplots, which are very powerful graphic tools for looking at several sets of data jointly and for visualizing data summaries like median, upper and lower quartiles, minimum and maximum data values.The boxplots we obtained are shown in fig. 4.
Each picture of fig. 4 shows four boxplots which represent the distribution ofthe intensity decay at distance 0-10, 10-20, 20-30, 30-40 km from the epicentre respectively.By placing the boxplots side by side the differences in location and dispersion among various groups of data are easily detected.In the case of the Colfiorito field, we note that: a) the inter-quartile range of the decays observed in the first bin does not include the null decay (like the 1905/09/08 and 1936/10/18 earthquakes), b) the first and third group of observations are scattered on four degrees, while c) the 50% of the second group is concentrated on the value 2 and the fourth group ranges between the values 2 and 3. Other irregular behaviors could be: d) nonmonotonic increase of the average trend in the subsequent groups (see the 1919/06/29 earthquake), e) large inter-quartile range (see the 1908/12/28 earthquake), f) a considerable difference between the dispersion of adjacent groups (see the 1928/03/27 earthquake).
To verify if the above peculiarities are really signals of an atypical field, we estimated the macroseismic field of all the earthquakes of the data set by applying both the method described in Zonno et al. (2008) and that based on the lo-gistic distribution, and we compared the results by means of the validation criteria previously mentioned.The results are given in table II.We point out that, in order to estimate the macroseis- mic fields of the earthquakes of table II according to the method in Zonno et al. (2008), we associated each of them with the attenuation class of the nearest among the earthquakes used in that analysis for identifying the attenuation trends.
In table II, for the macroseismic field of each earthquake, we give the score that the three validation criteria assigned to the forecast produced by the three distributions; the best value for each criterion is reported in bold.It can be seen that the best score is given by the predictive distribution or by the binomial distribution for at least two of the three validation criteria for 15 of the 21 earthquakes taken into account.Only for six of them, namely the 1905, 1908, 1919, 1928, 1936, and 1997 (Colfiorito) (Colfiorito) earthquakes, does the logistic distribution provide a better score for at least two of the criteria and these are the earthquakes for which the examination of the boxplots has highlighted anomalies of the macroseismic fields.For lack of space, fig. 5 reports the comparison between predicted (blue asteriskes) and observed (red crosses) intensities at site just for the earthquakes that are problematic from our viewpoint.

Discussion and conclusions
Both the exploratory data analysis carried out by means of the boxplots and the probabilistic forecast agree on discriminating the six above-mentioned earthquakes from the others as for the attenuation trend.This leads us to go back the questions Q1 and Q2 and to think that the case of Colfiorito is not unusual.Comparing the pictures in figs. 1 and 5, and observing fig.4, the macroseismic field of the Colfiorito earthquake appears similar to those of the 1905 and 1908 Calabria earthquakes and to that of the 1936 Bosco Cansiglio earthquake; these events are geographically far, but they may share some aspect.
Let us answer about the Colfiorito earthquake peculiarities (Q1).It shows several aspects that can be interpreted as perturbation elements: high data density in the epicentral area, mountainous environment, and furthermore it is the sum of more strong events that occurred in about ten days.A seismic sequence due to different sources, as in this case, produces a macroseismic pattern that is the overlapping of more macroseismic fields around more epicentres.So the high dispersion of the intensity values of the Colfiorito earthquakes can be due to the migration of the epicentres in a short lapse of time.
Concerning the other events (Q2), we can underline that the 1919 Mugello (Tuscany, Central Italy) earthquake and the Cansiglio event (Northern Italy) share with Colfiorito the problem of high data density in a range of 10-15 km from the epicentre.It seems that the higher the density of the sample the larger its dispersion.In fact, it is common to collect macroseismic data with excess of detail during the field survey, especially in higher intensity areas, to well describe the site response.This can generate a high dispersion of intensities within a few kilometers from the epicentre.
The Carnia event (1928/03/27), that occurred near the Italian-Austrian border, shows a very asymmetric distribution of intensity points (all Austrian data are missing), together with a high variability in intensity within a few kilometers from the epicentre, probably due to the rugged territory.The same problems probably affect the Cansiglio (Veneto, Northern Italy) earthquake itself, as its intensity points are strictly constrained by the morphology of the territory.
The 1905 Calabrian (Southern Italy) earth-Table II.List of the earthquakes considered and values (in bold the best ones) of the validation criteria applied to the three probability distributions of the intensity at site: predictive, binomial (Zonno et al. (2008)) and logistic (Magri et al. (1994)).The asterisk indicates the revised field of the 1950 Gran Sasso earthquake (Tertulliani et al. (2006)).quake is one of the most important but atypical events that ever occurred in Italy.Its real epicentre, probably offshore (Cucci and Tertulliani (2006);Michelini et al. (2006)), is still under debate and, like almost all Calabrian earthquakes, the macroseismic field shows a high inhomogeneity due to the morphology of the territory and the shape of Calabria region itself.Towns and villages in Calabria are indeed preferably settled along the coasts, setting up a distribution in belts, that strongly constraints the macroseismic field trend.The other Calabrian event, the 1908 Messina strait (Calabria-Sicily) earthquake, displays a two-lobes shaped macroseismic field, because the epicentre falls in the Messina strait.
In conclusion we believe it is hard to define anomalous macroseismic fields, in terms of attenuation, according to some conditions.Nevertheless we cannot do that without evidencing that the statistical analysis has enlightened some recurrent behavior in the intensity distribution.We can summarize that an earthquake that occurs in a coastal area, or offshore, or in a mountainous region, with valleys and ranges, or in a border area between different countries will have a higher probability to produce an anomalous intensity field, that is an irregular attenuation trend especially within the first 10-20 km from epicentre.Another condition that can lead to anomalies is the seismic sequence trend, typically shown by the Colfiorito earthquakes.Finally the non-homogeneous macroseismic procedures in collecting intensity data (see Camassi et al., this volume) it is another critical factor.These two latter points are obviously unpredictable in a model.
The convergence to the same conclusions of statistical and macroseismic studies is an encouraging result.Our conclusion is that, although it is necessary to translate into models the richness of the real phenomena, nevertheless it is important not to fall into the trap of overfitting which is in contrast with the idea itself of modeling.

Fig. 1 .
Fig. 1.Observed (red crosses) and estimated (blue asteriskes) intensities of Colfiorito 1997 earthquake: on the top the learning set is built through the zonation ZS4, on the bottom by a clustering method.

Fig. 3 .
Fig.3.Some of the earthquakes of I0 ≥ VIII that occurred in the last century, numbered according to table II, in particular the label 21 corresponds to the epicentre of the Colfiorito earthquake.

Fig. 4 .
Fig. 4. Boxplots of the 20 earthquakes that constitute the data base.The box contains the middle 50% of the data.The first and third quartiles of the data set are indicated by the lower and upper edges of the box, which, therefore, is known as the inter-quartile range.The line in the box indicates the median value.Box widths are proportional to the square root of the number of observations for the box.The whiskers are 1.5 times the interquartile range from the upper and lower quartile; the values at a greater distance are plotted individually as lines and represent potential outliers.

Fig. 5 .
Fig. 5. Observed (red crosses) and estimated (blue asteriskes) intensities of five «problematic» earthquakes of the data base under exam.

Table I .
Validation criteria applied to the 1997 Colfiorito earthquake; the asterisk denotes the best result.
Fig. 2. Observed (red crosses) and estimated (blue asteriskes) intensities of Camerino 1799 earthquake: on the top the learning set is built through the zonation ZS4, on the bottom by a clustering method.