Daily earthquake forecasts during the May-June 2012 Emilia earthquake sequence (northern Italy)

On May 20, 2012, at 02:03 UTC, a magnitude Ml 5.9 earthquake hit part of the Po Plain area (latitude, 44.89 ˚N; longitude, 11.23 ˚E) close to the village of Finale-Emilia in the Emilia-Romagna region (northern Italy). This caused a number of human losses and significant economic damage to buildings, and to local farms and industry. This earthquake was preceded by an increase in the seismicity the day before, with the largest shock of Ml 4.1 at 23:13 UTC (latitude, 44.90 ˚N; longitude, 11.26 ˚E). It was then followed by six other Ml 5.0 or greater events in the following weeks. The largest of these six earthquakes occurred on May 29, 2012, at 07:00 UTC (Ml 5.8), and was located 12 km southwest of the May 20, 2012, main event (latitude, 44.85 ˚N; longitude, 11.09 ˚E), resulting in the collapse of many buildings that had already been weakened, a greater number of victims, and most of the economic damage (see Figure 1). This sequence took place in one of the Italian regions that is considered to be at small-to-moderate seismic hazard [Gruppo di Lavoro MPS 2004]. Earthquakes of the M6 class have occurred in the past in this zone [Gruppo di Lavoro CPTI 2004], but with a much smaller time frequency with respect to the most seismically hazardous parts of Italy. […]


Introduction
On May 20, 2012, at 02:03 UTC, a magnitude M L 5.9 earthquake hit part of the Po Plain area (latitude, 44.89 ˚N; longitude, 11.23 ˚E) close to the village of Finale-Emilia in the Emilia-Romagna region (northern Italy).This caused a number of human losses and significant economic damage to buildings, and to local farms and industry.This earthquake was preceded by an increase in the seismicity the day before, with the largest shock of M L 4.1 at 23:13 UTC (latitude, 44.90 ˚N; longitude, 11.26 ˚E).It was then followed by six other M L 5.0 or greater events in the following weeks.The largest of these six earthquakes occurred on May 29, 2012, at 07:00 UTC (M L 5.8), and was located 12 km southwest of the May 20, 2012, main event (latitude, 44.85 ˚N;longitude, 11.09 ˚E), resulting in the collapse of many buildings that had already been weakened, a greater number of victims, and most of the economic damage (see Figure 1).This sequence took place in one of the Italian regions that is considered to be at small-to-moderate seismic hazard [Gruppo di Lavoro MPS 2004].Earthquakes of the M6 class have occurred in the past in this zone [Gruppo di Lavoro CPTI 2004], but with a much smaller time frequency with respect to the most seismically hazardous parts of Italy.To date, it appears that the largest amount of damage was caused to structures that were not built according to the building code, which was established after the publication of the latest Italian seismic hazard map [Gruppo di Lavoro MPS 2004] and become effective only after the L'Aquila 2009 earthquake.This highlights even more that our best defense against earthquakes is to build constructions according to a sound building code.Having said that, we also argue that the building code is not the only defense that we can have against earthquakes, and that different kinds of mitigation actions, more directed to reduce human loss that damage, can be taken over shorter time intervals [e.g. van Stiphout et al. 2010, Jordan et al. 2011].
Traditionally, seismic hazard maps forecast the expected ground motion in a time interval of 50 years or longer.The underlying earthquake occurrence process is presumed to be time independent [Cornell 1968], and the mean seismic hazard rate is expected to remain constant through time.Nonetheless, we know that the earthquake occurrence process has significant time variability in the seismic rate; these variations are much larger then would be anticipated with a pure time-independent process.The clearest of these variations is the time and space clustering of seismicity.An earthquake suddenly alters the dynamic conditions within the nearby fault systems, which can lead to future earthquakes over a time scale of days, and up to a few years.
The use of time-dependent models for practical purposes has been introduced recently in Italy [Marzocchi et al. 2012a] to forecast seismicity over a 10-year time window.This model was requested by the Civil Protection Authorities to select areas where building retrofitting is more urgent.This mitigation action comes from a decision of the Italian Government that provided funding at the beginning of 2010 for the reduction of seismic risk based on medium-term time scales (Interventi per la prevenzione del rischio sismico).
Time variations of the seismic rate are more evident over the short-term (days to weeks), and in particular after a large earthquake.The modeling of these time variations and its possible use for practical purposes was discussed in depth by the International Commission on Earthquake Forecasting for the Italian Civil Protection [Jordan et al. 2011], which was nominated by the Italian government after the L'Aquila 2009 earthquake.This Commission had two main tasks: (i) to report on the current state of knowledge of short-term prediction and forecasting of tectonic earthquakes; and (ii) to indicate guidelines for the use of possible forerunners of large earthquakes to drive Civil Protection actions, including the use of probabilistic seismic hazard analysis in the wake of a large earthquake.
The Commission recommended the development of operational earthquake forecasting (OEF), and of quantitative and transparent decision-making protocols that encompass mitigation actions at different impact levels, with each one associated with the surpassing of a specific probability threshold.In essence, OEF comprises procedures for gathering and disseminating authoritative information about the time dependence of seismic hazards, to help communities to prepare for potentially destructive earthquakes [Jordan et al. 2011].
To date, seismologists have already developed time-dependent models based on earthquake clustering, and these have been used to track the evolution of aftershock sequences in real-time.Here, we apply two versions of the Epidemic Type Aftershock Sequence (ETAS) model to the seismic sequence of the May-June 2012 Emilia earthquakes (see Figure 1a for the area considered), using the real-time earthquake data recorded by the Istituto Nazionale di Geofisica e Vulcanologia (INGV; National Institute of Geophysics and Volcanology) seismic network.The models were described by Falcone et al. [2010] (and references therein) and Lombardi and Marzocchi [2010a] (and references therein).They are thus referred to here as the FCM and LM models, respectively.Specifically, we consider the first three weeks of the Emilia 2012 sequence (May 19 to June 12, 2012), when most of the largest earthquakes occurred (see Supplementary Information for maps).For both models, we show the preliminary results of the daily forecasts of the number of events with M L ≥2.5 and the daily occurrence probability of one or more events with M L ≥4.0.

Short-term earthquake forecasts: the models
The short-term forecasting models so far proposed are of four types: (i) The Reasenberg and Jones [1989] model, which is based on the temporal clustering of earthquakes and which was used to forecast the aftershocks of the Octo- ber 17, 1989, Loma Prieta earthquake (see Lolli and Gasperini [2003] for a retrospective application to the Italian territory); (ii) The ETAS models [e.g., Ogata 1988, Ogata 1998, Console and Murru 2001, Zhuang et al. 2002, Console et al. 2003, Wossner et al. 2011], which were used to forecast aftershocks in real-time after the L'Aquila earthquake [Marzocchi and Lombardi 2009], and to set up a short-term forecasting model for the whole Italian territory [Murru et al. 2009, Falcone et al. 2010, Lombardi and Marzocchi 2010a]; (iii) The Short Term Earthquake Probability model [Gerstenberger et al. 2005], which was used to forecast the daily ground shaking in New Zealand after the September 3, 2010, Darfield earthquake; (iv) The Agnew and Jones model [Agnew and Jones 1991], which is mostly used to forecast mainshock occurrences, and which has been adopted by the California Earthquake Prediction Evaluation Committee.
The Reasenberg and Jones [1989] model is based only on temporal distribution after a main shock, while the other three models also include the spatial distribution.
The ETAS models are probably the most diffuse class of models for daily and weekly forecasts, as testified by the many that have been applied in different Collaboratory for the Study of Earthquake Predictability (CSEP) experiments in Japan, California (USA) and Italy [Falcone et al. 2010, Lombardi and Marzocchi 2010a, Ogata 2011, Zhuang 2011].For longer forecasting time windows, models mostly based on a single Omori Law decay can work reasonably well [see Lolli et al. 2011 for a more elaborated model].The ETAS models are based on simple physical components: a spatially variable seismic background that is characterized by a stationary Poisson distribution in time, and radially symmetric triggering.The ETAS models are generically described by an equation of this form: (1) where m is the rate of magnitude m events expected in the location , at time t and of magnitude m, and M min is the minimum magnitude considered.The first term represents the background varying with space (and not in time), while the summation takes into account the triggering effects of all the previous earthquakes, as a function of the distance, elapsed time, and magnitude of the triggering event.The last factor outside the square brackets is the frequency-magnitude law, which is supposed to be independent from space and time.Once has been estimated, the probability P of one event at least in a specific time-space-magnitude window Ω, is: (2) In our real-time application, we provide the daily probability for M L 4 events or larger, and for M L 5.5 events or larger, in the region shown in Figure 1b.We adopt two different and independent versions of the ETAS-class models.The detailed descriptions of these were reported in Falcone et al. [2010] (the FCM model) [Murru et al. 2009, Console et al. 2010a, Console et al. 2010b] and in Lombardi and Marzocchi [2010a] (the LM model).Both of these models have been submitted to the CSEP experiment that is now running in Italy.Since the CSEP experiment covers the whole of the Italian area, the real-time application for aftershock sequences in small areas can require some optimization/modifications of the ETAS models.This optimization aims to account for the spatial variability of the completeness magnitude [see Schorlemmer et al. 2010], the specific model parameters of the area considered [see, e.g., Lolli and Gasperini 2003], and the increase in M min soon after the occurrence of a large shock [e.g.Gasperini and Lolli 2009].In the following, we briefly describe these models, with emphasis on the differences among them and the modifications made with respect to the model used for the CSEP experiment that was carefully described in Lombardi and Marzocchi [2010a] and Falcone et al. [2010].
For the LM model, Equation (1) reads as: (3) where o is the Poisson background seismic rate, is the spatial distribution of the background seismicity, a is the coefficient of the exponential magnitude productivity law, k is the productivity coefficient, c is the time constant of the generalized Omori Law, p is the exponent of the generalized Omori Law, d is a characteristic distance for triggering, r i is the distance with the i-th past earthquake, c dqc is a normalization factor for the spatial distribution, b is the parameter of the probability density function derived from the Gutenberg-Richter law.
For the FCM model, equation (1) reads as: (4) where o is the average proportion of background events to the total number of events, is the average spatially variable rate of seismicity, r i and a have the same meaning as for Equation (3), b is the b-value of the Gutenberg-Richter law, and the average triggering distance of the aftershock zone d i is related to the magnitude of the triggering event, by the / equation [Zhuang et al. 2004[Zhuang et al. , 2005]], . For both of the models, the set of free parameters are estimated by applying a maximum-likelihood procedure to the local seismicity observed before May 20, 2012.The parameters are reported in Table 1.
It is worth noting that the different parameterization adopted by the two models makes it difficult to compare the parameters, like the productivity k and the background rate.Some important differences between the models are worth stating: (i) In the LM model, the function represents the spatial probability density function of background events, and it is estimated using the iteration algorithm developed by Zhuang et al. [2002], while in the FCM model, represents the average spatially variable seismic rate of the area, and it is computed using the method introduced by Frankel [1995].Moreover, the parameter o indicates the rate of background events in the whole area in the LM model, whereas it indicates the proportion of background events in the FCM model.(ii) The LM and FCM models assume different parametrizations for the dependence on the magnitude of the triggering earthquakes.Specifically, for the LM model, the parameters a and c determine how the triggering capability and the characteristic distance depend on the magnitude of the triggering earthquake, respectively.For the FCM model, the dependence on the magnitude of the triggering earthquakes is included in the spatial response function through parameter a alone.(iii) For the FCM model first, the integral in Equation ( 2) is replaced by the instantaneous rate calculated in the middle of the forecasting time window, considering the triggering effect of the earthquakes occurred before the time t, and without taking into account the contribution of unknown events that occurred inside.The model LM adds the contribution of unknown events that occurred during the forecast period to the triggering effect of all of the real events that occurred up to the start of the forecasting time window.The latter is computed as the median of the rates coming from 1,000 different simulated catalogs.(iv) For the LM model, the confidence interval reported in Figure 2 is computed as the 15-th and 85-th percentiles of the rates obtained by the 1000 simulations [see also Lombardi and Marzocchi 2010b].The 70-th confidence interval is expected to mimic the 1-v confidence interval.The confidence interval of the expected number of events for the FCM model is computed by assuming a Poisson distribution of the number of events in each forecasting period.In this case, for large mean m, the distribution is well approximated by a normal distribution with standard deviation m 0.5 .Then, for each forecast period, the confidence interval is computed as [m-m 0.5 , m+m 0.5 ].
To cope with the strong variation of M min immediately after a large shock, the LM model includes an empirical correction; in particular, instead of using the Gutenberg-Richter relationship obtained before May 20, 2012, the LM model uses the frequency-magnitude distribution that is observed before the time t of the start of the forecast.Therefore, it is implicitly assumed that any discrepancy from a Gutenberg-Richter Law will hold also for the next 24 h.Usually, this empirical correction is useful for the first days immediately after the large shocks, when the variation in M min is strong, and it becomes negligible as soon as M min returns to the value observed before the start of the sequence.

Daily earthquake forecasts for the Emilia 2012 earthquake sequence
The area considered for the forecasts is (10.90-11.80˚E) × (44.70-45.10˚N).During the evolution of the sequence, we changed the area considered once only (on May 29, 2012, immediately after the second-largest earthquake), to account for the East-West expansion of the sequence (see Figure 1a; see also the maps in Supplementary Information).We provided daily earthquake forecasts in real-time for two threshold magnitudes (M L ≥4.0, and M L ≥5.5) from May 20, 2012, at 13:00 UTC, a few hours after the first large event (M L 5.9; 02:03 UTC), which occurred close to Finale-Emilia village.All of the forecasts were then regularly provided at 6:00 UTC.In some cases, we also provided intermediate forecasts if something important was to happened.Figure 1b shows the typical space-time forecast format.Each forecast was based on the seismicity that occurred before 6:00 UTC, and provided an estimation of the expected seismicity in the following 24 h.
Immediately after large events, as well as the time variation of the completeness threshold magnitude that was discussed in the previous section, the real-time seismic catalog might have other potential shortcomings, like low-quality magnitude estimations, an increase in the completeness magnitude, and some medium-to-high earthquakes missed in real-time and included in the database only a few hours or days from their occurrence.This also happened during the L'Aquila 2009 sequence, which caused a general underestimation of the forecasts in the first few days [Marzocchi and Lombardi 2009].The same problem also appeared during the Emilia 2012 earthquake sequence, as we will show later.
Figure 2 summarizes the evolution of the forecasts made with the FCM (left panel) and LM (right panel) models, including the confidence interval bands and the comparisons with the real number of data observed with magnitude ≥2.5.Retrospectively, we ran the same models with the updated seismic catalogs.Here, we do not attempt any rigorous tests for checking the consistency of the forecasts; nonetheless, Figure 2 shows that the two models forecast the number of earthquakes with magnitude ≥2.5 well, with the notable exception of the days immediately after the two largest shocks (note that the confidence intervals of the models reported in Figure 2 are expected to encompass about 65% to 70% of the data).Retrospectively, we ran the same models with the updated seismic catalog (that was not yet the official catalog, which will be released after a few months).The revised forecasts of the LM model were closer to the real number of earthquakes observed, while the FCM model forecasts remained almost unaltered.This is due to the different strategies of the models to cope with incomplete seismic catalogs (see previous section).More importantly, these results highlight the importance of reliable real-time catalogs to provide reliable forecasts.We also note that both of the models performed well in terms of the spatial distribution of the earthquakes.All of the forecasting maps for M L ≥4.0 are reported in the Supplementary Information.For example, all of the M L ≥5.0 events occurred in parts of the grid that were characterized by higher spatial probabilities (see, for instance, Figure 1b).
The importance of reliable real-time seismic catalogs is clearly highlighted in Figure 3, where we report on the temporal evolution of the earthquake probability of M L ≥4.0 for the whole area (Figure 1a).The LM model has a significant variability between the forecasts made in real-time and the forecasts made using the updated seismic catalog.Nonetheless, as also shown in Figure 2, these differences are probably inside the natural variability of the process, and therefore this bias is unlikely to produce completely unreliable results, excluding in some cases the first days immediately after a large shock [see also Woessner et al. 2011].
Finally, the presence of two independent models poses the problem of how to condense the forecasts of each model in one single forecast.For now, we simply present the results of the two models separately.This choice has been made to highlight the epistemic uncertainty of the forecasts, but it is not yet satisfactory from a scientific point of view (see next section).

Conclusions and perspectives
The preliminary results shown here highlight some important issues that are worth mentioning: 1.Both models forecast the number of earthquakes observed and the locations of the largest shocks reasonably well.We have not yet checked the consistency of the forecasts through statistical tests, but the plot of Figure 2 shows that the number of forecast earthquakes matches well the number of aftershocks observed.This indicates that seismologists are ready to provide reliable information in real-time to decision makers in order to set up short-term risk mitigation actions.
2. To provide accurate forecasts, we need accurate realtime seismic catalogs.In particular, the reduction in the detec- tion efficiency immediately after a large shock usually results in an underestimation of the earthquake forecasting models, as the triggering effects of all of these earthquakes cannot be considered in the forecast.This bias tends to decrease sharply after the occurrence of a large earthquake.A reliable real-time seismic catalog would allow us to produce daily forecasts every few hours; in this way, we might be able to better capture the variations in the seismicity that can occur during the day.3.After a large shock, the completeness magnitude changes through time.These ETAS models need robust strategies to cope with this issue.To date, those who use these models sometimes use ad-hoc empirical strategies (see the LM model).More satisfactory and robust strategies are required to tackle this issue in the future.
4. An OEF implies authoritative scientific forecasts.With different models, we argue that an authoritative forecast must incorporate all of the information available, to quantify the epistemic uncertainties among the models, and it should represent the 'best-available' model [cf.Marzocchi and Zechar 2011].In the case of the 2012 Emilia earthquake sequence, we simply report the output of the different models without making any attempt to provide a single forecast.In the near future we plan to: (i) increase the number of earthquake forecasting models, with the only restriction that each model to be included has to be under testing in the CSEP experiment in Italy or elsewhere; and (ii) create an ensemble model by averaging all of the forecasts.This last step can be achieved by weighting of the forecasts of each model ac-cording to the performance of each model, using the CSEP testing phase [see Marzocchi et al. 2012b].In this way, the ensemble model will represent an average estimation of the forecasts.This approach has some similarity with the use of the logic tree in seismic hazard, where all of the branches are merged into one single hazard assessment (usually the average).With respect to this kind of averaging, here we have the advantage that we can to assign objective weights to each forecast [see Marzocchi et al. 2012b], while the weight of the branches in the logic tree are usually assigned subjectively. .Daily probabilities of one or more events with M L ≥4.0 in the forecasting areas.Green, violet lines, forecast made with the FCM and LM models, respectively.Blue, red lines, forecasts made with the revised seismic bulletin with the FCM and LM models, respectively (see text for details).Vertical lines, occurrence of the seven M L ≥5 earthquakes (some earthquakes appear as overlapping).

Figure 1 .
Figure 1.(a) Location of the events with M L ≥2.5 that occurred from January 1, 2009, in the area hit by the Emilia 2012 sequence.Stars, squares, circles, earthquakes with M L ≥5.0, 4.0 ≤ M L < 5.0, and 2.5 ≤M L <4.0, respectively.Black, red, blue symbols, events that occurred in the period of January 1, 2009, to April 30, 2012, May 1 to May 28, 2012, and May 29 to June 12, 2012, respectively.Red, blue rectangles, areas chosen for the real-time daily forecasting, before and after May 30, 2012, respectively (see text for details).(b) Example of a spatial forecast.The map reports the density of the expected number of earthquakes with M L ≥5.5 per km2 in the 24 h starting from May 29, 2012, 06:00 UTC.Three black stars, earthquakes with M L ≥5.0; gray dots, all earthquakes with 4.0 ≤ M L < 5.0 that occurred before.Red star, ML 5.8 earthquake that occurred during the forecasting time window.
Figure3.Daily probabilities of one or more events with M L ≥4.0 in the forecasting areas.Green, violet lines, forecast made with the FCM and LM models, respectively.Blue, red lines, forecasts made with the revised seismic bulletin with the FCM and LM models, respectively (see text for details).Vertical lines, occurrence of the seven M L ≥5 earthquakes (some earthquakes appear as overlapping).

Table 1 .
Parameters of the two ETAS models estimated by the maximumlikelihood method (learning phase).
Daily observed (black circles) and forecast number of events of M L ≥2.5 in the whole area.The two plots represent the forecasts made with the FCM model (left) and LM model (right).We report two forecasts: one that has been provided in real-time, and one that is a revised forecast using an updated seismic catalog.The black lines represent the 1-v confidence interval for the FCM model and the 70% confidence interval for the LM model applied to the revised seismic catalog (see text for details).Vertical black dotted lines, times of occurrence of the two main shocks of the sequence (May 20, 2012, M L 5.9; May 29, 2012, M L 5.8).