“ MULTISCALE PROCESSES TO DESCRIBE THE EASTERN SICILY SEISMIC SEQUENCES „

In this paper, a version of hybrid of Gibbs point process models is proposed as method to characterise the multiscale interaction structure of several seismic sequences occurred in the Eastern Sicily in the last decade. Seismic sequences were identiﬁed by a clustering tech-nique based on space-time distance criterion and hierarchical clustering. We focus our analysis on ﬁve small seismic sequences, showing that two of these are described by an inhomogeneous Poisson process (not signiﬁcant interaction among events) while the other three clusters are described by a Hybrid-Geyer process (mutiscale interaction between events). The proposed method, although it still needs extensive testing on a larger catalogue, seems to be a promising tool for the characterization of seismogenic sources through the analysis of induced seismicity.


INTRODUCTION
Given an area observed during a time interval, earthquakes can be considered as a realization of a marked space-time point process, assuming that each event, with a given magnitude (the mark), is identified by a point in space (by its geographical coordinates) and time (fixing its time of occurrence).
Generally in the point process theory, the attention is focused on two complementary aspects that can be untangled with summary statistics and model formulations [Diggle, 2013;Baddeley et al., 2015]. The first aspect is the description of the first-order characteristics of the point process estimating the intensity function; whilst the second aspect concerns the description of the events dependence, and so of the second-order properties of the process. In a given pattern, the events may exhibit regularity (where points tend to avoid each other), independence (complete spatial randomness), and clustering (where points tend to be close together).
To describe real point process phenomena, like seismic events, more complex models than the stationary Poisson that assumes statistical independence of events are defined. Indeed, seismic data represent an interesting situation where the study and the interpretation of features like self-similarity, long-range dependence and fractal dimension are crucial.
Moreover, the distribution of epicentres generally shows diverse interaction structures at different spatial and spatio-temporal scales varying according to the location of earthquake sources in the study area.
Therefore, the study of the second-order properties of a point process has a relevant role in the comprehension of the process and its realization.
Earthquakes are typically observed as clusters in space and time. In space, earthquakes epicentres are typically highly concentrated along the plate boundaries and, in general, along active faults [Baddeley et al., 2015]. On the other hand, clustering in time can be seen as a significant increase of seismic activity immediately after large earthquakes leading to aftershock sequences.
Currently in the seismological context when events are clustered together, the modelling formulation is usually based on self-exiting point processes assuming that the occurrence of an event increases the probability of occurrence of other events in time and space (such as the Hawkes model, Hawkes and Adamopoulos [1973] or the Epidemic Type Aftershock-Sequences (ETAS) model, Ogata [1988]).
Although the models cited above represent a wide variety of solutions to describe different types of spatio and spatio-temporal patterns, more complex models need to account for environmental heterogeneity while detecting interactions at several spatial scales (multiscale dependences). For this reason, it is attractive and motivating the definition and estimation of models that account for both attractive and repulsive dependences, such as the Hybrid of Gibbs models [Baddeley et al., 2013], while also including the effect of observed covariates in order to better describe the probability of occurrence of earthquakes in fixed regions. Siino et al. [2017] provide a novel contribution to the literature on point processes and seismological applications using a hybrid approach and the distance to seismic sources as covariates for the analysis of earthquakes in the Hellenic area.
In this paper, we try to characterize the interdependence of events occurred, at several spatial scales, in the East Sicily area. Indeed, an accurate study of the moderate and small seismicity of the Eastern Sicily could be useful to better understand the seismogenic sources of this area. For this reason, starting from a seismic catalogue over the last decade, we identify some sequences and we analyze their spatial distribution with a point process approach.
In particular, we describe the spatial seismicity of the selected clusters using an advanced spatial point process model formulation based on the hybrid of Gibbs processes [Baddeley et al., 2013;Siino et al., 2017]. The main characteristics of Hybrid of Gibbs models is that it is possible to describe the multiscale interaction between the points in presence of larger scale inhomogeneity. Therefore, in this paper, after describing each sequence in terms of the previous model, the results are related to the underlying geological information of the studied areas and some comparisons are done.
The paper is organized as follows. Section 2 presents a review of the main geological and seismic characteristics of the Eastern Sicily. The methodological methods used to select clusters are explained in Section 3. The model formulation used in this paper is reported in Sec-tion 4. The results of the analysis and the estimated models are presented in Section 5.
The last section is devoted to conclusions and final remarks.

SETTING: THE STUDY AREA
East Sicily and South Calabria is the area with greater deformation rate and seismic strain release in Italy [Serpelloni et al., 2010;Palano, 2015]. In this area, the Nubia and Eurasia plates collide, giving birth to a complex tectonic puzzle dominated by a NNW-SSE-oriented convergence where also crustal extension and strike-slip zones coexist in a relative small region [Hollenstein et al., 2003;Serpelloni et al., 2007;Caporali et al., 2009], see Figure 1.
Two main zones of crustal contraction are located in the Tyrrhenian Sea, and at the front of the Calabrian Arc in the Ionian Sea [Billi et al., 2007;Lavecchia et al., 2007;Giunta et al., 2009;Polonia et al., 2011Polonia et al., , 2012 characterized by compressional directions roughly transversal to the chain axis. They all are limited by a strike-slip transfer zone extending from the Aeolian Islands to the Malta escarpment [Palano et al., 2012;Polonia et al., 2012;Gallais et al., 2013;Orecchio et al., 2014]. Eastern Sicily and southern Calabria are affected by crustal extension mainly WNWESE oriented whose origin has been related to different tectonic processes [Tortorici et al., 1995;Monaco et al., 1997;Catalano et al., 2008;De Guidi et al., 2012]. All these domains are well documented at local scales by the occurrence of active fault segments, marine terracing, growing folds and active basins [Catalano and De Guidi, 2003;Bianca et al., 1999Bianca et al., , 2011Catalano et al., 2011;Sulli et al., 2013;Argnani et al., 2013;De Guidi et al., 2013].
Moderate and small crustal earthquakes are di used throughout the whole area, well marking the active zones aforementioned and with coherent seismogenic stress orientation [Neri et al., 2005;Scarf et al., 2013;Presti et al., 2013]. Deep seismicity marks the eastern Aeolian Island sector and the Tyrrhenian o shore of the Calabria; here deep seismicity (down to 450 km) is reported and is in-terpreted as the subducing Ionian slab under the Calabrian Arc [Selvaggi and Chiarabba, 1995;Chiarabba et al., 2005].
In historical time, the whole area experienced several M > 7.0 earthquakes that caused surface faulting, tsunamis, and large landslides [Guidoboni et al., 2007;Locati et al., 2011;Tinti et al., 2004]. Some authors framed all the large earthquakes in a single regional fault system [Neri et al., 2006;Catalano et al., 2008] running roughly N-S for about 370 km and consisting in a belt of several normal fault segments controlled by a WNW-ESE local extension confirmed either from structural analysis [Monaco et al., 1997;Catalano et al., 2008;De Guidi et al., 2013] and geodetic measurements [D'Agostino and Selvaggi, 2004]; however the situation is likely to be more complex. If in the southern Calabria the seismic sources of such large earthquakes were somehow recognized [Tortorici et al., 1995;Neri et al., 2006;Galli and Bosi, 2002;Jacques et al., 2001;Catalano et al., 2008;Galli and Peronace, 2015], the seismic sources of the most destructive earthquakes in the eastern Sicily are still controversial. In particular, several different possible faults, with different geometry and kinematics, were proposed for the two largest earthquakes that stroke Sicily in the last centuries: 1693 South Eastern Sicily (M = 7.5) and 1908 (M = 7.1) Messina Straits earthquakes.
As seismogenic source of the former was proposed a normal fault in the near offshore of eastern Sicily (Bianca et al., 1999), a thrust fault at the front of the Sicilian belt [Azzaro and Barbano, 2000;Barbano and Rigano, 2001], and a subduction fault plane in the Ionian basin [Gutscher et al., 2006].
The seismogenic source of the Messina earthquake is debated although the epicentral area, and the focal mechanism are somehow well constrained [Pino et al., 2000[Pino et al., , 2009. The different fault plane solutions proposed in the literature are somehow coherent each other, suggesting a roughly NNE-SSW to SSE-NNW trending structure with prevailing extensional motion. However, the dip directions are sometime opposite and angles variable; in fact some authors propose various west-dipping faults in the Calabrian side of Messina Straits [Schick, 1977;Bottari et al., 1986;Catalano et al., 2008;Aloisi et al., 2013], while some others propose east-dipping structures in the Sicilian side [Capuano et al., 1988;Boschi et al., 1989;De Natale and Pingue, 1991;Valensise and Pantosti, 1992;Amoruso et al., 2002;Neri et al., 2004;Valensise et al., 2008;Bonini et al., 2011].

INSTRUMENTAL SEISMICITY
In the following statistical analyses, each seismic event will be considered as a point in the space individuated by its three spatial coordinates (latitude, longitude, depth). Of course an earthquake source should be considered as a volume rather than a single point source; furthermore, earthquake location is affected by errors, but we can assume that all the errors associated to all the single events average out each other when they are treated together. The errors represent the standard confidence interval, expressed in km, of the horizontal and vertical coordinates after the localization of the event has been calculated. They depend partly on the capability of the seismic network and partly on the reliability of the velocity models.
However, to avoid the coarser earthquake location, we considered the seismic catalogue only since 2006, when the Italian National Seismic Network was upgraded and earthquake location was sensibly improved, the average errors are 0.85 km and 0.76 km for the horizontal and vertical coordinates, respectively.
The study area is focused on eastern Sicily: it extends   Figure 2). The magnitude of completeness of the catalogue is M c = 2.3, considering the empirical frequency-magnitude distribution (FMD) [Mignan and Woessner, 2012]. The corresponding estimate of the b parameter of the Gutenberg-Richter law obtained with moment estimator on transformed data is 1.06.

CLUSTER DETECTION
Instrumental seismicity can be generally distinguished into independent earthquakes (back-ground seismicity) and dependent earthquakes, like aftershocks, foreshocks, (events that occurs before and after a larger seismic event called mainshock, and related to it both in time and space) or multiplets. The process of splitting events of a seismic catalogue in background seismicity and dependent earthquakes is generally called seismic declustering [Van Stiphout et al., 2012]. Seismicity declustering is fundamental process in order to better study the seismicity characteristic and seismogenetic structure interaction, for seismic hazard assessment and in earthquake prediction models. While the identification of multiplets, characterized by very similar waveforms [Adelfio et al., 2012;D'Alessandro et al., 2013] is a simple task, the identification of foreshock and aftershock is more complicated because they cannot be distinguished by any particular outstanding feature in their waveforms. The identification of seismic sequences, foreshock-mainshock-aftershock is only possible after the full sequence of events has happened on the basis of their spatio-temporal proximity to each other. In the last decades, a lot a declustering techniques was proposed [Van Stiphout et al., 2012] almost all based to the definition of a measure of the space-time distance between earthquakes. One of the most simple procedure for identifying aftershocks within seismicity catalogues was introduced by Gardner and Knopo [1974]. In this method, widely known as the "Window Method" (WM), the identification of dependent earthquakes is based on the determination and analysis of an inter-event distance in time and space. Since this first work, a lot of declustering methods based on similar principle were proposed [Reasenberg, 1985;Uhrhammer, 1986;Frohlich and Davis, 1990;Molchan and Dmitrieva, 1992]. In WM, foreshocks and aftershocks are treated in the same way. For each earthquake in the catalogue with magnitude M, previous or subsequent shocks are identified as foreshock or aftershocks, respectively, if they occur within a specified time interval T(M), and within a distance interval L(M), whose width is clearly function of the mainshock magnitude. Frohlich and Davis [1990] proposed a simple criteria to define a space-time distance between two earthquakes i and j (1) where r ij is the distance between the hypocentres, t ij is the absolute difference between the event origin time and C is a scaling constant. Two earthquakes i and j are SIINO ET AL.
where D is a time-space distance threshold to optimize on the basis of the analysis of the background activity of the study region. According to Davis and Frohlich [1991] in Equation (1), we set C = 1 km/day -1 while in this paper we suggest a different procedure for threshold optimization based on Agglomerative Hierarchical Cluster (AHC) algorithm [Gan et al., 2011;Everitt et al., 2011;D'Alessandro et al., 2013Martorana et al., 2016].
In the clustering procedure, we adopt the Average Linkage (AL) criterion while the dendrogram cut level was determined using internal entropy level criterion to maximize the homogeneity within classes Celeux and Soromenho [1996]. After the clustering procedure, we identified nine clusters (Figure 3), and their characteristics are summarized in Table 1.

SPATIAL POINT PROCESS MODELS
In this section, we explain the basic concepts of a spatial point process describing its first-and second-order properties. Moreover, the Gibbs processes and their hybrid formulation are introduced explaining how to fit these models to data, to assess the goodness of fit and how to interpret the main parameters. Definitions and notations used throughout this section are introduced in Illian et al. [2008] and Baddeley et al. [2015].
A spatial point pattern υ = {u 1 ,...u n } is an unordered set of points in the region W ⊂ R d where n(υ) = n is the number of points, |W| < ∞ and usually d = 2. A point process model assumes that υ is a realization of a finite  point process X in W without multiple points. The firstorder property of X is described by the intensity function. It is defined as (2) where du is an infinitesimal region that contains the point u ∈ W, |du| is its area and E(Y (du )) denotes the expected number of events in du. When the intensity in Equation (2) is constant the process is called homogeneous. On the other hand, in the inhomogeneous case the intensity is not constant in the study area, but may depend, for instance, on the coordinates of points, still assuming independence of the past. A Poisson process model, which assumes that the random points are independent of each other, is completely described by its intensity function ρ(u).

Mainshock
The interpoint interaction between events is measured by second-order moment quantities. Several functional summary statistics are used to study the secondorder characteristics of a point pattern and so measuring dependence. Two widely used summary statistics, for descriptive analysis and diagnostics, are the Ripley's Kfunction and G-function (also called nearest-neighbour distance distribution function) [Ripley, 1976[Ripley, , 1988.
Knowing that, in a stationary process the distribution of X is the same as the distribution of the shifted process X + v, for any vector v, the K-function is defined as for any distance r ≥ 0 and any location u. When the process is stationary, the most commonly used estimator for (3) is due to Ripley [1976] is (4) where n is the number of points in the pattern, |W| is the area of the window, w(u i , u j ) is the Ripley's edge correction weight and d ij = ||u i -u j || is the pairwise distances between all distinct pairs of points u i and u j in the pattern.
The G-function is the cumulative distribution function of the nearest-neighbour distance at a typical point of X. It is a function of d i = d(u i , υ\u i ) that indicates the shortest distance from u i to the pattern υ\u i consisting of all points of υ except u i . It is defined as (5) for any distance r ≥ 0 and any location u. The estima-tor of (5) is based on the empirical cumulative distribution functions of the nearest-neighbour distances at all data points. G(r) is a non-decreasing step function and G(0) = 0. Under the homogeneous Poisson assumption the following theoretical relations hold K (r) = pr2 and G(r) = 1 -exp(-ρpr 2 ). Visual inspection of empirical K-function (G-function) with the theoretical summary statistic of the homogeneous Poisson process is used to study correlation in a descriptive analysis of the data.

GIBBS AND HYBRID OF GIBBS POINT PROCESS MODELS
Gibbs point process is a class of flexible and natural models for point patterns that postulates interactions between the points of the process defining a density for a point process with respect to a Poisson process of unit intensity [Møller and Waagepetersen, 2003].
The class of Gibbs processes X, also called Markov point processes, is determined through a probability density function f : < X > [0, ∞), where X = {υ ⊂ W : n(υ) < ∞} is a set of point configurations contained in W. In the literature several Gibbs models have been proposed such as the area-interaction, Strauss, Geyer, hard core processes [Møller and Waagepetersen, 2003]. For further details on them and on the desirable properties for an unnormalized density see Møller and Waagepetersen [2003] Baddeley et al. [2013] and Baddeley et al. [2015]. We explain with more detail the Geyer process and the interpretation of its parameters in terms of interaction between the points, since Geyer models are used mainly when the aim is to study attractive interaction among points. The unnormalized density of the Geyer saturation process [Geyer, 1999] is equal to (6) where s > 0 is the saturation parameter and t( . ) is the number of unordered pairs of distinct points in the pattern υ that lie closer than r units apart. When in Equation (6) the s is set to infinity, this model reduces to the Strauss process with interaction parameter γ 2 . Instead if s = 0, the model reduces to the Poisson point process. If s is a finite positive number, then the interaction parameter γ may take any positive value. The process is inhibitive when γ ≤ 1, and clustered when γ > 1.
However, Gibbs processes have some drawbacks when points have a strong clustering and show spatial dependence at multiple scales [Illian et al., 2008;Møller and Waagepetersen, 2003;Baddeley et al., 2013]. Baddeley et al. [2013] proposes hybrid models as a SIINO ET AL.
general way to generate multi-scale processes combining Gibbs processes. Given m unnormalized densities f 1 , f 2 , ..., f m the hybrid density is defined as f (υ) = f 1 (υ) x.
..x f m (υ), respecting some properties. For example the density of the stationary hybrid process obtained considering m Geyer components (3) (with interaction ranges r 1 , ...r m and saturation parameters s 1 , ..., s m ) is (7) where This density indicates that the spatial interaction between points changes with the distances r j and the parameters that capture this information are the interaction parameters γ j . If an inhomogeneous version of (7) is considered, the parameter β is replaced by a function β (u i ) that expresses a spatial trend and it can be a function of the coordinates of the points and covariate information defined in all the study area. Generally, we can specify that the density f is a function of a vector of regular parameters θ and a vector of irregular parameters η. In the case of (7), the previous two parametric vectors are θ = {log (β), log (γ m )} = {θ 1 , θ 2 } for regular parameters and η = {(r 1 , s 2 ), ...., (r m , s m )} for the irregular ones. Moreover, the regular vector can be subdivided into parameters for the description of the spatial trend (θ 1 ) and parameters for the interaction effects (θ 2 ). Gibbs models can be fitted to data by pseudolikelihood (composite likelihood) [Besag, 1975[Besag, , 1977Jensen et al., 1991], and it is function of the Papangelou conditional intensity ρ φ (u|υ) at location u ∈ W given υ, where φ are the parameters to estimate. In hybrid models φ = {θ, η} and the conditional intensity is the product of the conditional intensities of the components [Baddeley et al., 2013] (8) where B(u) is an offset term, θ T 1 V 1 (u, η) is the firstorder potential and θ T 2 V 1 (u, υ, η) is the sum of terms with k ≥ 2. The term G( . ) accounts for the interaction effects, and in the following analysis, it will be a combination of Geyer processes. The irregular parameters are estimated through the profile pseudolikelihood, maximizing p (η, υ) = max θ log PL(θ, η) over η. For details for computing approximate maximum pseudolikelihood estimates see Baddeley and Turner [2000]. Different specifications can be considered for the trend part B(u) and θ T 1 V 1 (u) of (8). We consider a non-parametric formulation for the spatial term depending on the coordinates using kernel estimators [Silverman, 1986].

DIAGNOSTICS FOR THE HYBRID MODELS
Diagnostics is based on a graphical approach, involving also the residual K-and G-functions [Baddeley et al., 2011]. Moreover, we compare and assess the goodness-of fit of the estimated models in terms of AIC, spatial raw residuals and number of simulated points under the estimated model . Furthermore, the diagnostic plots based on the residual Kand G-functions are used to decide which component has to be added at each step to the hybrid model.
The residual K-function (G-function) with respect to a fitted model is a modification of the K-function which has mean equal to zero if the model is true. Indeed, these graphs show for which spatial distances the current model has a lack of fit in describing the interaction between points. The residual K-function for a fitted model, evaluated at a specific distance r, is the score residual used for testing the current model against the alternative of the current model plus a Strauss process with the same trend and with interaction distance r.
Similarly, the residual G-function for a fitted model, evaluated at a given r, is the score residual used for testing the current model against the alternative of a Geyer saturation model with saturation parameter 1 and interaction radius rr Baddeley et al. [2011]. In the analysis, for model selection and to assess the goodness of fit, we computed both the residual G and K-functions and in section 5 we report the graphs for the residual Gfunctions of the estimates inhomogeneous Poisson models and hybrid of Gibbs process models. In the spatstat package  of R [R Development Core Team, 2005], there are most of the functions that have been used for fitting, prediction, simulation and validation of Hybrid models.

RESULTS
The models introduced in Section 4 are fitted to the largest sequences 2, 4, 5, 6 and 9 (at least 70 events), see Table 1.
For each sequence, the magnitude of completeness M c is estimated with a visual evaluation, and in a conservative way, of the empirical frequency-magnitude distribution (FMD) and events of magnitude M < M c are discarded [Mignan and Woessner, 2012]. Table 2 shows for the selected clusters, the value of M c and the corresponding estimate of the b parameter of the Gutenberg-Richter law obtained with moment estimator on transformed data.
Such parameters are useful to characterize the clusters, to compare them each other, and, partially, with the background seismicity (i.e. the whole catalogue).
The b-value of the earthquake frequency-magnitude distribution generally ranges between 0.8 and 1.1; variations are associated to several aspects, e.g. fracturing degree, material properties, stress concentration degree, pore-fluid pressure, tectonic setting, and to the petrological-environmental-geophysical characteristics [El-Isa and Eaton, 2014].
While the b-values of the clusters could be indicative of these local conditions, the corresponding values for the catalogue (b = 1.06) results from the combination of a multiplicity of geological/geodynamic conditions and, in this perspective, it is not useful to discuss about. Probably there is not a significant reason for the b-values of the clusters being always lower than b-value of the whole catalogue. Small b-values (0.7 -0.4) are generally associated to intra-plate seismicity or, for small magnitudes, to the incompleteness of the catalogues. Cluster 4, 5, and 6 show lower b-values (0.6-0.7). Excluding the incompleteness of the catalogue, low b-values for these three clusters can be related to the local geological and tectonic context that may influence the real state of stress. For descriptive purposes, the summary statistics K-and G-functions are estimated for each cluster assuming stationarity, and we report the estimation of the last summary statistics. In Figure 4, for each sequence, the nearest neighbour distance distribution function (black line) and the true value of G for a stationary completely random point process (red line) together with the envelopes considering 39 simulations under CSR assumption are compared. For the clusters 2, 4, 5 and 6, we observe remarkable deviations between the empirical and theoretical G curves, and this behaviour indicates spatial clustering. However, for the cluster 9, the curve is inside the envelope except for dis-tances less than 0.5 km. The graphical output obtained with the estimated K-functions and the related envelopes under the CSR assumption are not reported and they are coherent with the previous results. However, we want to point out that generally a summary function does not completely characterize a point process and the G-and K-functions are sensitive to different aspects of the point process. In particular, the G-function summaries information at a shorter scale than the K-function [Baddeley et al., 2015].
For each cluster, the first attempt in model estimation is to fit an inhomogeneous Poisson model with intensity depending on the spatial coordinates in which points do not interact with each other. We consider a non-parametric formulation for the spatial term in (5), considering kernel estimators with respect to the Cartesian coordinates and a null value for the interaction term.
To determine the smoothing bandwidth for the kernel estimation of point process intensity several methods have been proposed in the literature: Scott's rule, cross-validation, FLP (Forward Likelihood for Prediction) approach ; Adelfio and Chiodi, [2015]]. For each cluster, the best bandwidth has been selected considering the inhomogeneous Poisson model with the smallest AIC. Table 3 shows the used method to estimate the smoothing bandwidth vector and the corresponding values for each cluster (h = {h x , h y }).
The corresponding AIC and the range of the spatial raw residuals of the fitted inhomogeneous Poisson models are in Table 4. Given the point process models fitted to the point patterns, the residual K-and G-functions are used as diagnostic tools to assess the goodness-offit of the models [Baddeley et al., 2011]. In Figure 5, the residual G-functions for the estimated inhomogeneous SIINO ET AL. Estimation results for the inhomogeneous Poisson and the selected hybrid of Gibbs models: AIC, range of spatial raw residuals, vector of irregular parameter η (an interaction distance r j and a saturation parameter s j for each Geyer component, G j , j = 1,2) and interaction parameter for each Geyer component γ j .

FIGURE 4.
Envelope of the G-function (shaded region) to test Complete Spatial Randomness (CSR) condition. The black curve is the estimated G-function instead the red one corresponds to the function under the CSR assumption, for cluster 2 (4a), 4 (4b), Poisson models are reported. For the sequences 2, 5 and 6 the residual G-functions wander substantially outside the limits showing peaks. This suggests that assuming an inhomogeneous Poisson model for those clusters is not satisfactory since there is unexplained attractive interaction between points (positive association). The sharpness of the peak for the residual G-function encourages to include a Geyer saturation process. By the way, for the sequences 5 and 6, at very short distances (less than 70 meters), the first Geyer component is similar to an inhomogeneous Poisson one, since the value of the saturation parameter s1 is close to 0 (Table 4). While, for the sequences 2, 5 and 6, for distances up to about 200 meters, a Geyer component is necessary for explaining the cluster behaviour among points: for all these clusters both the saturation parameter s and the interaction parameter are greater than 1. In particular, according to the estimated parameters, the clustering behaviour is stronger for the sequence 6. On the other hand, we have seen before in Figure 4b and 4e that for the clusters 4 and 9, the G-functions are not so far from the CSR assumption. This is also confirmed from the residual G-functions after fitting an inhomogeneous Poisson model: in Figures 5b and 5e, the curves are not significantly far from zero. In particular, for the cluster 4, the estimated residual G-function is outside the band just for very short distances ( Figure  5b), and adding a hybrid component does not seem to improve significantly the results in terms of interpretation. Indeed, the parameter s = 0.1 of the Geyer component (Table 4) is very low and close to zero, indicating that the added component is close to an inhomogeneous Poisson process. Instead for the cluster 9, this behaviour is more evident, since in the residual G-function (Figure 5e) there is no evidence of unexplained association in the observed data. Therefore, for this sequence an inhomogeneous Poisson process well describes the spatial inhomogeneity. Moreover, this is a very small sequence (54 earthquakes) and therefore fitting a complex model, as a hybrid one, could be misleading.
In terms of goodness of fit considering hybrid models, we have an improvement in terms of AIC and range SIINO ET AL.
of spatial raw residual with respect to the inhomogeneous Poisson models (Table 4). For example, for the sequence 2 moving from an inhomogeneous Poisson model to a hybrid formulation, the AIC decreases by 26.658 and the range of the spatial raw residuals also decreases. Moreover, the residual G-functions of the final selected models (Figure 6a, 6b, 6c and 6d) oscillate around zero and they are inside the envelopes indicating that the interaction structure between the earthquakes is well described by the hybrid models. For a visual diagnostics of the final model, we also report the images of the estimated spatial intensity for the five selected clusters, together with the observed points ( Figure 7). As it can be observed, this results is also influenced by the size of clusters (see Figure 7e).

CONCLUSIVE REMARKS
Starting from previous results, we could say that the spatial seismicity of the sequences occurred in San Teodoro in 2013 and in Barcellona Pozzo di Gotto in 11 MULTISCALE PROCESSES FOR SICILIAN SEISMICITY FIGURE 6. Residual G-functions for the final selected model, that is an inhomogeneous hybrid model of Geyer processes with one or two components, and parameters reported in Table 4, for cluster 2 (6a), 4 (6b), 5 (6c) and 6 (6d). The x-axis shows the value of distances in km 2003 seems to be adequately described by an inhomogeneous Poisson process and, therefore, it seems that for these sets of earthquakes there is a weak interaction, i.e. a weak contagious effect between events. On the other hand, for the sequences occurred in Patti in 2013, San Salvatore di Fitalia in 2011 and Maletto in 2009, we explicitly postulate a multiscale interaction between events, fitting a hybrid of Geyer process with positive association and for distances up to 200 meters. For these last three sequences, it seems that the earthquakes occur in a more clustered behaviour. One reason for these results, and differences between the sequences, might be ascribed to the geological characteristics of the area, as the differences in the b-values among the clusters could indicate. On the other hand, to find a possible correlation of such few clusters with their positions within the complex geodynamic context of the study area, or even with the main seismogenic sources, would be a daring goal. It would be even more speculative considering that beyond the well described seismogenic structure of the study area  there is evidence of seismic belts not clearly associated to the well-constrained fault belt Billi et al. [2007]. Indeed, it is reasonable to assume that different seismogenic sources could be associated to seismic sequences with specific characteristics.
The errors associated to the determination of the hypocentral coordinates could have represented a limitation to the methodology, but our methodology proves to discriminate clustering effect at a multiscale level and provides promising results.
Finding possible geological explanations accounting for the differences in the clusters requires a larger catalogue and, for this purpose, application at greater scale (i.e. the whole Italy) should be performed. FIGURE 7. Estimated spatial intensity of the final selected model, for cluster 2 (7a), 4 (7b), 5 (7c), 6 (7d) and 9 (7e). The plots refer to Mercator projection UTM33 in km