Estimation of gravity noise variance and signal covariance parameters in least squares collocation with considering data resolution

The article describes an implementation of the negative log-likelihood function in the determination of uncorrelated noise standard deviation together with the parameters of spherical signal covariance model in least squares collocation (LSC) of gravity anomalies. The correctness and effectiveness of restricted maximum likelihood (REML) estimates are fully validated by leave-one-out validation (LOO). These two complementary methods give an opportunity to inspect the parametrization of the signal and uncorrelated noise in details and can provide some guidance related to the estimation of individual parameters. The study provides the practical proof that noise variance is related with the data resolution, which is often neglected and the information on a priori noise variance is based on the measurement error. The data have been downloaded from U.S. terrestrial gravity database and resampled to enable an analysis with four different horizontal resolutions. These data are intentionally the same, as in the previous study of the same author, with the application of the planar covariance model. The aim is to compare the results from two different covariance models, which have different covariance approximation at larger distances. The most interesting outputs from this study conﬁrm previous observations on the relations of the data resolution, a priori noise variance, signal spectrum and LSC accuracy.


Introduction and objective
The spatial correlation of the signal in least squares collocation (LSC) can be described by the analytical covariance model selected from a large group of models available today.The LSC based on the covariance model is widely applied in geodesy [Arabelos and Tscherning 2003, Albertella et al. 2004, Kotsakis 2007, Sansò et al. 2008, Reguzzoni and Tselfes 2009, Pavlis et al. 2012] and has some common assumptions with the group of techniques originating from other fields of geosciences, called kriging [Reguzzoni et al. 2005].The LSC has been extensively tested with planar finite models [Iliffe et al. 2003, Grebenitcharsky et al. 2005, Núñez et al. 2008, Klees and Prutkin 2010], as well as with spherical reproducing kernels [Schwarz and Lachapelle 1980, Knudsen 1987, Esan 2000, Arabelos et al. 2007, Pail et al. 2010, Yildiz 2012].The advantages of one or the other model have been often discussed and a better representation of the long-term correlations has been usually pointed regarding the spherical models [Tscherning and Rapp 1974, Forsberg 1984, Forsberg 1987, Strykowski 1996].These correlations are neglected in planar models, which tend to zero and do not oscillate around the distance axis, as opposed to spherical models.On the other hand, reproducing spherical kernels are more composed numerically and their generation is more time-consuming.The compromise between spherical and planar models has been investigated many times in the recent decades, by enhancing the planar model with long-wavelength correlations [Strykowski 1996] or finding the planar model that optimally represents gravity field correlation [Forsberg 1987].Barzaghi et al. [2001], in turn, propose some modifications of the traditional spherical covariance model.
The family of spherical reproducing covariance models is based on the summation of the orthogonal functions, such as e.g.Legendre polynomials.The discussion on the rotationally symmetric spherical covariance functions considering their relation with the planar approximation and some additional LSC problems can be found in Moritz [1976Moritz [ , 1980] ] and Tscherning [2004].Arabelos and Tscherning [1998] have prepared a study of the similarities and differences related to the application of finite and full covariance models.They have found some close results between these two kinds of models in selected applications, which are of particular interest in this article.Additionally, they have indicated an important role of the a priori noise variance, which is an issue of my special interest.Many of the LSC applications apply noise variance matrix assuming spatially uncorrelated noise.This is advisable for many types of the data and readily used [Núñez et al. 2008, Barzaghi andBiagi 2014].The assessment of the noise standard deviation is, however, often a discussed topic.In many works this parameter is primarily associated with the measurement error [Bouman 1997, El-Fiky et al. 1997, Osada et al. 2005, Vergos et al. 2005].Nevertheless, recently many LSC users provide the proofs that more factors have to be considered in the proper estimation of the LSC noise [Sadiq et al. 2010, Filmer et al. 2013, Saleh et al. 2013, Jarmołowski 2015].This paper applies selected spherical covariance model and aims at joining this discussion, providing a numerical example and some resulting observations on the signal covariance parameters and the noise standard deviation.The numerical testing of the selected covariance model assessed alone or compared with the handy test of the planar model shown in Jarmołowski [2013] can provide some common observations related to the noise variance parameter.
An extensive study of the spherical radial basis functions based on Legendre polynomials can be found in Tenzer and Klees [2008] and in Klees et al. [2008].They investigate popular spherical functions and apply them in local gravity field modeling.They also analyze the problems of regularization and covariance parameter estimation and calculate the relations between different covariance parameters.In this paper, I follow with similar investigations employing Tscherning-Rapp (TR) model in the LSC [Tscherning and Rapp 1974].TR model can represent the correlation of the global gravity; however, its construction based on Legendre polynomials enables an evaluation of the local form suitable for local gravity field characteristics modeled by local parameters [Schwarz andLachapelle 1980, Knudsen 1987].TR model is widely implemented in the analyses of the regional and local gravity field with the use of the newest satellite data.Sadiq et al. [2010] apply the local TR function in the determination of the gravity field from CHAMP and GRACE data combined with the ground observations.Yildiz [2012] uses TR model in the analysis of GOCE vertical gravity gradient by LSC.Sansò et al. [2008] provide a detailed study based on the satellite and ground data, employing a wide group of the reproducing, spherical kernels.The aforementioned and other LSC applications use various approaches to the a priori noise, e.g.: mentioned use of measurement error, cross-validation methods or regularization.In some cases the explanation of noise variance derivation is omitted.
This article describes the estimation of uncorre-lated noise standard deviation together with the parameters of TR covariance model, for various data resolutions.The analyses have been intentionally made with the use of the same gravity data as in Jarmołowski [2013] in order to compare the results of LSC with TR model and LSC with the planar Gauss-Markov model.The primary objective is to compare numerically the a priori noise variance estimates and prediction accuracy resulting from the use of particular models.The parameters of TR model, together with a priori noise standard deviation, are estimated with the use of two complementary methods.First, the restricted maximum likelihood (REML) method is used to estimate the parameters, and then leave-oneout (LOO) validation [Kohavi 1995] is performed using the same ranges of the parameters.The a priori noise variance found with the use of REML and LOO, the RMS of LOO differences and a posteriori error estimates are found as three converging numbers that validate the study.The use of the same data as in Jarmołowski [2013] creates an opportunity to compare corresponding three estimates for the planar covariance model and additionally authenticate obtained results.The new contributions in this article are REML estimation and LOO validation of the signal covariance parameters and noise variance in case of local TR model application.The extension of previously proven methods to spherical model case is especially desirable due to the discussions and uncertainties mentioned above, which refer to the importance of long-wavelength terms in the local application of spherical models and to noise variance problems.

Spherical covariance model and its parameters
When assuming a rotational anisotropy, the spherical covariance function of gravity anomaly can be written as [Tscherning andRapp 1974, Moritz 1980]: (1) where c n are degree variances, R B is the radius of Bjerhammar sphere [Arabelos and Tscherning 2003] and P n are Legendre polynomials with spherical distance }.This is the model of global gravity anomalies, since we start from the lowest harmonics, representing the most global characteristics of the gravity field.If we take into account the so-called "remove-restore" technique or stepwise collocation [Lachapelle and Tscherning 1978] and remove the global part of the signal, the residual gravity anomalies have to be modeled using the form [Knudsen 1987, Arabelos and Tscherning 2003, Sansò et al. 2008, Yildiz 2012 The lower cutoff degree n min is a parameter analyzed in the paper.The error degree variances related with the reference global model are denoted f n , whereas v n are signal degree variances above the maximum degree of the global geopotential model used (n min ).The anomaly degree variances for n> n min can be modeled by different analytical models.A number of them have been tested by Tscherning and Rapp [1974] and Tscherning [1974].The model used in this work i.e. the substitution for v n is The variables A and B are also the covariance parameters analyzed in this work.Another tested parameter is s, which is smaller than 1 and replaces the quotient of squared R B and product of radial distances to points [Moritz 1980, p. 181] : (4) The residual data represent the signal part above degree n min .The error degree variances n representing degrees below n min are, in turn, often omitted in the research studies.It has been shown that the analytical model can be equivalently determined using only degree variances of the residual signal and effectively fitted into the empirical covariance [Tscherning 1974, Lachapelle and Tscherning 1978, Schwarz and Lachapelle 1980].I follow this assumption employing also a shortened model of the form: The fitting of the model from Equation (5) to the empirical covariance function is easier than that in Equation (2), due to a smaller number of the parameters.

REML estimation of parameters and LOO validation
LSC with parameters can be represented by the equation [Moritz 1980] (6) where l is the vector of the observations, s is the stochastic part of the signal and n denotes the noise.Xb represents the long-wavelength part of the signal composed of design matrix X and the parameters vector b.Assuming now that our observations are gravity anomalies, the necessary step is to remove lower frequencies of the harmonic expansion from the signal, because: (7) In the presented investigations the long-wavelength part comes from the geopotential global model EGM2008 [Pavlis et al. 2012] and replaces deterministic part Xb with the harmonic expansion.
The probability density function (PDF) of the multivariate normal distribution based on the residual gravity Dg r and covariance C(i) reads [Kusche 2003, Koch 2007, van Loon 2008] where R is the matrix represented by the product of covariance matrix inverse and projection matrix [Searle et al. 1992]: (9) The covariance matrix C(i) is based on the vector i, which includes the covariance parameters.In practice, C(i) is composed of the signal and noise covariance matrices.Therefore for our case we have: (10) as the study assumes all correlated parts of the data together as a signal.Only uncorrelated part is investigated as noise, i.e.: REML is a method usually preferred instead of basic maximum likelihood (ML) in case of the bias suspected in the data [Searle et al. 1992, Jarmołowski andBakuła 2014].Therefore, I decided to apply the simplest first order polynomial trend X to exclude potentially remaining bias from the residual data, i.e.: (12) This makes the method more accurate and has no practical disadvantages in case of unbiased residuals.

T S S R S S S S S S S S S S S S S S S S S S S S S T S S S S T
.
• REML aims at the maximization of PDF by the selection of the covariance parameters.This maximization is simplified when using the logarithm of PDF.Sometimes it is more convenient to minimize PDF with the opposite sign [Searle et al. 1992, Koch 2007, van Loon 2008].In this case the logarithm takes the form of negative log-likelihood function (NLLF), as follows: (13 This equation is used in the step-by-step procedure of the parameters analysis in the space of parameters vector i.The global minimum of NLLF indicates the optima of covariance parameters, which are further controlled in the LSC process, using pointwise LOO validation at all sample data points following the straightforward rule: The variable k denotes number of the observations in Equation ( 11), Equation ( 12) and therefore also indicates sizes of the arrays in the subscripts in Equation ( 14).The expectation from REML and LOO is to obtain the same estimates of the parameters at the reasonable level of precision.LOO validation, as it is based on the LSC process, provides two additional measures of accuracy: RMS of differences between predictions and data and a posteriori error variance.

Applied gravity data
Bouguer gravity anomalies for the numerical test were downloaded from the U.S. gravity database available at the website of University of Texas at El Paso [Hildenbrand et al. 2002, Saleh et al. 2013].Figure 1a shows the large, regional area of gravity anomalies (6°× 9°), which was used for preparing the four subsets of different resolutions, selected using grids of various densities.The points were selected to be as close to the nodes as possible.Figure 1b describes the sampling scheme and shows four subsets: set 1 -the largest area with an approximate resolution of 0.5°, set 2 -0.25°, set 3 -0.1°and set 4, which covers the smallest area and has an original average resolution of about 0.03°.These resolutions are used later to fix the upper frequencies of the summation in TR covariance models, assuming that they establish a maximum required upper term of the Legendre series.In other words, the maximum degrees of the summation for particular datasets (n max ) are equivalent to their approximate resolution and equal 360, 720, 1800 and 6000, respectively in all calculations.
The global geopotential model EGM2008 was used as the long-wavelength field and subtracted from the signal.Each set has been detrended using lower harmonics up to the degree and order that correspond to the coordinate ranges smaller than selected area sizes.It was set approximately, assuming that the removed long-term trend for each area must represent more resolution than its size, but not too much, to preserve some correlated signal in the residuals.The corresponding degrees of the subtracted spherical harmonic expansion are equal 120, 180, 450 and 1800, respectively.
The samplings of the empirical covariance functions for the respective datasets cannot be smaller than their approximate data resolutions.Otherwise, an in-  1b).Black curves are assessed as well fitted, compared to a large number of examples in the literature, where a similar level of fit can be seen [Knudsen 1987, Sadiq et al. 2010, Yildiz 2012].The coloured curves are based on the parameters found in REML and LOO analyses, realized and explained in the next section.

REML analysis of covariance parameters and LOO validation
The analyzed parameters of the signal are n min , A, B and s (Equation 10).The noise is represented by a diagonal covariance matrix and the average noise standard deviation dn has been taken as another parameter.The parameter B in TR model can be often found as fixed in the literature since the early years of the model development [Tscherning and Rapp 1974].This study confirms that its fixing is relatively easy and reasonable.The maximum and minimum degrees of the Legendre polynomials are related with the upper and lower truncation levels.These levels are specific to particular data and are associated with the actual selected area size and data horizontal resolution.The upper frequencies have minor influence on TR model shape and therefore n max is fixed relatively to the approximate data resolution.The long-wavelength gravity parts (EGM2008) are calculated and subtracted up to the degree equal n min .With that in mind, the parameter n min has been additionally estimated by REML and LOO for comparison.The parameters A and s must be always adjusted to the actual data.These parameters are most responsible for the scale and shape of the function and therefore are different for the global data [Tscherning andRapp 1974, Rummel 1975] and for the local samples [Schwarz andLachapelle 1980, Knudsen 1987].Arabelos and Tscherning [2003] have estimated the regional values of some parameters in a global range.
The scheme of the parameters testing is in some sense sequential, i.e.only the selected parameters are si- multaneously variable, making the analysis more transparent.Therefore, the study starts with the REML estimation of three parameters and is followed by LOO validation of these estimates.Additionally, two of the previously fixed parameters have been estimated by REML and LOO, after fixing the remaining ones to previously estimated or commonly known values.First, the parameters A and s responsible for the shape of TR function are fixed for the four selected data samples in REML estimation, as well as in LOO validation.The number of simultaneously calculated parameters is limited to three, i.e.: B, noise variance (dn) and lower harmonic degree of the subtracted long-wavelength field (n min ).The fixing is also useful due to the specific correlation between A and s, which will be discussed later, using additional REML and LOO validation related to these parameters.The parameters A and s have been fixed by fitting the TR model to the empirical function (Figure 2, Tables 1 and 2), which is usually found as good approximation [Arabelos et al. 2007, Yildiz 2012].Figure 3 describes 3D wireframe plots of two selected constant NLLF values computed using C(i) matrix with varying B, n min and dn.The axes show the ranges of the parameters used in REML calculation.Table 1 groups the optima of the parameters indicated by the global NLLF minima.

ESTIMATION OF GRAVITY NOISE VARIANCE
The same data and parameters were then used in LOO validation, to confirm the estimates from Figure 3 and Table 1.LSC predictions have been made in the positions of the data points, always excluding the selected one from the computation (Equation 14).The ranges of the distances to the data points used in all LSC (LOO) computations have been limited to the ranges of the horizontal axes in Figure 2. All data points inside the circle of this radius are employed for point calculation.The values of RMS of LOO differences (Equation 14) have been calculated for four data samples.Two small constant values close to the minima of RMS for each dataset are shown as the wireframe surfaces in Figure 4.
The global minima of RMS in Figure 4 are placed somewhere inside the wireframe surfaces and indicate the optima of the parameters.These optimum parameters found by LOO, are then listed in Table 2 together with the minima of the RMS values.To complete LSC estimation results, the standard deviation of the prediction has been calculated, however, only for the parameters found as optimal in LOO (Table 2).Respectively to the order of the datasets, the ranges of the predictions standard deviations are equal to: 11.12 -13.04 mGal, 5.72 -7.01 mGal, 2.69 -3.49mGal and 1.01 -1.80 mGal.It should be pointed that the prediction errors are found in some works as more sensitive to the change of the parameters than LSC results [Sansò et al. 1999, Darbeheshti and Featherstone 2009, Jarmołowski and Bakuła 2014].Therefore, their estimates are worth considering in the assessment of the correctness of the parameters estimation.The above four ranges of the errors estimated for the selected datasets are very close to the selected a priori noise levels and the minima of RMS in LOO validation.The additional REML and LOO tests are performed to estimate A and s parameters, in order to compare them to the estimates from the covariance function fit.REML and LOO start here after fixing of the remaining three parameters to the values commonly used or based on the previous LOO estimation.B is fixed to 24, as in Tscherning and Rapp [1974].This value is the most common in the literature; moreover, its good performance is also confirmed in the current work (Figure 4).More precisely, a wide range of different B values provide the same results in REML and LOO, which indicates that local TR model applications are not very sensitive to the change of B (Figures 3 and  4).The lower cutoff n min is based on the degree of the subtracted EGM2008 gravity.This is theoretically correct and further confirmed by the estimates from Table 2, except n min for the fourth dataset, which has got a better estimate in REML (Table 1).The noise parame-ter dn is individual, cannot be pre-assumed and is hard to determine from the covariance model fit.Therefore LOO estimates of dn discussed above (Table 2) have been taken to fix this parameter in the current estimations of A and s, which are summarized in Figures 5 and  6 and Tables 3 and 4. The parameters A and s observed in the places of minimum NLLF (Figure 5) and in minimum RMS of differences (Figure 6) are subsequently used to draw coloured plots of TR model in Figure 2. The black dots in Figures 5 and 6 indicate the parameters A and s selected for the black curves in Figure 2. The difference between the parameters estimated by REML, LOO and fitting of the model results in the altered shapes of the models (Figure 2) and in different RMS values (Figures 6c and 6d).The model is sensitive to the parameter s, which is reflected in the elongated shapes of NLLF (Figure 5) and RMS contours (Figure 6).It should be noted that some plots in Figures 5 and 6 have the ranges of the axes insufficient to show the global minima.These minima can indicate significantly different parameters than those from the model fit, due to numerically sensitive parameter s.
The most interesting observation coming from all above results are the three estimates related with the noise, which is individual for the particular data set.These estimated parameters are: a priori noise dn, RMS of differences in LOO validation and a posteriori error estimate [Moritz 1980].The convergence between them has been attained and moreover, they converge well with the respective parameters found for the planar model in Jarmołowski [2013].The computation of RMS in LOO and a posteriori errors used for the comparison in Table 5 uses parameters from Table 2.All noise estimators observed for TR model and Gauss-Markov model are summarized in Table 5.Their desirable agreement is discussed in the next section.

Discussion and outcomes
The most valid conclusion is related with a priori noise standard deviation, however, some properties of the other parameters have been also observed.The parameters A and s are significantly correlated and even a small change in s induces a large rescaling of A (Figures 5 and 6).Therefore the covariance model is also very sensitive to the change of the parameter s numerically (Figure 2).It is possible then, that the manipulation of the Bjerhammar sphere radius applied in TR model (Equation 4) can be a better option than the direct selection of s.The parameters dn, n min and B can be fairly based on their estimates from REML (Figure 3, Table 1), which are successfully confirmed by the LOO validation (Figure 4, Table 2).B is found to be most flexible and easy to determine for the local data.A wide range of B (including B = 24) shows a good performance in terms of LSC accuracy (Figure 4).The estimates of the ESTIMATION OF GRAVITY NOISE VARIANCE lower cutoff frequency n min coincide or almost coincide with the subtracted lower degree in REML, as well as in LOO validation.
The estimates of dn by REML and LOO are consistent with each other and moreover, coincide with the prediction errors (Table 5) and RMS of LOO (Figures 4  and 6, Table 5).These estimates of noise are different for different datasets, which suggests their relation with the data resolution.The analogous observation can be found in Jarmołowski [2013], where similar analysis LOO is made with the planar, finite covariance model and dn values show almost identical increase with decreasing data resolution.Additionally, Jarmołowski [2013] compares obtained dn estimates to the selected signal parts from EGM2008 below the data resolution and obtains sensible coincidence.A wrong choice of dn gives significantly worse RMS of LOO differences, i.e. worse LSC accuracy with any model.A wrong dn also leads to underestimated or overestimated prediction errors.This is avoided here, because four data resolutions with TR model and Gauss-Markov planar model in Jarmołowski [2013].The differences between the models at larger distances, also appear to have no influence on the uncorrelated a priori noise.Summarizing, the suspicion arises that limited data resolution also limits the upper gravity harmonics that can be found as correlated and interpolated by LSC.The noise variance is associated with the size of the signal variance at the frequency corresponding to the data resolution, if the measurement error is smaller than signal variance associated with this resolution.In other words, the data point values preserve the gravity signal up to the measured frequencies; however, some signal must be lost in the interpolation process and handled by the diagonal noise covariance matrix as "uncorrelated" at the resolution of the interpolated field.This idea is consistent with many ideas of filtering or regularization, which are mentioned in Jarmołowski [2013] and in the following paper.Additionally, we know why the regularization was necessary so many times.The data resolution limits achievable resolution of the interpolation.The noise variance depends then on the signal variance at the data frequency or on the survey error, but only if it is larger than signal variance at this frequency.In my opinion, REML estimation via fast technique like scoring or the known signal degree variances should be used to determine the noise variance of the data every time.

Figure 1 .
Figure 1.Bouguer gravity used and scheme of sampling.

Figure 2 .
Figure 2. Empirical covariance function of residuals and TR model fitted (black line: A and s manually selected, B = 24, n min equal to degree of subtracted global harmonic expansion, red line: A and s inferred from REML, the same B and n min , blue line: A and s inferred from LOO validation, the same B and n min ).

Figure 3 .
Figure 3. NLLF values in REML estimation of B, dn (mGal) and n min after fixing of A and s to values from covariance function fitting.

Figure 4 .
Figure 4. RMS of differences in LOO validation (mGal) of B, dn (mGal) and n min after fixing of A and s to values from covariance function fitting.

Figure 5 .
Figure 5. Search of parameters A and s by REML (B, dn and n min fixed).Black dots indicate parameters from fitting of TR model into empirical.

Figure 6 .
Figure 6.Search of parameters A and s by LOO (mGal) validation (B, dn and n min fixed).Black dots indicate parameters from fitting of TR model into empirical.

Table 1 .
Parameters fixed and estimated by REML, together with global NLLF minima.

Table 2 .
Parameters fixed and estimated by LOO, together with minima of RMS of LOO differences.

Table 3 .
dn is based on REML, confirmed by LOO and, as a consequence, three noise indicators (dn, RMS in LOO and prediction error) are almost equal for Parameters fixed and estimated by REML, together with global NLLF minima.

Table 4 .
Parameters fixed and estimated by LOO, together with minima of RMS of LOO differences.