Comparison of the MIPAS products obtained by four different level 2 processors

Piera Raspollini∗,a, Enrico Arnoneb, Flavio Barbaraa, Bruno Carlia, Elisa Castellib, Simone Ceccherinia, Bianca Maria Dinellib, Anu Dudhiac, Michael Kieferd, Enzo Papandreae, and Marco Ridolfie aIstituto di Fisica Applicata "N. Carrara" (IFAC) del Consiglio Nazionale delle Ricerche (CNR), Firenze, Italy bIstituto di Scienza dell’Atmosfera e del Clima (ISAC) del Consiglio Nazionale delle Ricerche (CNR), Bologna, Italy cAtmospheric, Oceanic and Planetary Physics, Clarendon Laboratory, Oxford University, UK dKarlsruhe Institute of Technology (KIT), Institute for Meteorology and Climate Research (IMK), Karlsruhe, Germany eUniversità di Bologna, Bologna, Italy


I. Introduction
M IPAS [Fischer et al., 2008] is a midinfrared Fourier Transform spectrometer that sounded the atmospheric emission at limb onboard the ENVISAT satellite.It was launched on 31 March 2002 in a sun-synchronous polar orbit of about 800 km altitude and operated until 8 April 2012.
The limb scanning sequence observed by the instrument is made of spectra that sound the atmosphere at different tangent altitudes (the tangent altitude is the minimum altitude reached by the line of sight and is also the altitude from where most of the observed signal originates).The inversion of the measurements allows the determination of the vertical profiles of the atmospheric quantities of interest in the range of the tangent altitudes covered by the limb scanning sequence, which is 6-70 km for the nominal measurement mode.
Four different retrieval codes have been de-veloped for the analysis of MIPAS measurements and have been used for the processing of the entire MIPAS mission.These are: • the ESA processor [Raspollini et al., 2013, Raspollini et al., 2006, Ridolfi et al., 2000]: ML2PP V6 of this processor and the corresponding dataset will henceforth be referred as ML2PP; • the GMTR, developed at Bologna University [Carlotti et al., 2006, Dinelli et al., 2010]: this processor and the corresponding dataset V2.3 will henceforth be referred as BOL; • the algorithm obtained by the joint effort of the Institut für Meteorologie und Klimaforschung (IMK) at Karlsruhe Institute of Technology and the Instituto de Astrofísica de Andalucía (IAA) [von Clarmann et al., 2003a,von Clarmann et al., 2009]: this processor and the corresponding dataset (V5R_222 for CH 4 and N 2 O, V5R_220 for the other species) will henceforth be referred as IMK; • the algorithm MORSE developed at Oxford University [Dudhia, 2008]: this processor and the corresponding dataset V1.4 will henceforth be referred as OXF.The four retrieval algorithms use the same Level 1b data (calibrated and geolocated spectra) provided by ESA.
An assessment of the internal consistency of their products, which so far has been proven only by means of a blind test retrieval experiment based on synthetic spectra [von Clarmann et al., 2003b], while a comparison using real data has been performed only for ozone products [Laeng et al., 2013], is the objective of this paper.A comprehensive data set is now available and it represents a great opportunity to make a statistically significant comparison aiming at the determination of the possible systematic errors and at the identification of possible improvements.

II. Differences between the four algorithms
All algorithms use the global fit approach, i.e. the spectra of each scan are fitted simultaneously (minimizing the quadratic norm of the noise-weighted residuals between measurements and forward model calculations, with a constraint), but while three of them perform a one-dimensional (1D) retrieval, i.e. each scan is fitted separately, the forth one (BOL) performs the simultaneous retrieval of all scans of the orbit (two-dimensional (2D) retrieval).Generally the different species are retrieved sequentially, with some exceptions for BOL and IMK, and retrieved profiles from previous retrievals are used as assumed profiles in subsequent retrievals.
The forward models of the four algorithms compute the radiative transfer integral along the line of sight taking into account the atmospheric vertical inhomogeneities, but different assumptions are made by the four algorithms about the horizontal inhomogeneities.The atmosphere is assumed to be in Local Thermodinamic Equilibrium (LTE) by all algorithms except IMK, which is able to properly handle deviations from LTE (non-LTE), but it routinely uses only for selected species.Scattering is not included in the radiative transfer integral, and the spectra affected by thick clouds, identified by the cloud filtering algorithm [Spang et al., 2002, Spang et al., 2004], are not included in the analysis.Retrievals are performed limiting the fit to selected spectral intervals (called microwindows, MWs) containing most information on the target parameters and minimizing the systematic errors also introduced by the assumptions in the forward model.The main differences between the four algorithms (constraints, retrieval grid, microwindows, forward model, cloud filtering thresholds) are summarized in Table 1.

III. Procedure for the comparison
The performances of the four algorithms are compared in terms of seasonal averages of single scan retrieval error and seasonal averages of single scan vertical resolution of their products.Possible biases between the products of the four algorithms are searched by computing differences between seasonal averages of each of the three algorithms OXF, BOL and IMK with respect to ML2PP.
Seasonal averages, in a three month period, of temperature profiles and Volume Mixing Ratio (VMR) profiles of water vapor, nitric acid, methane, nitrous oxide and nitrogen dioxide are compared for six latitude bands (90 ) and for daytime and nighttime observations.Daytime (nighttime) profiles are identified with the sun elevation angle at the geolocation of the tangent altitude of the middle sweep of the scan.The analysis is performed considering the nominal mode measurements made in two years (2008 and 2009) that correspond to the second phase of the MI-PAS mission (years 2005-2012).This phase is characterized, with respect to the first phase (years 2002-2004), by a reduced spectral resolution but an improved spatial resolution (Optimized Resolution -OR).Independent analyses are performed for two years (2008 and 2009) to evaluate if the results are consistent enough to consider the results of one year representative of OR measurement performances.Comparisons are made on a common pressure grid, corresponding to an altitude grid of 1 km below 56 km and 2 km above 56 km.This fine grid, chosen in order to reduce the resampling error in the comparison, approximately corresponds to the finest of the four retrieval grids.
The averages are computed including all the retrieved profiles that passed the filtering procedure.For all datasets, only profile levels not filtered out for the clouds and profiles for which convergence has been reached are included in the average.Furthermore, in BOL database, all profiles with information gain less than 0.3 are filtered out, as well as all profile levels x i with absolute deviation from the median larger than 6 times the MAD.MAD is calculated by finding the median of the absolute deviation between observations x i and the median, M, of n data points: MAD n = median (|x i − M n |).In OXF database, any profile level whose retrieval random error is greater than 70% of the profile value is discarded.In the ML2PP dataset, profiles characterized by a chi-square larger than a species dependent threshold are discarded.In IMK dataset, profile levels whose diagonal values of the Averaging Kernel Matrix (AKM) [Ceccherini et al., 2010] are smaller than 0.03 are discarded.
The consequence of the individual filtering procedure is that, for the different algorithms, not exactly the same measurements enter the averages.This has been verified not to introduce a significant difference in the seasonal averages, because the number of the averaged profiles for each latitude band and each season is always much larger than the number of different measurements.An exception may be encountered for night polar summer and day polar winter where averages are performed on a statistically small number of samples.However, also these differences are interesting, because the main objective here is to compare seasonal averages obtained by the different MI-PAS algorithms in order to see how all the different choices made in the four algorithms (cloud filtering, microwindows, regularization, filtering and so on) have an impact on them.

IV. Comparison of diagnostic parameters of the four algorithms
The performances of each retrieval are characterized by the trade-off that the retrieval constraints (either regularization or a-priori) determine between the two diagnostic parameters: the retrieval error and the vertical resolution.
The retrieval error is the propagation of the measurement noise through the retrieval and its Covariance Matrix (CM) [Ceccherini et al., 2010] is computed by the four algorithms and provided for each scan (OXF and BOL products just include the diagonal elements of CM).The retrieval error depends on the sensitivity of the measurements to the target parameters, which, in turn, is driven by the amplitude of the emitted radiance and by the temperature of the atmosphere.Given the large seasonal and latitudinal variability of the temperature profile, the retrieval errors are characterized by a large variability.
The vertical resolution is estimated using the AKM calculated by the four algorithms.The vertical resolution at altitude z i is computed, for each profile, as ∆z i /AKM ii , where ∆z i is the retrieval step at altitude z i and AKM ii is the diagonal element of the AKM at the altitude z i .ML2PP and IMK provide for each scan the full AKM and its diagonal element respectively, and for their products it is possible to compute the vertical resolution in a rigorous way.BOL provides representative AKMs for selected season and latitude bands.OXF does not provide routinely the AKM of each scan in its products, neither the complete CM, and hence the AKM ii has been estimated as equal to 1 − (sd x /sd a ) 2 , with sd x and sd a respectively the retrieval and the a priori error, that is only considering the diagonal terms of the CM.This approach, being rigorous only if the off-diagonal terms of the CM are negligible, and hence if the retrieved points are uncorrelated, overestimates the diagonal terms of the AKM when an a priori constraint is used and consequently underestimates the value of the vertical resolution.
Seasonal averages of the diagnostic parameters are computed after interpolating both retrieval error profile and vertical resolution profile of each scan with the same method used for the temperature and VMR profiles, because the average performances on the native retrieval grid rather than the performances of the average are the objective of our analysis.
Figure 1 shows, for ozone, the seasonal averages of single scan absolute retrieval error (left plot) and the seasonal averages of single scan vertical resolution profile (right plot) for the four processors, for equatorial and Southern hemisphere polar winter conditions.BOL has the worst vertical resolution, with a peak of 8 km at high altitudes.This is a consequence of the used retrieval grid, being significantly coarser than the retrieval grid of the other algorithms (see Table 1).Furthermore, for temperature, ozone and water vapor, that are obtained with a joint retrieval, the retrieval error takes into account also the propagation of the error of the profiles that are jointly retrieved and this leads to a more comprehensive, but larger retrieval error.Among the other three algorithms, OXF has the largest retrieval error and the best vertical resolution.The use of a selfadapting regularization strength based on the retrieval error of each scan allows ML2PP to maintain a fairly constant vertical resolution for different atmospheric conditions, while the retrieval error changes significantly.Contrarily, IMK uses a regularization with a fixed strength, and hence its regularization is stronger (and hence vertical resolution is worst) when the in- formation content of the measurement is lower, while the retrieval error does not change significantly.Similar considerations can be done for the other species.The plots of the seasonal averages of single scan absolute retrieval error and of single scan vertical resolution profile for the four processors for temperature, water vapour, nitric acid, methane, nitrous oxide and nitrogen dioxide are reported in the supplementary material (Figs.S1-S6).

V. Differences in mean temperature and VMR profiles
The differences between the zonal means of BOL, OXF and IMK with respect to ML2PP have been computed and reported in three pressure-latitude maps for each season and each year.Absolute differences are reported for temperature and relative differences for the VMR of the various species.Averages are performed also distinguishing between daytime and nighttime profiles to highlight possible problems due to diurnally varying systematic errors.Figure 2 shows the maps for temperature relative to June-July-August 2009 period, for both daytime and nighttime con-ditions.The maps relative to other seasons and other species are reported in the supplementary material (Figs.S7-S61) ), and here below the overall results are summarized.Discontinuities in the maps are due to the coarse discretization of the latitude bands on which the comparison is performed.
The mean profiles of the four algorithms, as well as the differences between OXF, IMK and BOL products with respect to ML2PP products, computed for five 'reference' atmospheres (polar summer, polar winter, equatorial, midlatitude daytime and midlatitude nighttime) are collected in summary plots for each species.Mean differences are compared with the systematic error profiles estimated for OXF microwindows for each reference atmosphere [Dudhia, 2008], and hence representative of OXF, ML2PP, and, partly, BOL systematic errors.IMK systematic errors are generally comparable [von Clarmann et al., 2009].Figure 3 reports the summary plots for ozone, the plots relative to the other species are contained in the supplementary material (Figs.S62-S67).
For all species, no significant difference was observed between the independent analyses performed for years 2008 and 2009.

V.1 Temperature
Between 100 hPa and 1 hPa temperatures from the four algorithms are consistent within 1 • K, apart from a few exceptions (see Fig. 2 and Figs. S7-S13 and S62 of the supplementary material): • In the polar winter, between 100 and 30 hPa, ML2PP and OXF are colder than BOL up to 3 • K and than IMK up to 6 • K. Impact of unaccounted polar stratospheric clouds could have a role here.• In midlatitude winter and autumn, mainly BOL, but also IMK, differ from ML2PP and OXF by about 1-3 • K and this difference changes sign for daytime/nighttime conditions.The cause of this behavior can be ascribed to the geometry of observation of MI-PAS measurements and to the assumption of horizontal homogeneity done in ML2PP and OXF forward models.Indeed, MIPAS looks backwards with respect to the satel-lite's flight direction in the nominal observation mode.This means that during the descending part of the orbit (when flying from North to South), which away from the Poles corresponds to daytime observations, the instrument looks northward, while during the ascending part (when flying from South to North), which away from the Poles corresponds to nighttime observations, it looks southward.As a consequence, in a region with temperature increasing northwards it sees a negative temperature gradient in nighttime observations and a positive temperature gradient in daytime observations [Kiefer et al., 2010].This asymmetry in the observations, in presence of non-linearity effects, leads to a bias in the retrieved temperature when horizontal gradient are not taken into account and this is seen as a bias between the algorithms that take into account inhomogeneities and the others.The bias in the retrieved temperature of ML2PP and OXF is only visible when comparing the mean of ascending and descending profiles separately: when averages are performed including measurements from both daytime and nighttime observations, as well as from different seasons, differences compensate each other and the resulting bias is very small (see Fig. S62 of the supplementary material).
At high altitudes, in particular between 1 and 0.2 hPa, BOL and OXF are 2-4 • K warmer than ML2PP in the tropics and in the summer and spring hemisphere for midlatitudes.Between 0.2 and 0.08 hPa IMK is up to 3 • K colder than BOL, ML2PP and OXF almost at all latitudes.At low altitudes, in particular below 10 hPa, OXF and ML2PP are comparable within 1-1.5 • K with the exception of the tropical regions in some seasons, IMK is more than 3 • K warmer than OXF and ML2PP at almost all latitudes.Further investigations are needed to understand the causes of these differences, but these have to be searched in the use of different microwindows, different cloud filtering thresholds and different interferences due to sequential or multi-target retrievals.

V.2 Water vapor (H 2 O)
Between 60 hPa and 0.2 hPa water vapor profiles from the four algorithms are consistent within 10% (Figs.S14-S21 and S63 of the supplementary material).In the troposphere, in particular between 200 and 70 hPa, differences between the four algorithms are larger than 40%, with the IMK mean profile being the highest one and OXF and BOL profiles being the lowest ones.Differences between the vertical resolution of the different algorithms and hence difficulties in matching the hygropause can be responsible of some of the observed differences.
In the mesosphere, in particular between 0.9 and 0.2 hPa, the BOL mean profile is the smallest, while for pressures lower than 0.2 hPa OXF and IMK profiles are the largest.Error due to Non-LTE may have a major role here.

V.3 Ozone (O 3 )
Between 40 and 0.5 hPa ozone profiles from the four algorithms are consistent within 5-10% (see Fig. 3 and Figs.S22-S29 of the supplementary material), with the largest differences around the ozone peak.In particular, in the Southern hemisphere polar winter, IMK is biased low with respect to the three algorithms by more than 10% at ~3 hPa (near the ozone peak) and is biased high by ~15% at ~20 hPa (see Fig. 3 and Figs.S26 and S27 of the supplementary material).This could be an effect of the different regularization approaches used by the four algorithms and of the choice of microwindows.
In the tropics, between 60 and 40 hPa ML2PP is biased high by about 15% with respect to the other three algorithms, while between 200 and 80 hPa ML2PP is always biased high with respect to IMK but it is biased low with respect to BOL and OXF.Outside the tropics, OXF and ML2PP are consistent within 10% between 500 and 0.1 hPa.ML2PP and OXF are biased high with respect to IMK for pressures larger than 100 hPa at all latitudes and with respect to BOL for pressures larger than 200 hPa.The positive bias of ML2PP, OXF and BOL with respect to IMK in the troposphere, also confirmed by validation of MIPAS products with ground based measurements [Laeng et al., 2013], can be attributed to the different microwindows used by the different retrievals.
For pressures smaller than 0.08 hPa ML2PP is biased low for more than 30% with respect to all others algorithms.This is also true between 0.2 and 0.08 hPa, but only with respect to IMK and partially BOL, while OXF is even smaller than ML2PP in this pressure range.
From Fig. 3 we can also see that the mean differences between the profiles of the four algorithms are generally smaller than the systematic errors of the individual retrievals and this is true also for the other species.Indeed, the estimation of the systematic errors includes all errors that are not just propagation of the random measurement error through the retrieval.Given the length/time scale of variation of each systematic error, when averaging on a long period and a latitude band, some of these errors may even change sign and compensate each other.
The results found for ozone are consistent with those of the comparison of the ozone profile retrieved by the four algorithms made in the frame of Ozone Climate Change Initiative [Laeng et al., 2013].

V.4 Nitric acid (HNO 3 )
Between 100 and 5 hPa mean differences between the nitric acid profiles from the four algorithms are generally within 5-10% (see Figs. S30-S37 of the supplementary material for the different seasons and Fig. S64 for the summary plots), apart from in the tropics and in the Southern hemisphere winter and spring polar conditions.
In the tropics BOL is about 10% larger than the other 3 algorithms between 20 and 6 hPa and up to 20% smaller between 60 and 25 hPa.OXF nitric acid value around 100 hPa is significantly larger than the values retrieved by the other three algorithms, probably due to difficulties in resolving the knee of the profile.In the Southern hemisphere polar winter, mostly ML2PP, but also OXF, are biased low for more than 50% with respect to BOL and IMK between 80 and 30 hPa (see Figs. S34-S35 and S64 of the supplementary material).
In midlatitudes, especially in winter and summer (see Figs. S30-S31 and S34-S35 of the supplementary material), (BOL -ML2PP) differences change sign for daytime and nighttime profiles.As in the case of temperature, this behavior can be attributed to the horizontal inhomogeneities that are only accurately handled in BOL retrievals.The diurnal change of the differences is significantly less evident in the (IMK -ML2PP) differences, indicating that the approach used by IMK for handling the horizontal inhomogeneities (i.e.modeling the gradient of temperature) only accounts for a part of the problem and only BOL corrects for errors due to HNO 3 gradients.The fact that a similar behavior is not seen for the other species can be explained by the following two considerations.First, the temperature bias caused by horizontal gradients does not cause biases in the minor constituent retrievals because the non-linearities present in the radiative transfer of temperature and minor constituents are similar and the "effective" temperature, retrieved when neglecting the horizontal gradients, is the most suitable for the minor constituent retrieval.Second, with the exception of nitric acid, either horizontal concentration gradients or species dependent non-linearities present in minor constituent retrievals are usually small [Carlotti et al., 2013].
At high altitudes, i.e. for pressures smaller than 4 hPa, ML2PP is more than 40% smaller than the other three algorithms at all latitudes.At low altitudes, i.e. for pressures larger than 100 hPa, ML2PP is more than 20% larger than the other three algorithms.Further investigations are needed to understand this behavior.

V.5 Methane (CH 4 )
Differences between the methane profiles from the four algorithms are within 10% between 50 hPa and 0.1 hPa (Figs.S38-S45 and S65), with the exception of the Southern hemisphere winter and spring polar conditions, where differences between OXF and IMK with respect to ML2PP are larger.
For pressures larger than 50 hPa, IMK is 10-20% larger than ML2PP and OXF at all latitudes and seasons, even larger with respect to BOL, that is about 10% smaller than ML2PP below 150 hPa.Around 30 hPa, IMK is smaller than the other three algorithms at almost all latitudes.The cause may be ascribed to the different microwindows used by IMK.
For pressures smaller than 0.2 hPa ML2PP is larger than IMK and smaller than OXF.

V.6 Nitrous oxide (N 2 O)
Similar to methane, differences are within ~10% in the altitude range 100 hPa -1 hPa (Figs.S46-S53 and S66 of the supplementary material), with the exception of the Southern hemisphere polar winter and spring, where OXF and IMK nitrous oxide values are significantly larger than ML2PP and BOL ones.For almost all latitudes and seasons, IMK is about 10-15% larger than the other three algorithms for pressures smaller than 2 hPa, while between 1 hPa and 0.8 hPa ML2PP is more than 20% smaller than the others.

V.7 Nitrogen dioxide (NO 2 )
The nitrogen dioxide profiles have very different retrieval ranges in the four algorithms and also in the common retrieval range consistent results are obtained only in a limited interval: differences between the algorithms are within 10% in the altitude range 10 hPa -0.3 hPa for nighttime measurements and in the altitude range 10-1 hPa for daytime measurements (Figs.S54-S61 and S67 of the supplementary material).Impact of Non-LTE, that is taken into account only in IMK retrievals, can be responsible of some of the found differences.In the polar winter, OXF profiles are significantly different from the others.
For pressures larger than 10 hPa ML2PP has a positive difference up to 30-40% with respect to the other three algorithms, especially for daytime measurements.This could be explained by the error coming from a wrong assumption of the profile below the lowest retrieved altitude.

VI. Conclusions
The consistency of the different retrieval procedures implemented in the four algorithms performing MIPAS processing was here evaluated with the objective of obtaining information for the refinements of the algorithms themselves and for a better assessment of the systematic errors that affect the MIPAS products.Very similar results are obtained for years 2008 and 2009, and these conclusions can be considered to be representative of MIPAS Optimized Resolution measurements.
The trade-off between retrieval error and vertical resolution of the MIPAS products varies for different atmospheric conditions and different algorithms, and these differences are explained by the different retrieval strategies and regularization approaches adopted by the four algorithms.
Despite the significant differences in the four algorithms, in general in the stratosphere the seasonal averages of their products are consistent within 1 • K for temperature and within 5-10% for the VMR of the analyzed species, with differences smaller than the estimated systematic error of individual retrievals.Differences larger than 10% are generally found in the troposphere, where clouds may have a major role, in the mesosphere, where contribution of non-LTE may have an impact, and in the Southern hemisphere winter polar conditions, where the retrieval error is significantly larger than in other conditions and hence also the impact of the different retrieval constraints may be larger.
Evidence of the impact of unaccounted horizontal inhomogeneities in ML2PP and OXF is seen, for middle latitude bands, in the temperature (BOL -ML2PP) and (IMK -ML2PP) differences and in nitric acid (BOL -ML2PP) differences.For the other species no evidence of the impact of the horizontal inhomogeneities is found in (BOL -ML2PP) differences, indi-cating that the "effective" temperature that is retrieved by 1D retrievals (ML2PP and OXF) is sufficient to compensate for temperature horizontal inhomogeneities in the VMR retrievals.Some of the observed differences can be explained by the known differences in the forward model or in the retrieval of the four algorithms (handling of horizontal inhomogeneities, selected microwindows, cloud filtering, regularization strategies), but some differences are still unclear and deserve a deeper analysis.These findings, that however contribute in providing a better assessment of the systematic errors of MIPAS products, will be used as guidance for further investigations and future improvements in the algorithms.The final assessment of the accuracy of MIPAS products requires also the comparison of MIPAS measurements with accurate correlative measurements.

Figure 1 :
Figure 1: Ozone zonal means of single scan random error profile (left plot) and vertical resolution profile (right plot) of BOL (red curves), OXF (green curves), IMK (blue curves) and ML2PP (pink curves) for equatorial (curves with circles) and Southern hemisphere polar winter (curves with triangles) conditions.

Figure 2 :
Figure 2: Maps of the differences between the temperature zonal means of BOL and ML2PP (left plots), OXF and ML2PP (center plots), IMK and ML2PP (right plots) for June-July-August 2009 period; upper plot: daytime conditions, lower plot: nighttime conditions.

Table 1 :
Main differences between the four algorithms