Reflection Full Waveform Inversion with Decoupled Elastic-wave Equations in Inhomogeneous Medium

Reflection full-waveform inversion (RFWI) can reduce the nonlinearity of inversion providing an accurate initial velocity model for full-waveform inversion (FWI) through the tomographic components (low-wavenumber). However, elastic-wave reflection full-waveform inversion (ERFWI) is more vulnerable to the problem of local minimum due to the complicated multi-component wavefield. Our algorithm first divides kernels of ERFWI into four subkernels based on the theory of decoupled elastic-wave equations. Then we try to construct the tomographic components of ERFWI with only single-component wavefields, similarly to acoustic inversions. However, the S-wave velocity is still vulnerable to the coupling effects because of P-wave components contained in the S-wavefield in an inhomogeneous medium. Therefore we develop a method for decoupling elastic-wave equations in an inhomogeneous medium, which is applied to the decomposition of kernels in ERFWI. The new decoupled system can improve the accuracy of S-wavefield and hence further reduces the high-wavenumber crosstalk in the subkernel of S-wave velocity after kernels are decomposed. The numerical examples of Sigsbee2A model demonstrate that our ERFWI method with decoupled elastic-wave equations can efficiently recover the low-wavenumber components of the model and improve the precision of the S-wave velocity.


Introduction
The traditional theory of velocity recovery is divided into two parts: background velocity construction and amplitude projection. In the first part, the background velocity model is built with methods such as traveltime tomography (including ray-based or wave-based, first-arrival traveltime tomography and reflection traveltime tomography) and velocity analysis. The high-wavenumber velocity is recovered using different migration methods [Claerbout and Doherty, 1972;Gazdag, 1978;Stolt, 1978;Baysal et al., 1983;Yilmaz, 2001;Biondi and Symes, 2004]. The problem of the traditional theory is its difficulty to update the velocity of intermediate wavelengths [Jannane et al., 1989]. In order to update velocity of different scales of wavelength simultaneously, Lailly [1983] and Tarantola [1984] proposed the full waveform inversion (FWI) theory that matches all the wavefield information without any approximation between the calculated data and the observed data in model parameter inversion. However in practice, this method is highly non-linear, thus suffering from many local minima (cycle-skipping) [Bunks et al., 1995].
To address the local minima issue, there are two strategies. The first strategy is inversion using the lowfrequency information in seismic data and the second strategy is inversion starting with an accurate initial model. The long-wavelength background velocity coverage with low-frequency data [Sirgue and Pratt, 2004] can reduce the dependence of FWI on the initial model. Bunks et al. [1995] proposed the multi-scale full waveform inversion method to combine the frequency of data with the scale of velocity. The multi-scale FWI method divides seismic data into different frequency and adopts the inverted results of low-frequency data as the initial velocity of the inversion of higher-frequency data. If there are sufficient low-frequency data, the multi-scale method can alleviate effects of an inaccuracy initial velocity. In general, low-frequency sources are not always available and the seismic data can go down only to approximately 5 Hz [Baeten et al., 2013;Shin and Cha, 2008;Wu et al., 2014]. So, if an accurate initial velocity model is crucial to FWI, more and more research is focusing on how to construct a good initial model and how to reduce the nonlinearity of FWI [Biondi and Almomin, 2013a;Biondi and Almomin, 2013b;Hardy, 2013;Djebbi, 2014].
According to the equation = (2 / ₀)cos ( /2) which is derived from the ray and Born approximate scattering theory [Virieux and Operto, 2009], where ₀ is the reference velocity, is the double of the incidence angle, the wavenumber is proportional to the frequency and is the direction vector of , the longwavelength background velocity may correspond to the data near zero frequency. Wu et al. [2014] simulated the pseudo-low-frequency data by Hilbert transformation and proposed the envelope inversion method. The frequency of envelope data is independent of the seismic frequency band, focusing around zero frequency.
Similarly, Hu [2014] proposed beat tone inversion using two adjacent high frequencies to obtain a good initial model. Bharadwaj et al. [2016] proposed a bump data function to obtain the long-wavelength structure.
Alternatively, on the basis of the equation = (2 / ₀)cos ( /2) , fixing the frequency of seismic data, the wavenumber decreases as (from 0° to 180°) increases. So wave components with large scattering angles such as direct waves, diving waves and refraction waves mainly update the background velocity of shallow layers, while reflection waves with smaller scattering angles mainly recover the perturbation velocity of high wavenumbers in the middle and deep layers. Accordingly, Mora [1989] divided the gradient of FWI into migration components and tomographic components corresponding to high-wavenumber and low-wavenumber respectively. However, the energy of migration components is far stronger than that of tomographic components [Tang et al., 2013]. Therefore, FWI mainly just updates the high-wavenumber velocity. With respect to reflection data, the (gradient) model update of FWI is similar to a reverse time migration (RTM) image of the data residual [Wang et al., 2018b]. Therefore, Díaz and Sava [2014] adopted a directional filter based on Poynting vectors in the gradient computation to preserve the smooth components of the gradient, thereby spreading information away from the sharp boundary. However, as the initial velocity model of FWI is smooth, there are only the downgoing waves in the wavefield rather than the upgoing waves. To address this issue, Xu et al. [2012aXu et al. [ , 2012b proposed a reflection FWI (RFWI) method with a migration/demigration [Symes and Kern, 1994] process to generate the upgoing waves and recover the background velocity along the tomographic components. For the issue that RFWI with the least square objective function of amplitude can easily fall into local minima, Chi et al. [2015] developed a correlation-based RFWI method with better convexity of the objective function.
Over the past few years, most geophysicists have focused on the development of acoustic FWI methods, based on the assumption of acoustic medium only. However, S-waves also play an important role in the illumination of targets under a gas cloud, salt body and fractures, motivating the development from singleparameter acoustic-wave FWI to multi-parameter elastic-wave FWI (EFWI) [Tarantola, 1986;Mora, 1987;Crase et al., 1990. However, between the inversion of multiple parameters trade-off is inevitable, which increases the nonlinearity of inversion. Wang et al. [2018b] decomposed kernels of gradient by decoupling elastic-wave equations to update the P-wave velocity and S-wave velocity separately like the acoustic case. In addition, the update for the S-wave velocity in FWI intrinsically brings even higher wavenumbers than P-waves Zhanyuan Liang et al.
because of the lower velocity of S-waves. Thus, optimizing the low-wavenumber components for the S-wave velocity is even more crucial in preventing the elastic FWI from converging to local minima [Wang et al., 2018a].
Methods of elastic-wave equation decoupling are generally divided into two categories. The first category of methods uses Helmholtz decomposition for elastic wavefield separation [e.g., Morse and Feshbach, 1953;Clayton, 1981;Mora, 1987]. These methods not only increase the numerical complexity and computational cost of the elastic-wave simulation, but also affect the amplitude and phase of the wave field extrapolation [Chang and McMechan, 1987;Sun and McMechan, 2001;Sun et al., 2011;Du et al., 2012Du et al., , 2014Wang and McMechan, 2015;Cheng et al., 2016]. The second category of decoupling method starts from a set of decoupling equations which maintain the vector characteristics and elastic characteristics of the seismic wave propagation. Ma and Zhu [2003] decoupled the P-and S-wave equations similarly to wavefield extrapolation by defining the P-wave stress. Xiao and Leaney [2010] used different but equivalent equations to decompose the elastic-wave field for RTM of vertical seismic profiling data. Subsequently, several similar ideas were proposed to separate the P-and S-waves for imaging and migration velocity analysis [Shabelansky et al., 2015;Wang and McMechan, 2015;Du et al., 2017].
In this paper, we propose a set of decoupled elastic-wave equations in an inhomogeneous medium to reduce the crosstalk artifact in the elastic-wave reflection full waveform inversion (ERFWI) after kernels are decomposed. We first introduce kernels of RFWI based on the first-order Born approximation. Accordingly, kernel of ERFWI is divided into four subkernels based on the method of Wang et al. [2018b]. After kernels are decomposed, the ERFWI of S-wave velocity can be recovered with only S-wavefield, similarly to acoustic RFWI.
The problem is that ERFWI begins with an initial velocity model which is actually for an inhomogeneous medium. Hence, we derive a set of equations to address the decoupling in an inhomogeneous medium to obtain the S-wavefield with less P-wave components. Then we test the new decoupled method with a simple inhomogeneous model, showing that it can enhance the accuracy of S-wavefield and it can further improve subkernels of ERFWI. In the numerical example session, we apply ERFWI with the conjugation gradient method to the Sigsbee2A model to demonstrate the effectiveness of our proposed method.

Kernels of Reflection Full Waveform Inversion
The purpose of reflection full waveform (RFWI) inversion is to update low-wavenumber velocity along to the "rabbit-ear" path [Xu et al., 2012]. The key of RFWI is to calculate the contribution of background velocity perturbation to wavefield perturbation. So, we firstly decompose the velocity into background component and perturbation component in the process of kernels derivation. Correspondingly, the wavefield is also decomposed into background and perturbation.

Decomposition of model parameters into background ₀( ) and perturbation
( ) is as below: (1) Similarly, wavefield ( , ) can be decomposed into background ₀( , ) and perturbation ( , ): (2) According to the first-order Born approximation, the scattering wavefield is far weaker than the incident wavefield, i.e. ( , ) = ( , ). Meanwhile, we express the integral solution of the perturbation wavefield in the frequency domain in order to void the derivative calculation [Xu et al., 2012]: where represents frequency, ( ) is the source signature, ₀( , ; ') and ₀( ', ; ) are Green's function of source wavefield and receiver wavefield respectively. ₀ means the Green's function for the background velocity and in ( ', ; ), ' is the scattered point and is the source position. According to the reciprocity of Green's function, and can be exchanged: Substitute Equation 5 into Equation 4: The Green's function of first-order scattering wavefield can be expressed as: When = , is the location of geophone, we can get: The Equation 8 is the kernel of RFWI which represents the contribution of wavefield perturbation corresponding to the background model parameters perturbation.

Elastic-wave Reflection Full Waveform Inversion and Kernels decomposed
The objective function of ERFWI is established as below: where ( , ; ) represents the calculated scattering wavefield at the location of receivers, while ( , ; ) represents the observed wavefield. The gradient of ERFWI is the derivative of the objective function with respect to background model parameters, Substituting equation 8 into equation 10, we get the gradient expression of ERFWI which is the summation of the cross-correlation between the source-side background wavefield and the receiver-side scattered wavefield, and the cross-correlation between the source-side scattered wavefield and the receiver-side background wavefield. For simplicity, equation 10 can be expressed as: where represents the source-side background wavefield, represents the receiver-side background wavefield, represents the source-side scattered wavefield, and δ represents the receiver-side scattered wavefield.
As we know, there is only PP wave mode in the acoustic RFWI and the inverted parameter is only , so no crosstalk exists. However, due to the co-existence of P-and S-waves, ERFWI suffers from crosstalk noise, especially for the S-wave velocity [Wang et al., 2018b]. Under the assumption of homogeneous medium, the elastic wavefield can be completed decoupled into P-wave wavefield and S-wave wavefield [Ma and Zhu, 2003;Xiao and Leaney, 2010]. According to the decoupled wavefield, we decompose the gradient of ERFWI into four parts: where: (i) is the subkernel through the summation of the cross-correlation between the source-side P-wave background wavefield and the receiver-side P-wave scattered wavefield, and the cross-correlation between the source-side P-wave scattered wavefield and the receiver-side P-wave background wavefield; (ii) is the subkernel through the summation of the cross-correlation between the source-side P-wave background wavefield and the receiver-side S-wave scattered wavefield, and the cross-correlation between the source-side P-wave scattered wavefield and the receiver-side S-wave background wavefield; (iii) is the subkernel through the summation of the cross-correlation between the source-side S-wave background wavefield and the receiver-side P-wave scattered wavefield, and the cross-correlation between the source-side S-wave scattered wavefield and the receiver-side P-wave background wavefield; (iv) is the subkernel through the summation of the cross-correlation between the source-side S-wave background wavefield and the receiver-side S-wave scattered wavefield, and the cross-correlation between the source-side S-wave scattered wavefield and the receiver-side S-wave background wavefield.
Here we assume the density is constant, so that inversion of just P-wave velocity and S-wave velocity is needed, and inversion effects of the two model parameters will be considered. Firstly, we assume ≠ 0, ≠ 0 and hence simply invert the P-wave velocity without the S-wave velocity. According to the radiation pattern (Figure 1), we can see there is only PP scattered wave, without PS converted wave. So in this case, we need to consider only the subkernel in the gradient of ERFWI (Figure 2), similarly as in the acoustic RFWI.

5
Decoupled ERWI in Inhomogeneous Medium Figure 1. Radiation pattern of perturbation with P-wave incidence.
Next we assume = 0, ≠ 0, meaning only S-wave velocity will be inverted instead of P-wave velocity. In this case, P-wave and S-wave exist simultaneously in the wavefield ( Figure 3) and all the four subkernels , , , in the gradient of ERFWI together cause the "cross-talk" due to the superposition of different wave paths. As shown in Figure 4, which takes the wave path of receiver-side as an example, the subkernel shown by the green ellipse area can be contaminated by the subkernel shown by the purple ellipse area. The subkernel with converted waves may bring many high-wavenumber information which undermines the inversion of the low-wavenumber velocity.
6 Figure 2. A schematic of ERFWI with only inversion (receiver-side).  The method of decoupled elastic-wave equations can reduce the crosstalk problem of S-wave velocity inversion in ERFWI [Wang et al., 2018a;Wang et al., 2018b]. The decoupled method can decompose the elastic wavefield into a P-wavefield and S-wavefield, which are then inverted separately.
We establish a velocity model with = 0, ≠ 0 to invert only . The P-wave velocity of the model is homogenous and there is a layer to generate the scattered wave and converted wave in the deep area of the S-wave velocity model. By decomposition of the elastic wavefield into P-wavefield and S-wavefield, we can calculate subkernels separately as shown in Figure 5a, 5b, 5c and 5d. From the Figure 5, we can see that Figure 5b and 5c mainly bring high-wavenumber noise to ERFWI and the Figure 5d is beneficial to the update of low-wavenumber components. As shown in Figure 6, the decoupled kernel of contains only withwavefield. Now, the kernel of can be expressed: 7 Decoupled ERWI in Inhomogeneous Medium   Notice that the P-wavefield and the S-wavefield can be completely decoupled only under the homogenous medium assumption. As a matter of fact, most models are inhomogeneous to some extent, which means homogenous decoupled theory is not sufficient to decompose inhomogeneous elastic wavefield. As a consequence, the decoupled S-wavefield will contain P-wavefield components, as shown in Figure 8a and 8b, so the will be contaminated by other kernels. In this paper, we propose a set of elastic vector decoupled equations for inhomogeneous medium to improve the accuracy of S-wavefield for the ERFWI of low-wavenumber .

Elastic Vector Decoupled Equations in Inhomogeneous Medium
The second-order elastic-wave equation in inhomogeneous medium is written with the form of matrix vector symbols as: where ü = ² / ² is the vector wavefield of acceleration, ∇ = [ , , ] represents the gradient operator, ∇· where = ∇ · . Equation 15 implies the corresponding relation between the P-wave and the velocity perturbation: → (PP scattered wave by perturbation, the same below). → , → .
Take the calculation of the curl of Equation 14 [Li et al., 2018]: where S = ∇× . Equation 16 shows the corresponding relation between the S-wave and the velocity perturbation: → , → .
Based on Helmholtz's theory, we can decompose the elastic-wave wavefield into P-and S-wave wavefield as: Different from other methods of decoupled elastic-wave equations [Ma et al., 2003;Li et al., 2007;Tang and McMechan, 2018], there is no inhomogeneous-term in the new S-wave decoupled equation (Equation 23) and hence most → exist in the P-wavefield. Therefore the new decoupled equations are beneficial to improve the accuracy of the S-wavefield to reduce contamination on from other kernels.
Next consider an inhomogeneous velocity model, where the velocity increases gradually along the radius and the minimum velocity value is at the center of the model. The P and S-waves are simulated using a Ricker source wavelet with peak frequency of 25Hz and source location is at the center of the model. Snapshots of S-wavefield are shown in Figure 7.  Figure 7e shows the difference between Figure 7a and 7c, and Figure 7f represents the difference between Figure 7b and 7d. Obviously, the difference mostly comes from the P-wave component. Below, we call the elastic-wave decoupled method with inhomogeneous-term in the S-wave equation as "conventional decoupled method", and call the elastic-wave decoupled method without inhomogeneous-term in the S-wave equation as "our decoupled method". To demonstrate the effectiveness of our decoupled ERFWI, we compare the two types of transmission kernels (gradient) with a single source-receiver pair in an inhomogeneous medium (the velocity increases with depth). The source and receiver positions are located on the top and bottom of the model, respectively. The P and S-waves are simulated using a Ricker source wavelet with peak frequency of 13Hz. As we can see from Figure 8, the conventional decoupled kernel (Figure 8a) produces more high-wavenumber artifacts than our decoupled method (Figure 8b). The difference (Figure 8c) between the two methods shows the high-wavenumber contamination due to the coupled effect from the P-wave.

Numerical Example
In the following section, we apply our decoupled method to a resampled Sigsbee2A model shown in Figure 9a and 9b. The dimension of the resampled model in the horizontal and vertical direction is * = 351 * 184. The grid size is 10m in both directions. In the initial model, velocity increases linearly with depth (Figure 9c and 9d).
During the inversion, we keep the density constant (  show gradients for the conventional and our decoupled ERFWI, respectively. We can see that Figure 10b has more low-wavenumber information than the conventional decoupled ERFWI gradient (Figure 10a). To emphasize this aspect, we show the wavenumber spectral distribution of the two gradients, as shown in Figure 10c and 10d. It can be seen that the energy in Figure 10d focuses more on low-wavenumber components than that in Figure 10c. These tests suggest that our new decoupled ERFWI method is feasible and superior for updating the low-wavenumber components for the S-wave velocity. Figure 11a shows the Sigsbee2A S-wave velocity model after resampling and its imaging result of reverse time migration (RTM) with cross-correlation imaging conditions (Figure 11d). Only when the initial velocity model is suited, the RTM image can be accurately focused. Figure 11e depicts the RTM images using the linear initial models ( Figure 11b). Because the initial model is far away from the actual model, the travel-time information is inaccurate and the migration result is not focused, especially for the three diffraction points below the model. In addition, some deep reflectors are deeper than their true location, due to higher velocity value of the initial model. Figure11c and   11f show the updated low-wavenumber background models of S-wave velocity after 30 iterations ( Figure 11c) and its migration result (Figure 11f) respectively. Accordingly, the RTM image, as illustrated by Figure 11f, is significantly improved with our decoupled ERFWI results. Besides, Figure 11f shows that the three diffraction points are better focused and are nearly the same as those in Figure 11d. Vertical velocity profiles from the different distance of the model are shown in Figure 12. The linear initial velocity (blue solid line) deviates from the true one (black solid line), and after our decoupled ERFWI, the trend of the inverted velocity curve (red solid line) is closer to the true curve. To further prove the ability of our new method in updating low-wavenumber velocity, we perform elastic-wave full waveform inversion (EFWI) by taking the inverted velocity by our decoupled ERFWI as initial velocity. All the parameters of numerical simulation are the same as the RFWI above. Inverted velocities after 96 iterations with EFWI are shown as Figure 13c (the inverted P-wave velocity) and 13d (the inverted S-wave velocity). After normalizing, the convergence curve of errors in the process of EFWI is shown in Figure 14. From Figure 14, we can see that the convergence of errors is fast and stable without falling into the local minimum, which shows the accuracy of the low-wavenumber velocity obtained by our decoupled ERFWI. Vertical velocity profiles from the different distance of the P-wave velocity and the S-wave velocity are shown in Figure 15 and Figure 16, respectively.
After EFWI, we can get the ideal velocity profiles.

Conclusions
A good initial velocity model with accurate low-wavenumber components is the key factor for FWI. According to the equation = (2 / 0)cos ( /2) , the update of low-wavenumber velocity mainly depends on low-frequency information in the seismic data and tomographic components in gradients (cross-correlation of wavefield with scattering angle close to 180°). However in reality, there is little low-frequency information below 5Hz in the seismic data. To address this problem, in this paper, we construct the initial velocity model through RFWI method that is dependent on tomographic components, which is more in line with the physical meaning between wave propagation and velocity imaging.
The parameter of the acoustic method is only P-wave velocity with PP scattered wavefield. However, not only PP scattered waves but also PS converted waves exist in the elastic wavefield (Figure 3), there will be the issue of crosstalk between the inverted P-wave velocity and inverted S-wave velocity (Figure 4). In particular, the inversion of S-wave velocity is more susceptible to the coupling effects of elastic wavefield than P-wave velocity. To overcome that, we calculate the S-wave velocity with only S-wavefield according to the theory of elastic-wave equations decoupling [Wang et al., 2018b] which is beneficial to reduce the P-wave components of S-wavefield. This requires a sufficiently high-precision S-wavefield, but in inhomogeneous medium, the decoupled S-wavefield is still subject to strong P-wave components. Hence, we further derive a set of equations suitable for the decoupling in an inhomogeneous medium to reduce the coupling effects of P-wavefield on S-wavefield. The numerical results of Sigsbee2A model demonstrate that our decoupled ERFWI can provide an accurate initial velocity model for FWI.