“ PRELIMINARY VALIDATION OF LAVA BENCHMARK TESTS ON THE GPUSPH PARTICLE ENGINE „

Lava ﬂow modeling is important in many practical applications, such as the simulation of potential hazard scenarios and the planning of risk mitigation measures, as well as in scientiﬁc research to improve our understanding of the physical processes governing the dynamics of lava ﬂow emplacement. Existing predictive models of lava ﬂow behavior include various methods and solvers, each with its advantages and disadvantages. Codes differ in their physical implementations, numerical accuracy, and computational efﬁciency. In order to validate their efﬁciency and accuracy, several benchmark test cases for computational lava ﬂow modeling have been established. Despite the popularity gained by the Smoothed Particle Hydrodynamics (SPH) method in Computational Fluid Dynamics (CFD), very few validations against lava ﬂows have been successfully conducted. At the Tecnolab of INGV-Catania we designed GPUSPH, an implementation of the weakly-compressible SPH method running fully on Graphics Processing Units (GPUs). GPUSPH is a particle engine capable of modeling both Newtonian and non-Newtonian ﬂuids, solving the three-dimensional Navier–Stokes equations, using either a fully explicit integration scheme, or a semi-implicit scheme in the case of highly viscous ﬂuids. Thanks to the full coupling with the thermal equation, and its support for radiation, convection and phase transition, GPUSPH can be used to faithfully simulate lava ﬂows. Here we present the preliminary results obtained with GPUSPH for a benchmark series for computational lava-ﬂow modeling, including analytical, semi-analytical and experimental problems. The results are reported in terms of correctness and performance, highlighting the beneﬁts and the drawbacks deriving from the use of SPH to simulate lava ﬂows.


INTRODUCTION
The mechanisms controlling lava flow emplacement are not yet fully understood, due to the complexity of the fluid, its non-Newtonian rheology, and the strong effect on the rheology by the physical and chemical properties of the fluid (temperature, chemical composition, degree of crystallization, etc.) and their evolution over time.Mathematical modeling and computer simulations can play an essential role in improving our un-derstanding of the lava flow patterns, its morphology, and thermal evolution [Del Negro et al., 2008;Ganci et al., 2018;Vicari et al., 2009].The complex nature of the fluid, aspects such as free surface and irregular topographies, and phenomena like phase transition and the consequent formation of levees and tunnels make simulation of lava flows an extremely challenging task for Computational Fluid Dynamics (CFD).In the modeling of lava flow hazards [Cappello et al., 2011[Cappello et al., , 2016a;;Del Negro et al., 2013, 2016;Ganci et al., 2013;Hérault et al., 2009], common approaches to the simulation of lava flows start by reducing the complexity with a number of different strategies, such as reduced dimensionality, simplified thermal or dynamic models, or the use of stochastic approaches with little or no physical modeling [Costa and Macedonio, 2005].These simplifications allow easier implementations and higher performance, and while the results may still be useful for real-time forecasting [Cappello et al., 2016b], risk mitigation [Scifoni, 2010] and the production of long-term scenarios [Del Negro et al., 2013], they are inadequate for a more thorough study of the behavior of the fluid and the laws underlying its rheology, which require the detailed modeling of the full three-dimensional flow and its rheological aspects.
The Smoothed Particle Hydrodynamics (SPH) method, recently introduced in the field of Computational Fluid Dynamics (CFD), is a Lagrangian mesh-free particlebased method that allows a three-dimensional modeling of the fluid, taking into account in an efficient way many physical aspects that are typical of lava flows, such as the free surface, solidification fronts, the high dynamicity of the fluid and the interaction with irregular boundaries, such as solid lava emplacements and natural topographies.One main drawback of the SPH method is low accuracy, since in its common form it is a first order method [Monaghan, 2005].A second drawback comes from the frequent adoption of a weakly-compressible model [Hérault et al., 2010] instead of a fully incompressible one.As stated by Cordonnier et al. [2016], this choice affects the quality of the simulation, requiring an adjustment of parameters in order to improve the results.Moreover, the adoption of a weakly-compressible model affects the simulation performance, since the time stepping is driven by the speed of sound [Zago et al., 2018].On the other hand, a weakly-compressible model allows a complete parallelization of the computations that can be run on massively parallel hardware like Graphics Processing Units (GPU), giving an advantage in terms of performance [Hérault et al., 2010].
The TecnoLab at the Istituto Nazionale di Geofisica e Vulcanologia (INGV) in Catania has created the GPUSPH particle engine [Hérault et al., 2011;Bilotta, 2014;Zago et al., 2017], the first implementation of the Weakly-Compressible Smoothed Particle Hydrodynamics (WCSPH) method to run entirely on Graphic Processing Units (GPUs).To better handle the computational needs of lava flow simulations, GPUSPH has been extended to distribute computations across multiple GPUs [Rustico et al., 2012], even across separate nodes in a cluster [Rustico et al., 2014], and it includes a semi-implicit integration scheme for highly viscous flows [Zago et al., 2018].
GPUSPH has already been validated in a number of contexts, both against classical theoretical problems, and real-world applications [Wei et al., 2015;2016].However, we need to evaluate the accuracy and robustness of the particle engine results in the context of lava flow simulation, starting from the benchmark tests introduced by Cordonnier et al. [2016].These benchmark tests of growing complexity (from a simple dam break without thermal effects to a physical experiment including both dynamic and thermal aspects) can be used to validate lava flow models in terms of completeness, accuracy and computing performance.In their work, Cordonnier et al. [2016] also compare the models that represented the state of the art at that time, and illustrated the respective main features, advantages and disadvantages.Here we present preliminary results obtained with GPUSPH, discussing the influence of the model and formulation on the accuracy and computational performance of the results, and the possible strategies to improve them.

SPH IN GPUSPH FOR LAVA
The SPH method discretizes the fluid by means of particles that act as interpolation nodes.Each particle carries information about a small volume of fluid, such as velocity, position, density, mass, temperature and so on, and moves according to equations of motion.The basis of the method relies on the SPH smoothing, that is the way in which the fields are interpolated at the position of the particles [Monaghan, 2005].For each particle the effect of its neighbors is weighted by the value of a function called smoothing kernel, that is centered in the particle position and is usually indicated with W(r,h).Here, r indicates the distance from the neighbouring particle and h is a parameter called smoothing length.The smoothing kernel is usually chosen with a compact support, which radius is called influence radius, usually determined as a multiple of h.
The dynamics of fluid bodies is modeled according to the continuity equations for mass: (1) where ρ is the density, u the velocity and D/Dt the total derivative with respect to time, and momentum (Navier-Stokes equations):

ZAGO ET AL.
(2) where P is the pressure, μ the dynamic viscosity, and G represents external forces, such as gravity, and the thermal evolution is described by the heat equation where T is the temperature, c p is the specific heat at constant pressure and κ is the thermal conductivity.
Even though we are considering the incompressible form of the Navier-Stokes equations, solving a fully incompressible fluid model would be rather complex and computationally expensive, since it would imply to solve the pressure form Poisson equation.Moreover, incompressible SPH models are known for exhibiting unstable behaviors in the density field and issues dealing with free surfaces [Ihmsen et al., 2014].As an alternative, in SPH one uses frequently a weakly compressible formulation where the pressure is derived from the density using an equation of state, such as [Cole, 1948]: (4) with ρ 0 the at rest density, c 0 the speed of sound and γ the polytropic constant.By construction, the compressible model allows to integrate independently each particle, making possible the adoption of explicit integration schemes without the need to solve any linear system.While this constitutes an advantage in terms of parallelizability of computations [Hérault et al., 2010], it also introduces a drawback in terms of time-stepping.
Explicit integration schemes exhibit stability requirements concerning the maximum allowed time step.The latter is in fact linked to several factors like the speed of sound, the viscosity, the maximum acceleration and the thermal diffusivity, by some CFL-like conditions [Monaghan, 1989;Monaghan and Kos, 1999;Morris et al., 1997, and references within]: (5) where a β is the acceleration of the particle β, c β is the speed of sound at density ρ β , and C 1 , C 2 , C 3 and C 4 are stability constants.In GPUSPH we use C 1 =C 2 =0.3, C 3 =0.125 and C 4 =0.1.The maximum allowed time step for the whole system is then the minimum time step over all particles, Dt=min β Dt β .
Because of these constraints, the speed of sound used for SPH simulations is usually taken smaller than the physical one, but still enough to prevent a compressible behavior.Weak compressibility is achieved as long as the speed of sound is large enough to bound density variations within certain ranges.A common choice is to take the sound speed one order of magnitude above the maximum velocity experienced in the flow, so that the maximum variation in density is contained within 1% of the fluid density.When the simulation involves high fluid columns, the hydrostatic velocity (i.e. the maximum velocity that a particle at the highest position would achieve in free-fall) can be predominant with respect of the maximum flow speed, and must be used instead.This is important to eliminate the vertical collapse of the fluid column due to excessive compressibility.Therefore, choosing a high speed of sound can help to get closer to the incompressible behavior, which would be more appropriate for lava, but such improvement is paid with a small time step, and then in terms of computational time and numerical precision.

SURFACE THERMAL DISSIPATION
Thermal dissipation is modeled both at the contact with the ground, using Equation ( 3) and on the free surface.The latter occurs according to two phenomena: 1. Thermal radiation: according to Stefan-Boltzmann law, we express the radiated heat per unit surface as: (6) with K B the Stefan-Boltzmann constant, e the emissivity, m the mass and T a the air temperature.

Air convection:
We do not model air particles, but we account the heat lost due to air convection by means of a convection coefficient η, according to the following law, per unit surface: (7)

SPH DISCRETIZATION OF THE EQUATIONS
The dynamical equations seen above are discretized according to the SPH method as described in Bilotta et al. [2016], Hérault et al. [2011] and Zago et al. [2018] obtaining: (8) for the continuity Equation ( 1 The two formulas for the surface thermal dissipation, introduced in section 2.1, are applied to surface particles that are identified as described in Bilotta et al. [2016] and Hérault et al. [2011].To apply these two per unit surface formulas, a simple way of computing the particle surface would be to consider it a square with the average inter-particle spacing as particle side.But we are dealing with flows that usually evolve towards a condition of under-resolved flow, then we would underestimate the surface in the regions where the simulation becomes more rarefied (i.e.where we have a thin flow).
To face this problem we compute the particle surface using the numerical volume [Hu and Adams, 2006] (11) and considering a spherical particle volume and a circular particle surface.

SPH SMOOTHING KERNEL
Given a symmetric SPH smoothing kernel W, its gradient can be written as: (12) where x αβ = x α -x β , and h is the kernel smoothing length.By choosing a kernel for which (13) has an analytical expression, and given For the simulations presented in this work we adopt a Wendland kernel [Wendland, 1995], defined as where, working in three dimensions, C w =21/(16 πh 3 ) and C F =5 C w /(8h 2 ).

BOUNDARY CONDITIONS AND BOUNDARY MODEL
From a mechanical point of view, at the interfaces between fluid and either walls and ground we impose a no-slip condition; this is obtained prescribing normal and tangential velocity along the analytical boundary: (15) ( 16) where v w is any physical sliding velocity of the wall (in our examples, we will always have v w =0).In our SPH discretization, physical boundaries are implemented according to the Dummy boundary model [Adami et al., 2012], where solid walls are discretized with multiple layers of boundary particles, enough to cover a full influence radius and thus complete the kernel support for fluid particles adjacent to the boundary.No-slip boundary conditions with dummy boundaries is obtained assigning to the boundary particles a velocity obtained as v w plus the opposite of the Shepard-averaged velocity of the neighboring fluid: (17) (where F represents the set of fluid particles) while the density is computed to achieve a pressure that matches the Shepard-averaged pressure of the neighboring fluid: (18) In order to reduce the accumulation of numerical error in the computations for the boundary model, all summations are performed using the Kahan method [Kahan, 1965].For the thermal model we use absorbing boundary conditions, implemented using the sponge layer approach: the boundary is assumed to have a sufficiently large thickness H s , through which heat propagates using the standard heat equation.Given a one-dimensional reference system with the origin on the boundary interface and oriented along the inwards normal n, the conditions for the temperature T(n,t) (with n the wall depth coordinate, and t the time) can be described analytically as where T w is the initial physical temperature of the wall [Hérault, 2008].H s must be chosen large enough to guarantee the last condition over all the simulation time.

BENCHMARKING THE GPUSPH MODEL
The SPH discretization, as any numerical method, introduces some errors that lead to a discrepancy between the simulation results and the exact solutions, which can be reduced adopting finer resolutions.In the following we will analyze the results of the simulation in terms of convergence with respect to the discretization fineness.In particular, the outputs of GPUSPH have been compared to the first three benchmark tests introduced by Cordonnier et al. [2016].

BM1: VISCOUS DAM-BREAK
Dam breaks are one of the simplest test cases in the field of CFD, and is described as a defined amount of confined fluid that is suddenly freed from one side and allowed to spread onto an horizontal plane, driven by gravity.In the simplest configurations, validation is made against the progress of the front of the flow over time.
In Cordonnier et al. [2016], the initial configuration of the fluid is a box with length L = 6.6m, height H = 1m and width W = 6.6m.The fluid has density ρ = 2700 kg/m 3 and a dynamic viscosity μ = 10 4 Pa•s.According to Balmforth et al. [2007] and Saramito et al. [2013], the evolution of the front over time is analytically described by (21) where T is the characteristic time of the problem, defined as T = (L/H) 2 (μ/ρgh).In our case, we have T = 16.45 s.

1) Implementation in GPUSPH:
The simulation do-main consists of a base plane and three walls containing the fluid.The fourth side of the main fluid box is left free to allow the fluid to slump, simulating the sudden opening of a gate (Figure 1).The solid walls are modeled using dummy boundaries, as introduced in section 2.3.The density diffusion approach introduced by Molteni and Colagrossi [2009] is used to smooth out the noise that naturally develops in the density field.For the speed of sound, the usual choice in WCSPH is to pick a value c 0 around 10 or 20 the maximum velocity value.Our experiments however show that much lower errors at a given spatial resolution can be obtained by using a higher speed of sound.There are diminishing returns in raising the value of c 0 , though, due to the smaller time-step, and even a reversal when the time-step becomes too small for the available precision.The main results that we illustrate are thus obtained with a speed of sound 100 times higher than the hydrostatic velocity, resulting in c 0 = 443 m/s.We measure the front of the fluid as the position of the furthest particle in the flow, plus Dp/2 to take into account the particle volume.Results for the front position at the three resolutions, compared to the front position predicted by Equations (21) are presented in Figure 2. We observe that GPUSPH slightly overestimates the theoretical solution for the front progress, and that the error becomes smaller at higher resolutions.Table 1 shows the errors obtained as the difference between the simulated and theoretical front position in respectively the low, intermediate and high resolution cases at time t, and the error ratios, obtained as the error at lower resolution over the error at higher resolution.
From Table 1 and Figure 3 we can assess the convergence of the model and that the order of convergence grows over time, being in the best case around a second order trend.The latter result is due to the low-resolution simulation becoming under-resolved as the flow progresses, due to the decrease in thickness, leading to larger errors and to the formation of artifacts, as shown in Figure 4, illustrating the situation for BM1 IR at t = 500s: we can observe that the front profile is no more well reconstructed and some artifacts are arising, like the formation of a second head and the detachment of the fluid from the ground.For BM1 HR we can further observe that at t = 45s there is a temporary inversion, with the simulation being slightly behind the theoretical result.This may be explained by the change in the expression of the analytical law, that presents a small discontinuity at t = 2.5T = 41.125s,giving a bigger value from the right hand side.
Finally, concerning the fluid height, we have from Saramito et al. [2013] that for short times the fluid height at the dam position and the end of the reservoir should remain constant (i.e.h(t,0)/H = 0.684 and h(t,-L)/H = 1) while the surface shape rearranges, whereas for long times (t >> 2.5T) the height at x = -L and x = 0m is the same, and evolves according to the law t -2 .For short times (less than around 100s) all of the three different resolution simulations match the analytical result, with an error of less than Dp.
3) Simulation performance: All the simulations were performed on a NVIDIA Titan X GPU (Maxwell architecture); with the adopted spatial discretization, Error at t = 10 s Error at t = 500 s linear quadratic FIGURE 3. Logarithmic plot of the error for BM1 over the spatial discretization interval for different times.The convergence rate is higher at longer times due to the under-resolved condition of lower resolution simulations.

BM2: INCLINED VISCOUS ISOTHERMAL SPREAD-ING
This benchmark test regards the simulation of a fluid spreading onto an inclined plane, and follows the analytical solution derived by Lister [1992].The fluid has a Newtonian rheology and is injected at a constant rate Q from a point source through the plane.We are interested in the evolution of the down-slope and cross-slope extent (L d and y p , respectively as shown in Figure 5) of the flow.The plane is inclined by an angle α = 2.5° and the fluid, with kinematic viscosity ν = μ/ρ = 11.3 • 10 -4 m 2 /s, flows at a rate Q = 1.48 • 10 -6 m 3 /s.
The density is left free by Cordonnier et al. [2016], and we have opted to use the same value adopted in BM1, ρ = 2700 kg/m 3 .According to Cordonnier et al. [2016], the analytical solution at long time for the downslope extent over time is given by: ( 22) and for the cross slope extent, at long times, is given by:  quently very small time steps.On the other side, having a too large piston implies more stresses on the fluid that would lead to higher disorder in the flow, with consequent increase in the discretization error [Monaghan, 2005].Moreover a large piston surface, coupled with the small value of Q, can determine small particles velocities that can be affected by numerical precision during the integration process.
In order to make a simpler definition of the geometry, with reduced numerical rounding, the floor is parallel to the xy coordinate plane, and gravity is oriented by an angle of -α.A side view of the simulation at t = 15s is presented in Figure 6.
As in BM1, the speed of sound is chosen so as to minimize compressibility while avoiding numerical in-stabilities due to loss of precision.For BM2, we use c 0 = 125,5 m/s.
2) Results for BM2: The evolution of L d is shown in Figure 7.The comparison between the simulated and analytical solutions is to be performed at long time, anyway an irregularity can be observed at the beginning of the simulation for both L d and y p , where, while a theoretical solution would have a growing trend, starting from zero, the simulated solutions maintain a non-zero constant value.This is due to the fact that the vent is not a point source and its size corresponds to the initial extension of the flow.We observe that for L d (Figure 7), the convergence is apparent, with a resolution increase leading to results closer to the theoretical solutions.As in BM1, as the flow spreads out, it becomes under-resolved at lower resolution, leading to considerably worse results that diverge from the analytical solution over time.Moreover, the under resolved flow front generates some artifacts that result in a shaky position advance.
The errors are reported in Table 2 and their trend is shown in Figure 8; as for BM1 we have a convergence that gains orders over time.Also in here, this can be explained considering the thinning evolution of the flow, which makes the low-resolution simulation becoming under-resolved as the flow progresses.When it comes to the cross-slope extent, we have a discrepancy between the simulated and analytical extension, as shown in Figure 9, which leads at long time to a wider flow.The causes are not yet fully known, and could likely be due to the approximations done at numerical level such as finite size of the inlet, as well as compressibility.Although the value of the density does not affect the theoretical behavior of the problem, it can be relevant in numerical terms.The impact that the density has in the discrepancies mentioned above is currently being investigated.
Running on the same hardware as BM1, one second of BM2 HR evolution is simulated in around 7'165 s.BM2 IR involves 1'171'807 particles and a time step DtBM2 IR = 6.22 • 10 -6 s, with one second of BM2 IR evolution being simulated in around 1'673s.Finally, BM2 LR involves 645'444 particles and a time step DtBM2 LR = 8.27• 10 -6 s, with one second of BM2 IR evolution being simulated in around 646s.

BM3: AXISYMMETRIC COOLING AND SPREADING
This benchmark test deals with a non-isothermal flow.The setup exhibits many similarities with BM2, including a point source with fluid spreading on a plane.The floor is however horizontal in BM3, leading to axial symmetry in the flow emplacement.Thermal effects are also taken into account in BM3, with only one-way coupling, as the fluid rheology is assumed to be independent from the temperature, which therefore acts as a passive tracer.The parameters for BM3 are reported in Table 3.
The radius of the expanding fluid evolves according to: (24) 1) Implementation in GPUSPH: The implementation in GPUSPH of BM3 is similar to that of BM2, though the difference in the magnitude of some physical parameters introduces more stringent constraints.The very small flow-rate sets an upper bound in the dimension of the source, since a larger inlet would result in a smaller velocity of the piston, that could lead to numerical issues for its The error ratios are computed as the ratio of the error at lower resolution over the error at higher resolution.
integration in single precision.The upper bound also imposes limitations on the coarsest resolution that can be used to discretize the problem and finally on the time-step.On the other hand, to be the piston capable of containing enough fluid to run the simulation, a smaller piston section implies a large piston height, requiring a higher speed of sound that would affect the time step.A solution to the fluid column could be to make a horizontal piston, parallel to the plane, but the junction with the hole would introduce disorder in the flow.The size of the piston has been chosen in order to approximate a trade-off with the issues just mentioned.The piston has a squared section, 0.005m wide and is 0.13m high.The plane has a side 0.12m long and the vent is a square hole placed in the middle.In order to avoid introducing disorder in the flux, the hole shape and size matches with the piston section.We use for the wall the same thermal parameters that we use for the fluid, and the thickness required by the dynamic boundary model is also sufficient to implement the absorbing conditions for the thermal model.The speed of sound is 40 m/s.We perform a convergence test using three levels of discretization: a low resolution with inter-particle distance Dp LR = 1/512 m= 1.95 • 10 -3 m, an Intermediate resolution with Dp IR = Dp LR /1.5 = 1/768 m = 1.3 • 10 -3 m, and a High Resolution with Dp HR = Dp IR /1.5 = 1/1024 m = 9.7 • 10 -4 m. 2) Results for BM3: For the flow dynamics, the model convergence can be assessed by looking at the radius of the emplacement.We take as objective the simulated time t = 144s the fluid is already spread enough to show a clear temperature profile, and artifacts due to the low resolution are yet to ap-pear.We observe an over estimation of the emplacement radius, being R BM3HR (144)=0.038m and R(144)=0.027m, probably due to discrepancies between the numerical and analytical setup.Concerning the temperature profile, we perform a graphical comparison between the simulated profile, and the analytical and measured ones, as shown in Figure 10, where we plot the normalized temperature (T*=(T-T a )/(T 0 -T a )) with respect to the radius normalized by the current flow extension.In the two reference curves, the temperature at the vent is lower than the temperature specified in the problem data, which has been faced by setting a lower initial temperature of the simulated fluid.From a graphical estimation, the temperature is chosen as the 93% of the normalized temperature, that is 313.6 K.
We can see that although the simulated solutions do not apparently match the reference, they qualitatively tend to those as the resolution is increased.The high mismatch in the shape of the profile can be due to an incomplete implementation of the thermal model for the boundary that is not described in Codonnier et al. [2016].The width of the simulated curves comes from the particles marked as surface, not being distributed on an imaginary smooth surface, but rather being recognized at different heights.This is also cause of the double curve effect (more apparent in the green curve) that is due to the lower surface particles cooling down for both the effect of surface dissipation and ground transmission.

CONCLUSIONS AND FUTURE WORK
We have presented the first three benchmark tests introduced by Cordonnier et al. [2016] simulated using GPUSPH.They state that the main drawback of SPH is the Weakly Compressible formulation and therefore a tuning of the parameters is needed to improve the quality of the simulations.Following this reasoning we mainly acted on the speed of sound and the boundary model in order to find a good compromise among accuracy, stability and simulation time.We have shown that for BM1 we have a strong convergence, while for the two benchmarks cases involving a fluid injection, larger discrepancies with the analytical results are present, but, we have proven the existence of a convergent behavior in most of the cases, and that the errors can be mitigated choosing a proper resolution for the spatial discretization.We have also encountered some issues related to under-resolved conditions of the problem, and again we have seen that they can be eliminated using finer discretization.
One main general improvement will be given by the introduction of an open boundary model.Although pistons are quite easy to implement, we have seen that it is worth to develop a more sophisticated way to implement a constant flux, in order to obtain a cleaner flow and a more stable simulation.For our future work we will focus on the introduction of the inlet feature, which should lead to significant improvements on the results for BM2 and BM3.
Anyway, the biggest obstacle to reproduce the BM3 test case has been the incomplete description of the problem setup from Cordonnier et al. [2016], with respect to the thermal boundary conditions; while this can be partially solved referring to the tests from which the benchmark setup is taken [Garel et al., 2012].We believe that more complete benchmarks should be developed.Cordonnier et al. [2016] also set up a fourth benchmark test (BM4), regarding experimental lava flow and the simulation involves realistic lava parameters.Before passing to real lava we need to be sure that the first three tests are reproduced in a good way that has been preliminary done here.We measured the performance of GPUSPH reporting the simulation times; which are around two orders of magnitude lower than what required by naïve implementations of the WCSPH method, as demonstrated by Hérault et al. [2010].Further improvement will be sought in the next work introducing open boundaries, which will contribute to reduce the noise content within the simulation and have simpler setups.The inlet will also give advantage in terms of performance and robustness.In fact, the high viscosity values that will be involved in BM4 require the adoption of the semi implicit integration scheme, recently introduced in GPUSPH [Zago et al., 2018]; in that case, we shown that the performances of the semi implicit scheme have a strict dependence on the number of particles involved in the simulation, then simpler geometries obtained substituting the piston structure with an open boundary will be more suited for the adoption of the semi implicit scheme.
2) Results for BM1: We show results for three different resolution, a Low Resolution with interparticle distance Dp LR = 1/8 m = 0.125m, an Intermediate Resolution with inter-particle distance Dp IR = Dp LR /2 = 1/16 m = 0.0625m, and a High Resolution with Dp LHR = Dp LR /4 = 1/32 m = 0.03125m.In the following we will refer to these three simulations as BM1 LR , BM1 IR and BM1 HR .

FIGURE 1 .
FIGURE 1. Lateral slice of BM1 HR at t = 10s.Particles are colored by velocity magnitude.
FIGURE 4.Under resolved flow front for BM1 IR at t = 500s: this is one of the resolution issues arising in BM1 as the flow evolves.

FIGURE 6 .FIGURE 7 .
FIGURE 6. Implementation of BM2 in GPUSPH, lateral view of a slice.Particles are colored by type: fluid in blue, walls in green and the moving piston in red.

FIGURE 10 .
FIGURE 10.Time evolution of the cross-slope extension for BM2.

TABLE 1 .
Errors for BM1 at short and long time.The error ratios are computed as the ratio of the error at lower resolution over the error at higher resolution.
LR = 5.63 • 10 -5 s and 1s of BM1 LR evolution is simulated in around 168s.Finally, BM1 LR involves 187'244 particles and a time step ZAGO ET AL.

TABLE 2 .
Time evolution of the cross-slope extension for BM2.Errors for BM2.
• 10 -5 s. 1s of BM3 LR evolution is simulated in around 15s on the same hardware used for the other benchmarks.BM3 IR involves 60'432 particles and a time step DtBM3 IR = 1.31 • 10 -5 s. 1s of BM3 IR evolu-ZAGO ET AL. 10 tion is simulated in around 134s.Finally, BM3 HR involves 106'724 particles and a time step DtBM3 LR = 9.83 • 10 -6 s. 1s of BM3 HR evolution is simulated in around 267s.