Scalable multi-GPU implementation of the MAGFLOW simulator

Main Article Content

Eugenio Rustico
Giuseppe Bilotta
Alexis Hérault
Ciro Del Negro
Giovanni Gallo


We have developed a robust and scalable multi-GPU (Graphics Processing Unit) version of the cellular-automaton-based MAGFLOW lava simulator. The cellular automaton is partitioned into strips that are assigned to different GPUs, with minimal overlapping. For each GPU, a host thread is launched to manage allocation, deallocation, data transfer and kernel launches; the main host thread coordinates all of the GPUs, to ensure temporal coherence and data integrity. The overlapping borders and maximum temporal step need to be exchanged among the GPUs at the beginning of every evolution of the cellular automaton; data transfers are asynchronous with respect to the computations, to cover the introduced overhead. It is not required to have GPUs of the same speed or capacity; the system runs flawlessly on homogeneous and heterogeneous hardware. The speed-up factor differs from that which is ideal (#GPUs×) only for a constant overhead loss of about 4E−2 · T · #GPUs, with T as the total simulation time.

Article Details

How to Cite
Rustico, E., Bilotta, G., Hérault, A., Del Negro, C. and Gallo, G. (2011) “Scalable multi-GPU implementation of the MAGFLOW simulator”, Annals of Geophysics, 54(5). doi: 10.4401/ag-5342.

Most read articles by the same author(s)

1 2 > >>