Using neural networks to study the geomagnetic field evolution

Considering the three components of the geomagnetic field as stochastic quantities, we used neural networks to study their time evolution in years. In order to find the best NN for the time predictions, we tested many different kinds of NN and different ways of their training, when the inputs and targets are long annual time series of synthetic geomagnetic field values. The found NN was used to predict the values of the annual means of the geomagnetic field components beyond the time registration periods of a Geomagnetic Observatory. In order to predict a time evolution of the global field over the Earth, we considered annual means of 105 Geomagnetic Observatories, chosen to have more than 30 years registration (1960.5-2005.5) and to be well distributed over the Earth. Using the NN technique, we created 137 «virtual geomagnetic observatories» in the places where real Geomagnetic Observatories are missing. Then, using NN, we predicted the time evolution of the three components of the global geomagnetic field beyond 2005.5.


Introduction
Artificial Neural networks (ANN or shortly NN) are sets of connected neuron layers, which contain one or more neurons.Each neuron represents a known function 1 f which transfers the input quantity p, multiplied by a weight w and added by a bias b, to the output a: w and b are both adjustable parameters of the neuron.
Commonly, neural networks are adjusted, or trained, so that a particular input leads to a specific target output.During training the weights and biases of the network are iteratively adjusted to minimize the network performance function.The default performance function is the mean square error (mse) (Demuth and Beale, 2004).
The neural networks have been used to study time series and predictions of different quantities that have a stochastic behavior (Frank et al., 2001).The geomagnetic field components have such behavior not only in the time scale of seconds or minutes (Hongre et al., 1999), but even in the larger time scale of years (Duka, 2005).For our intentions, the network inputs are the annual time series of the geomagnetic field component values and the targets are the known annual values of the geomagnetic field components that are shifted in some way from the input values.The predicted values of the geomagnetic field components are simulated by the trained networks when the inputs are shifted from the initial inputs.evolution without discussing any physical implication of the prediction technique.

Choosing the neural network
Without analyzing the different algorithms of training functions, but only by testing different networks and different training functions, we aimed to find the best ones that could supply the longest and the best time prediction of the geomagnetic field.
The training and simulation procedures of NN need statistically significant datasets that are difficult to be provided by the real geomagnetic observations.Therefore, we started with the time series of synthetic data that are annual values of the geomagnetic field components calculated by the Gufm1 model at a given place 2 .The time series S (t) are the values of the geomagnetic field components X (t), Y (t), Z (t) in the series of years from 1650 to 1980.
In the simplest case, during the network training, the input is a vector P(1 × 300), which values are taken from the series X(t) or Y(t) or Z(t), where t = 1650, 1651, …1950.The target is also vector T (1 × 300), which values are taken from the same series X (t) or Y (t) or Z (t), where t = 1650+1, 1651+1, …, 1950+1.During the simulation process the input is B (1 × 300), from the same series X (t) or Y (t) or Z (t) where t = 1650+1+1, 1950+1+1).So, the target T is shifted from the input P by d = 1 year, B is shifted from the target T by the same d.Then the simulation supply the prediction of the value d = 1 beyond the last value of B. d is called the horizon of prediction.
In order to enlarge the prediction horizon, we followed two approaches: 1.By increasing d directly from 1 until a value where the prediction error is unacceptable.
2. By fixing d = 1 and inserting the predicted value as a known value in the input of network retraining, and then by simulation the next value is predicted.This procedure is repeated until the prediction error is unacceptable.
Using the first approach and the input series of 300 years long, we studied the influence of the kind of network on the relative error 3 of the prediction of known values of X, Y, Z compo-We have also applied the neural networks to generate «virtual geomagnetic observatories» in different places on the Earth where the real geomagnetic observatories are missing.In this case, the inputs are the series of the geodetic coordinates of the known geomagnetic observatories, the targets are the known values of the geomagnetic field components and simulations supply the geomagnetic field values at a given place near by the known observatories.
The MatLab software package (www.Mathworks.com/products/Matlab),used to create the appropriate neural networks, contains different kind of networks, but the most successful for the time prediction are those that use the «multi-layer feed-forward error backpropagation algorithm» (Kugblenu et al. 1999).«Backpropagation» is a gradient descent algorithm, in which the network weights are moved along the negative of the gradient of the performance function (Demuth and Beale, 2004).There are a number of variations on the basic algorithm that are based on other standard optimization techniques and are used in different kind of training function, as: -trainscg (Scaled conjugate gradient), -trainbfg (Quasi Newton methods),trainbr (Bayesian regularization backpropagation), -trainlm (Levenberg-Marquardt backpropagation) etc. of the Matlab software (Aggarwal et al., 2005).
The different networks known as newff (feed-forward backpropagation network), newcf (cascade-forward backpropagation network), newelm (Elman backpropagation network) (Demuth and Beale, 2004), are tested.In these tests, the input time series were synthetic data generated by the Gufm1 model (Jackson et al., 2000) or IGRF 10 th Generation, version 4.0 (http://www.geomag.bgs.ac.uk) at different places and different years (from 1650 to 1980 in case of Gufm1 model, and from 1940 to 2014 in case of IGRF model).
We stress that the aim of this work is to test different NN forecasting algorithm and their application to the mid term geomagnetic field ( 2 ) We have chosen the coordinates of the NGK observatory which has also a long series of real data.
ing from 1 + d value of the last row of P and the input matrix of simulation is B(m × n), where the m th row of length n starts from m + d + 1 beyond the starting value of the last row of P(m × n).
For n = 200, m from 1: to 50 and d = 1, the «newcf» network provides different predictions for the next to last value of B. In the fig.2, there is shown the variation of absolute error of this prediction (only for Y -component) by the dimension m of the inputs, when their length n is fixed.It is noticed an immediate diminution of the absolute error when m = 3, but there are abundant fluctuations for the greater values of m.Therefore, in the following all inputs and target are chosen to have the format The NN functioning depends on several properties as the training function, the transfer function, the nn structure (the number of the layers and the number of neurons in each layer), the training epoch (the number of repetition of the training algorithm) etc., (Demuth and Beale, 2004).Up to here, we used the simplest structure of the NN that has two layers with one neuron each.In order to get the better structure of the given kind of neural network («newcf»), for the same series of inputs (synthetic data of In the same order are the absolute errors of the other components. In order to improve the results of predictions, we changed the input size from one dimensional to m dimensional, choosing from the same time series 200 years long, m time series with the same length n.So, the training inputs are: P (m × n) where the rows m = 2, 3, are shifted by 2, 3,… from the first row (m = 1), the target series is one dimension The ratio of the absolute error (difference between the predicted value for a given year and the known value of this year) with the known value from the long series of 300 years.It seems that we have the smallest error of prediction using the «trainbr» function and when the geomagnetic field components are considered independently from each other.This is explainable, because the result of prediction is very sensitive from the time variation of the input quantity and in case of the geomagnetic field the time variation is quite different for different component.Therefore in the following we will use the «trainbr» training function and will consider the X, Y, Z separately as independent quantities.
In order to illustrate the influence of the NN structure on the quality of prediction, we present We evaluated the error of the predictions of this method, by calculating Rms (Root mean square of deviations) for the whole prediction horizon (d), for each component, according to the formula: Where S i p is the absolute value, predicted by NN that could be X, Y or Z component in nT and S i m is the absolute value, given by Gufm1 model for the respective year and component.The results are presented in the table I, which also lists the respective averaged relative error of prediction.
We followed the same procedure for the second approach of horizon enlarging.The re-first factor, we reduced the horizon of prediction, and in cases of the second one we have applied the normalization of the input values.
An important internal factor influencing the NN behavior is the initialization of weights and bias.Usually the weights and bias are initialized randomly in the segment [-1, 1].We noticed that there are fluctuations of the NN prediction during the retraining and simulation of the same NN and the same inputs.In cases of receiving large fluctuations, we reduced the segment [-1, 1] to a very narrow one.

Evaluation of the prediction error for the chosen NN
In order to enlarge the horizon of the NN prediction, we used both ways mentioned above (subtitle 2).We compared the results of these ways when the prediction horizon is the same for both ways and when the same NN is applied on the same synthetic data generated by Gufm1 model for the NGK observatory.
The following graphs (figs.6a, 6b, 6c) present the predicted values (the red curve) of X, Y, Z components of the geomagnetic field for   Comparing these results, it is evident that the second approach is the better one.For that sults are presented in the analogue graphs (figs.7a, 7b, 7c) and table II as in the first approach.

Prediction of the global geomagnetic field
In order to study the time evolution and prediction of the global geomagnetic field, annual means of the geomagnetic observatories, which have longer than 30 years time series registrations, were considered.The spreading of geomagnetic observatories over the Earth is non uniform, there are places were the observatories are very dense as Europe while elsewhere as in South Pacific there is a lack of observatories.Numerous observatories have very short registrations or in their registration there are several years of missing registrations.Most of them have registration from the 60' to nowadays.For this reason, 105 observatories that have registrations of the 1960.5-2005.5 period and that are uniformly allocated over the Earth were selected (shown in the fig.9 by the empty circles).68 of them have different missing periods of registrations ranging from one year to 15 years.Before filling out this missing data, we have reduced data of all observatories that have different altitude to the same altitude zero (in geodetic coordinates), considering only the dipolar contribute to the reducing of the geomagnetic field components.

Virtual geomagnetic observatories
The chosen geomagnetic observatories are not sufficient to represent the geomagnetic field at a given epoch.We needed a denser network of observatories over the Earth.For that reason we created several virtual geomagnetic observatories especially in those places where real observatories are missing.
The time series of the geomagnetic field at a place S (t) depends on only one parameter (the reason, in the following we have used only the second approach of enlarging the prediction horizon.

Using NN prediction for real data of one geomagnetic observatory
Aiming to apply the NN technique to the time series of real data, we chose the Niemegk observatory (NGK: latitude: 52 0 4 ' , longitude 12 0 41 ' , altitude 78 m), which has continued registrations from 1932.5 to 2005.5, and annual mean time series of components 74 years long, published in the site: http://www.geomag.bgs.ac.uk/gifs/annual_ means.shtmIn order to compare the results of the NN prediction for the time beyond 2005.5, we used the results of IGRF 10 model.We followed the same technique as for the synthetic data (the second way of the inputs).The rms of deviations between predicted values (from 1991.5 to 2005.5) by NN and real values of the NGK Observatory for X, Y, Z components are respectively: 38.56 nT, 17.12 nT and 32.04 nT.Comparing these results with those of the table 1 and 2 of the synthetic data (Gufm1 model), an increase of the prediction error is noticed.It is explainable considering that the real data series are shorter, the time dependence of synthetic data is very smoothed, while the real data have almost erratic behavior in time.In the figs.8a, 8b, 8c, there are presented the graphs of X, Y, Z component values predicted by NN (2000NN ( .5-2015.5.5), calculated by IGRF (2000IGRF ( .5-2014.5.5) and observed in NGK (2000.5-2006.5).
Comparing the prediction of the IGRF 10 model and the NN prediction with the real data for the period 2000.5-2015.5, it can be noticed that the Neural Network gives better results.Maybe the reason is that the IGRF 10 model represents the main field, while the NN prediction represents the whole field.
ries terms, while in the series of different observatories there are irregular distances between their coordinates.
After too many tests, we managed to construct a new NN: where the input matrix P [p1; p2], is composed by the two vectors: p1 whose elements in the beginning are the latitudes of the 105 known observatories and p2 whose elements in the beginning are the longitudes of the time t), while the series of geomagnetic field values at a given time on different places depends on two parameters: latitude and longitude 4 .Apart of this, in the case of time series there is an equal interval (1 year) between se-   III.
105 known observatories.The target vector t is composed by the respective geomagnetic field component values of the 105 observatories.After the first training of the chosen NN network, we simulated the geomagnetic field component value of the virtual geomagnetic observatory at the chosen place.Then, the coordinates of the simulated observatory and obtained values of its geomagnetic field were added respectively to the input matrix P and to the target vector t.The NN is retrained with the added dimensions of inputs and outputs (106) and then another virtual geomagnetic observatory is simulated.The process was continued until 137 new virtual observatories were created, that are shown in the fig.9 (the empty circles are the real observatories and black filled circles are the virtual ones).The number of the generated observatories is limited by the elapsed time for running of the program in the computer 5 .
In order to use the same neural network for the different components of the geomagnetic field, the input and target values are reduced by appropriate coefficients chosen to have values of different components at the same order It is important to choose the path from one generated observatory to the other in a way that starts from a very dense region (like Europe) and gradually goes to the less dense regions.
In order to evaluate the accuracy of the virtual observatory generation, we compared the geomagnetic field component values of one observatory in Europe (WING-observatory: 53.45° latitude, 9.4° longitude, 50 m altitude, that is not included in the group of 105 real observatories), with the respective values of the virtual observatory, that is generated by the NN in same coordinates.The resulting differences for the respectively X, Y, Z components are: 50 nT, 63 nT, 31 nT.For another observatory (PST -51.42° latitude; 302.7° longitude), located far ( 5 ) Note that the NN used has 14 layers where the first layer has 14 neurons and the others have different numbers of neurons.For such NN and the PC used (Pentium IV, 2Ghz, 512 Mb RAM), for about 400 cyles of NN training and simulation of the three components of the geomagnetic field at one observatory at a given year, it was needed more than 5 hours.until the year 2015.5 as the error of NN prediction for longer time horizon is greater than 100 nT.
Using the time series of all observatories and the NN technique for creation of virtual observatories, we simulated the values of the geomagnetic field components at a regular grid of points (2048 points) that covers uniformly the Earth surface for several epochs.In the figures (10÷12), the isolines of the X, Y, Z components in Mercator projection for the two epochs, 2005.5 and 2015.5, according to the values predicted by NN technique, are shown.
It can be easy to see that the predicted values by the NN fit the observed value better than other models.The time prediction process for each of the 242 observatories (real and virtual) is almost the same as that followed for the NGK observatory, but having time series from 1960.5 to 2005.5 year.We continued the time prediction  over the Earth, we obtained an accurate mid term geomagnetic field evolution.This is illustrated by the plots of different components of the geomagnetic field for two epochs of the horizon prediction.
As the first attempt of using the NN techniques to predict the mid term geomagnetic field evolution, we are aware of the necessity of the technique improving in order to achieve a better and longer prediction horizon.As most of the error prediction comes from the virtual geomagnetic observatory generation process, we believe that creating a denser network of virtual observatories by a better choice of the path from the known observatory to the new ones, the error prediction would be reduced.

Concluding Remarks
Testing a variety of NN offered by the Mat-Lab software package, we identified those NN that give the most accurate and the longest prediction when are applied on the geomagnetic field synthetic data.We tried also different way of inputting the data and different kind of NN training.It resulted that the best way is to input the data separately for different geomagnetic field components and to input data in the matrix form with three rows of data shifted from each other by the same time lag (one year).The best training function resulted trainbr.
In the best case of applying NN on the real data, when the time series of geomagnetic field data (annual means) are 74 years long, we achieved a horizon prediction of 15 year with relative errors of the prediction for X, Y, Z components respectively about 40 nT, 20 nT and 30 nT We used the NN techniques not only for filling the gaps in the data registrations of several geomagnetic observatories, but also for filling gaps in the geographical distribution of the geomagnetic observatories on the Earth.In this case, using the geodetic coordinates as inputs, we created virtual geomagnetic observatories situated in the places where the real geomagnetic observatories are missing.
Applying NN techniques on the data from real and virtual geomagnetic Observatories nents.It emerges that so called «newcf» network has a smaller error of prediction than two known networks «newff» and «newelm».The prediction error for a given horizon (d = 15 years), was different for different component X, Y, Z.In the fig. 1 there are shown (only for X -component) the relative error variations by the prediction year for three different networks.Meantime the average absolute error of X component for the whole horizon was 76.6035 nT («newcf» network), 200.7186 nT («newff» network) and 198.2094 nT («newelm» network).

Fig. 1 .Fig. 2 .
Fig. 1.Relative error of prediction of X-component for three different networks

Fig
Fig. 3a,b.Variation of X-component prediction error by the year of prediction, for different training functions.ab

Fig. 4 .
Fig. 4. Variation of relative error of prediction by the prediction year, for different number n of neurons per layer.

Fig. 5 .
Fig. 5. Relative error prediction variation by the prediction year for different number L of layers.

Fig. 7a- c .
Fig. 7a-c.Predicted values of a) X, b) Y, and c) Z component by NN for 1956 -1970 period and the respective values calculated by Gufm1 model (Second way of data input).
Fig. 8a-c.Predicted values of a) X, b) Y, c) Z component by NN for 2000.5 -2015.5 period and the respective values calculated by Gufm1 model and observed on NGK observatory.
The component field values of all observatories are reduced to the same altitude.wayfrom the group of 105 observatories, where there is a great lack of real observatories, we got the greatest differences of several hundred nT.In order to compare the results of NN model, Gufm 1 model and IGRF 10 model, we calculated the rms of deviations between the observed X, Y, Z component values of several real observatories and the respective values predicted by NN, calculated by Gufm 1 model and calculated by IGRF 10 model, for two different epochs: 1980.5 (TRO, PAF, NCK, WNG, BFE, CBB observatories) and 2005.5 (KDU, PST, TRO, PAF, NCK, WNG, AMS, BFE, BSL, CBB observatories).These deviations are shown in the table

Fig. 10 .
Fig. 10.Isolines of X-Component predicted by the Neural Networks for two different epochs.

Fig. 11 .
Fig. 11.Isolines of Y-Component predicted by the Neural Networks for two different epochs.

Fig. 12 .
Fig. 12. Isolines of Z-Component predicted by the Neural Networks for two different epochs.

Table I .
Rms and relative errors of prediction (first way of data input).

Table II .
Rms and relative errors of prediction (second way of data input).

Table III .
Deviation between values of observed components and respective values calculated by IGRF, Gufm1 and predicted by NN