Creation of a neural network in NTL+

Introduction

Artificial neural networks - mathematical models and their software or hardware implementations that are built on the principle of organization and functioning of biological neural networks - networks of neural cells (neurons) of the brain.

Currently, neural networks are used widely enough in many tasks of recognition, classification, associative memory, determining patterns, forecasting etc.

In order to work with neural networks there are separate mathematical products and additional modules for major mathematical software packages that provide wide capabilities of constructing networks of different types and configurations.

We, on the other hand, will attempt to create our own neural network from scratch by means of the NTL+ language and try at once its object-oriented programming capabilities.

Choosing a task

In this article, let us verify a hypothesis that on the grounds of past bars it is possible to predict with certain probability the type of the following bar: rising or falling.

This task is related to classification ones. We also have extensive historical information on financial instruments at our disposal, which allows using this historical data to obtain desired(expert) output. Because of this, we will create a network based on multi-layer perceptron.

Now, let us try to create a network that predicts on the basis of closing prices of k bars, the type of the following bar. If its closing price is higher than the closing price of the previous bar, we assume the desired output to be 1 (the price rose). In other cases, let us assume that the desired output is 0 (the price did not change or fell).

Constructing the network

In general, a multi-layer perceptron has one input layer, one or several hidden layers and one output layer. Having a single hidden layer is enough for transforming inputs in such a way to linearly divide the input representation. Using more hidden layers often causes significant decrease in the training speed of a network without any noticeable advantage in learning quality.

Let us settle on a network with one input layer, one hidden and one output layer. The following figure shows the architecture of our network in general terms. x1 - xn - inputs (closing prices), wi,j - weights of edges coming from i-node and going to j-node; y1 - ym neurons of the hidden layer, o1 - ok outputs of the network.


Additionally, there are bias inputs in the network. Usage of these inputs provides the capability of our network to shift activation function along the x-axis, thus not only can the network change the steepness of the activation function but provide its linear shift.

1. Graph of the activation function at = 1
2. Graph of the activation function at =2 and = 1
3. Graph of the activation function with the shift at =2, = 1 and = 1

The value for each node of the network will be calculated according to the formula:
, where f(x) - activation function, and n the number of nodes in previous layer.

Activation functions

Activation function calculates the output signal, received after passing the accumulator. Artificial neuron is usually represented as a non-linear function of a single argument. Most often the following activation functions are used.

Fermi function (exponential sigmoid):


Rational sigmoid:


Hyperbolic tangent:


In order to compute outputs of each layer, rational sigmoid was chosen because its calculation takes less processor time.

Training process of the network

To train our network, we will implement the back propagation method. This method is an iterative algorithm that is used to minimize the error of work of a multi-layer perceptron. The fundamental idea of the algorithm: after calculating the outputs of the network, adjustments for each node and error ω for each edge are calculated, at that the error calculation goes in the direction from network outputs to its inputs. Then the correction of weights ω takes place in accordance with the calculated error values. This algorithm imposes the only requirement on the activation function - it must be differentiated. Sigmoids and hyperbolic tangent comply with this requirement.

Now, to carry out the training process, we need the following sequence of steps:

  1. Initialize the weights of all edges with small random values
  2. Calculate adjustments for all outputs of the network
    , where oj - calculated output of the network, tj - expected(factual) value
  3. For each node except for the last calculate an adjustment according to the formula
    , where ωj,k weights on the edges coming from the node for which the adjustment is calculated and calculated adjustments for the nodes situated closer to the output layer.
  4. Calculate the adjustment for each edge of the network:
    , where oi calculated output of the node from which the edge comes. The error is calculated for this edge and - calculated adjustment for the node to which the specified edge comes.
  5. Correct weight values for all edges:
  6. Repeat steps 2 - 5 for all training examples or until the set criterion of learning quality is met

Preparing input data

It is necessary to prepare input data to perform training process of the network, the input data of high quality have a significant impact on the work of the network and speed of stabilizing its coefficients ω, i.e. on the training process itself.

All input vectors are recommended to be normalized, so that their components lie in the range [0;1] or [-1;1]. Normalization makes all the input vectors commensurate in the training process of the network, therefore correct training procedure is achieved.

We will normalize our input vectors - transform the values of their components to the range [0;1], to do so let us apply the formula:

Closing prices of bars will be used as components of the vector. The ones with indices [n+k;n+1] will be fed as outputs, where k - input vector dimension. We will determine the desired value based on the n-th bar. We will form the value on the grounds of the following logic: if closing price for the n-th bar is higher than the closing price of the n+1 bar (the price rose), we will set the desired output to 1, in case the price fell or did not change, we will set the value to 0;

The bars involved in the formation of each set are highlighted in yellow in the figure, the bar that is used to determine the desired value is highlighted in orange.

The sequence of provided data also affects the training process. The training process happens steadier if all input vectors corresponding to 1 and 0 are fed evenly.

Additionally, let us form a separate set of data that is used for the assessment of effectiveness of our network. The network will not be trained on this data and it will only be used for calculating the error with the method of least squares. We will add 10% of initial examples to a test data set. As a result, we will use 90% of the examples for training and 10% for testing.

Function, calculating the error with the method of least squares, looks like this:
,
where - output signal of the network and - desired value of the output signal.

Now let us examine the script code, preparing input vectors for our neural network.

Let us consider DataSet class - set of data. The class includes:

  • Array of vectors of input values - input
  • Formed output value - output
  • 'Normalize' method - normalization of data
  • 'OutputDefine' method - determining actual value on the relationship of prices
  • 'AddData' method - recording values in the input vector array and actual value variable
  • 'To_file' method - collecting data in the common array for the subsequent recording in the file
class DataSet { array <double> input(inputVectorSize); double output; //Normalization void Normalize() { double min = input[0]; double max = input[0]; //Finding minimum and maximum values for(uint i=0;i<input.length();i++) { if(input[i]>max) max=input[i]; if(input[i]<min) min=input[i]; } //Normalizing data for(uint i=0;i<input.length();i++) { //If min==max, setting all values to 0 if(max-min<0.000005) { input[i] = 0; } else { input[i]=(input[i]-min)/(max-min); } } } //Calculating the output void OutputDefine() { //If the price goes up, output is set to 1. if(input[inputVectorSize-1]<output) { output=1; } //If the price goes down or does not change, output is set to 0 else { output=0; } } //Splitting 'rawInput' array into two parts: input array and output value void AddData(array <double> rawInput) { //If the sizes of the arays don't match, we show a message. if(rawInput.length()!=uint(inputVectorSize+1)) { System.Print("Wrong input array size"); } //Writing to the 'input' array and 'output' variable for(int i=0;i<inputVectorSize;i++) { input[i]=rawInput[i]; } output=rawInput[inputVectorSize]; } // Merging the 'input' array and 'output' variable into one array array<double> To_file() { array<double> temp(inputVectorSize+1); for(int i=0;i<inputVectorSize;i++) { temp[i]=input[i]; } temp[inputVectorSize]=output; return temp; } }

Now let us examine the program code, used in the Run() function and performing the following sequence of actions:

  • Loading all historical data that is available in the terminal to an interior array. Loading is performed for the symbol and timeframe for which the current chart is displayed.
  • Trimming the bar array to the size that is n-fold of the input vector + output value
  • Forming the arrays of input vectors; normalizing these vectors; determining desired value 0 or 1
  • Forming the array of sorted input vectors where the vectors corresponding to 0 and 1 alternate successively
  • Recording the main part of the array with sorted input vectors in the file with data and recording the remaining part of data in the array for testing and, subsequently, estimating with the method of least squares.
int Run() { // The array that holds closing prices of all loaded bars array <double> data (Chart.Bars-1); // Writing data to array 'data' for(int i=1; i < Chart.Bars;i++) { data[i-1]=Close[i]; } // Truncating excessive data that can't form the whole set data.resize((data.length/(inputVectorSize+1))*(inputVectorSize+1)); int num=0; // The array is for holding all input vectors array<DataSet> sets(data.length/(inputVectorSize+1)); array <double> temp(inputVectorSize+1); // The array is for holding input vectors in the correct order array <DataSet> alternation; // Forming elements in array 'sets': normalizing each vector and calculating output (0 or 1) for(int i=data.length-1;i>=0;i-=(inputVectorSize+1)) { for(int j=0;j<inputVectorSize+1;j++) { temp[j]=data[i-j]; } sets[num].AddData(temp); sets[num].OutputDefine(); sets[num].Normalize(); num++; } // Forming 'alternation' array, where all the sets are sorted, so that they go 1,0,1,0 etc. uint len = sets.length; for(uint i=0;i<len;i++) { for(uint j=0;j<sets.length;j++) { if(sets[j].output==0 && state==1) { alternation.insertLast(sets[j]); sets.removeAt(j); state = 0; break; } if(sets[j].output==1 && state==0) { alternation.insertLast(sets[j]); sets.removeAt(j); state = 1; break; } } } // The file for writing data for training file f; f.Open("data.txt",fmWrite|fmText); // Calculating the number of sets with data for training and testing data uint datasets = int(alternation.length()*0.9); uint testsets = alternation.length() - datasets; System.Print("datasets SETS = "+datasets); // Writing data for training to the file for(uint i=0;i<datasets;i++) { for(int j=0;j<inputVectorSize+1;j++) { f.WriteDouble(alternation[i].To_file()[j]); } } f.Close(); System.Print("testsets SETS = "+testsets); // Data for testing // Writing to the file file t; t.Open("test.txt",fmWrite|fmText); for(uint i=0;i<testsets;i++) { for(int j=0;j<inputVectorSize+1;j++) { t.WriteDouble(alternation[i+datasets].To_file()[j]); } } t.Close(); return(0); }

You also need to declare the global variable int state = 0 in this file, the variable is required for alternation of input vectors.

Creating network classes

We will need for our network:

  • an instance of 'layer' class for connecting input and hidden layers of the network
  • an instance of 'layer' class for connecting hidden and output layers of the network
  • an instance of 'net' class for connecting layers of our network

Creating the class 'layer'

Now, we will need the class, containing the following properties and methods:

  • array 'input' for storing inputs of the network
  • array 'output' for storing outputs of the network
  • array 'delta' for storing adjustments
  • two-dimensional array 'weights' for edges' weights
  • method 'LoadInputs' for assigning the values specified in the input array to inputs of the layer
  • method 'LoadWeights' for loading weight values of the layer from the hard drive
  • method 'SaveWeights' for saving weight values of the layer to the hard drive
  • constructor allocating the necessary memory volume for the used arrays
  • method 'RandomizeWeights' - filling weights with random values
  • method 'OutputCalculation' - calculating output values
  • methods 'CalculatingDeltaLast' and 'CalculatingDeltaPrevious' - calculating adjustment values
  • method 'WeightsCorrection' - correcting weight values
  • additional (diagnostic) methods for displaying information on the screen:
    • PrintInputs() - printing the inputs of a layer
    • PrintOutputs() - printing the output of a layer (is called after calculating output by means of 'OutputCalculation')
    • PrintDelta() - printing adjustments for a layer (should be called after calculating adjustments by means of 'CalculatingDeltaLast' or 'CalculatingDeltaPrevious')
    • PrintWeights() - printing weights ω of the layer

    Let us remove the created class to a separate file. Besides, this file will also contain network configuration: number of nodes in input, hidden and output layers. These variables were placed outside the class in order for them to be used in the script for input data preparation.

    extern int inputVectorSize = 8; // input vector extern int L1_Innersize = 8; // number of nodes in the hidden layer without bias extern int L2_Innersize= 1; // number of nodes in the output layer without bias extern double nu = 0.1; //training rate for the backpropagation method class layer { // input values array<double> input; // output values array<double> output; // output array size int outputSize=0; // weights array<array <double>> weights; // adjustments array<double> delta; // default constructor layer() { } // loading input values from array 'inp'. bool LoadInputs(array <double> inp) { if(inp.length()+1 != input.length()) { System.Print("Loading is not possible. Array sizes do not match"); return false; } for(uint i=0;i<inp.length();i++) { input[i]=inp[i]; } return true; } // loading weights from a specifies file bool LoadWeights(string filename) { file f; if(!f.Open(filename,fmRead|fmText)) { return false; } for(uint i=0;i<weights.length();i++) { for(uint j=0;j<weights[0].length();j++) { weights[i][j]=f.ReadDouble(); } } f.Close(); return true; } // loading weights to a specified file bool SaveWeights(file f) { for(uint i=0;i<weights.length();i++) { for(uint j=0;j<weights[0].length();j++) { if(!f.WriteDouble(weights[i][j])) { f.Close(); return false; } } } return true; } // constructor // inp - array of values // size - size of output array layer(array<double> inp, int size) { input = inp; input.insertLast(1); // Bias outputSize = size; //weigths array array<array<double>> temp_weights(size, array<double>(inp.length()+1)); array<double> temp_output(size); output = temp_output; weights = temp_weights; array<double> delta_temp(size); delta = delta_temp; } //randomizing weights void RandomizeWeights() { for(uint i=0;i<weights.length();i++) { for(uint j=0;j<weights[0].length;j++) { weights[i][j]=(-0.5+double(Math.Rand())/32767)*0.1; } } } // printing weights void PrintWeights() { string line; for(uint i=0;i<weights.length();i++) { line = "weights"; for(uint j=0;j<weights[0].length;j++) { line += " ["+i+"]["+j+"]="+weights[i][j]; } System.Print(line); } } // printing input values void PrintInputs() { for(uint i=0;i<input.length;i++) { System.Print("inputs["+i+"]="+input[i]); } } // print output values void PrintOutputs() { for(uint i=0;i<output.length;i++) { System.Print("outputs["+i+"]="+output[i]); } } // printing delta values void PrintDelta() { for(uint i=0;i<delta.length();i++) { System.Print("delta ["+i+"]="+delta[i]); } } // output calculation void OutputCalculation() { // temporary array to store output values array<double> a(outputSize,0); for(int k=0;k<outputSize;k++) { for(uint i=0;i<input.length();i++) { a[k]+=weights[k][i]*input[i]; } a[k]=Activation(a[k]); } output = a; } //Changing weights of the last layer void CalculatingDeltaLast(array <double> realValues) { //System.Print("delta.length()="+delta.length); if(realValues.length()!=delta.length()) { System.Print("Mismatch between array sizes"); return; } for(uint i=0;i<realValues.length();i++) { delta[i]=-output[i]*(1-output[i])*(realValues[i] - output[i]); } } //Changing weights for any layers except for the last. void CalculatingDeltaPrevious(layer &inout LayerLast) { for(uint j=0;j<LayerLast.input.length()-1;j++) { double sum = 0; for(uint k=0;k<LayerLast.output.length();k++) { sum += LayerLast.delta[k]*LayerLast.weights[k][j]; } delta[j]=output[j]*(1-output[j])*sum; } } //Changed weights void WeightsCorrection() { for(uint i=0;i<weights.length();i++) { for(uint j=0;j<weights[0].length();j++) { weights[i][j] = weights[i][j] - nu*delta[i]*input[j]; } } } }

    We will also need additional functions that we will place outside the classes. These are 2 functions: normalization of an input vector and activation function.

    void Normalize(array<double> &inout input) { double min = input[0]; double max = input[0]; //finding minimum and maximun for(uint i=0;i<input.length();i++) { if(input[i]>max) max=input[i]; if(input[i]<min) min=input[i]; } for(uint i=0;i<input.length();i++) { if(max-min<0.000005) { input[i] = 0.0; } else { input[i]=(input[i]-min)/(max-min); } } } // Activation function double Activation(double x) { return x/(Math.Abs(x)+1); }

    Creating the class 'net'

    We will need the class combining our layers into a joint network, therefore such a class should have:

    • Method 'Calculate' - binding our layers for the calculation of the network output. This method will be used later for the work with the trained network
    • Method 'CalculateAndLearn' - computing outputs, adjustments and errors. We will call the previous method to calculate outputs, and for adjustments and errors it will call corresponding methods of each layer
    • Method 'SaveNetwork' - saving coefficients (weights) of the network.
    • Method 'LoadNetwork' - loading coefficients (weights) of the network.
    class net { array <double> x(inputVectorSize); array <double> y(L1_Innersize); layer L1(x,L1_Innersize); array<double> L1_outputs(L1_Innersize); layer L2(y,L2_Innersize); //Calculating output using current weights array<double> Calculate(array <double> data, bool randomWeights) { if(data.length()!=x.length()) { System.Print("Array sizes do not match"); } x = data; Normalize(x); L1.LoadInputs(x); if(randomWeights) { L1.RandomizeWeights(); L2.RandomizeWeights(); } L1.OutputCalculation(); L2.LoadInputs(L1.output); L2.OutputCalculation(); return L2.output; } //Calculating output and changing weights void CalculateAndLearn(array <double> data, array <double> realValues, bool randomWeights) { Calculate(data, randomWeights); L2.CalculatingDeltaLast(realValues); L1.CalculatingDeltaPrevious(L2); L2.WeightsCorrection(); L1.WeightsCorrection(); } //Saving the weights of the whole network bool SaveNetwork(string filename) { file f; if(!f.Open(filename,fmWrite|fmRead|fmText)) { f.Open(filename,fmWrite|fmText); } L1.SaveWeights(f); L2.SaveWeights(f); f.Close(); return true; } //Loading all weights bool LoadNetwork(string filename) { file fn; if(fn.Open(filename,fmRead|fmText)==false) { return false; } for(uint i=0;i<L1.weights.length();i++) { for(uint j=0;j<L1.weights[0].length();j++) { L1.weights[i][j]=fn.ReadDouble(); } } for(uint i=0;i<L2.weights.length();i++) { for(uint j=0;j<L2.weights[0].length();j++) { L2.weights[i][j]=fn.ReadDouble(); } } return true; } }

    Checking the work of the network

    In this subsection, we will check the work of our network on an elementary example that will be used to determine the correctness of classification of input vectors. To this end, we will use a network with two inputs, two nodes of the hidden layer and one output. We will set the parameter nu to 1. Input data will be represented as alternating sets with the following contents:
    input 1,2 and expected value 1
    input 2,1 and expected value 0
    They will be specified in the file in this form:

    1.00000000 2.00000000 1.00000000 2.00000000 1.00000000 0.00000000

    Feeding different number of training sets to the input, we can observe how the training process of our network is advancing.


    The red line on the graph shows the training sets corresponding to 1. The blue line - corresponding to 0. The number of training examples is laid off along the x-axis and the calculated network value - along the y-axis. It is obvious that when there are few bars (25 or fewer) the network does not recognize different input vectors, when there are more of them - the division into 2 classes is apparent.

    Evaluation of the network functioning

    In order to evaluate the effectiveness of training, let us use the function calculating the error with the method of least squares. To do so, let us create the utility with the following code and run it. We will need two files for its work: "test.txt" - data and desired values (outputs) that are used in error correction and "NT.txt" - the file with the calculated coefficients (weights) for the work of the network.

    This script performs the following sequence of actions:

    • creates object NT of the class 'net'
    • reads coefficients(weights) ω and loads them in NT
    • creates arrays of inputs and outputs of the network
    • reads inputs in a loop and places them in 'x' array, reads outputs and places them in 'reals' array
    • calculates the output of the network, feeding 'x' array as an input
    • computes error as the sum of squares of differences between all calculated values and expected(factual) values
    • when the end of file is reached, we display the error and terminate the work of the script
    #include "Libraries\NN.ntl" int Run() { net NT; file f; int counter = 0; // Loading weights NT.LoadNetwork("NT.txt"); double error=0; // Declaring an array for storing culculated outputs array <double> CalculatedOutput (L2_Innersize); // Opening the file with test data if(f.Open("test.txt",fmRead|fmText)==false) { System.Print("Input data not found"); return -1; } // Array holds input values array <double> x(inputVectorSize); // Actual output array <double> reals(L2_Innersize); while(true) { for(int i=0;i<inputVectorSize;i++) { x[i]=f.ReadDouble(); if(f.IsEOF()) { System.Print("counter="+counter+"error="+0.5*error+"error/counter="+(0.5*error/counter)); return -1; } } //forming real examples for(int j=0;j<L2_Innersize;j++) { reals[j]=f.ReadDouble(); System.Print("Real exit = "+reals[j]); } CalculatedOutput = NT.Calculate(x,false); System.Print("Calculated exit = "+CalculatedOutput[0]); //calculating error for(uint i=0;i<reals.length();i++) { error+=(reals[i]-CalculatedOutput[i])*(reals[i]-CalculatedOutput[i]); } counter++; } return(0); }

    Creating an indicator

    Let us create an indicator showing the decision of the network about the next bar by means of a histogram. The logic of work of the indicator is straightforward. Value 0 will correspond to the falling value of the next bar, value 1 - rising value, and 0.5 - the events of decreasing and increasing of closing price are equally possible. To show the indicator in a separate window, we will specify #set_indicator_separate, it is also necessary to determine the file where the class 'net' is defined, we will include the file in the line #include "Libraries\NN.ntl". Moreover, we will need a variable for storing computed output of the network for the given input, if there is only one output value, then it is enough to have one variable, but the network can have several outputs, therefore, will need an array of values: we declare it in the line array CalculatedOutput (L2_Innersize);. In the initialization function of our indicator we will specify indicator's parameters, its type, bind values of histogram with two buffers of values. We will also need to restore parameters of our network, i.e. load in it all weight coefficients, computed during training process. This is done by means of the method LoadNetwork(string s) of object NT, with the only parameter - file name, containing weight coefficients. In the function Draw we form input vector x corresponding to closing prices of the bars with indices [pos+inputVectorSize-1; pos], where 'pos' - number of the bar that we compute the value for and 'inputVectorSize' dimension of input vector. At the end, the method 'Calculate' of object NT is called. It returns an array of values (if our network have several outputs), but because we have only one output, we will use the element of the array with index 0.

    #set_indicator_separate #include "Libraries\NN.ntl" double ExtMapBuffer1[]; double ExtMapBuffer2[]; int ExtCountedBars=0; net NT; array <double> CalculatedOutput (L2_Innersize); int Initialize() { Indicator.SetIndexCount(2); Indicator.SetIndexBuffer(0,ExtMapBuffer1); Indicator.SetIndexStyle(0,2,0,3,0xFF0000); Indicator.SetIndexBuffer(1,ExtMapBuffer2); Indicator.SetIndexStyle(1,2,0,3,0xFF0000); NT.LoadNetwork("NT.txt"); return(0); } int Run() { ExtCountedBars=Indicator.Calculated; if (ExtCountedBars<0) { System.Print("Error"); return(-1); } if (ExtCountedBars>0) ExtCountedBars--; Draw(); return(0); } void Draw() { int pos=Chart.Bars-ExtCountedBars-1; array <double> x(inputVectorSize); while(pos>=0) { ExtMapBuffer1[pos]=0; // forming input vector for(int i=pos+inputVectorSize-1; i>=pos; i--) { x[i-pos]=Close[i]; } ExtMapBuffer2[pos]=NT.Calculate(x,false)[0]; pos--; } }

    The diagram above shows the network forecast of the subsequent bar. Values from 0.5 to 1 suggest the higher possibility that the subsequent bar will rise than fall. Values from 0 to 0.5 imply higher possibility of falling rather than rising. Value 0.5 suggests the state of ambivalence: the next bar might go up or down.

    Summary

    Neural networks are powerful tools for data analysis. This article covered the process of creation of a neural network by means of object-oriented programming in the NTL+ language. Usage of object-oriented programming allowed simplifying the code and made it much more reusable in the future scripts. In the presented implementation we needed to create the class layer defining a layer of the neural network and the class net defining the network in general. We used the class 'net' for the indicator, computing error and calculating network internal weights. Moreover, it was presented how to use the trained network by the example of the indicator-histogram that provided the network's forecast of the price change.