Neural network step by step guide

A neural network in machine learning is a type of model inspired by the human brain, consisting of interconnected nodes called “neurons” that process information in layers, allowing the system to learn complex patterns from data by adjusting the connections between neurons (weights) and improving its predictions over time, making it particularly useful for tasks like image recognition and natural language processing. Here is a sample image recognition neural network for recognizing handwritten digits. This post will simply go over the steps needed for this neural network.

Step 1 – Data gathering

In order to train any neural networks, having a lot of data is very important. The more data the better. For a neural network to recognizing handwritten digits, a lot of handwritten digits will be needed. They can be stored as images as the raw data.

Step 2 – Data labeling

A neural network for digit recognization will be a network of supervised learning. In order to do supervised learning, the data needs to be labeled. Every image that contains a digit needs to have label to indicate what the digit is. For example, if the digit of the image is a 3, then this image needs to be labeled with 3. With labeled data, the network will be able to get feedbacks on whether if it is able to recognize the image correctly.

Step 3 – Data processing

In order to use the data, the data needs to be transformed into a data format the computer can understand. For the handwritten digit recognization network, the raw image can be transformed into pixel values. For example, for a 28 by 28 pixel images which would have 784 pixels in total, it can be transformed into an array containing 784 pixel values, each value of the pixel can be the grayscale from 0-1, 0 meaning it’s a complete white color, 1 meaning it is a complete black color, anything in between for the levels of darkness.

Step 4 – Data modeling

A neural network consisting of interconnected nodes called neurons that process information in layers. Each neuron contains a value. We need to define the meaning for the values of the neurons in each layer. The values in an input layer will have different meaning than the values in a hidden layer, and the values in a hidden layer will have different meaning than the values in an output layer. We need to figure out and understand the meanings of the values in each neuron. In the handwritten digit recognization neural network, there will be a total of 784 neurons in the input layer, and each neuron containing a value ranging from 0 to 1, representing the grayscale of a pixel at a particular spot on the image. The number of neurons in the hidden layer can be any number that works the best, if we choose 15, then there will be 15 values that represents the a data point that is obtained after applying weights, bias and activation function on the 784 input values. The output layer has 10 neurons, representing the digits 0 to 9. It is an array of 10 floats between 0 and 1. The value at index 0 representing 0, at index 1 representing 1, at index 2 representing 2, etc. If the network is able to predict the digit on the image 100% correctly, the output for a image containing 3 will be an array of [0,0,0,1,0,0,0,0,0,0], the output for a image containing 9 will be an array of [0,0,0,0,0,0,0,0,0,1], etc.

Step 5 – Data transformation between layers

In a neural network, data will go through layers of transformation until it gets to the output layer. In the simplest neural network with only 1 hidden layer, the data go from the input layer to the hidden layer and then from the hidden layer to the output layer. In order to transform the data, we need to apply weights, biases and activation function to the input layer and hidden layer. After the data in the input layer is transformed into the hidden layer, the hidden layer will becomes the new input layer, and then it will go through the same transformation to get to the output layer. The transformation between layers progressively extract complex features from the data by building upon the representations learned in previous layers, this process is key to how neural networks can learn intricate patterns in data.

For neurons in the input layer, the value is simply the input values. For example, a neuron in the handwritten digit recognization neural network is simply a value between 0.0 and 1.0 that represents the greyscale of a pixel in an image. A 28 by 28 pixels digit image would have 784 pixel values, hence 784 neurons as the inputs.

The value of a neuron in a hidden layer gets its value by calculating a weighted sum of its input values from connected neurons from the previous layer, adding a bias, and then passing the result through an activation function, which transforms the sum into the final output value of the neuron. For example, to get the value for the first neuron of the hidden layer in the handwritten digit recognization neural network with 784 input neurons, we have to calculate the weighted sum of the 784 input values, adding a bias and then passing the result through an activation function to get the value. Since there are 784 input values, there will be 784 weights because each input will have its own weight. A weighted value is obtained by multiplying an input value by it’s weight, doing this for 784 inputs will get 784 weighted values. The weighted sum is obtained by summing up these 784 weighted values. A bias is then added to the weighted sum to get an new output value, this output value is then passed through an activation function to get the final output value for a neuron. Each neuron in the hidden layer needs to go through all of these calculation in order to get its value. The input value will stay the same, but the weights and bias are different for every neuron value calculation in the hidden layer. Graphically from the above handwritten digit recognization neural network, you can see each neuron in the hidden layer is connected with 784 lines because there are 784 input values. If there are more than one hidden layer, each neuron in the next hidden layer has to go through the same calculation process. The only differences is that the inputs will now be the neurons from he previous hidden layer instead of the input layer.

For the output layer, each neuron essentially gets its value the same way as the neurons in the hidden layer. The only difference is that the input neurons are now from the hidden layer right before the output layer.

Step 6 – Calculating the cost via a cost function

A cost function in a neural network is a mathematical function that measures how well the network is performing on a given task by calculating the difference between the predicted output and the actual target value, essentially acting as a metric to guide the network in adjusting its weights and biases to minimize the error and improve accuracy during training; the goal is to find the set of parameters that produce the lowest cost value, representing the best possible prediction for the network.

Step 7 – Taking derivative of the cost function and update the model’s parameters

The derivative of a cost function represents the “slope” of the cost function at a given point, indicating how much the cost changes with respect to a small change in the model’s parameters (weights and biases), essentially guiding the direction to update the parameters to minimize the cost during training using gradient descent. By calculating the derivative of the cost function with respect to each parameter, we can determine how much adjusting that parameter will affect the overall cost, allowing us to update the parameters in the direction that reduces the cost the most. The derivative is crucial for gradient descent optimization, where the algorithm iteratively updates parameters in the direction of the negative gradient (the direction that minimizes the cost).

Step 8 – Train the neural network model

Assuming step 1 to step 4 are done for all available data to be used for training. The rest is simply feeding each data entry into the network model to go through step 5 to step 7 until all data is used or the model is able to produce accurate prediction.

Search within Codexpedia

Custom Search

Search the entire web

Custom Search