It's time to progress on to the architecture of a neural network. By way of doing this we will address some deep learning jargon. At a high level there are three main components to a generic neural network: the input layer made up of input neurons, the output layer made up of output neurons, and the hidden layer. The hidden layer is literally any layer that is not an input or output layer. It's funny, for a while I wasn't sure what a hidden layer meant. Does a hidden layer get it's name due to some mathematical significance? The answer is no. Like I said before, it's just a layer that is neither an input or output layer.

Architecture of Neural Network

Let's go ahead and introduce another hidden layer to our architecture. Now that we have an additional layer some folks might refer to this architecture as a multilayer perceptron. The reason this name is used is largerly due to the history of neural networks. This terminology is used despite the fact that most architectures use sigmoid neurons instead of perceptrons - it's misleading to say the least. An example of this can be found in this article by Microsoft on applying multilayer perceptrons to polymorphic malware classification. I find it hard to believe that the temperamental perceptron is doing the heavylifting in this case, but I could be wrong.

Architecture of Neural Network

The number of layers and neurons is defined based on the task. For example, if we want to classify a 256 x 256 grayscale images of cats, we would define an input later of 65,536 input neurons (256 x 256). Each neuron then represents the intensity of each corresponding pixel from the image - scaled between 0 and 1. The output layer, containing a single output neuron, would produce a value between 0 and 1, with values above 0.5 indicating the image contains a cat, and values less than 0.5 indicating the image doesn't contain a cat.

Another important property of this network is that each layer of neurons provides inputs to the next layer in the network. This property is called feedforward; in other words, there are no loops in the network. So there we have it, a quick introduction to feedforward neural networks.

This post was developed with the help of: