chapter two

2 Neural network architectures

 

This chapter covers

  • The need for different networks types for different data types
  • Using fully connected neural networks for tabular-like data
  • Using 2D convolutional neural networks for image-like data
  • Using 1D convolutional neural networks for ordered data

The vast majority of DL models are based on one or a combination of three types of layers: fully connected, convolutional, and recurrent. The success of a DL model depends in great part on choosing the right architecture for the problem at hand.

If you want to analyze data that has no structure, like tabular data in Excel sheets, then you should consider fully connected networks. If the data has a special local structure like images, then convolutional NNs are your friend. Finally, if the data is sequential like text, then the easiest option is to use 1D convolutional networks. This chapter gives you an overview of the different architectures used in DL and provides hints as to when to use which architectural type.

2.1   Fully connected neural networks

Before diving into the details of the different DL architectures, let’s look at figure 2.1 and recall the architecture of a typical traditional artificial NN that we discussed in chapter 1. The visualized NN has three hidden layers, each holding nine neurons. Each neuron within a layer is connected with each neuron in the next layer. This is the reason why this architecture is called a densely connected NN or a fully connected neural network (fcNN).

2.1.1   The biology that inspired the design of artificial NNs

2.1.2   Getting started with implementing an NN

2.1.3   Using a fully connected NN to classify images

2.2   2D convolutional NNs for image-like data

2.2.1   Main  ideas in a CNN architecture

2.2.2   A minimal CNN for edge lovers

2.2.3   Biological inspiration for a CNN architecture

2.2.4   Building and understanding a CNN

2.3   One dimensional CNNs for ordered data

2.3.1   Format of time-ordered data

2.3.2   What’s special about ordered data?

2.3.3   Architectures for time-ordered data

2.4   Summary