Deep Learning: Foundations and Concepts

296

10. CONVOLUTIONAL NETWORKS

Figure 10.7 The multi-dimensional convolutional filter layer shown in Figure 10.6 can be extended to include multiple independent filter channels.

We now make a further important extension to convolutions. Up to now we have created a single feature map in which all the points in the feature map share the same set of learnable parameters. For a filter of dimensionality M × M × C, this will have M 2 C weight parameters, irrespective of the size of the image. In addition there will be a bias parameter associated with this unit. Such a filter is analogous to a single hidden node in a fully connected network, and it can learn to detect only one kind of feature and is therefore very limited. To build more flexible models, we simply include multiple such filters, in which each filter has its own independent set of parameters giving rise to its own independent feature map, as illustrated in Figure 10.7. We will again refer to these separate feature maps as channels. The filter tensor now has dimensionality M × M × C × COUT where C is the number of input channels and COUT is the number of output channels. Each output channel will have its own associated bias parameter, so the total number of parameters will be (M 2 C + 1)COUT . A useful concept in designing convolutional networks is the 1 × 1 convolution (Lin, Chen, and Yan, 2013), which is simply a convolutional layer in which the filter size is a single pixel. The filters have C weights, one for each input channel, plus a bias. One application for 1 × 1 convolutions is simply to change the number of channels (typically to reduce the number of channels) without changing the size of the feature maps, by setting the number of output channels to be different to the number of input channels. It is therefore complementary to strided convolutions or pooling in that it reduces the number of channels rather than the dimensionality of the channels.

10.2.6

Pooling

A convolutional layer encodes translation equivariance, whereby if a small patch of pixels, representing the receptive field of a hidden unit, is moved to a different

Turn static files into dynamic content formats.

Create a flipbook