What are Capsule Networks and why do they exist?
The Capsule Network is a new type of neural network architecture conceptualized byGeoffrey Hinton, the motivation behind Capsule Networks is to address some of the short comings of Convolutional Neural Networks (ConvNets), which are listed below:
Problem 1: ConvNets are Translation Invariant [1]
What does that even mean? Imagine that we had a model that predicts cats. You show it an image of a cat, it predicts that it’s a cat. You show it the same image, but shifted to the left, it still thinks that it’s a cat without predicting any additional information.
Figure 1.0: Translation Invariance
What we want to strive for is translation equivariance. That means that when you show it an image of a cat shifted to the right, it predicts that it’s a cat shifted to the right. You show it the same cat but shifted towards the left, it predicts that its a cat shifted towards the left.
Figure 1.1: Translation Equivariance
Why is this a problem? ConvNets are unable to identify the position of one object relative to another, they can only identify if the object exists in a certain region, or not. This results in difficulty correctly identifying objects that have sub-objects that hold positional relationships relative to one another.
For example, a bunch of randomly assembled face parts will look like a face to a ConvNet, because all the key features are there:
链接:
https://kndrck.co/posts/capsule_networks_explained/