Background: What is Deep Learning?
This article is primarily focused on how to use Deep Learning tools. Before wading out of the shallows, though, it is very helpful to understand what Deep Learning is. In traditional (or classical) machine learning, we are interested in utilizing statistical techniques to understand how input "features" contribute to the underlying statistical variance of an outcome (called a target), and using that relationship to come up with correct outcomes on new data.
When building a classical machine learning model, while some work might go choosing which machine learning algorithm to use and the best choice of algorithmic parameters, the most important decisions involve which set of features to include as inputs. These sets of choices are sometimes called "feature engineering." Tremendous amounts of time will be spent inspecting features, combining datasets, assessing correlation, encoding, and cleaning.
Deep learning, in contrast, works very differently. Instead of hand-tuning the features and trying to decide which are most important to the target, those decisions are delegated to the computer. Neural nets are able to ingest raw data and find useful representations which contribute to the outcome of interest. At lower levels in the network, these inferences may be abstract. As you move toward higher levels in the network, though, they often begin to take meaningful patterns in the form of edges or features (such as eyes, ears, or object attributes). In fact, attempting to inject additional features may not even be desirable as the machine is often very good at drawing its own inferences.
This ability for the model to draw its own inferences is transformational. Its this ability that makes Deep Learning so powerful, and also a little bit mysterious.
Aidse: Historically, understanding why Deep Learning models draw the conclusions they do has been difficult, though this is starting to change. Using libraries such as Shap, it's possible to inspect how the model perceives the input data and what features may have contributed to a specific outcome. The figure below shows the predictions for two input images are explained by the presence of certain attributes. Red pixels represent positive contributions to the label probability while blue pixels represent negative contributions. In the case of the bird image, the model recognizes the presence of a narrow beak and certain patterns in the plumage, patterns which are not seen in the next most likely label (
red-backed_sandpiper). In the case of the meerkat image, it recognizes the wide striping around the eyes.
It's worth noting that it's not always possible to directly inspect why the machine came to the conclusions it did. The second set of images show a network that has been run in reverse, where it was asked to draw a picture based on a particular label of interest. The top example shows what came out when asked about a "knight" while the second shows how the AI perceives "buildings."