Abstract

Deep learning with artificial neural networks has become an incredibly interesting and fast-paced field of research that has exploded since the introduction of AlexNet in 2012 (Krizhevsky, Sutskever, & Hinton, 2012). Since that time, object recognition networks have matched or exceeded human capacity (He, Zhang, Ren, & Sun, 2015) on image databases and have found a wide variety of applications, ranging from speech recognition, to algorithmic generation of faces and landscapes, to reconstruction of visual stimuli from neural recordings. Given the success of these approaches, deep learning’s utility for data analysis, and its potential for modelling aspects of sensory processing and decision making, it is essential that researchers in perception become familiar with the variety of techniques that are bundled into the term of deep learning. Deep learning research and applications have been dominated by computer science and engineering, but many cutting-edge tools and algorithms are openly available to be used for data analysis and as a platform for biologically plausible or inspired modelling.
Deep learning implies several things: first, it is an implementation of an artificial neural network that has many layers (greater than three and can reach hundreds). Additionally, the network is trained on large datasets (often millions of examples), and lastly the innovations in graphical processing unit (GPU)-based computation allow these complex models (AlexNet for example has ∼62 million parameters) to be optimized, using gradient descent methods and the backpropagation algorithm in a computationally efficient way. Beyond this, deep learning is a catch all term for a large variety of algorithms and approaches that can benefit from the above.
Eugene Charniak’s ‘Introduction to Deep Learning’ is a small volume with seven chapters briefly covering several important topics concerning the theory behind and implementation of artificial neural networks. These include simple artificial neural networks, convolutional networks, recurrent networks, reinforcement learning (RL), and unsupervised learning methods, like autoencoders and generative adversarial networks (GANs).
Charniak is a computer scientist specializing in computational linguistics but writes in a relatively casual way that is accessible to readers of other backgrounds. The text is intended as a companion to an introductory course he teaches, and without his lectures, not all aspects are as comprehensive as other texts available as I will outline below. Readers should be familiar (but not necessarily an expert) with Python, linear algebra, multivariate calculus, probability theory, and statistics. The book gives several Python language programming examples, using the NumPy and most importantly TensorFlow libraries (https://www.tensorflow.org/). Each chapter spends about 20 pages exploring a particular topic, covering the mathematical basis, some code examples and providing follow-up reading suggestions and end of chapter exercises. For instance, the first chapter discusses the general theory behind artificial neural networks, their grounding in linear algebra and how one could optimize one using the backpropagation algorithm. The following chapter introduces TensorFlow and the basics of how one sets up a computational architecture (inputs, network layers, loss function, optimization function, etc.) that can then be run on a data set. Relatively quickly you’ll be up and running examples of single- and double-layer and convolutional networks.
Later chapters cover more complex topics and Charniak, at times, varies in elucidation. He is especially good at describing in plain language some of the details for how his code examples work, and for providing useful distillations of reinforcement learning, autoencoders and GANs, but his writing can be less helpful in other areas. In particular, the chapters on recurrent networks and sequence-to-sequence learning involve many references to data sets and results that the reader doesn’t have immediate access to (unlike early chapters with more explicit code using the MNIST handwritten digit data set). This can make things hard to follow along, unless the reader is sufficiently advanced to know how to implement the networks and find the correct data set themselves online. Generally, Charniak makes a good effort to be clear on each topic presented and how the code examples work, but without further explanation some elements may be cryptic. Therefore, I would highly recommend readers to consult the TensorFlow online documentation and tutorials. Similar attention to online documentation and tutorials are needed in the reinforcement learning chapter to follow the examples Charniak provides using Open.AI Gym (https://gym.openai.com/), a library supporting environments commonly used for training RL systems. At the time of this review, the book does not feature an online component. I have therefore created a GitHub repository with direct implementations of nearly all the code examples, located here: https://github.com/VisionResearchBlog/Introduction-to-deep-learning-code-examples
Charniak’s book arrives at a time with many avenues for learning or teaching about deep learning. One of the most amazing aspects of the current deep learning revolution is that there are many open source online tutorials and codebases to learn from. Additionally, there are several books (online or print, several free and all under $80) that cover similar topics as Charniak’s book.1–5 Despite such stiff competition, Charniak’s book is affordable and approachable for beginners; it has useful code examples and is a quick read. He gives many references for extra reading, if the reader wants a more thorough understanding. The book would work best, however, as a companion to a course where an instructor could provide more in-depth treatments of the mathematics, and more code instruction, and supplemented with primary texts and readings like the sources above.
Given the wide variety of resources for learning about the field of deep learning, I would highly encourage Perception readers to explore all options available to find what suits their needs, as there are texts for beginner to advanced, several with large codebases and tutorials to be explored.
Footnotes
Declaration of Conflicting Interests
The author(s) declared no potential conflicts of interest with respect to the research, authorship, and/or publication of this article.
Funding
The author(s) received no financial support for the research, authorship, and/or publication of this article.
