Some Pitfalls of Neural Networks

aputunn
4 min readJun 23, 2024

--

The MIT Deep Learning community has significantly influenced our understanding of Neural Networks. For a deeper dive, see MIT’s original resources [1][4].

Deep Learning certainly has revolutionized many fields. In autonomous vehicles, it helps cars perceive and navigate. In healthcare, it aids in diagnosing diseases and personalizing treatments. Reinforcement learning allows AI to excel in decision-making in robotics and games, learning complex tasks. Generative modeling allows producing realistic images, music, and text. The impact of other applications certainly goes on and on, including natural language processing and security. Although we are far from an expert in Deep Neural Nets, we have gained a somewhat understanding of how these algorithms drive advancements.

Simply seen, a Neural Network transforms input data (signals, images, sensor measurements) into decisions (predictions, classifications, actions in reinforcement learning). They can generate new data from desired outcomes (which is the other way around), as seen in generative modeling. All the algorithm is trying to do is to estimate a certain function that maps some inputs to some outputs and builds up some representation of it [8] with the data available to further make decisions on new inputs.

Back in 1989, the Universal Approximation Theorem [7] posited that a neural network with enough neurons could approximate any function. This theorem highlighted neural networks’ theoretical potential to solve diverse problems by learning from data [2][4]. However, it did not address practical challenges like defining the network architecture, finding optimal weights, or ensuring generalization to new tasks [2][5]. Partly, this may have led to overlooking and overestimating neural networks’ capabilities in all real-world problems.

One unharmful, straightforward example is this: suppose a Convolutional Neural Network (CNN) attempts to colorize a black-and-white image of a dog but ends up giving the dog a green-colored ear in subtle parts and a pink chin with the tongue sticking out, Figure 1. This could occur because the training data likely included many images of dogs with tongues out or with grass backgrounds, causing the CNN to misinterpret these features. Even such a silly example shows how deep learning models depend heavily on their training data which could lead to issues like algorithmic bias and potential failures in critical applications.

Figure 1. Source: https://www.youtube.com/watch?v=ySEx_Bqxvvo&ab_channel=AlexanderAmini

Another example is in a safety-critical scenario of autonomous driving. Cars on autopilot can sometimes crash or perform nonsensical maneuvers, often resulting in fatal consequences. There was a case of a Tesla car that crashed into a construction pylon present on the road, Figure 2. When examining the training data used for the network controlling the vehicle, it was found that Google’s street view of that road segment did not include construction barriers and pylons. Disparities and issues in training data can lead to these downstream consequences, a prominent failure mode within neural networks. These often occur when neural networks encounter situations they haven’t been carefully trained on, leading to high uncertainty and ineffective handling of instances.

Figure 2. Source: https://www.youtube.com/watch?v=ySEx_Bqxvvo&ab_channel=AlexanderAmini

There are numerous examples of failure modes, such as facial recognition systems misidentifying individuals and chatbots generating inappropriate responses. The list of limitations is far from exhaustive. Here, we list some of the presented key reasons as currently perceived limitations of Neural Networks:

  • Very data hungry (often millions of examples)
  • Computationally intensive to train and deploy
  • Easily fooled by adversarial examples
  • Can be subject to algorithmic bias
  • Poor at representing uncertainty (how do we know what the model knows?)
  • Uninterpretable black boxes, difficult to trust
  • Often requires expert knowledge to design and fine-tune architectures
  • Difficult to encode structure and prior knowledge during learning
  • Struggles with extrapolation (going beyond data)

These are some open issues in Deep Learning research today, and expert references provide further information about the current state in DL. See Dr. Moitra’s course [1] if seeking further knowledge.

References

[1] Moitra, Ankur. “18.408 Theoretical Foundations for Deep Learning, Spring 202.” People.csail.mit.edu, Feb. 2021, people.csail.mit.edu/moitra/408c.html. Accessed 23 June 2024.

[2] Thompson, Neil, et al. THE COMPUTATIONAL LIMITS of DEEP LEARNING. 2020.

[3] Tala Talaei Khoei, et al. “Deep Learning: Systematic Review, Models, Challenges, and Research Directions.” Neural Computing and Applications, vol. 35, 7 Sept. 2023, https://doi.org/10.1007/s00521-023-08957-4.

[4] MIT Deep Learning 6.S191. introtodeeplearning.com/.

[5]Raissi, Maziar. Open Problems in Applied Deep Learning. 2023.

[6] Nielsen, Michael A. “Neural Networks and Deep Learning.” Neuralnetworksanddeeplearning.com, Determination Press, 2019, neuralnetworksanddeeplearning.com/chap4.html.

[7] Zhou, Ding-Xuan. “Universality of Deep Convolutional Neural Networks.” Applied and Computational Harmonic Analysis, vol. 48, no. 2, June 2019, https://doi.org/10.1016/j.acha.2019.06.004.

[8] Schäfer, Anton Maximilian, and Hans Georg Zimmermann. “Recurrent Neural Networks Are Universal Approximators.” Artificial Neural Networks — ICANN 2006, 2006, pp. 632–640, https://doi.org/10.1007/11840817_66.

--

--