AI/ML | ShareTechnote

Machine Learning

Evolution of Algorithm / Challenges and Solution

When I started studying the neural network/Deep Learning, most of the things seemed relatively clear and the simple fully connected feedforward network seemed to work for all the problems. But as I learn more, I realized that it is not that simple as expected and started seeing so many different algorithmic tweaking (e.g, selection of transfer function, selection of optimization functions etc) and so many different structure of the neural networks(e.g, fully connected network, CNN, RNN etc). Then this question came to my mind. Why we need to go through so many tweakings and selections ? It turns out that there is no single 'fit for all' algorithm. An algorithm works for a certain situation but does not work well for other situation. To make the neural network work better for more diverse situation, many different tricks has been invented. As the result of such inventions for a long period of time, we now see a neural networks with such a many options to choose from.

Following is the brief summary of Neural Network models.

Event/Model	Year	Description
Perceptrons	1958	Single-layer neural networks for binary classification tasks.
Backpropagation	1974-1986	Multi-layer neural networks training algorithm.
Convolutional Neural Networks (CNNs)	1989	LeNet-5 by Yann LeCun for image recognition tasks.
Long Short-Term Memory (LSTM)	1997	RNN variant for improved handling of long-term dependencies.
Deep Belief Networks (DBNs)	2006	Unsupervised pre-training with restricted Boltzmann machines.
ImageNet Challenge	2012	AlexNet revolutionized computer vision with deep CNNs.
Generative Adversarial Networks (GANs)	2014	Unsupervised learning with two competing neural networks.
Neural Machine Translation (NMT)	2014	Sequence-to-sequence models for improved machine translation.
Attention Mechanism	2014	Improved handling of long sequences in sequence-to-sequence models.
Transformer Models	2017	Attention-based models for large-scale pre-training and fine-tuning.
BERT	2018	Pre-trained model that set new benchmarks in NLP tasks.

Following is the list of important tricks that was invented to handle some of the early problems of the neural network.

Issues	Techniques to solve the issue
Vanishing Gradient	Replacing the classing sigmoid function with ReLU (Ref [1])
Slow Convergence	Replacing the cloassic GD(Gradient Descent) with SGD(Stochastic GD) (Ref [1])
Sever Fluctuation in training due to SGD	Replacing SGD with mini-batch SGD (Ref [1])
Falling into Local Minima	Introducing adaptive learning rate algorithms (e.g, Adagrad, RMSProp, Momentum, Adam) (Ref [1])
Overfitting	Introducing early-stopping, regularization, dropout (Ref [1])
Too big fully connected network in CNN	Introducing Pooling layer before the fully connected network (Ref [1])
No Memory	Introducing RNN (Recurrent Neural Network) (Ref [1])

Reference

[1] Deep Learning for Wireless Physical Layer : Opportunities and Challenges