Machine Learning  

 

 

 

 

Evolution of Algorithm / Challenges and Solution

 

When I started studying the neural network/Deep Learning, most of the things seemed relatively clear and the simple fully connected feedforward network seemed to work for all the problems. But as I learn more, I realized that it is not that simple as expected and started seeing so many different algorithmic tweaking (e.g, selection of transfer function, selection of optimization functions etc) and so many different structure of the neural networks(e.g, fully connected network, CNN, RNN etc). Then this question came to my mind. Why we need to go through so many tweakings and selections ? It turns out that there is no single 'fit for all' algorithm. An algorithm works for a certain situation but does not work well for other situation. To make the neural network work better for more diverse situation, many different tricks has been invented. As the result of such inventions for a long period of time, we now see a neural networks with such a many options to choose from.

 

Following is the brief summary of Neural Network models.

Event/Model

Year

Description

Perceptrons

1958

Single-layer neural networks for binary classification tasks.

Backpropagation

1974-1986

Multi-layer neural networks training algorithm.

Convolutional Neural Networks (CNNs)

1989

LeNet-5 by Yann LeCun for image recognition tasks.

Long Short-Term Memory (LSTM)

1997

RNN variant for improved handling of long-term dependencies.

Deep Belief Networks (DBNs)

2006

Unsupervised pre-training with restricted Boltzmann machines.

ImageNet Challenge

2012

AlexNet revolutionized computer vision with deep CNNs.

Generative Adversarial Networks (GANs)

2014

Unsupervised learning with two competing neural networks.

Neural Machine Translation (NMT)

2014

Sequence-to-sequence models for improved machine translation.

Attention Mechanism

2014

Improved handling of long sequences in sequence-to-sequence models.

Transformer Models

2017

Attention-based models for large-scale pre-training and fine-tuning.

BERT

2018

Pre-trained model that set new benchmarks in NLP tasks.

 

 

Following is the list of important tricks that was invented to handle some of the early problems of the neural network.

Issues

Techniques to solve the issue

Vanishing Gradient Replacing the classing sigmoid function with ReLU (Ref [1])
Slow Convergence Replacing the cloassic GD(Gradient Descent) with SGD(Stochastic GD) (Ref [1])
Sever Fluctuation in training due to SGD Replacing SGD with mini-batch SGD (Ref [1])
Falling into Local Minima

Introducing adaptive learning rate algorithms (e.g, Adagrad, RMSProp, Momentum, Adam) (Ref [1])

Overfitting

Introducing early-stopping, regularization, dropout (Ref [1])

Too big fully connected network in CNN

Introducing Pooling layer before the fully connected network (Ref [1])

No Memory

Introducing RNN (Recurrent Neural Network) (Ref [1])

 

 

 

Reference

 

[1] Deep Learning for Wireless Physical Layer : Opportunities and Challenges