PyTorch - Regression
We can use Neural networks for regression by training them on a dataset of input-output pairs. Then, the network learns to map the inputs to the corresponding outputs, allowing it to make predictions on new input data. The network architecture, activation functions, and loss function used for training can all impact the network's ability to perform regression. Popular neural network architectures for regression include feedforward neural networks and convolutional neural networks.
Linear Regression : Single Variable
I used a very simple model as below. This code implements a simple linear regression model using PyTorch. The model is defined as a subclass of the nn.Module class and consists of a single linear layer with one input and one output. The code defines the input and output data as tensors, initializes the model, defines the loss function and optimizer, trains the model for a set number of epochs, and then plots the loss over time and the predicted output for some test input data. The optimization is done using stochastic gradient descent (SGD) with a learning rate of 0.01. The code also uses the matplotlib library to visualize the training process and the predicted output.
import torch
from torch import nn, optim
import matplotlib.pyplot as plt
import numpy as np
class LinearRegression_i1_o1(nn.Module):
def __init__(self):
super().__init__()
self.linear = nn.Linear(1,1)
def forward(self,x):
o = self.linear(x)
return o
x_data = torch.Tensor([[1.0],[2.0],[3.0]]);
y_data = torch.Tensor([[3.0],[5.0],[7.0]]);
net = LinearRegression_i1_o1()
criterion = nn.MSELoss(reduction='sum')
optimizer = optim.SGD(net.parameters(), lr = 0.01)
l = [];
for epoch in range(100):
y = net(x_data);
loss = criterion(y,y_data);
optimizer.zero_grad();
loss.backward();
optimizer.step();
print('epoch = ',epoch, ',' , 'loss = ',loss.item());
l.append(loss.item());
x_test = torch.tensor([[0.0],[1.0],[2.0],[3.0],[4.0],[5.0]]);
y = net(x_test);
x = x_test.detach().numpy().flatten();
x = list(x);
y = y.detach().numpy().flatten();
y = list(y);
plt.subplot(1,2,1);
plt.plot(l,'r-');
plt.title('Epoch vs Loss');
plt.xlabel('Epoch');
plt.ylabel('Loss');
plt.subplot(1,2,2);
plt.plot(x,y,'bo');
plt.xlim([0,6]);
plt.ylim([0,12]);
plt.xticks(ticks=range(0,6));
plt.yticks(ticks=range(0,12));
plt.title('test x vs, estimated y');
plt.xlabel('test x');
plt.ylabel('estimated y');
plt.grid();
plt.show();
The result shown are the loss values at each epoch during the training process.
The loss value starts high at the first epoch and gradually decreases over time. As the model is trained on the input data, it learns to better fit the output data and the loss decreases.
At the final epoch, the loss value is very low (0.077), indicating that the model has achieved a good fit to the training data. A low loss value suggests that the model can generalize well to unseen data.
epoch = 0 , loss = 109.12733459472656
epoch = 1 , loss = 48.755165100097656
epoch = 2 , loss = 21.876678466796875
.....
epoch = 96 , loss = 0.08057259768247604
epoch = 97 , loss = 0.07941444963216782
epoch = 98 , loss = 0.078273244202137
epoch = 99 , loss = 0.07714822888374329

Regression Quadratic Function : Single Variable
In this example, I will try to fit a simple non-linear data (e.g, quadratic date). The super simple model in previous example would not be able to fit the non-linear data like this. So I needed to revise/expand the model as below.
This code implements a neural network model with two hidden layers to perform regression on a quadratic function. The model is defined as a subclass of the nn.Module class, and it consists of a sequential container with three linear layers separated by sigmoid activation functions. The input and output data are defined as tensors, and the code trains the model using stochastic gradient descent (SGD) with a learning rate of 0.003 and the mean squared error (MSE) loss function.
The code trains the model for 2000 epochs and prints the loss value every 100 epochs. After training, the code evaluates the model on some test data and plots the loss over time and the predicted output for the test data.
The model is capable of learning a quadratic function as the input-output relationship, which is a more complex function than the simple linear regression model. Therefore, the loss value starts higher and takes longer to converge to a lower value. The plot of the predicted output shows that the model has learned the quadratic relationship between the input and output data.
import torch
from torch import nn, optim
import matplotlib.pyplot as plt
import numpy as np
class Regression_Quad_i1_h2_o1(nn.Module):
def __init__(self):
super().__init__()
self.net = nn.Sequential(
nn.Linear(1,4),
nn.Sigmoid(),
nn.Linear(4,6),
nn.Sigmoid(),
nn.Linear(6,1)
);
def forward(self,x):
o = self.net(x)
return o
x_data = torch.Tensor([[1.0],[2.0],[3.0],[4.0]]);
y_data = torch.Tensor([[1.0],[5.0],[10.0],[17.0]]);
net = Regression_Quad_i1_h2_o1()
criterion = nn.MSELoss(reduction='sum')
optimizer = optim.SGD(net.parameters(), lr = 0.003)
l = [];
for epoch in range(2000):
y = net(x_data);
loss = criterion(y,y_data);
optimizer.zero_grad();
loss.backward();
optimizer.step();
if (epoch % 100) == 0:
print('epoch = ',epoch, ',' , 'loss = ',loss.item());
l.append(loss.item());
x_test = torch.tensor([[0.0],[0.5],[1.0],[1.5],[2.0],[2.5],[3.0],[3.5],[4.0],[5.0]]);
#x_test = torch.tensor(np.linspace(0,5,10));
#x_test = x_test.view(10,1);
y = net(x_test);
x = x_test.detach().numpy().flatten();
x = list(x);
y = y.detach().numpy().flatten();
y = list(y);
plt.subplot(1,2,1);
plt.plot(l,'r-');
plt.title('Epoch vs Loss');
plt.xlabel('Epoch');
plt.ylabel('Loss');
plt.subplot(1,2,2);
plt.plot(x,y,'bo');
plt.xlim([0,6]);
plt.ylim([-5,30]);
plt.xticks(ticks=range(0,6));
plt.yticks(ticks=range(-5,30,5));
plt.title('test x vs, estimated y');
plt.xlabel('test x');
plt.ylabel('estimated y');
plt.grid();
plt.show();
The loss values shown in the result below start high and decrease over time, indicating the model is learning the input-output relationship. The loss reaches a minimum at epoch 900, suggesting good model performance. The predicted output shows that the model accurately learned the quadratic relationship between input and output data.
epoch = 0 , loss = 459.1344909667969
epoch = 100 , loss = 115.44194030761719
epoch = 200 , loss = 35.4473762512207
epoch = 300 , loss = 6.033346652984619
epoch = 400 , loss = 2.90378737449646
epoch = 500 , loss = 2.3812389373779297
epoch = 600 , loss = 2.0026025772094727
epoch = 700 , loss = 1.609250545501709
epoch = 800 , loss = 1.264915108680725
epoch = 900 , loss = 1.0008598566055298
epoch = 1000 , loss = 1.4073749780654907
epoch = 1100 , loss = 1.5786477327346802
epoch = 1200 , loss = 1.4552806615829468
epoch = 1300 , loss = 1.2912174463272095
epoch = 1400 , loss = 1.1246427297592163
epoch = 1500 , loss = 0.9764596819877625
epoch = 1600 , loss = 0.8541290163993835
epoch = 1700 , loss = 0.7479215264320374
epoch = 1800 , loss = 0.6534777879714966
epoch = 1900 , loss = 0.5714238882064819
