




PyTorch - nn.Module


In most case, we implement a Neural Network in the form of a class.  What I am going to show you in this page is how to implement a neural network in Python class. I will be trying to keep the structure as simple as possible to make it easy to understand.  The network that I am going to create in this page is basically same as the one that I created without using Class in nn.Linear page.



You can implement a network in Class from scratch, but in most case we create a class that inherits from Pytorch class nn.Module because we can use many of the high level API that is provided by Pytorch without implementing them by ourselves. Every class that is created in this page inheritis nn.Module class. If you are not faimilar with the basic structure of Python Class or how it inheritis other class, check this basic Class tutorial.




Structure of nn.Module Class


nn.Module class is not used as it is. To use nn.Module class, you need to create a class that is based on nn.Module class (i.e, a class that inherits nn.Module). General form of the class you would create would look as below. In the simplest form, you need to define (override) only two member functions : __init__() function and forward() function in the class. Usually we define the network structure in __init__() function and define the forward process in forward() function. For the details on how to define the network structure, my notes on nn.Linear and nn.Sequential would help. It is crucial to make practice until you get very familiar with the process of network definition and forward flow definition to the degree where you can define these two components almost instantly for a problem given to you.

    class YourNNClass(nn.Module):

        def __init__(self):


              # define the structure of a Neural Net


        def forward(self,x):                

            # define the flow of forward process


            return # output variable of forward()                     




Single Perceptron and Foward only


This section shows a example of the simplest neural net class that can be created from nn.Module. The minium requirement for nn.Module based class is to override __init__() method and forward() method. What this class does is exactly same as this example without using the nn.Module.

    import torch

    from torch import nn


In this class, I define a simple linear network with two inputs and one output and use Sigmoid() function as the activation function of the network.

    class Network(nn.Module):

        def __init__(self):



            self.linear = nn.Linear(2,1)     # self.linear is just a local variable. you can define any name

            self.activation = nn.Sigmoid() # self.activation is just a local variable. you can define any name


        def forward(self,x):                  # x in the forward() method is the input vector.

            o = self.linear(x)                  # the input vector x is transferred to the network

            o = self.activation(o)           # the result of the network output is transferred to the activation function


            return o                             # the outpuf of the activation function is returned as the result   


Now let's test if the network is working as intended.

    net = Network()                         # create an instance of the Network() that we defined

    print(net)                                 # print out the Network() class information

    ==> Network(

              (linear): Linear(in_features=2, out_features=1, bias=True)

              (activation): Sigmoid()




    x = torch.tensor([[1.0,1.0]])        # define a test input vector


     ==> tensor([[1., 1.]])



    o = net.forward(x)                     # put the input vector into the forward() method of the class

    print('o = net.forward(x) :',o)       # print the output of the forward() method.

    ==> o = net.forward(x) : tensor([[0.3510]], grad_fn=<SigmoidBackward>)




Single Perceptron, Foward, Loss and Backpropagation


What we went through in previous section is to define a nn.Module class and follow through the forward path. In this section, I will add the second step meaning 'backward() and back propagation'.


    import torch

    from torch import nn

Class definition is the same as we looked at in previous section. So no further explanation.

    class Network(nn.Module):

        def __init__(self):



            self.linear = nn.Linear(2,1)

            self.activation = nn.Sigmoid()


        def forward(self,x):

            o = self.linear(x)

            o = self.activation(o)


            return o


Creating an instance of Network Class. Same as shown in previous section. No further explanation.

    net = Network()


    ==> Network(

                     (linear): Linear(in_features=2, out_features=1, bias=True)

                     (activation): Sigmoid()



Creating an input vector. Same as shown in previous section. No further explanation.


    x = torch.tensor([[1.0,1.0]])


    ==> tensor([[1., 1.]])


Following through the forward path. Same as shown in previous section. No further explanation.

    o = net.forward(x)

    print('o = net.forward(x) :',o)

    ==> o = net.forward(x) : tensor([[0.6695]], grad_fn=<SigmoidBackward>)


Now let's define a loss function and optimizer.   

    lossFunc = torch.nn.MSELoss()

    optimizer = torch.optim.SGD(net.parameters(), lr = 0.01)


Define the desired output for the test input vector created above.

    t = torch.tensor([[1.0]])

    print('t = ',t)

    ==> t =  tensor([[1.]])


Define a loss function using the output value 'o' and the desired output value 't' as shown below.

    loss = lossFunc(o,t)

    print('loss = ',loss)

    ==> loss =  tensor(0.1092, grad_fn=<MseLossBackward>)


Initialize the optimizer. This is to reset the gradient() value to zero to prevent any existing value from affecting the newly calculated value. Usually you need to do this for every epoch of the training process.


Now calculate the gradient value for all the network parameters (i.e, weight and bias of each neuron in the network). This is done just by calling backward() function. The backward() just calculate the gradient value. It does not update the weight/bias value themselves yet.


Then, let's update the weight/bias using the gradient values calculated in previous step. This is done by calling step() function of the optimizer.


Following is just to check if weight and bias has been updated. I put the same input x that I used before and then I get the different result. So I can guess weight/bias has been changed, but don't know exactly which values they are changed to. In next example, I will show you one way to show you exactly how the weight and bias has changed.


    ==>  Weight of linear :

                  Parameter containing:

                         tensor([[-0.7002, -0.0495]], requires_grad=True)

            Bias of linear :

                  Parameter containing:

                         tensor([-0.6640], requires_grad=True)

            Grad of weight :

                         tensor([[-0.2524, -0.2524]])

            Grad of bias :





Some Basics on the Class


I wouldn't talk about the syntax about Python class. For those basic syntax of Python class, refer to my note on Python Class.  When you try to create your own class or try to understand the class created by others, it would be good idea to understand exactly when each member function (behavior) in the class is triggered (executed). In my case, a common trick to figure out exactly what happens (i.e, which member functions are executed) when I did something to the class is to put a printf() in each and every member function in the class.

Following is an example of doing this trick to understand how Pytorch Module based (a class derived from Module class) works.


    import torch

    from torch import nn


    class Network(nn.Module):

        def __init__(self):


            print("__init__() is called");  # I put the printf() here to figure out when __init__() is executed.


            self.linear = nn.Linear(2,1)

            self.activation = nn.Sigmoid()


        def forward(self,x):

            print("forward() is called"); # I put the printf() here to figure out when forward() is executed.

            o = self.linear(x)

            o = self.activation(o)


            return o


        def printInfo(self):

            print("Weight of linear : \n     ", self.linear.weight)

            print("Bias of linear : \n     ", self.linear.bias)

            print("Grad of weight : \n     ", self.linear.weight.grad)

            print("Grad of bias : \n     ", self.linear.bias.grad)


    # This is to instantiate the Network class that is defined above. By this line, a string "__init__() is called" is printed. It mean just instantiating the class (creating an object of the class) automatically execute def __init__(self) of the class.

    net = Network()  

    ==> __init__() is called


    x = torch.tensor([[1.0,1.0]])


    # This is obvious. You would easily guess the forward() is executed since you explictely ran the member function forward()

    o = net.forward(x)

    ==> forward() is called


    # If you want to print out a result of the execution of a certain member function, you can print the result as shown below.

    print('o = net.forward(x) :',o)

    ==> o = net.forward(x) : tensor([[0.3122]], grad_fn=<SigmoidBackward>)


    # This example would not look so intuitive, but important property of nn.Module class. In this example, you just passed a value to the class (remember the variable net is the instance(object) of Network() class created above) and the string 'forware() is called' is printed. This indicates that just passing a variable to the object automatically execute the forward() function in the class. It is how nn.Module class works.

    o = net(x)

    ==> forward() is called


    # You can print out the result of net(x) as shown below. You would notice that the result is same as net.forward(x)

    print('o = net(x) :',o)

    o = net(x) : tensor([[0.3122]], grad_fn=<SigmoidBackward>)



