Building a Dilated ConvNet in pyTorch
It is no mystery that convolutional neural networks are computationally expensive. In this story we will be building a dilated convolutional neural network in py.
We will start by importing necessary libraries. We will also import torchvision
because it will make our life easier by helping us out in importing CIFAR-10 dataset.
import torchimport torchvision as tvimport torchvision.transforms as transformsimport torch.nn as nnimport torch.nn.functional as Fimport torch.optim as optim
We will now load and transform the dataset. We will create a transformation pipeline in which we convert the images to tensor and then normalize them.
# Loading and Transforming datatransform = transforms.Compose([transforms.ToTensor(),transforms.Normalize((0.4914, 0.4822, 0.4466), (0.247, 0.243, 0.261))])trainTransform = tv.transforms.Compose([tv.transforms.ToTensor(), tv.transforms.Normalize((0.4914, 0.4822, 0.4466), (0.247, 0.243, 0.261))])trainset = tv.datasets.CIFAR10(root='../data', train=True,download=True, transform=transform)dataloader = torch.utils.data.DataLoader(trainset, batch_size=1, shuffle=False, num_workers=4)
Now it is time for us to write the model.
# Writing our modelclass DilatedCNN(nn.Module): def __init__(self):
super(DilatedCNN,self).__init__()
self.convlayers = nn.Sequential(
nn.Conv2d(in_channels = 3, out_channels = 6, kernel_size = 9, stride = 1, padding = 0, dilation=2),
nn.ReLU(),
nn.Conv2d(in_channels=6, out_channels=16, kernel_size = 3, stride = 1, padding= 0, dilation = 2),
nn.ReLU(),
) self.fclayers = nn.Sequential(
nn.Linear(2304,120),
nn.ReLU(),
nn.Linear(120,84),
nn.ReLU(),
nn.Linear(84,10)
) def forward(self,x):
x = self.convlayers(x)
x = x.view(-1,2304)
x = self.fclayers(x)
return x
It is really simple to define dilated conv layers in pytorch. We can simply do that by passing dilation=<int>
argument to the conv2d
function.
We will now put our model for training.
net = DilatedCNN()
#optimization and score functionloss_function = nn.CrossEntropyLoss()optimizer = optim.SGD(net.parameters(),lr=0.001,momentum=0.5)#training of the modelfor epoch in range(2):
running_loss= 0.0
for i,data in enumerate(dataloader,0):
inputs, labels = data
optimizer.zero_grad()
outputs = net(inputs)
#backward prop
loss = loss_function(outputs, labels)
loss.backward()
optimizer.step() running_loss += loss.item() if i%2000==1999:
print('[%d, %5d] loss: %.3f'%(epoch+1,i+1,running_loss/2000))
running_loss = 0.0print("Training finished! Yay!!")
That is all! If you notice, we have used 9X9 filter size and still it didn’t take more than 120% of my CPU. Using dilated convolutional neural network I was able to cover more spatial area.
To know more about dilated convolutional neural network — https://towardsdatascience.com/understanding-2d-dilated-convolution-operation-with-examples-in-numpy-and-tensorflow-with-d376b3972b25