This assignment is meant to emulate a day’s work as a CV/DL developer here at Polarr. Expect some tasks to be very trivial, while others are a bit more challenging and open-ended. Feel free to stack-overflow, Google, and research any questions/issues you have. You are also welcome to use any programming language. The only thing we ask is that this work is done independently by you without the help of your family/friends. Recommended time limit: 8 hours (you should not spent more time than this, but if you do, that's okay as well).
When you have completed the assignment, please name the file in accordance with this naming pattern: [YOUR FULL NAME]-Assignment for [NAME OF THE ROLE THE ASSIGNMENT IS FOR] and upload it to **this dropbox.**
Question 1: Is this Photo Blurry? (1 hour)
Download the following 3 photos, where one each has been applied certain amount of Gaussian blur:
Write a function that calculates a bluriness score
for each of the 3 photos and rank them in terms of the score. Confirm visually that your ranking of the photos make sense. For instance, a photo that is very sharp could have a score of 5, while a photo that is very blurry has a score of 1.
Question 2: Simple Neural Networks (1.5 hours)
Assume we have a fully-connected neural network that does image recognition with one hidden layer where the input is an image with channels R,G,B, and dimensions 224x224.
The input is X, W1 is weight 1, B1 is bias 1, W2 is weight 2, B2 is bias 2, and y is the output.
The output is y = choose_max_index(softmax(W2 x ReLU(W1 x X + B1) + B2))
X: N x D, N is the number of samples, D is 3x224x224
W1: D x H, D is 3x224x224, H the hidden layer size
B1: 1 x H, a vector of size H
ReLU: np.maximum(value, 0)
W2: H x C, H is the hidden layer size, C the number of output labels
B2: 1 x C, vector of size C
softmax: e^z_j/Sum(e^x_i) for i = 1, ... C; for j in 1 to C
choose_max_index: for all values of softmax function, choose the index with the max output
Question 3: Slightly More Complicated Neural Networks (2 hours)
Scan through the paper on depthwise-separable factorization using MobileNets: https://arxiv.org/abs/1704.04861
For this question, you can decide which ML framework (e.g. tensorflow, PyTorch) you want to use. Find a pre-trained Mobilenet-v2 model and download the weights.
Given the weights, write a test script to run image classification on this image. Find the top 5 labels and their respective probabilities.
Print the first three layers of the network and the shapes of the corresponding parameters (weights, biases, ...) and save them to a .txt
file.