# Abridged task description
In each test case, you are given 6 images, each representing a different information capture from a radar feed. For each testcase, you must output a segmentation map of whether the pixels corresponded to a background pixel, a wall pixel or a human pixel.
However, the scoring function is extremely imbalanced: for each non-background pixel correctly identified, you get 50 points (not to be confused with your actual score), whilst correct background pixels only get 1 point. Your points are divided by the maximum possible number of points, to get your final score.
# Unofficial writeup
# 82+ points
Credit: many countries
It is well known that U-net performs extremely well on image segmentation tasks with a lot of training data. This problem fits that description. So we feed the training data into a standard U-net with 3 encoder layers and 3 decoder layers. In particular, having 512 channels at the bottleneck yielded strong results.
It seemed that the size of the U-net (when no other modifications were made) had a substantial impact on the score. For example, in Australia’s experience, 2 encoder/decoder layers gave 37 points, whilst 4 encoder/decoder layers and 1024 channels at the bottleneck gave 0 points (i.e. lower than baseline).
One possible theory is that due to the imbalance in the scoring function, underfitting and overfitting to CrossEntropyLoss (the ‘default’, sensible choice) was heavily penalized. For example, if an overfitted model learnt to aggressively classify pixels as background, it would consistently miss out on a lot of points.
# 98+ points
Credit: many countries
We keep the base model as U-net, however we change the weights of our loss function to represent actual scoring conditions. In particular, we want to make the background class 50 times less important that all the other classes. If you use a standard PyTorch setup, you can do this by setting the weight parameter of torch.nn.CrossEntropyLoss.