Yet, despite these encouraging advancement, some of the best models were easily miss-led when presented with slightly modified, or noisy inputs1,2. I found this quite interesting as it reminds us that we are still quite far from human performance when it comes to generalisation, seeing things in context and transferring the knowledge we have about one domain to another.
I have been doing research in the field of video games for a while and with the recent advancement in DNNs I thought it would be interesting to test some of these methods on video games. There is already some interesting work on teaching agents to play games using deep reinforcement learning methods by just looking at the pixels and learning the actions3,4. My interest however lies is training DNNs to pickup similarities between games by looking at gameplay videos. The problem is very interesting and challenging and I guess it will keep me busy for a while. I published initial results from classifying 200 games in a previous post (here) and for now, I want to share something more fun.
When I started my experiments, I did the very obvious first step: feeding images of games to an accurate pretrained deep neural network model and check what the model thinks about them. In my case, I used the VGG-16, a very popular and accurate models trained on the ImageNet dataset and achieved near human performance5. I thought this would be a good starting point to fine tune the model later on images from games but I wanted first to check the performance of the original model without fine tuning. So here I want to share some of the results I got, which I think are fun to look at. The model outputs probabilities of what it sees in an image (selected from 1000 different categories covering a wide range of objects) and I'm showing the five highest probabilities in the figures below.
First, let's look at images where the model failed to capture what's in the image and produced (with high confidence) wrong classes. (The one I particularly like is the image of a treasure box where the network sees it as garbage! I guess these networks are of no use for treasure hunters yet!)
Now to be fair, the network did a good job in recognising quite a lot of images, specially those that simulates real objects with high quality graphics. Here are some examples from games such as: Bridge Project, Star Conflict, Among the Sleep, Alien Rage Unlimited, Maia and, Oniken.
- Deep Neural Networks are Easily Fooled:High Confidence Predictions for Unrecognizable Images.
- Accessorize to a Crime: Real and Stealthy Attacks onState-of-the-Art Face Recognition
- Playing Atari with Deep Reinforcement Learning
- Continuous control with deep reinforcement learning
- Very Deep Convolutional Networks for Large-Scale Image Recognition