Google researchers know how much people like to make others believe that they are on the moon, or that it is night instead of daylight, and other maniacs fun only if you are in a movie studio in front of a green screen. So they did what a good 2018 coder would do: build a neural network that allows you to do it.
This “video segmentation” tool, as they call it (well everyone does it), begins to be released on YouTube Stories on a limited way – if you see the option, congratulations , you are a beta tester.
There is a lot of ingenuity in this feature. It’s a piece of cake to understand where the foreground ends and the background starts if you have a depth detection camera (like the iPhone X dashboard) or a lot of processing time and no battery to think (like a desktop)
On a mobile, however, and with an ordinary RGB image, it’s not so easy to do. And if making a still image is difficult, the video is all the more so as the computer has to do the math at least 30 times per second.
Well, Google engineers took this as a challenge and put in place a convolutional neural network architecture, resulting in thousands of labeled images like the one on the right.
The network learned to distinguish the common features of a head and shoulders, and a series of optimizations reduced the amount of data it needed to crunch in order to do so. And – although it cheats a bit – the result of the previous calculation (so, some sort of cutting out of your head) becomes used as raw material for the next, further reducing the load.
The result is a fast and relatively accurate segmentation engine that runs more than fast enough to be used in video – 40 frames per second on the Pixel 2 and over 100 on the iPhone 7 (!).
This is great news for many people – removing or replacing a background is a great tool to have in your toolbox and this makes it very easy. And I hope that it will not kill your battery.