How generating images from a sketch helps to improve your Zoom call

გამარჯობა დიდებული ადამიანი,

Last week I promised you a paper about how to do video encoding via neural networks. To be frank, I will fail to deliver that paper as it does not exist in public :/ The technology I had in mind was the AI Video Compresion by NVIDIA but sadly there is no public paper about that topic as (I assume) NVIDIA considers it a trade secret. But after digging trough the internet I found a public paper by NVIDIA that is considered to contain the preprocessor to the video compression product they plan to sell. In the paper is described how to generate images from a drawing, which is is a similar problem then with the video encoding: Create a good looking image based on incomplete information.

Software exists to create business value

I am Simon Frey, the author of the Weekly CS Paper Newsletter. And I have great news: You can work with me

As CTO as a Service, I will help you choose the right technology for your company, build up your team and be a deeply technical sparring partner for your product development strategy.

Checkout my website simon-frey.com to learn more or directly contact me via the button below.

Let’s work together!

Abstract:

We propose spatially-adaptive normalization, a simple but effective layer for synthesizing photo realistic images given an input semantic layout. Previous methods directly feed the semantic layout as input to the deep network, which is then processed through stacks of convolution, normalization, and non linearity layers. We show that this is sub optimal as the normalization layers tend to “wash away” semantic information. To address the issue, we propose using the input layout for modulating the activations in normalization layers through a spatially-adaptive, learned trans-formation. Experiments on several challenging datasets demonstrate the advantage of the proposed method over existing approaches, regarding both visual fidelity and alignment with input layouts. Finally, our model allows user control over both semantic and style.

Download Link:

https://arxiv.org/pdf/1903.07291.pdf