გამარჯობა დიდებული ადამიანი,
Last week I promised you a paper about how to do video encoding via neural networks. To be frank, I will fail to deliver that paper as it does not exist in public :/ The technology I had in mind was the AI Video Compresion by NVIDIA but sadly there is no public paper about that topic as (I assume) NVIDIA considers it a trade secret. But after digging trough the internet I found a public paper by NVIDIA that is considered to contain the preprocessor to the video compression product they plan to sell. In the paper is described how to generate images from a drawing, which is is a similar problem then with the video encoding: Create a good looking image based on incomplete information.
We propose spatially-adaptive normalization, a simple but effective layer for synthesizing photo realistic images given an input semantic layout. Previous methods directly feed the semantic layout as input to the deep network, which is then processed through stacks of convolution, normalization, and non linearity layers. We show that this is sub optimal as the normalization layers tend to “wash away” semantic information. To address the issue, we propose using the input layout for modulating the activations in normalization layers through a spatially-adaptive, learned trans-formation. Experiments on several challenging datasets demonstrate the advantage of the proposed method over existing approaches, regarding both visual fidelity and alignment with input layouts. Finally, our model allows user control over both semantic and style.
It would be awesome if you could help growing our little paper community even more by sharing it with your circles (you can also @eu_frey me on Twitter for retweets :D):
If you have any paper recommendation for me, please do not hesitate to approach me via [email protected] (Please keep the Backend & DevOps topic focus in mind)
With much love,