Novel-View Human Action Synthesis aims to synthesize the appearance of a dynamic scene from a virtual viewpoint, given a video from a real viewpoint. Our approach uses a novel 3D reasoning to synthesize the target viewpoint. We first estimate the 3D mesh of the target object, a human actor, and transfer the rough textures from the 2D images to the mesh. This transfer may generate sparse textures on the mesh due to frame resolution or occlusions. To solve this problem, we produce a semi-dense textured mesh by propagating the transferred textures both locally, within local geodesic neighborhoods, and globally, across symmetric semantic parts. Next, we introduce a context-based generator to learn how to correct and complete the residual appearance information. This allows the network to independently focus on learning the foreground and background synthesis tasks.
M. Lakhal, D. Boscaini, F. Poiesi, O. Lanz and A. Cavallaro, "Novel-View Human Action Synthesis," ACCV, 2020
@InProceedings{Lakhal_2020_Arxiv, author={Mohamed Ilyes Lakhal and Davide Boscaini and Fabio Poiesi and Oswald Lanz and Andrea Cavallaro}, title = {Novel-View Human Action Synthesis}, booktitle={Asian Conference on Computer Vision (ACCV)}, month = {December}, year = {2020}, }