ACM Multimedia 2020 - Industrial Invited Talk

Industrial Invited Talk

Talk 1

Title: Building Digital Human
Presenter: Dong Yu, Tencent AI Lab
Date: 13 October 2020, 16:30 - 17:00 (UTC)

Abstract: Digital human finds its applications in areas such as virtual companion, virtual reporter, and virtual narrator. In this talk, I will introduce our recent work in building digital human. I will focus on our progresses in multi-modal text-to-speech synthesis and multi-modal voice separation and recognition.

Bio: Dr. Dong Yu is an IEEE Fellow and an ACM distinguished scientist. He is currently taking the role of distinguished scientist and vice general manager at Tencent AI Lab. Prior to joining Tencent in 2017, he was a principal researcher at Microsoft Research (Redmond). His research has been focusing on speech processing and recognition and multi-modal interactive systems. His works have been widely cited and have been recognized by the prestigious IEEE Signal Processing Society 2013 and 2016 best paper award.
Dr. Dong Yu is currently serving as the vice chair of the IEEE Speech and Language Processing Technical Committee (SLPTC). He has served as a member of the IEEE SLPTC (2013-2018), a distinguished lecturer of APSIPA (2017-2018), an associate editor of the IEEE/ACM transactions on audio, speech, and language processing (2011-2015), an associate editor of the IEEE signal processing magazine (2008-2011), and members of organization and technical committees of many conferences and workshops.

Talk 2

Title: Cloud Drive App - Closing the Gap Between AI Research and Practice
Presenter: Itamar Friedman, Alibaba DAMO Academy
Date: 15 October 2020, 15:00 - 15:30 (UTC)

Abstract: In the past few years, Cloud Drive Apps have aroused increasing interest from end-users and enterprise customers. During this period, numerous artificial intelligence based features were introduced, such as functions enabling users to intelligently organize, search, share, edit and recreate content with their images and videos.
In this talk, I will introduce our latest work related to highly-efficient image understanding, which aims to enable various novel methods (such as neural architecture search and advanced training techniques) to be practiced in Cloud Drive App use cases. I will discuss use-cases such as image search through free-text query, focusing on difficult real-world problems and suggested solutions. I will also demonstrate the usefulness of the proposed techniques when applied to public competitions.

Bio: Itamar holds MSc and BSc (summa cum laude) in Computer Vision and Machine Learning from the Faculty of Electrical Engineering at the Technion (Israel Institute of Technology). His research interests include efficient video and image analysis, with focus on automated deep learning. Prior to Alibaba, Itamar was a serial entrepreneur, founding startups in the field of web development, robotics and, and machine-vision offline-to-online technologies. He was a mentor in Microsoft Accelerator TLV, mentoring Israeli leading AI startups in the fields of medical and drones. He holds several patents, and his research outcomes have been integrated into various products in Alibaba Group.