CV4MR

Workshop on Computer Vision for Mixed Reality

June 18, 2023

In conjunction with CVPR 2023

Vancouver, Canada

UPDATES

OVERVIEW

VR technologies have the potential to transform the way we use computing to interact with our environment, do our work and connect with each other. VR devices provide users with immersive experiences at the cost of blocking the visibility of the surrounding environment. With the advent of passthrough techniques such as those in Quest Pro, now users can build deeply immersive experiences which mix the virtual and the real world into one, often also called Mixed Reality. Mixed Reality poses a set of very unique research problems in computer vision that are not covered by VR. Our focus is on capturing the real environment around the user using cameras which are placed away from the user’s eyes, yet reconstruct the environment with high fidelity, augment the environment with virtual objects and effects, and all in real-time. This would offer the research community to deeply understand the unique challenges of Mixed Reality and research on novel methods encompassing View Synthesis, Scene Understanding, efficient On-Device AI among other things.

CALL FOR PAPERS

Dates:

  • Paper Submission Deadline: March 15, 2023
  • Notification to authors: April 01 04, 2023
  • Camera ready deadline: April 08 11, 2023

Topics of Interest include:

  • Real time View Synthesis for Passthrough
  • Depth Estimation for Stereoscopic Reconstruction
  • 3D capture, reconstruction and rendering for virtual objects
  • Scene understanding
  • Real-time Style Transfer for Passthrough
  • Novel Applications of Mixed Reality in areas such as Healthcare, Manufacturing, etc.

Submission Guidelines:

  • We invite submissions of max 8 pages (excluding references), and 4-page extended abstracts as well.
  • Submitted manuscript should follow the CVPR 2023 paper template.
  • If you have other media to attach (videos etc), please feel free to add anonymized links.
  • Submissions will be rejected without review if they:
      1. Contain more than 8 pages (excluding references).
      2. Violate the double-blind policy.
      3. Violate the dual-submission policy for papers with more than 4 pages excluding references.

KEYNOTE

Richard Newcombe is VP of Research Science at Meta Reality Labs leading the Surreal team in Reality Labs Research. The Surreal team is creating a new generation of Machine Perception technologies called LiveMaps that combines novel always-on wearable sensing and compute with efficient algorithms for device location, 3D scene understanding and user state-estimation. The surreal team pioneered a new generation of machine perception glasses devices called project Aria that provides a new generation of data for ego-centric multimodal AI research. Richard received his undergraduate in Computer Science, and masters in Robotics and Intelligent Machines from the University of Essex in England, his PhD from Imperial College in London with a Postdoc at the University of Washington. Richard went on to co-found Surreal Vision, Ltd. that was acquired by Meta in 2015. As a research scientist his original work introduced the Dense SLAM paradigm demonstrated in KinectFusion and DynamicFusion that influenced a generation of real-time and interactive systems in AR/VR and robotics by enabling systems to efficiently understand the geometry of the environment. Richard received the best paper award at ISMAR 2011, best demo award ICCV 2011, best paper award at CVPR 2015 and best robotic vision paper award at ICRA 2017. In 2021, Richard received the ICCV Helmholtz award for research with DTAM, and the ISMAR and UIST test of time awards for KinectFusion.

SPEAKERS

De-An Huang is a research scientist at NVIDIA. His research interests include video understanding, embodied agents, and large-scale AI systems. He received his Ph.D. in Computer Science from Stanford University in 2020, advised by Juan Carlos Niebles and Fei-Fei Li. He was awarded the NVIDIA Graduate Fellowship in 2018. His Ph.D. thesis was titled “Purposive Visual Imitation for Learning Structured Tasks from Videos”. De-An did research internships at NVIDIA Research, Meta AI, Microsoft Research, and Disney Research. He holds a Master's degree in Robotics from Carnegie Mellon University and a Bachelor's degree in Electrical Engineering from National Taiwan University.

Angjoo Kanazawa is an Assistant Professor in the Department of Electrical Engineering and Computer Sciences at the University of California, Berkeley. She leads the Kanazawa AI Research (KAIR) lab under BAIR and serves on the advisory board of Wonder Dynamics. She earned her BA in Mathematics and Computer Science from NYU working with Rob Fergus, and her PhD in Computer Science at the University of Maryland, College Park, where she was advised by David Jacobs. While in graduate school, she visited the Max Planck Institute in Tübingen, Germany, under the guidance of Michael Black. Before taking up her current teaching post, she worked as a Research Scientist at Google Research, and as a BAIR postdoc at UC Berkeley advised by Jitendra Malik, Alexei A. Efros and Trevor Darrell. Kanazawa's research lies at the intersection of computer vision, computer graphics, and machine learning. She is focused on building a system that can capture, perceive, and understand the complex ways that people and animals interact dynamically with the 3-D world--and can used that information to correctly identify the content of 2-D photos and video portraying scenes from everyday life.

Andrea Vedaldi is Professor of Computer Vision and Machine Learning at the University of Oxford, where he co-leads the Visual Geometry Group since 2012. He is also a research scientist in Meta AI Research in London. His recent work has focused on unsupervised learning of representations and geometry in computer vision. He is author of more than 180 peer-reviewed publications in the top machine vision and artificial intelligence conferences and journals. He is a recipient of the Mark Everingham Prize for selfless contributions to the computer vision community, the Open Source Software Award by the ACM, and the best paper award from the Conference on Computer Vision and Pattern Recognition in 2020. He is the recipient of the ERC Staring and Consolidator Grants and co-I in two EPSRC Programme Grants.

Rafal Mantiuk is a Professor of Graphics and Displays at the Department of Computer Science and Technology, the University of Cambridge in the United Kingdom. He received PhD from the Max-Planck Institute for Computer Science in Germany. His recent interests focus on computational displays, rendering and imaging algorithms that adapt to human visual performance and deliver the best image quality given limited resources, such as computation time or bandwidth. He contributed to early work on high dynamic range imaging, including quality metrics (HDR-VDP), video compression and tone-mapping.

Pratul Srinivasan is a research scientist at Google, where he works on problems at the intersection of graphics and computer vision, specializing in view synthesis and inverse rendering. He completed his PhD from UC Berkeley in 2020, advised by Ravi Ramamoorthi and Ren Ng, and received the ACM Doctoral Dissertation Award Honorable Mention and David J. Sakrison Memorial Prize for his thesis work on neural radiance fields. He has received Best Paper Honorable Mentions at ECCV 2020, ICCV 2021, and CVPR 2022.

ORGANIZERS

Rakesh Ranjan is a Senior Research Scientist Manager in Reality Labs, Meta. Rakesh and his team pursue research in the areas of AI based low-level computer vision, 3D reconstruction and scene understanding for Augmented and Virtual Reality devices. Prior to Meta, Rakesh was a Research Scientist at Nvidia where he worked in AI for Real Time Graphics (DLSS) and AI for Cloud Gaming (GeForce Now). Rakesh also spent 5 years at Intel Research as a PhD and full-time researcher.

Peter is a Research Manager in computer vision at Meta. Before joining Meta in 2014, he was Visiting Assistant Professor in Professor Bernd Girod’s group in Stanford University, Stanford, USA. he was working on personalized multimedia systems and mobile visual search. He received M.Sc. in Computer Science from the Vrije Universiteit, Amsterdam, Netherlands and a M.Sc. in Program Designer Mathematician from Eötvös Loránd University, Budapest, Hungary. Peter completed his Ph.D. with Prof. Touradj Ebrahimi at the Ecole Polytechnique Fédéral de Lausanne (EPFL), Lausanne, Switzerland, 2012.

Laura Leal-Taixé is a Senior Research Manager at NVIDIA and also an Adjunct Professor at the Technical University of Munich (TUM), leading the Dynamic Vision and Learning group. From 2018 until 2022, she was a tenure-track professor at TUM. Before that, she spent two years as a postdoctoral researcher at ETH Zurich, Switzerland, and a year as a senior postdoctoral researcher in the Computer Vision Group at the Technical University in Munich. She obtained her PhD from the Leibniz University of Hannover in Germany, spending a year as a visiting scholar at the University of Michigan, Ann Arbor, USA. She pursued B.Sc. and M.Sc. in Telecommunications Engineering at the Technical University of Catalonia (UPC) in her native city of Barcelona. She went to Boston, USA to do her Masters Thesis at Northeastern University with a fellowship from the Vodafone foundation. She is a recipient of the Sofja Kovalevskaja Award of 1.65 million euros in 2017, the Google Faculty Award in 2021, and the ERC Starting Grant in 2022.

Xiaoyu Xiang is a Research Scientist in Reality Labs, Meta. Her primary research area includes image and video reconstruction, novel view synthesis and generative models. Prior to Meta, Xiaoyu obtained a Ph.D. degree in Electrical and Computer Engineering from Purdue University, advised by Jan P. Allebach.