Article Highlight | 24-Jul-2024

Edge-enhanced spatial computing based on binocular metalens

Compuscript Ltd

A new publication from Opto-Electronic Sciences; DOI 10.29026/oes.2024.230033, discusses edge-enhanced spatial computing based on binocular metalens.

 

Spatial computing and the emerging metaverse represent a paradigm shift in the way humans interact with machines. Common augmented reality devices rely on spatial computing to perceive the depth of the real physical world while embedding virtual objects into real scenes in a three-dimensional manner. One of the key technologies of spatial computing is its depth perception capability, which bridges the gap between the physical and digital fields. This guarantees intuitive, natural interaction with virtual objects. Therefore, digital information can be correctly placed and manipulated in a scene according to the laws of physics. However, the weight and volume of traditional depth sensing systems result in a lack of comfort in human-computer interaction wearable devices containing many sensors (mainly cameras and LiDAR). The space occupied by bulky sensors also limits battery life, causing the device to need frequent charging. At the same time, disparity calculation for ill-posed regions that inherently lack texture remains challenging. Advances in portable, accurate imaging and depth sensing systems are critical to the next generation of wearable devices for human-computer interaction.

 

The authors of this article have developed an edge-enhanced depth perception system based on binocular metalens for spatial computing. The entire system integrates miniaturization, intelligence, lightweight, and compactness. Its physical working mechanism consists of a binocular metalens, a 532 nm filter, and a CMOS sensor.

 

The principle underlying depth acquisition in binocular imaging relies on presenting a stereo-image pair exhibiting discernible disparities. Disparity represents the horizontal displacement between corresponding pixels in the left and right images. Despite significant advances in accuracy and speed in various binocular stereo systems, it remains challenging to find precise corresponding points for disparity calculations within inherently ill-posed regions, such as textureless regions and reflective surfaces. Ambiguous depth predictions have serious implications for subsequent machine decisions. The application scenario shown in Fig. 1 has ill-posed regions, such as unpatterned backgrounds and untextured surfaces of letter objects. Without preprocessing, the raw captured image is processed directly by the proposed pyramid stereo-matching neural network, H-Net, to obtain the disparity. A novel symmetric H-module with an attention mechanism allows the H-Net to dynamically allocate resources based on the significance of contextual features of each view and the correlation between the left and right views. With depth-sensing results, an edge enhancement is performed to filter the feature information that detects the 3D space gradients.

 

The researchers characterized the fabrication of the binocular metalens based on scanning electron microscopy (SEM) images (Fig. 2). There are no cracks or pores on the fabricated nanopillars, and the 750 nm high nanopillars have good alignment. Each metalens has a diameter of 2.6 mm, a volume of 4.25×10-6 cm3, and a weight of only 2.61×10-5 g, which is lighter than one percent of the weight of a human hair.

 

The H-Net follows an end-to-end learning framework from stereo input images to disparity map prediction without any other pre- or post-processing, as shown in Fig. 3. The global context aggregation is vital to derive the disparity information from stereo image pairs. Besides the conventional encoder-decoder architecture and pyramid pooling, H-Net adopts cross-pixel interaction and cross-view interaction to enable the utilization of contextual information and the integration of diverse perspectives, leading to improved performance and more comprehensive analysis. With the backbone of PSMNet35, the head of H-Net is a Siamese network44, whose two branch networks are weight sharing. These head Siamese CNNs utilize residual blocks45 to extract features and weight-sharing spatial pyramid pooling (SPP) modules35 to aggregate context information. These features are then combined using cross-pixel interaction and cross-view interaction in an H-Module. A 4D cost volume is created from the left and right image features, which is then used in a 3D CNN for depth estimation. A disparity regression module is performed before the final disparity map prediction.

 

This edge-enhanced depth perception system will facilitate accurate 3D scene modeling, thereby promoting the development of machine vision, autonomous driving and robotics. 

 

Keywords: metasurfaces / meta-lenses / deep learning / depth perception / edge detection

 

# # # # # #

Prof. Mu Ku Chen received his Ph.D. degree from the Department of Physics at the National Taiwan University in 2019. He was a Postdoctoral Fellow in the Research Center for Applied Sciences at Academia Sinica in 2019. He was a Postdoctoral Fellow in the Department of Electronic and Information Engineering at The Hong Kong Polytechnic University from 2019.09 to 2020.05. He was a Research Assistant Professor in the Department of Electronic and Information Engineering at The Hong Kong Polytechnic University from 2020.06 to 2021.07. Currently, He is a Research Assistant Professor in the Department of Electrical Engineering, City University of Hong Kong. Research interests include Photonic information, Nanophotonics, Micro & Nano-electronics fabrication, and Artificial nano-antenna array based meta-devices for the photonic applications.

 

Prof. Takuo Tanaka received the Ph.D. degree from Osaka University, in 1996. After that, he joined the Faculty of Engineering Science, Osaka University, as an Assistant Professor. In 2003, he moved to RIKEN as a Research Scientist with the Nanophotonics Laboratory. He was promoted to an Associate Chief Scientist, in 2008, and a Chief Scientist, in 2017. His research interests include three-dimensional microscopy, such as confocal microscope and two-photon microscope. Recently, he is studying about nanophotonics, plasmonics, and metamaterials fields with developing many new nanofabrication techniques. He has also experimental and theoretical experiences about high precision optical measurements and spectroscopy.

# # # # # #

Opto-Electronic Science (OES) is a peer-reviewed, open access, interdisciplinary and international journal published by The Institute of Optics and Electronics, Chinese Academy of Sciences as a sister journal of Opto-Electronic Advances (OEA, IF=15.3). OES is dedicated to providing a professional platform to promote academic exchange and accelerate innovation. OES publishes articles, reviews, and letters of the fundamental breakthroughs in basic science of optics and optoelectronics.

# # # # # #

 

More information: https://www.oejournal.org/oes

Editorial Board: https://www.oejournal.org/oes/editorialboard/list

OES is available on OE journals (https://www.oejournal.org/oes/archive)

Submission of OES may be made using ScholarOne (https://mc03.manuscriptcentral.com/oes)

CN 51-1800/O4

ISSN 2097-0382

Contact Us: oes@ioe.ac.cn

Twitter: @OptoElectronAdv (https://twitter.com/OptoElectronAdv?lang=en)

WeChat: OE_Journal

# # # # # #


Liu XY, Zhang JC, Leng BR et al. Edge enhanced depth perception with binocular meta-lens. Opto-Electron Sci 3, 230033 (2024). doi: 10.29026/oes.2024.230033 

Disclaimer: AAAS and EurekAlert! are not responsible for the accuracy of news releases posted to EurekAlert! by contributing institutions or for the use of any information through the EurekAlert system.