MediaPipe Iris: real-time iris tracking and depth estimation (2023)

Published by Andrey Vakunov and Dmitry Laguna, Search Engineers, Google Research

Wide range of real-world applications including computational photography ( portraitand bright reflections) iaugmented reality effects(e.g. virtual avatars) rely on estimating the position of the eyes by tracking the iris. When accurate iris tracking became available, we showed that it was possible to determine the metric distance from the camera to the user - without using a dedicated depth sensor. This, in turn, can improve a variety of use cases, from computational photography, to virtually trying on glasses and hats of the right size, to usability improvements that adjust the font size to the viewer's distance.

(Video) Part1. Iris tracking 🔥 How to Accurately Detect Irises in the Eye? | An Easy Guide | Mediapipe

Iris tracking is a difficult task to solve on mobile devices due to limited computing resources, changing lighting conditions and the presence of occlusions such as hair or squinting. Sophisticated, specialized equipment is often used, limiting the range of devices on which the solution can be applied.

MediaPipe Iris: real-time iris tracking and depth estimation (1)
FaceMeshcan be adapted to control virtual avatars (quite). With the additional use of iris tracking (Normal), the life of the avatar has improved significantly.
MediaPipe Iris: real-time iris tracking and depth estimation (2)
Example of eye coloring possible with MediaPipe Iris.

Today we are announcing the launchIris MediaPipe, a new machine learning model for accurate iris estimation. Based on our work inMediaPipe faceplatethis model is able to track landmarks covering the iris, pupil and eye contour with a single real-time RGB camera, without the need for specialized equipment. By using aperture reference points, the model is also able to determine the metric distance between the subject and the camera with a relative error of less than 10% without using a depth sensor. Note that iris tracking does not infer where people are looking, nor does it provide any form of identity recognition. Due to the fact that this system is implemented inMediaPip- cross-platform, open source platform for researchers and developers to build world-class ML applications and solutions - works on most mobile phones, desktops, laptops and evenOnline.

MediaPipe Iris: real-time iris tracking and depth estimation (3)
Far-sighted usability prototype: The observed font size remains constant regardless of the distance from the device to the user.

ML pipeline for iris tracking
The first stage of the pipeline builds on our previous work on3D face meshes, which uses high-quality facial landmarks to create a mesh of approximate facial geometry. From this mesh, we isolated an eye area in the original image for use in the iris tracking model. The problem is then divided into two parts: evaluating the eye contour and iris location. We designed a multi-task model consisting of a unified encoder with a separate component for each task, which allowed us to use task-specific training data.

(Video) Iris Tracking MediaPipe part 1 || OpenCV |Python (30 FPS)Tutorial | 2022

MediaPipe Iris: real-time iris tracking and depth estimation (4)
Examples of iris (blue) and eyelid (red) tracking.

To train the model based on the cropped eye area, we manually annotated approximately 50,000 images of different lighting conditions and head positions from different geographic regions, as shown below.

MediaPipe Iris: real-time iris tracking and depth estimation (5)
The region of the eye marked by the contours of the eyelid (red) and iris (blue).
MediaPipe Iris: real-time iris tracking and depth estimation (6)
The cropped eye areas are input to a model that predicts landmarks using separate components.

Iris Depth: Depth estimation from a single image
Our iris tracking model is able to determine the metric distance of the object from the camera with an error of less than 10%, without the need for specialized equipment. This is done based on the fact that the diameter of the horizontal iris of the human eye remains approximately constant at 11.7 ± 0.5 mm in a wide population [1,2,3,4] along with some simple geometric arguments. For illustration, consider a pinhole camera model projecting onto a square pixel sensor. The distance to the subject can be estimated from the landmarks of the face with the toolfocal lengthfrom the camera, which can be obtained using the camera capture APIs or directly from a fileMetadane EXIFcaptured image, as well as other internal parameters of the camera. Given the focal length, the distance from the subject to the camera is directly proportional to the physical size of the subject's eye, as shown below.

MediaPipe Iris: real-time iris tracking and depth estimation (7)
Object distance (D) can be calculated from the focal length (F) and iris size using similar triangles.
MediaPipe Iris: real-time iris tracking and depth estimation (8)
Lewy:MediaPipe Iris predicts the metric distance in cm on the Pixel 2 based on iris tracking alone, without the use of a depth sensor.Normal:The depth of the basic truth.

To quantify the accuracy of this method, we compared it to the iPhone 11's depth sensor by collecting synced front-end videos and depth images from over 200 participants. We have experimentally verified with a laser measuring device that the error of the iPhone 11's depth sensor is <2% for distances up to 2 meters. From our assessment, our approach to estimating depth based on iris size has a mean relative error of 4.3% and a standard deviation of 2.4%. We tested our approach on participants with and without glasses (not including participants' contact lenses) and found that glasses slightly increased the mean relative error to 4.8% (standard deviation 3.1%). We have not tested this approach in participants with eye conditions (such asold bowLubHere). Given that MediaPipe Iris does not require specialized hardware, these results suggest that it is possible to achieve metric depth from a single image on devices with a wide range of cost.

(Video) Iris detection using MediaPipe Iris - #google

MediaPipe Iris: real-time iris tracking and depth estimation (9)
Histogram of estimation errors (lewy) and comparing the actual distance with the distance estimated by the iris (Normal).

Launch of the MediaPipe Iris
We provide iris and depth estimation models as a cross-platform MediaPipe pipeline that can run on desktop, mobile, and the web. As described in our lastGoogle developer blog post about MediaPipe on the web, We're happyWebsite teammiXNNPACKto run our Iris ML pipeline locally in the browser without sending any data to the cloud.

MediaPipe Iris: real-time iris tracking and depth estimation (10)MediaPipe Iris: real-time iris tracking and depth estimation (11)

ByWASM do MediaPipestack, you can run the models locally in your browser!Lewy:Iris tracking.Normal:Iris depth calculated from a photo containing only EXIF ​​data. Iris tracking can be testedHereand iris depth measurementsHere.

future directions
We plan to extend our MediaPipe Iris template with even more stable tracking to reduce errors and implement it for accessibility use cases. We strongly believe in sharing codes that allow for repeatable research, rapid experimentation and development of new ideas in various areas. In ourdocumentationand accompanimenttemplate card, we detail the intended uses, limitations and fairness of the model to ensure that the use of these models is consistentGoogle's AI policies. Please note that any form of surveillance or identification is clearly out of scope and not possible through this technology. We hope thatproviding this iris perception functionfor the wider research and development community, it will result in the emergence of creative use cases, stimulating new responsible applications and new directions of research.

(Video) Real-Time Head Pose Estimation: A Python Tutorial with MediaPipe and OpenCV

For more MediaPipe ML solutions, see ourssolutions pagemiGoogle developer blogfor the latest updates.

We would like to thank Artsiom Ablavatski, Andrei Tkachenko, Buck Bourdon, Ivan Grishchenko and Gregory Karpiak for their support in model evaluation and data collection; Yury Kartannik, Valentin Bazarevsky, Artsiom Ablavatski for the development of mesh technology; Aliaksandr Shyrokau and the annotation team for their diligence in data preparation; Vidhya Navalpakkam, Tomer Shekel, Kai Kohlhoff for expertise in the field, Fan Zhang, Esha Ubowa, Tyler Mullen, Michael Hays, and Chuo-Ling Chang for helping integrate the model into MediaPipe; Matthias Grundmann, Florian Schroff and Ming Guang Yong for their continued assistance in building this technology.


1. Iris Position Estimation Python part 2| MediaPipe | Opencv 2022
2. Human skeleton detection with Mediapipe + Kinect-powered depth estimation
3. AI Pose Estimation with Python and MediaPipe | Plus AI Gym Tracker Project
(Nicholas Renotte)
4. Real-Time 3D Pose Detection & Pose Classification | Mediapipe | OpenCV | Python
(Bleed AI Academy)
5. Face Distance Estimation with OpenCV Python and Neural Networks on a Monocular Camera
(Nicolai Nielsen)
6. Iris Tracking Algorithm
(Taehee Lee)


Top Articles
Latest Posts
Article information

Author: Laurine Ryan

Last Updated: 10/25/2023

Views: 6003

Rating: 4.7 / 5 (57 voted)

Reviews: 80% of readers found this page helpful

Author information

Name: Laurine Ryan

Birthday: 1994-12-23

Address: Suite 751 871 Lissette Throughway, West Kittie, NH 41603

Phone: +2366831109631

Job: Sales Producer

Hobby: Creative writing, Motor sports, Do it yourself, Skateboarding, Coffee roasting, Calligraphy, Stand-up comedy

Introduction: My name is Laurine Ryan, I am a adorable, fair, graceful, spotless, gorgeous, homely, cooperative person who loves writing and wants to share my knowledge and understanding with you.