Abstract
BACKGROUND:
Premature newborns have a higher risk of abnormal visual development and visual impairment.
OBJECTIVE:
To develop a computational methodology to help assess functional vision in premature infants by tracking iris distances.
METHODS:
This experimental study was carried out with children up to two years old. A pattern of image capture with the visual stimulus was proposed to evaluate visual functions of vertical and horizontal visual tracking, visual field, vestibulo-ocular reflex, and fixation. The participants’ visual responses were filmed to compose a dataset and develop a detection algorithm using the OpenCV library allied with FaceMesh for the detection and selection of the face, detection of specific facial points and tracking of the iris positions is done. A feasibility study was also conducted from the videos processed by the software.
RESULTS:
Forty-one children of different ages and diagnoses participated in the experimental study, forming a robust dataset. The software resulted in the tracking of iris positions during visual function evaluation stimuli. Furthermore, in the feasibility study, 8 children participated, divided into Pre-term and Term groups. There was no statistical difference in any visual variable analyzed in the comparison between groups.
CONCLUSION:
The computational methodology developed was able to track the distances traveled by the iris, and thus can be used to help assess visual function in children.
Introduction
It is estimated that an average of 15 million children are born prematurely each year worldwide (Koullali et al., 2016). According to the World Health Organization (WHO), preterm birth constitutes as being one that occurs before 37 weeks of gestation, besides being the main cause of neonatal morbidity and mortality worldwide (Galindo-Sevilla et al., 2019).
Premature newborns (PNEB) are born at an earlier and immature stage in the development of the central nervous and visual system, thus, they have a higher risk of abnormal visual development and visual impairment, either by changes in the axial length of the eyeball, in vascularization and image definition in the retina, abnormalities in refraction, in the transmission of the optic nerve and maturation of the visual cortex (Brémond-Gignac et al., 2011).
The impairment of the visual system in premature infants may lead to the development of typical diseases with high incidence such as glaucoma, cataract, or retinal disorders such as retinal detachment or retinoblastoma and retinopathy of prematurity (Clark-Gambelunghe et al., 2015).
Some studies have found that environmental exposure before time may cause some aspects of visual function to be accelerated by visual and visomotor experiences that end up happening prematurely (Sale et al., 2007; Landi et al., 2007), but this does not define the quality of visual function entirely, as one needs the contribution of brain maturity and cortically/subcortically mediated aspects to this end (Ricci, Cesarini, Romeo et al., 2008).
Regarding the evaluation of visual function, one can have the contribution to health information technology systems, since they are effective tools that help to complement the practice of health professionals and can contribute to clinical processes with respect to productivity and patient safety, as well as efficiency in health systems (Asan et al., 2015).
Eye tracking technology is one of the tools used in healthcare, which is a method in which the position of the eye is used to determine the direction of gaze in relation to the head by offering a visual stimulus in different directions (up and down, right and left), which will generate eye movements that will be subsequently evaluated as saccadic movement, fixation and soft persecution (Ali et al., 2021). Regarding the operation of this tool, it is based on video eye trackers that record data using corneal/iris reflection, and through this reflection, it is possible to track eye movements by a visible and infrared spectrum that are reflected in the eyes by infrared light and captured by a camera (Hansen et al., 2005).
With the objective measurements provided by eye tracking, it can improve vision assessment protocols and would provide support in assessing progress during rehabilitation, training or visual therapy (Ali et al., 2021). With this in mind, an accessible eye screening test in health care units would generate economic benefits in the context of the Single Health System (SUS), since many visual alterations, if detected and treated early, it would decrease financial resources, besides enabling the prevention of severe and irreversible visual problems (Kooiker et al., 2016). Thus, the objective of the study was to develop a computational methodology to aid in the evaluation of functional vision in newborns and infants.
Methods and procedures
Methods of experimental study
This experimental study was approved by the Research Ethics Committee CAAE (No. 49995121.7.0000.0121) and was conducted in the outpatient clinic for high-risk newborn follow-up at a public university, with children from the Neonatal Intensive Care Unit (NICU) administered by the Unified Health System (SUS) that serves several municipalities in the extreme south of Santa Catarina, Brazil. The collections for building the internal dataset took place between May 2021 to February 2022.
The inclusion criteria for participation in the study were: 1. Children up to two years of age regardless of gestational age at birth; 2. With or without a diagnosis of visual, neurological, and genetic changes; 3. Have the Informed Consent Form signed by the legal guardian; 4. Active alertness according to Brazelton’s neonatal behavioral assessment scale at the time of the assessment (Brazelton, 1995).
The exclusion criteria for participation in the study were: 1. Children in the immediate postoperative period of cardiac, neurological, gastrointestinal, urogynecological, and orthopedic surgeries; 2. Children with eye infections or using any medication that compromises their vision.
The study was carried out in three stages: defining the capture pattern, obtaining the images, and developing the detection algorithm.
Definition of the capture pattern
The purpose of defining the capture pattern was to control the discrepancy between the captured videos in order to guarantee a better performance in the child detection algorithm. To create the capture pattern, the authors, and health professionals, helped to study and test several ways to get to the final pattern. The capture pattern used:
1. 30-second videos with the infrared camera and with the child 30 cm away from the evaluator, and the following patterns were performed: Horizontal visual tracking: Slow pursuit movement with the stimulus directed from right to left for three repetitions (horizontal). Vertical visual tracking: Slow persecution movement with the stimulus directed from top to bottom for three repetitions (vertical). Visual field: The evaluator performs stimuli at the extremities of the visual field on the right and left. Vestibulo-ocular reflex: The evaluator performs the movement of the vestibulo-ocular reflex, three times to the right and three times to the left, alternately, while the evaluator, who is 30 cm away, performs a frontal stimulus to the eyes. Fixation: evaluated from the time the child fixates the gaze on the stimulus during the vestibulo-ocular reflex.
For the stimulus, we used a black and white contrast plate similar to a human face, developed and used in the visual function evaluation battery by Ricci, Cesarini, Groppo et al., 2008. The infant remained seated on the lap of the evaluator or guardian with trunk support.
Obtaining the images
The filming took place in a controlled environment, with the child in stable condition, accompanied by parents and/or guardians to obtain a robust and diverse dataset for the development of the future detection algorithm and neural network training.
To obtain the images, the evaluation was separated into two stages. The first step was to screen the infants and interview the parents or guardians about the general data of the child and the mother and possible clinical diagnoses. Subsequently, the parents or guardians were invited to participate in the study and, after signing the ICF, the capture pattern protocol specified above was applied.
The capture of images related to visual and behavioral responses of the infant was performed with a HardLine Cutie 6809® webcam modified to capture near-infrared spectrum positioned perpendicular to the face of the participants using for this purpose supports and mobile tripods for height adjustment during the acquisition process.
The captures followed the protocol defined in this study allowing the reproducibility of the results and ensuring the quality of the images. The images and videos were stored in an Avell 16gb i7® computer, to compose an image dataset.
Development of the detection system
The infant vision detection system was developed following the steps shown in Fig. 1. These steps correspond respectively to the input of the original video in the algorithm, image processing (detection and selection of the infant’s face, detection of specific facial points, and from the recognition of the ocular structure the tracking of the iris positions is done), ending with the processed video, containing the information of the visual responses recognized during the image processing by the software.

Stages of the development of the detection system.
The first step of face detection and selection is performed using the FaceMesh library combined with OpenCV. FaceMesh does face detection and uses four edges (upper right edge, upper left edge, lower right edge, and lower left edge) to form a square bounding the face for alignment in the center of the image with OpenCV. The cropping parameters were refined to the face shape commonly found in the child population, according to the video and image bank of the captured images.
In the next step, face point detection, the FaceMesh solution was used. This procedure has the function of finding the facial points of interest, including about 478 points, 468 for the face, and 10 for the eyes. The detected points could be used for the identification of various facial structures, such as the mouth and eyes. For the present solution, the points were used for eye detection. The reason for using FaceMesh is its easy handling since it is only necessary to import the library to use its parameters, besides, it is a solution with high performance and robustness (Google Mediapipe FaceMesh, 2019).
For eye detection, the points of interest collected in the previous stage were used together with the OpenCV image parameters. With the image parameters, the left and right edges of the eye were detected. To find the iris, a native FaceMesh function was called, which returns the center of the iris and 4 points on the edges of the iris. To get the iris radius, OpenCV’s circles function was used, which calculated the radius from the 5 returned points (the 4 on the edge and the center). With the recognition of the iris, the tracking of its traveled positions was performed, using the values saved during the iris movements.
The algorithm is based on classes that act on detection and tracking, each with its own responsibility. The main features implemented in the algorithm were: the traversed positions module, which detects all positions that were made by the children; FaceMesh, which creates the landmarks of the face; the face adjustments, which adjust the landmarks to each child’s respective face sizes; and the iris detection, which detects and tracks the position of the iris of the children.
The positions traversed module adds the positions traversed and demonstrates the N positions traversed in the last frames, and the number of frames passed to be demonstrated is a parameter that can be configured, to more or fewer frames, so it can be useful for several analyses.
The face point detection module uses the imported FaceMesh library, which is responsible for detecting the facial points of interest.
The facial adjustments serve to standardize the position of the face and eyes and act by aligning the eyes, aligning the face, and separating only the face in the image, discarding the background scenery. Iris detection is done with the positions of the 5 points found by FaceMesh and thus calculating the iris radius for display.
Methods feasibility study
A clinical feasibility study was conducted after the experimental study. The cinical feasibility study was approved by the research ethics committee CAAE and conducted in the outpatient clinic for follow-up of high-risk newborns in the extreme south of Santa Catarina, Brazil. Data were collected from May 2021 to July 2022. The study aimed to evaluate the applicability of the proposed system through the clinical evaluation of children’s visual function from the videos processed by the system.
The videos processed by the software were clinically analyzed by two independent evaluators, and a third evaluator was activated in cases of discrepancy between the two evaluators. The visual variables evaluated were vertical and horizontal visual tracking, being divided into the presence of visual alterations such as strabismus and nystagmus, and gaze continuity, classified as typical when continuous and as atypical when not performed, brief, or discontinuous. Furthermore, in the vestibulo-ocular reflex, the evaluation was separated into fixation time (stable when greater than 3 seconds and unstable when less), ocular alignment (positive when present and convergent when misaligned), and left and right gaze performance (performs when the gaze remains fixed on the stimulus in the midline during head movement to the contralateral side, and as not performing when not fixed on the stimulus).
Statistical analysis was performed using the Statistica® software (version 13.0). The Shapiro Wilk test for normality was used. The descriptive analysis used mean and standard deviation for parametric data and median, minimum and maximum for non-parametric data. In the comparative analysis, the t-test for independent samples was used for parametric data, and the Mann-Whitney U test for non-parametric data. Fisher’s exact test was used for the association between the groups of visual variables.
Results
Results of the experimental study
Forty-one children participated in the study. The characteristics of the sample are described in Table 1.
Characterization of the sample
Characterization of the sample
Legend: GA: gestational age; BW: birth weight.
The clinical diagnoses that the participants had a history of were the following: prematurity, COVID-19, acute viral bronchiolitis, neonatal jaundice, testicular dystopia, strabismus, and unknown neurological syndrome.
With the construction of the internal dataset and the development and configuration of the algorithm, we obtained a result of the functioning of the software according to the initial proposal to achieve the final function of tracking the positions traveled by the iris. At this stage, as the assessment videos were used only for the robustness of the data set, the children received clinical visual function assessment by the study physiotherapists.
Figures 2 and 3 show a comparison of the images without the software processing on the iris movement and with the processing applied for this tracking.

Image of child’s eye structure without the application of software.

Image of child’s visual response to the application of the software.
According to the examples in Figs. 2 and 3, one can see how the software works in its face detection, eye detection, and tracking of the iris movement by the red line formed on the image, demonstrating the quality of the software.
Children were divided into two groups: Term (n = 4) and Pre-term (n = 4). The sample characteristics are described in Table 2. As statistical results, the gestational age and birth weight of the Pre-term group were significantly lower than those of the Term group.
Sample characteristics
Sample characteristics
Legend: SD, standard deviation; GA, gestational age; BW, birth weight; 1’: first minute; 5’: fifth minute; *p < 0.05.
The proposed system, based on the distances traveled by the iris during stimuli, was used to help in the evaluation of the visual functions of the participants. As shown in Table 3, no visual variable was statistically significant between groups; however, in all of them, the Pre-term group had higher values of participants with changes than the Term group.
Association of visual variables
Legend: RVO, vestibulo-ocular reflex; *p < 0.05.
Feasibility study results
The present study found that the software developed was able to track the positions traveled by the iris and through the feasibility study, the videos processed by the software were able to assist professionals in the area in assessing the visual function of term and preterm infants.
Werchan et al. (2022) developed an eye-tracking methodology to estimate the gaze behavior of infants by extracting the face, eye, and pupil from the image using OpenCV, and found, as in the present study, that the software was able to perform gaze tracking and also showed high internal reliability and robust external validity.
In another study (Kooiker er al., 2016), eye-tracking technology was used to quantify the assessment of the visual function of children with visual impairment between 1 and 14 years of age, and the authors found quantitative similarity in the results of the assessment by the technology and by professionals in the area without the use of the tool, however, they did not assess children under one year of age. Thus, it is shown that the visual screening methodology is a possible tool to be applied in the child population with satisfactory results in the literature.
However, the cited studies use assessment tool cameras such as webcams, which film the visual responses to stimuli that appear on the computer screen. Despite presenting the benefit of home assessment, the use of screens is not recommended for the child population until at least two years of age, and the size of the screen limits the use of the methodology in the assessment of children’s visual functions (World Health Organization, 2019).
Despite recently published studies on eye tracking in childhood, a scientific gap persists in the use of eye-tracking technology without the use of screen stimuli and children born prematurely. In this context, existing tools for assessing visual function in the infant population include physical stimuli and clinical assessment of responses without computational tools to assist (Van Hof-van Duin et al., 1992), corroborating with the novel aspect of the present study.
Nevertheless, early assessment of visual functions plays an important role in preventing severe and irreversible problems in the child population. However, due to social issues, some countries have limitations in performing early visual assessments by professionals specialized in eye health. Therefore, eye-tracking technology assists professionals in assessing and treating the dysfunction appropriately, through the sequence of eye movements from one point to another (Ali et al., 2021).
In the feasibility study performed, it was noted that there was no statistical difference in visual functions between term and preterm infants, although the second group showed in all variables higher numbers of participants with changes in the visual system.
This response in visual tracking may be associated with the average corrected age of the participants being only two months and chronological five months. This is because, maturation of the frontal cortex, cerebellum, and caudal and supplemental frontal eye fields occurs between two and six months of life, and these are regions related to the pursuit of eye movements (Pieh et al., 2012).
Changes in the vestibulo-ocular reflex may be related to delayed visual maturation, because the absence of movement until 3 weeks of life can occur in healthy children, however, as in our study the participants are older than 21 days of life, the absent reflex is an indication of the visual sensory defect (Costa, 2007).
Still in this context, the literature shows that the premature population when reaching term age may show greater visual responsiveness than term infants, supposedly due to the visual experience of the extra uterine period (Chau et al., 2013). However, studies show that at 3 months and 2 years of age, premature infants have problems with visual attention and visual fixation change (Ricci, Cesarini, Groppo et al., 2008; Atkinson and Braddick, 2007), reinforcing the need for tools to help screen the visual function of this population.
All children with visual function disorder were invited to participate in a follow up ambulatory from the laboratory of this study to early stimulation and assessments continuity.
Thus, this study brings an innovative methodology to help evaluate visual functions in the infant population, especially premature infants, aiming to fill the scientific gap regarding the availability of technological tools for this purpose. Furthermore, this tool will help society and the children’s families, enabling early treatments and fewer investments in health, reaching regions that lack specialized professionals.
Conclusion
The computational methodology developed by the exposed protocol was able to perform visual function screening in children, helping health professionals in the evaluation of visual functional variables in premature and term infants.
Data availability
The data used to support the findings of this study are available from the corresponding author upon request.
Conflicts of interest
The authors declare that there is no conflict of interest regarding the publication of this paper.
Funding
The study was supported by the Foundation for Research Support of Santa Catarina (FAPESC) grant number 2021TR000344.
Footnotes
Acknowledgments
We are grateful to the Foundation for Research Support of Santa Catarina (FAPESC) grant and the UNIEDU/FUMDES program for the financial support.
