Abstract
Random-dot stereograms have been widely used to explore the neural mechanisms underlying binocular vision. Although they are a powerful tool to stimulate motion-in-depth (MID) perception, published results report some difficulties in the capacity to perceive MID generated by random-dot stereograms. The purpose of this study was to investigate whether the performance of MID perception could be improved using an appropriate stimulus design. Sixteen inexperienced observers participated in the experiment. A training session was carried out to improve the accuracy of MID detection before the experiment. Four aspects of stimulus design were investigated: presence of a static reference, background texture, relative disparity, and stimulus contrast. Participants’ performance in MID direction discrimination was recorded and compared to evaluate whether varying these factors helped MID perception. Results showed that only the presence of background texture had a significant effect on MID direction perception. This study provides suggestions for the design of 3D stimuli in order to facilitate MID perception.
Introduction
Good stereo vision and motion perception are valuable assets in daily life. Correctly processed binocular information allows us to have a more vivid perception of our surroundings and to make precise movements, for example, driving a car, playing sports, or observing dynamic objects in space. People with impaired stereo vision have worse performance on motor visual tasks than their peers with normal stereo vision (Grant, Melmoth, Morgan, & Finlay, 2007; Read, 2014), possibly leading to negative consequences for their quality of life (Koc, Erten, & Yurdakul, 2013).
Among the different stimuli designed to investigate stereo vision and motion perception, random-dot stereograms (RDSs) are of particular interest, as they only provide disparity and motion information. It ensures that observers base their judgment only on these two sources of information, avoiding the influence of other cues such as texture, perspective, and so forth (Julesz, 1964). We define here RDS as a stereogram where dots are correlated across the eyes and across time. In each video frame, the dots in each image of the pair of stereo images move in opposite directions leading to the perception of motion-in-depth (MID) towards or away from an observer; both changing disparity (CD) and interocular velocity differences (IOVD) cues are thus presented. The relationship between these elements is given (Nefs, O'Hare, & Harris, 2010) by
One difficulty with using RDSs in MID studies is that MID perception with RDSs is not trivial. Large individual differences in MID perception have been reported (Barendregt, Dumoulin, & Rokers, 2014; Regan, Erkelens, & Collewijn, 1986a). Nefs et al. (2010) also reported that participants have different sensitivity for the use of CD and IOVD cues for MID perception, and many investigations of MID were carried out using small numbers of highly experienced participants or authors who might be familiar with the protocols (Brooks, 2002; Czuba, Cormack, Rokers, & Huk, 2008; Nefs et al., 2010; Shioiri, Nakajima, Kakehi, & Yaguchi, 2008).
The aim of this study was therefore to identify which stimulus factors are essential for the perception of MID when using RDSs (as defined here). This could not only facilitate studies on MID with RDSs but could also find applications in clinical practice.
It is well known that certain aspects of the design of static stereoscopic stimuli are important for correct and easy perception, such as the eccentricity of the stimulus, the contrast and the color of the signal, the luminance, or the relationship between reference and target (Chen, Shi, Tai, & Yun, 2012; Zhang et al., 2016). Perception of MID relies on different cues from those for the perception of static 3D scenes (for the impact of dynamic changes, see Brenner, Van Den Berg, & Van Damme, 1996; Tidbury, Brooks, O'Connor, & Wuerger, 2016; Watanabe et al., 2008).
The impact of stimulus features on 3D motion perception has been the subject of numerous studies before. For instance, Regan, Erkelens, and Collewijn (1986b) investigated the influence of reference marks in the visual field and the role of vergence eye movements. Brenner et al. (1996) showed that target vergence, relative disparity, and retinal image size can all affect the perceived velocity of the MID and the final perceived position (and commented that these different sources of information will be weighted differently to make the least conflict between these sources). Brooks (2001) showed that the perception of stimulus speed for stereo motion is contrast dependent (lower contrast meaning slower speed) and later investigated the issue of relative disparity pedestals (Brooks & Stone, 2004). Allison and Howard (2011) looked at the effects of several stimulus features including dot speed, temporal frequency, dot density, dot size, and dot lifetime on the percept of MID in spatially uncorrelated moving displays. However, the stimuli used in these studies were complex objects (Brenner et al., 1996) or dots placed in the same depth position which could provide cues from motion-defined boundaries. This is why, in our setup, paired dots are displayed at a random position within a depth volume.
Due to the nature of the factors considered in this study, the experiment was carried out in two parts denoted Experiment 1 (a training session) and Experiment 2 (impact of the stimulus design). Considering the relatively small size of our population sample, we decided to train them before measuring the impact of the stimulus design. This training session aimed at reducing individual differences in task performance. In the second experiment, we explored four stimulus factors for their effect on MID perception: presence of a static reference, background texture, relative disparity, and stimulus contrast. We also considered the impact of dot density in both experiments. Dot density in static RDS stimuli is considered to be an irrelevant factor for depth perception (Harris & Parker, 1992; Iizuka, 1992), and we would like to confirm its effect on MID perception. Our hypothesis is that MID perception performance could be improved by adding/optimizing some of these factors.
General Methods
Participants
Sixteen healthy participants (25.8 ± 2.3 years old), inexperienced with RDS stimuli, were recruited. Inclusion criteria involved monocular visual acuity equal to or better than 10/10 (evaluated by a decimal scale chart), with no history of ocular pathology (functional and organic), no vertical or horizontal phoria (checked by cover test), no glasses to avoid prismatic effect (contact lenses were accepted), and stereo acuity equal to or better than 60’’ (tested by Titmus Stereo Test). Approval was obtained from Brest University Hospital institutional review board, according to the tenets of the Declaration of Helsinki. All participants were naive to the experimental procedures and informed about the nature of the study.
RDS Stimulus
The stimulus was generated within the refresh rate of the monitor on a computer (Dell Precision, M6700), using OpenGL libraries and visual C#. Stereoscopic vision was produced by displaying the stimulus on a white Lambertian screen using a 3D projector NEC 310 (Figure 1(a)), the 3D projector works in a bottom-up mode to render the 3D scene. The images for the left and right eyes were provided to the observer by 3D goggles synchronized with the projector. The image resolution was 1,280 × 960 pixels and the projector refreshing rate was 120 Hz. The fixation cross as well as all the RDSs were in white (392 cd/m2) distributed on a dark (20 cd/m2) background (Figure 1(a)). The angular size of the black circle was 7°, the dot size was 15 arcmin (each dot was a 4 × 4 pixels square), with a viewing distance of 70 cm (the same viewing distance was used throughout the two experiments), the surrounding texture of the black circle was composed of randomly located black dots on a gray background (dot size: 2 × 2 pixel). During the entire stimulus exposure time, participants were asked to fixate on the white cross in the middle of the RDSs. The white cross subtended an angle of 24 arcmin and was 4.8 arcmin thick. The experiment was performed in a dark room. The only light source was the projector.
(a) Stimulus images for the left and right eye (images are shown successively either to the left or the right eye and not to both at the same time). The inlay image (bottom right) illustrates the moving direction of the dots. (b) Top view. Dots In-depth distribution when RDSs are observed by the participant through 3D glasses.
Pairs of correlated dots were displayed at a random position within a depth volume (Figure 1(b)), so that participants perceived a cloud of dots moving in depth instead of a dot plane (the edge of the plane moving in depth could provide some monocular cues to the perception of motion direction; Barendregt et al. 2014). The volume size was ±0.6° (9 cm behind the screen, 6 cm in front of the screen). Then, the RDSs were shifted in each frame giving the impression that the dots were moving towards or away from the observer with a velocity of 0.6°/s (Figure 1(b)). Dots disappeared after a 1 s lifetime or if they reached the volume borders. The new dots were then generated and relocated at the back of the depth volume if its original MID direction was towards the participant when the trial began. Similarly, if the original MID direction was away from participants, the newly generated dots were located at the front of the depth volume. To make the disappearance of the dots at the edges of the volume smoother, we increased the stimulus volume up to ±0.9°, that is, we added ±0.3° to initial volume and within this part of the stimulus volume, the dots’ Michelson contrast was gradually reduced from 90.3% to 0% (the luminance of the background was unchanged). When the dots reached the border of the volume, they were relocated at a random position on the other side of the volume, so the participants could not anticipate the direction of the dots trials after trials.
Experiment 1
Protocol and methods
Improvement with training of static stereo stimulus perception among naïve participants or participants who have impaired stereo vision has been demonstrated (Chang, Kourtzi, & Welchman, 2013; Ding & Levi, 2011; Sakano & Allison, 2014; Skrandies, Jedynak, & Fahle, 2001). We carried out a training session to familiarize participants with the experiment and to improve their performance in MID direction detection. Sixteen participants (all those recruited) participated in this experiment. Because our aim was to improve MID perception, we did not use a control group to assess if the improvement was related to learning only or also to practice. The stimulus is shown in Figure 1.
This experiment was conducted in three phases.
In Phase 1 (pre-training), each participant underwent 60 trials using the RDS stimulus: 10 trials for each of the six considered dot densities, that is, 1 dots/°(A), 6 dots/°(A), 10 dots/°(A) and 1 dots/°(T), 6 dots/°(T) and 10 dots/°(T), where the “A” stands for dot cloud moving away and the “T” stands for dot cloud moving towards the observer. For each trial, one of the dot densities was chosen randomly. RDS stimuli were presented to participants for 1.8 s, then no RDS for 1 s. During these 2.8 s, participants had to choose between two motion directions (towards or away from the participant) using a joystick. If no answer was given, the following trial started automatically and a null answer was recorded.
In Phase 2 (training), participants received some training to help them to improve their MID perception. Each participant underwent a series of 60 trials as in Phase 1, but after each response, they were told if they had estimated the direction of motion correctly (Watanabe, 2015). Phase 2 stopped if the percentage of correct answers at the end of a series had reached 80% for this series (i.e., of 60 trials) or after three series (whether the percentage of correct answers was improved or not). A maximum number of three series was chosen because we aimed to train the participants rapidly with minimum fatigue.
In Phase 3 (post-training), participants again underwent 60 trials as in Phase 1. The results are presented in Figure 2 (dashed line).
Mean participant performances for pre-training and post-training experiment (16 participants). Horizontal axis is dot density per degree, where “A” stands for the dot cloud moving away and the “T” stands for the dot cloud moving toward the observer. The error bars designate the standard error of the mean.
Results and discussion
In this study, naïve participants were trained before the main experiments were carried out, their performance before and after training was recorded, and evaluated by the accuracy for MID direction discrimination.
As illustrated in Figure 2, participants’ performance in MID direction discrimination was obviously improved after the training session. In the pre-training experiment, the average of all correct answers rates (16 participants, 6 dots densities) was 59.2% ± 9.9% (mean ± 1 SD) which meant that participants had difficulty detecting the correct MID direction. According to the two-way repeated measured ANOVA analysis (two factors: direction, dot density), the “towards” direction was not significantly better than the performance for “away” detection, F(1, 15) = 0.76, p = .57. The perception of stereoscopic motion could be created by monocular cues (expanded/contracted stimuli) or binocular disparity, asymmetries in the sensitivity to MID perception using monocular cues (centrifugal/centripetal stimuli) has been reported (Edwards & Badcock, 1993; Lewis & McBeath, 2004). In the current study, the results did not reveal differences between the perception of approaching and receding stereoscopic motion generated by binocular disparity. This discrepancy might be related to the relatively small amplitude of the stereoscopic motion (0.6°) and the lack of geometrical cues (the size of the retinal image of a real object moving in depth would vary). In the post-training session, the average of all correct answer rates was 85.8% ± 20.2%. The mean improvement in MID direction perception was 45.5% ± 30.6% (i.e., mean ± one SD of all the individual improvements). However, four participants did not achieve improvement after training (hence the very large standard deviation here with respect to the mean). They reported difficulty in perceiving MID before and during the training and their answers were close to chance level (i.e., 50%) in the pre- and post-training tests. The statistical analysis (three-way repeated measures ANOVA; factors: direction, dot density, training effect) indicated that the difference in correct answer percentage between pre-training and post-training was significant, F(1, 15) = 14.25, p < .001. This result confirmed previous studies reporting that perceptual training could improve visual performance (Ding & Levi, 2011; Di, Vincent, Xin-Zhu, & Jean-Louis, 2017). The improvement did not vary with dot density, F(2, 30) = 1.52, p = .24.
Experiment 2
Methods
The aim of this second experiment was to assess the influence of the stimulus design on MID perception. The same RDS stimulus as in Experiment 1 was used, but the influence of four different factors was investigated as illustrated in Figure 3(a-f).
The different stimulus protocols used in Experiment 2 to investigate the influence of the stimulus on MID perception, (a) neither background texture nor fixation cross are presented, (b) both background texture and fixation cross are presented, (c) only fixation cross is presented, (d) only background texture is presented, and the position of background texture and fixation in depth is changed (e) at the back of the depth volume and (f) at the front of the depth volume.
The presence of a fixation cross (Figure 3(a) and (c)): Since the fixation cross is a static object, it may facilitate the participants’ ability to discriminate the dot motion. The blue circle that appears in Figure 3(a) and (c) was not part of the stimulus. It was only represented in the figure for convenience to illustrate the area used for the RDSs. Participants were instructed to fixate on the fixation cross throughout the experiment. When there was no fixation on the screen (Figure 3(a)), participants would have to fixate the center of the stimulus. Participants were asked to report the direction of MID perception using a joystick as soon as they perceived it, and this was the case for the whole experiment.
The presence of a background texture surrounding the RDSs (Figure 3(a) and (d), the background texture was composed of randomly located gray dots, 2 × 2 pixels). The presence of a background texture provided a static reference for the participants to facilitate accommodation and vergence.
The stimulus contrast (Figure 3(b); i.e., same stimulus with a lower signal contrast). Two additional levels of contrast were tested: 80% and 62%.
Finally, the relative disparity between the fixation stimulus and the RDSs (Figure 3(a) and (e)). Three types of relative disparity were compared. First, the fixation cross and background texture were at the same location on the screen in depth. The two others were when the fixation cross and background texture were at the bottom (Figure 3(e)) or in front of the depth volume (Figure 3(f)). Since relative disparity was larger in Figure 3(e), the rationale to involve this factor was to investigate whether MID direction detection is related to the value of relative disparity (the fixation cross was moved to the back or front of the depth volume, together with the background texture). The location-in-depth of the fixation cross and background texture were changed only when the relative disparity was considered as a factor. In Figure 3(a) to (d), the fixation cross and background texture are at the same depth as the screen (zero disparity).
Only 12 participants from Session 1 who were able to easily detect MID participated to this experiment (i.e., participants who had at least 80% correct performance in detecting MID direction). For the four other participants whose performances were basically at chance level after training, we considered that the variations of stimulus factors would not induce significant changes, and so they were not invited to participate in Experiment 2.
To assess the influence of these factors, each participant underwent, for each factor, wix series of 60 trials (10 trials for each of the six considered dots densities as in the training session). No feedback was given. In the absence of the fixation cross, participants were instructed to look at the center of the black area. Otherwise, participants were instructed to fixate the cross. The order of the series was randomized to reduce the impact of the learning effect.
In order to assess the participants’ fixation during the second experiment, an eye-tracker (Face lab 5) was used to monitor gaze position (3D) during each stimulus condition (after a calibration phase). With the help of the data analysis software (Eyework), the fixation time on a certain area of the screen could be extracted. The fixation duration (as a percentage of the total experiment time) is defined as the amount of time spent looking into a square area centered on fixation divided by the duration of the session. In order to compare the stability of eye fixation in static depth perception and dynamic stereo perception, we measured the fixation time using a static stimulus. This stimulus is a single frame extracted randomly from the ones used in the main experiment (Figure 3(b)) and displayed for 60 s.
Results and discussions
Four stimulus features that could be critical to MID perception were investigated: background texture, fixation cross, contrast, and relative disparity.
Reference object: background texture or fixation cross
Two reference objects were used in the experiment: the background texture and fixation cross. MID detection performance when both background texture and fixation cross were presented was significantly better than when no reference (background or cross) was presented, F(1, 11) = 8.00, p < .001. The presence of the background texture definitely helped to improve MID detection performance when compared to the “no reference” case, F(1, 11) = 7.00, p < .001. On the contrary, the effect of the fixation cross alone (i.e., no reference vs. only fixation cross) was not statistically different, F(1, 11) = 1.59, p = .12. It could be assumed that the presence of the texture helped to achieve a more stable fixation by providing a larger fixed reference than the fixation cross alone. Besides, the texture of the background might help the observers to accommodate on the screen, which could further facilitate MID perception. The lack of impact of the fixation cross might be due to its limited size.
Effect of the s
ignal contrast
Three levels of signal contrast were compared: 90.3%, 80%, and 62% (note that as only the luminance of the dots was reduced, the luminance of the entire area was also reduced). As illustrated in Figure 5, contrast was not a significant factor in the current study, F(2, 22) = 0. 295, p = .08. This result was quite surprising, since high contrast can lead to better performance in static depth perception (Halpern & Blake, 1988). For this reason, we further analyzed the participants’ response time (for each trial under different contrast levels). The statistical analysis indicated that responses were faster with lower contrast, F(2, 22) = 5.41, p < .001.
Influence of reference objects on the percentage of correct answers for detecting MID direction. The error bars designate standard error of the mean. Influence of contrast on the percentage of correct answers for detecting MID direction. The error bars designate standard error of the mean.

Contrast is one of the most critical factors for image quality. It can affect performance in object detection and motion discrimination (Nakayama & Silverman, 1985). Contrast range can even affect judgment on the object speed (Brooks, 2001; Thompson, Brooks, & Hammett, 2006). In previous studies of MID perception, the rationale behind the choice of the contrast for stereo motion perception has been rarely mentioned, some studies used mid-gray contrast with others using maximum contrast (Czuba, Rokers, Guillet, Huk, & Cormack, 2011; Sakano & Allison, 2014). In our study, the variation of stimulus contrast did not lead to any significant changes. One reason was possibly a ceiling effect. Since with the optimal stimulus (background texture plus fixation cross) participants’ performances had reached their peak, the change of the contrast was not sufficient enough to affect their performance. Another reason could be the limited range of the contrasts considered, performances might be affected if the contrast is further reduced (Simmons, 1998). In the current study, the calculation of the response time showed that with lower contrast, participant gave faster response for MID direction discrimination. Although this can appear counter-intuitive, it is in agreement with results on suppressive/summative mechanisms in visual motion processing (Tadin, 2015).
Effect of relative disparity
The position of the reference objects (fixation cross and background texture) was varied in order to test whether larger relative disparities could help MID perception. When the references were at the same position of the screen (zero disparity), the largest relative disparity was 0.9°. When the reference objects were at the back or front of the depth volume, the largest relative disparity was 1.8°. The results are shown in Figure 6. Surprisingly, we observed that changing the relative disparity did not significantly affect performance in MID perception, F(2, 22) = 0.33, p = .72.
Relative disparity impact on percentage of correct answers for detecting MID direction. The error bars designate standard error of the mean.
For static stereo perception, the relative disparity is important for the observers to identify the relative positions of the objects in depth (Adrien, Dennis, David, & Daphne, 2016). Our results highlight that MID perception relied on different mechanisms from those for static depth perception (i.e., CD and IOVD cues; Harris, Nefs, & Grafton, 2008).
Fixation time
Percentage of Fixation Time on Cross Versus Experiment Duration.
Strong individual differences were also observed in the fixation time and how it changed between static (reference, RDSs are displayed on the screen without motion) and dynamic tests (RDSs are moving in depth). The fixation time did vary significantly for three different dynamic stimuli conditions, F(2, 8) = 8.75, p = .0097, indicating that fixation stability could be one reason for differences in MID perception performance with dynamic stimuli (the presence of fixation cross and background texture).
Conclusion
Perceiving MID in random dot stereograms is not a trivial task. Results from the first experiment showed that a combination of training and practice can help some participants to significantly improve their performance for detecting MID. These results suggest that the degree of experience of the participant (from “first timer” to highly experienced) may limit the possibility of experiment replication. The impact of training was however not the aim of this study and the difference in impact between training (i.e., practice with feedback) and practice alone should be investigated in another study as well as why some participants (4 out of 16 here) did not improve their performance.
Results from Experiment 2 showed that even if a “training” session can allow most of the participants to significantly improve their MID perception, some specific stimulus features, particularly a static background texture, can help further improving the accuracy of MID direction discrimination. (A possible explanation is that this texture introduces a static reference against which the dot movement is highlighted.)
The other stimulus factors that we considered, that is, the dot density, the presence of a fixation target, the relative disparity, and the dot contrast, did not change significantly the observers’ performances. The limited size of the cross used may explain our results as well as the fact that fixation stability does not seem to be a critical factor for MID perception. The lack of impact of the contrast (albeit a shorter response time) may be due to the limited range of contrast variation we explored and additional experiments need to be carried out to explore the optimum dot contrast and luminance to facilitate MID perception.
This study emphasizes the importance of 3D stimulus design for MID perception test using RDSs. In a further study, we plan to investigate the influence of additional factors (speed, size, number, coherence, stimulus duration, etc.) to improve the protocol and to reduce further individual variations in MID perception. This could allow for improving RDSs/MID as a powerful research tool not only to investigate the associated neural mechanisms but also to help to generate new tools for clinical optometry.
Footnotes
Acknowledgements
The authors thank Yulia Fattakhova, Luay Ahdab, and students and staff of Telecom Bretagne for their help and participation in the experiments.
Declaration of Conflicting Interests
The author(s) declared no potential conflicts of interest with respect to the research, authorship, and/or publication of this article.
Funding
The author(s) received no financial support for the research, authorship, and/or publication of this article.
