Abstract
We examined whether the thresholds of motion and depth perception produced by motion parallax could be specified by the concept of a disparity gradient. We manipulated both the motion parallax amplitude and the angular separation of two dots and calculated the percentages of trials in which participants perceived motion or depth. The results showed that the amplitude of motion parallax for the threshold increased as the separation became larger with the gradients of 0.023, 0.072, and 0.430 for the lower depth, the lower motion, and the upper depth thresholds, respectively. These findings indicate that the gradient is a useful concept to specify the motion and depth thresholds together rather than parallax amplitude alone.
It is known that observer-produced motion parallax provides the same depth information as binocular disparity (e.g., Bourdon, 1898; Graham & Rogers, 1982; Heine, 1905). However, it is still unknown whether the concept of gradient that is used in binocular vision (e.g., Burt & Julesz, 1980a, 1980b) is applicable to the thresholds of motion parallax. The concept of gradient in binocular vision can be applied to motion parallax because both depth cues are calculated from the same triangulation geometry. But, the question here is whether the gradient for motion parallax can describe both the motion (or what Gogel, 1990, called concomitant motion) and depth thresholds.
Besides examining the depth thresholds, examining the motion threshold is a key to comprehend the motion parallax perception (H. Ono & Ujike, 2005; M. E. Ono, Rivest, & Ono, 1986). Ono and Ujike demonstrated that motion parallax produces four different motion or depth perceptions as a function of the amplitude of the motion parallax: (a) no motion and no depth when the amplitude of parallax is small—below motion threshold and depth threshold, (b) no motion with depth when the amplitude is larger, (c) motion with depth when the amplitude is even larger, and (d) motion with no depth when the amplitude is very large. With observer-produced motion parallax, unlike object-produced motion parallax, the locations of objects can be perceived as stationary ([b] above) as long as the displacements of the retinal images are fully converted to depth information (i.e., location constancy in Ono & Ujike). If there is a nonconverted portion, then the displacement of retinal images is perceived as the objects’ motion (Sakurai & Ono, 2000). Ono and Ujike argue that the limit of motion parallax for location constancy is analogous to that of the limit of binocular disparity for fusion.
By measuring motion thresholds in addition to depth threshold like in H. Ono and Ujike (2005), we tested whether the concept of gradient (Burt & Julesz, 1980a, 1980b) is applicable to motion parallax. The reason for this is that an analogy can be made between motion perception and patent stereopsis, that is, a lack of location constancy. That is, the upper limit of stationary perception (i.e., depth without motion) can be said to correspond to the upper limit of fused stereopsis (i.e., depth without diplopia). We defined the gradient for a motion parallax stimulus as the equivalent disparity divided by the angular separation between those two points. Equivalent disparity (see Rogers & Graham, 1982) is the relative retinal displacement of two points when the head moves by an amount equal to the interocular separation; we used 6.5 cm in this study. (The word disparity used alone henceforth means equivalent disparity.) By manipulating the amplitudes of both the disparity and angular separation, we obtained disparity thresholds for motion and depth perception for each separation. The disparity thresholds were plotted as a function of angular separation, and best fit lines were computed. The slope of the best fit line represents the disparity gradient.
Method
Participants
Seven observers (three males including the first author and four females) participated. They were aged from 20 to 35 years (M = 24.2, standard deviation [SD] = 5.0). (Although one more male participated, we did not include his data because the range of the experimental conditions was insufficient for his thresholds.) All had normal or corrected-to-normal vision and were naive as to the purpose of the experiment except for the author. This study was approved by the Office of Research Ethics at York University, Toronto, Canada and was conducted in accordance with the Declaration of Helsinki.
Stimuli and Apparatus
A chinrest on a rail was laterally movable over a 13 cm span. The stimuli were two white dots presented at eye level on a computer screen (Dell 17 inch cathode-ray tube display). The diameter of the dots was 0.08°. The experimental room was made as dark as possible. The background on the screen was black, and the frame of the screen was not visible; only dots were visible. Possibly, some participants might have seen very dim frame of the screen after their dark adaptation. We did not ask the naive participants about this, but the author who participated did not see the frame throughout the experiment. The movement of the dots was yoked to that of the chinrest. One dot moved in the same direction as the chinrest, and the other moved in the opposite direction; both moved by the same extent. The viewing distance was 114 cm. There were five levels of horizontal separations 1 between the dots: 0′, 15′, 30′, 60′, and 120′ when the chinrest was at the central position (the prime symbol denotes arc min). We used 10 disparity levels of 0′, 0.3′, 0.6′, 1.2′, 2.4′, 4.8′, 9.6′, 19.2′, 38.4′, and 76.8′ for the first three participants and added 153.6′ condition for the other four participants (see Participants section). For a 13 cm head movement, the disparity was equal to the visual angle of the amplitude of the motion of one dot (e.g., when each dot on the screen moved left to right or right to left by 0.3′ of visual angle, the disparity calculated from the two dots was 0.3′). The disparities simulated 0, 0.17, 0.35, 0.70, 1.40, 2.79, 5.58, 11.17, 22.33, 44.67, and 89.33 cm of depth. There were two directions of depth; either the left or the right dot was simulated to appear closer. The yoking of the dot and head movement was implemented using a Nintendo Wii remote controller that was attached to the chinrest.
Procedure
After receiving the instructions and signing the informed consent form, participants sat behind the chinrest and covered their nonpreferred eye with an eye patch. The preferred eye was measured by checking which eye was used when a participant looked though a small hole in a card near her or his face. The trials were performed in a completely darkened room; only the two dots were visible.
Each trial began with a short beep after which the stimulus was presented for 10 seconds followed by a blank screen (black; Figure 1). Participants were instructed to move their head laterally back and forth during each trial while fixating on the center of the display (there were no fixation marks, but the center of the display was always at the midpoint between the dots). When the chinrest touched the end of the rail, it made a click sound and then the observer moved their head back in the opposite direction to the other end, which also produced a click sound. Participants were trained to produce head motion frequency of approximately 0.67 Hz (i.e., 0.75 seconds for one way).
The task was to report (a) whether the dot(s) appeared to move—yes or no and (b) which dot appeared closer—left or right. Participants were asked to report “motion” when at least one of the two dots appeared to move. They were asked to report “no-motion” as long as the dots appeared stationary in the simulated three-dimensional space, even if a change in the visual (headcentric) directions of the dots was detectable on the display (i.e., location constancy). The participants gave their responses verbally.
Each block of trials consisted of all possible combinations of disparity, separation, and depth direction; there were 100 trials for three participants and 110 trials for four participants. The trial order was randomized within each block. Running one block took approximately 30 minutes. All participants completed six blocks of trials on separated days.
Before the six blocks of trials, all participants completed two blocks as training. The training trials were the same as those in the main experimental session. During the training sessions, participants learned to maintain the amplitude (13 cm) and the frequency (0.67 Hz) of their head movement by synchronizing to the sound of a metronome. We did not measure the speed nor the exact span of head motion during the experimental trials.
Results
Thresholds for Each Separation
Motion
We calculated the percentage of trials in which concomitant motion was perceived for each separation and disparity condition, for each participant. Generally, the mean percentage across observers increased as a function of disparity, in a sigmoidal manner (Figure 2).

Schematic figure of the sequence of a trial.

Mean percentage of trials in which motion was seen as a function of disparity for each separation. The thresholds values shifted as the separation increased. Horizontal solid line indicates 50% level.
Next, we estimated the motion thresholds (50%) for each separation by fitting a normal distribution function and obtained its parameter. These thresholds were computed for each participant. For the separations for 15′, 30′, 60′, and 120′ the mean disparity thresholds across participants were 2.92′, 3.95′, 6.21′, and 9.17′ (SD = 1.46′, 1.54′, 3.15′, and 5.80′), respectively.
Depth
We calculated the percentage of trials in which simulated depth was seen for each combination of separation and disparity conditions, for each participant. Generally, the percentage was below the 75% criterion when the disparity was small, was above it when the disparity was in the midrange and was again below it when the disparity was large (Figure 3). Hence, the percentage as a function of disparity crossed or touched the criterion (75%) at least twice. The exception from the general shape was that of separation 0′ (gradient of infinity); the function never crossed or touched the criterion for any disparities for any participants. Next, we estimated intersections of the percentage function and 75% level. We defined the disparity on the intersections as the lower and upper thresholds to represent the range of location constancy. Each intersection was estimated using linear approximation of the two nearest data points. These thresholds were computed for each participant. The function touched or crossed the criterion (75%) more than twice for 4.7% of all conditions of all participants; we used only the smallest and largest values for these participants. For the separations of 15′, 30′, 60′, and 120′, the mean lower disparity thresholds across participants were 0.88′, 1.57′, 1.88′, and 1.98′ (SD = 0.60, 1.07, 1.58, and 5.44), respectively, and the mean upper disparity thresholds across the participants were 4.68′, 16.23′, 35.11′, and 61.12′ (SD = 6.01′, 18.87′, 27.30′, and 46.36′), respectively.

Mean percentages of trials in which simulated depth was seen as a function of disparity for each separation. The curves shift as the separation increase. Horizontal solid line indicates 75% level.
Gradient
To estimate the gradients, we computed the slope of the best fit line of the disparity threshold as a function of the dot separations, for each participant. Figure 4 shows the geometric mean across participants. The method used was linear regression analysis with an assumption that the intercept is zero; its slope represents the gradient (we followed Burt and Julesz’s (1980a, 1980b) model). The geometric mean of the gradients (i.e., the slope of the best fit line) of all participants’ motion thresholds was 0.072 (95% confidence interval [CI] = [0.040, 0.129]). The geometric mean of gradient across all participants for the lower threshold was 0.023 (95% CI [0.011, 0.046]) and for upper threshold was 0.430 (95% CI [0.246, 0.751]). These CIs did not include zero, indicating that the computed functions had statistically significant gradients (slopes) that deviated from zero.

Mean lower motion thresholds (filled triangles) and upper and lower depth thresholds (circles and blank triangles) as a function of angular separation, across all participants. The solid lines are the best fitted linear functions. The broken line is the slope of unity.
Finally, we analyzed the order of the depth upper, depth lower, and motion lower thresholds. The equivalent disparity gradients for motion and depth thresholds of each participant were logarithmically transformed and then performed a one-way analysis of variance. As the result of the analysis of variance was significant, F(2, 12) = 22.88, p < .0001,
Discussion
Our findings are summarized in Figure 5, which divides the two-dimensional space (top view) into the four types of perceptions described by H. Ono and Ujike (2005). The four types parallel Burt and Julesz’s (1980a, 1980b) different zones. Our zone of depth without motion corresponds with their “depth with fusion”, and our zone “depth with motion” corresponds to their patent stereoscopic depth (in their Figure 5(b)). In our Figure 5, the zone for depth without motion is outside the two regions of depth with motion. The forbidden zone for depth is the inner most region containing the median plane. The zone for concomitant motion is the three zones in the middle that saddle the median plane.

Zones and boundaries specified by gradients for the horizontal plane that passes through the fixation point. The border between depth with motion and depth with no-motion, or motion threshold is analogous to fusion limit in binocular vision. This figure is based on the average of all participants’ data. Only a horizontal distribution of the zones is presented in this figure, but the actual zones would be an imaginary solid if longitudinal separations work the same as horizontal separations.
Although the concept of gradients described the thresholds for motion parallax perceptions, the numerical values we found are considerably smaller than that of binocular gradient value of unity for the fusion reported by Burt and Julesz (1980a, 1980b). We speculate that these small gradient values reflect the sparseness of the visual information about depth in our experiment. To calculate the veridical depth between two dots, the visual system needs to estimate amplitude of the retinal motion relative to the extent of the head movement and absolute distance from one’s eye to the dots (M. E. Ono et al., 1986). However, in our experimental condition, there were no optic flow information for a depth cue, and the stimulus was viewed monocularly. By contrast, in Burt and Julesz’s stimulus consisted of the wall-paper stimulus, which had many elements to fuse binocularly. This speculation is also suggested by Chung-Fat-Yim and Ono’s (2012) finding that the gradient of 0.15 for binocular fusion of two simple line segments (thin needles).
Perhaps, the concept of gradient itself needs more examinations. Burt and Julesz (1980a, 1980b) defined the gradient in terms of two points, and it is not clear how one applies it to the random dots stimulus pioneered to study motion parallax by Rogers and Graham (1982) and Graham and Rogers (1982). One possibility is to average all the gradients of all possible two points, but it is not possible to compute such value for a square wave or a sawtooth modulation of the simulated depth, as the set of gradients would contain a value of infinity. One possibility is to consider the separation between the peak and trough for a simulated depth stimulus, as is known that different spatial frequencies have different thresholds (Bradshaw, Hibbard, Parton, Rose, & Langley, 2006; Rogers & Graham, 1982). Whether such application would eliminate the difficulty we had with studying vertical separations (see Note 1) and whether it would describe the threshold values for the random dot stimulus remains an interesting question.
Footnotes
Acknowledgements
The authors thank Linda Lillakas for many helpful comments that greatly improved this article. Special thanks to Brian Rogers for the insightful comments with his expertise.
Declaration of Conflicting Interests
The author(s) declared no potential conflicts of interest with respect to the research, authorship, and/or publication of this article.
Funding
The author(s) disclosed receipt of the following financial support for the research, authorship, and/or publication of this article: This research was supported by Grant A0296 from the Natural Sciences and Engineering Research Council of Canada.
