Abstract

Minar and Lewkowicz (2013) previously reported their failure to replicate the results of two experiments we published in this journal (Walker et al., 2010), despite using the same stimuli. Their commentary (Lewkowicz & Minar, 2014) now explains that the sound they used differed significantly from ours, and we show here how the nature of their sound critically undermined their attempted replication. They also propose an alternative explanation of our results (though one that remains unsubstantiated). We show how their alternative explanation actually predicts that they also should have observed the results we reported, assumes infants are sensitive to the pitch-height correspondence they are attempting to deny, and is contradicted by the evidence.
Lewkowicz and Minar propose that when the pitch and loudness of a sound both rise and fall together, the sound has an internal congruence that attracts attention. Because our sound diminished in loudness as it approached the upper and lower limits of its pitch range (peaking in loudness midpitch), such congruence was present only when the sound was in its lower pitch range (e.g., during the first 1.25 s of the 2.5 s it took to go from lowest to highest pitch). Lewkowicz and Minar speculate that such congruence during the very first moments of an (up to 60 s) animation will induce infants to look longer toward the screen across the remainder of the trial. Because they assume that all our animations began with the ball in its lowest location, or with the morphing shape in its most rounded form, Lewkowicz and Minar reason that the initial internal congruence of our sound was confounded with the audiovisual congruence of the animations. Trials starting with the ball at the upper location, or the morphing shape in its most pointy form, would be expected to reveal a reverse audiovisual-congruity effect because the initial internal congruence of the sound would then be confounded with audiovisual incongruence.
We can confirm that though our ball animations always began with the ball in its lowest location, the morphing shape in Experiment 2 always began in its most pointy form. In the latter case, therefore, the initial internal congruence of the sound was confounded with audiovisual incongruity and so should have encouraged longer looking times on audiovisually incongruent trials than on audiovisually congruent trials. The near identical audiovisual-congruity effect observed in both cases thus contradicts Lewkowicz and Minar’s proposal.
It is noteworthy that Lewkowicz and Minar’s sounds also were internally incongruent above the midpoint of their pitch range, with pitch changing while loudness remained constant. With their moving-ball animations always beginning with the ball in its lowest location, their own account should have predicted the audiovisual-congruity effect we reported, rather than null results.
Lewkowicz and Minar’s proposal that infants are sensitive to the internal pitch-loudness congruity of sounds requires infants to be sensitive to the correspondence between pitch and height. Pitch-loudness congruity cannot arise from the common coding of pitch and loudness on the basis of their magnitude or strength because pitch is not a prothetic dimension (e.g., people do not refer to sounds as having more or less pitch and, in any case, they judge loud to be big and high pitch to be small). Further, although pitch and loudness share terms in expressions such as “high pitch” and “high levels of loudness,” which provides a possible linguistic basis for their correspondence, this is irrelevant for 3- to 4-month-old infants. What remains, therefore, is for pitch-loudness congruity to reflect their common coding onto a nonlinguistic notion of height. But infants’ sensitivity to a correspondence between pitch and height is just what we demonstrated and what Lewkowicz and Minar wish to deny.
The nature of Lewkowicz and Minar’s sound critically undermined their efforts to observe the correspondences we reported. The creation and maintenance of a tight link between visual change (object motion and morphing) and the perception of a sound is a prerequisite for exploring the effects of congruity in the directions of visual and auditory change. This was why we ensured that each moment the visual object appeared to cease changing (i.e., at each of its extreme states of elevation and sharpness), the sound also appeared to cease (i.e., its intensity approached ambient levels). To accomplish this, we gradually increased and then decreased the loudness of the sound as it moved between its highest and lowest levels of pitch. However, by arranging for their sound to maintain its maximum level of loudness as it moved toward and away from its highest level of pitch, Lewkowicz and Minar dissociated changes in the visual object from the presence of a sound (i.e., movement of the ball could not be responsible for the sound because there was a point in each cycle of events when the ball stopped moving but the sound continued at maximum intensity).
The case for claiming infants are sensitive to cross-sensory correspondences remains intact. Indeed, several studies have since confirmed and extended our observations (e.g., to include brightness-pitch, size-pitch, and thinness-pitch correspondences). After conducting a proper replication of our pitch-height experiment (i.e., using sounds with the same pitch-loudness profile) and closely replicating our results, Dolscheid, Hunnius, Casasanto, and Majid (2013) provided evidence for infants’ sensitivity to a thinness-pitch correspondence. Tellingly, because the latter animations always began with the visual object in its thinnest state, the congruity effect Dolscheid et al. reported is further compelling evidence contradicting Lewkowicz and Minar’s position.
Footnotes
Declaration of Conflicting Interests
The authors declared that they had no conflicts of interest with respect to their authorship or the publication of this article.
