Abstract
Objective:
This study assessed listeners’ ability to localize spatially differentiated virtual audio signals delivered by bone conduction (BC) vibrators and circumaural air conduction (AC) headphones.
Background:
Although the skull offers little intracranial sound wave attenuation, previous studies have demonstrated listeners’ ability to localize auditory signals delivered by a pair of BC vibrators coupled to the mandibular condyle bones. The current study extended this research to other BC vibrator locations on the skull.
Method:
Each participant listened to virtual audio signals originating from 16 different horizontal locations using circumaural headphones or BC vibrators placed in front of, above, or behind the listener’s ears. The listener’s task was to indicate the signal’s perceived direction of origin.
Results:
Localization accuracy with the BC front and BC top positions was comparable to that with the headphones, but responses for the BC back position were less accurate than both the headphones and BC front position.
Conclusion:
This study supports the conclusion of previous studies that listeners can localize virtual 3D signals equally well using AC and BC transducers. Based on these results, it is apparent that BC devices could be substituted for AC headphones with little to no localization performance degradation.
Application:
BC headphones can be used when spatial auditory information needs to be delivered without occluding the ears. Although vibrator placement in front of the ears appears optimal from the localization standpoint, the top or back position may be acceptable from an operational standpoint or if the BC system is integrated into headgear.
Introduction
Bone conduction (BC) listening devices transmit sounds via vibrations to the bones of the skull which stimulate the cochlea. BC headsets are used in a variety of applications, including military, law enforcement, safety and rescue, industry, and recreation. Such devices offer several advantages over traditional air conduction (AC) devices. For instance, they can be worn without obstructing the ear canal (Henry & Letowski, 2007) and can transfer acoustic signals to the user with little environmental leakage and from concealable locations (e.g., under a hat), and survey results provide evidence that they are comfortable and easy to use (McBride, Patrick, Letowski, & Tran, 2010).
Their usefulness for practical applications has, however, been scrutinized. For instance, the effectiveness and quality of BC signals depend on where the vibrators are placed on the head (McBride, Letowski, & Tran, 2008). Some have also indicated that BC listening devices may act as tactile devices and their effectiveness may be compromised in environments where listeners are exposed to external vibrations. However, within normal BC communication sound levels, the user does not feel vibrations from BC listening devices and the devices are not affected by external vibrations (Henry & Mermagen, 2004).
Since the skull offers minimal intracranial sound wave attenuation, another concern involves limitations of BC listening devices in multichannel and spatial audio systems. However, previous studies (MacDonald, Henry, & Letowski, 2006) have demonstrated that acceptable localization accuracy can be achieved with BC devices by preprocessing signals using a head-related transfer function (HRTF). Some also believe BC headsets can be effectively used in virtual three-dimensional (3D) audio systems (MacDonald & Tran, 2008). Such systems have been used to create auditory environments similar to the multichannel communication process by which attention is directed by varying signal intensities (Begault & Wenzel, 1993; Blue, McBride, Weatherless, & Letowski, 2013; Doll & Hanna, 1995; Doll, Hanna, & Russotti, 1992; King & Oldfield, 1997; Ricard & Meirs, 1994).
Previously, BC was not considered to be an effective means of transmitting spatial information due to small interaural intensity and time differences (IID and ITD, respectively) of signals. Transcranial attenuation of BC signals is around 0 dB to 25 dB (e.g., Kirikae, 1959; Stenfelt, 2011; Stenfelt & Goode, 2005), whereas AC interaural attenuation is approximately 50 dB to 70 dB (e.g., Algom, Adam, & Cohen-Raz, 1988; Snyder, 1973; Zwislocki, 1953). Maximum ITD via BC is about 0.2 ms, whereas proper AC sound source localization in the 180° range requires about 0.8 ms ITD (e.g., Agterberg et al., 2011). Such limited IIDs and ITDs make it difficult for listeners to completely isolate BC sounds presented to either side. However, despite small natural IIDs and ITDs, clear sound lateralization of BC signals delivered bilaterally with supplemental IIDs and ITDs have been reported and confirmed by several authors (e.g., Bosman, Snik, van der Pouw, Mylanus, & Cremers, 2001; Jahn & Tonndorf, 1982; Kaga, Setou, & Nakamura, 2001).
The current study explored the feasibility of using BC devices in 3D audio systems by investigating how well spatially differentiated virtual audio signals (in the horizontal plane only) delivered by a stereo BC headset can be localized given the position of the vibrators on the skull. BC localization accuracy (closeness of perceived signal location to actual signal location) and precision (closeness of multiple perceived signal locations for the same actual location) data were compared to data obtained from loudspeakers and spatially differentiated virtual audio signals delivered through circumaural AC headphones.
Method
Participants
Fifteen listeners (7 male, 8 female) ages 18 to 36 years (M = 25.9 years, SD = 5.6 years) participated in the study. All listeners underwent a hearing test using a Madsen Electronics Orbiter 922 clinical audiometer and TDH39P headphones. The tests took place in a sound-treated booth with noise levels complying with ANSI S3.1-2008 standards. Qualified listeners had pure-tone AC hearing thresholds of 20 dB HL or lower for audiometric frequencies from 250 to 8000 Hz, inclusively (ANSI S3.6, 2010); no history of otologic pathology; and the difference between pure-tone hearing levels for both ears at any test frequency could not exceed 10 dB HL.
Apparatus and Equipment
This study took place inside the Sphere Room in the U.S. Army Research Laboratory’s Environment for Auditory Research (EAR) at Aberdeen Proving Ground, Maryland. The Sphere Room is a 5.4 x 5.3 x 4.9 meter audiometric testing area with an array of 57 Meyer Sound MM-4XP loudspeakers configured in a sphere (Figure 1). The loudspeakers at the 0° elevation (i.e., listener’s ear level) are spaced 22.5° apart. At ±30° and ±60° elevations, the spacing is 30° and 45°, respectively. Only the loudspeakers at 0° elevation were used in this study. The listening station consisted of Sennheiser HD 280 Professional circumaural AC headphones, AfterShokz Sport stereo BC headset, swiveling chair, computer display, laser pointer, and response buttons.

Sphere Room configuration showing the swiveling chair in the middle of a raised platform surrounded by loudspeakers. The laser pointer and small computer screen are located in the center of the security bar that has response buttons located on both ends.
Stimuli
Auditory signals were generated by a personal computer and transmitted via loudspeaker array, stereo AC headphones, and stereo BC headset. The test stimulus was a stream of five 250-ms Gaussian white noise bursts separated by 300-ms intervals. The test stimuli for the AC headphones and BC headset were created for each listener based on their HRTF to account for head and torso effects on sound wave perception. Signal intensity was perceptually equivalent to a 65 dB (A) sound presented over a loudspeaker directly in front of the listener.
Procedure
Each consenting participant who met the hearing criteria was seated at the listening station in the center of the Sphere Room and participated in the HRTF measurement session. The HRTF process directly captures changes in intensity, spectrum, phase, and timing that occur when the original stimulus reaches each ear. For example, a sound played from 90° will be received sooner and with greater intensity by the right ear than the left. Mathematically convolving an arbitrary sound with the intensity and time differences associated with a signal presented directly to the right of a listener yields stimuli that appear to originate from a virtual 90° location when delivered through a binaural system.
Due to head and pinnae differences, each individual’s HRTF is unique; thus, localization is generally superior when using individualized HRTFs. Although HRTFs derived from a different person or manikin can also provide helpful localization cues (e.g., Yost, Dye, & Sheft, 1996) and can be used when individualized HRTFs are impractical or not available, our study used individualized HRTFs. To measure the HRTFs, two probe microphones (Etymotic Research Inc. model ER-7C) were placed at the openings of the listener’s ear canals. The chair was adjusted so the listener’s ears were positioned in the center of the sphere of loudspeakers. Auditory signals from the 16 loudspeakers located at ear level were presented. During this 10-minute session, the listener was asked to face “Home” (i.e., 0° azimuthal position) while 75 dB (A) signals were presented from the loudspeakers using the EAR hardware and a modified version of HeadZap™, a commercial HRTF measurement software system. The Golay code method (Golay, 1961) was used to calculate the HRTFs. The result was 16 virtual locations of the test sound’s source placed on the 0° azimuthal plane at 22.5° intervals around the listener’s head. Test stimuli were convolved with the HRTFs to produce the spatialized stimuli.
The data collection session was divided into five test blocks. During the first (loudspeaker) test block, test signals were presented directly from the loudspeakers. These trials were used to ensure all listeners were able to localize sound under natural listening conditions. During the second (headphone) test block, signals derived from the listener’s HRTF measurement session were presented through circumaural headphones. These trials were designed to evaluate how well listeners were able to localize spatially differentiated virtual sounds using a traditional method. The last three blocks were the BC headset trials that also used signals derived from the HRTF session. Each of these trials corresponded to a pair of bone vibrator test positions around the ear—front (BC front), top (BC top), and back (BC back; Figure 2).

Bone vibrator test positions (starting from the left and moving clockwise: BC back, BC top, and BC front).
Within each block per listener, each signal was randomly presented five times (80 signal presentations per condition). The BC headset blocks were counterbalanced across listeners; however, the loudspeaker and headphone blocks were not randomized but were presented in the same order prior to the BC blocks. Due to differences in hardware setup and procedural requirements for the AC and BC blocks, the AC blocks were not randomized to reduce experiment time and listener fatigue. Since multiple AC localization studies have been performed using both loudspeakers and headphones and the assessment of BC localization performance was the primary contribution of this study, presenting the two AC blocks first as a means of obtaining general performance baseline data for comparisons was deemed acceptable. By not randomizing all conditions, it is possible that performance may be underestimated for the AC conditions since practice time was shorter. However, AC localization is an everyday activity and the additional practice effect should be small.
Sound intensity for the loudspeaker trials was preset to approximately 65 dB (A) as measured by a sound level meter located at ear level from the middle of the room where the listener’s head would be centered. To ensure that each listener heard the headphone and BC headset signals as loud as the loudspeaker signals, a loudness matching procedure was used to set the intensity of both devices (Pollard, Tran, & Letowski, 2013). For the headphones, three listeners who did not participate in the study performed this procedure before the experimental sessions began. The procedure required them to adjust the signal intensity of the headphones to match the loudness of a 65 dB (A) signal presented through a loudspeaker directly ahead. The test signal alternated between the loudspeaker and headphones and the listener adjusted the headphone volume using an amplifier until the signal stream was the same loudness level. The amplifier setting for these three listeners was identical so it was used for each listener in the experimental sessions. Each BC block began by placing the BC headset at a different test position. Each listener completed the loudness-matching procedure as described earlier for each bone vibrator position. Loudspeakers surrounding the listener were used to present 55 dB (A) background noise throughout the BC trials to mask any BC signal leakage that could potentially be heard via AC.
At the start of each trial, the listener faced the Home position and pressed the response button to begin. A set of noise bursts was presented from one of 16 azimuthal locations in virtual space. The listener indicated the perceived sound location by turning the chair, pointing the laser at the perceived sound location, and pressing the response button. Listeners then returned the chair to Home position and pressed the button again to receive the next signal. Listeners set the pace of the experiment and breaks were taken as needed between trials. Proprietary in-house computer software was used to present the signals and record responses. Each experimental session took approximately 3 hours.
Data Analysis and Results
To compare localization errors between device conditions, separate values of signed error and unsigned error were calculated per listening condition and virtual signal location (henceforth referred to as signal location; Letowski & Letowski, 2011). Signed error is the average of all errors including their direction, whereas unsigned error is the average absolute value of the errors. Localization error was calculated by computing the smallest angular distance between the signal location and perceived target location based on the listener’s response (Cabot, 1977; Letowski & Letowski, 2011; Mardia & Jupp, 2000). For instance, if the signal originated from 0° (which is the same as 360°) and the listener perceived the origin to be 330°, signed error would be −30° (i.e., 330 – 360) and unsigned error would be 30° (i.e., |330 – 360|). The mean localization errors are shown in Table 1.
Mean Signed Error and Unsigned Error for the Loudspeakers (S), Headphones (H), BC Front (F), BC Top (T), and BC Back (B) Device Conditions.
Table 2 and Figure 3 illustrate the standard error across listeners for each condition. Figure 3 shows that the loudspeaker condition had the lowest level of variability overall, whereas headphones had the second lowest level of variability for most signal locations. The BC back (BCB) condition had the highest level of variability for most signal locations, whereas the BC front (BCF) condition demonstrated less variability than the BC top (BCT) condition for most signal locations. Overall, the BC and headphone data per location had similar patterns.
Standard Error for the Loudspeakers (S), Headphones (H), BC Front (F), BC Top (T), and BC Back (B) Device Conditions.

Standard error across listeners per device condition: loudspeakers, headphones, bone conduction headset positioned in front of the ear (BC front), bone conduction headset positioned above the ear (BC top), bone conduction headset positioned in back of the ear (BC back). The loudspeaker condition had the lowest error overall followed by the headphone condition. The BC back condition typically had the highest level of variability.
Table 3 shows reversal error percentages. Reversal errors are responses located in the opposite hemisphere of the virtual source location. Front/back and back/front (FB/BF) errors are typical in sound localization studies and arise from similar IID and ITD cues received by both ears for symmetrical front and back sound source locations resulting in a natural “cone of confusion.” For this reason, FB/BF reversals should be removed or resolved prior to data analysis (Letowski & Letowski, 2011). The practice of resolving FB/BF reversals is a common practice in hearing studies since maintaining them will increase localization blur making it difficult to draw conclusions from the data (Begault & Wenzel, 1993; Oldfield & Parker, 1984). For example, if a sound source is located at 0° and a listener makes two judgments of 0° and 180°, the average location would be 90°, which is inaccurate. In the current study, the FB/BF reversal errors were resolved by converting them to their mirror images, thus placing them into the proper hemisphere. Such a procedure has been recommended in similar localization studies to preserve the sample size (e.g., Cabot, 1977; Gerzon, 1975; Wightman & Kistler, 1989).
Percentage Front/Back and Back/Front (FB/BF) and Left/Right and Right/Left (LR/RL) Reversal Error per Device Condition.
In Table 3 the FB/BF errors, as well as left-right and right-left (LR/RL) errors, are combined together. The data suggest that LR/RL localization cues are strong; thus, LR/RL reversals are rarely seen. LR/RL reversals that do appear are likely the result of an inadvertent response button push or a forced response after missing the signal presentation due to distraction. Since these errors are not natural, occur less frequently, and cannot easily be discerned from truly random responses, they were not resolved or removed prior to data analysis.
A two-way repeated measure ANOVA with a univariate approach and Huynh–Feldt correction for degrees of freedom was used to determine which experimental conditions were significantly different from one another based on the error. The two factors were device condition (5 levels) and signal location (16 levels). An alpha level of .05 was used to determine significance. Analyses were performed for both signed and unsigned error. Signed error primarily indicates the accuracy of listeners’ judgments, whereas unsigned error is a combination of both accuracy and precision. Figure 4 displays mean signed and unsigned errors of each target signal for the loudspeaker condition. The magnitude of signed error is much smaller than for unsigned error; hence, using signed error would misrepresent the level of deviation between the response and actual target. Therefore, unsigned error is frequently reported in localization studies as a comprehensive measure of localization errors and it has been used for post hoc analyses for the current study.

Illustration of the difference between the mean signed error and unsigned error for the loudspeaker device condition. Reporting only signed error would misrepresent the magnitude of the deviation between the target signal and listener’s response.
Table 4 displays the ANOVA statistics. For both sets of data, the Huynh–Feldt correction for degrees of freedom was used. The analyses of both signed and unsigned error indicated the presence of significant differences between device conditions and signal locations. There was also a significant device condition × signal location interaction.
Statistics for the Signed Error and Unsigned Error Repeated Measure ANOVAs With a Univariate Approach and Huynh–Feldt Correction Used for Degrees of Freedom for All Devices.
To identify which device conditions were significantly different from another, post hoc paired comparison tests with Bonferroni correction were performed on the unsigned error. Results confirm that performance under the loudspeaker condition was the most accurate overall (p values < .001). In addition, the headphone condition resulted in more accurate performance than the BCB condition (p value = .001). No other significant differences were identified between device conditions. Post hoc tests with Bonferroni correction were also performed on unsigned error to identify which signal locations resulted in errors significantly smaller in size than others. Table 5 lists significantly different locations based on Bonferroni-corrected p values.
Significantly Different Signal Locations for All Device Conditions Where the Mean Unsigned Error of Location 1 Was Lower Than the Mean Unsigned Error of Location 2.
To illustrate the interaction, unsigned error per device condition was plotted against the signal locations. Figure 5 shows the loudspeaker condition consistently had the lowest average unsigned error, except for at 247.5°. There is little consistency when comparing the other four device conditions. For instance, half of the time, listeners performed better (i.e., had smaller errors) with AC headphones than with the BC headset, but in three cases they performed the worst with AC headphones.

Device condition × signal location interaction plot showing distribution of mean unsigned errors across all signal locations observed in all five listening conditions: loudspeakers, headphones, bone conduction headset positioned in front of the ear (BC front), bone conduction headset positioned above the ear (BC top), bone conduction headset positioned in back of the ear (BC back).
Separate repeated measure ANOVAs were performed to compare BC device conditions only. Table 6 displays the ANOVA statistics. The Huynh–Feldt correction for degrees of freedom was used. The unsigned error analysis indicated a significant difference between BC device conditions and signal locations. The signed error analysis indicated no significant differences between BC device conditions; however, significant differences were detected between signal locations. For both data sets, there was not a significant device × signal location interaction.
Statistics for the Repeated Measure ANOVA With a Univariate Approach and Huynh–Feldt Correction for Degrees of Freedom for Signed Error and Unsigned Error (BC Devices only)
Post hoc tests with Bonferroni correction were performed on the unsigned errors for the BC device conditions. Results showed that BCF had a significantly lower mean unsigned error than BCB (p value = .022). Table 7 lists significantly different locations based on the Bonferroni-corrected p values. No other significant differences were detected.
Significantly Different Signal Locations for BC Device Conditions Only Where the Mean Unsigned Error of Location 1 Was Lower Than the Mean Unsigned Error of Location 2
A circular regression model was used to investigate effects of device condition and signal location on precision and the reversal rate. Localization performance was modeled using the two-part wrapped Cauchy data model described in McMillan et al. (McMillan, Hanson, Saunders, & Gallun, 2013; McMillan, Saunders, & Hanson, 2011). Specifically, the judgment response to a particular signal follows a finite mixture of two wrapped Cauchy distributions, one centered at the target signal location S and one centered at the mirror image of the target signal location, referred to as the confused location C given by
To investigate the effects of device condition and signal location on localization performance, ρ and θ of the ith localization judgment was modeled with a logistic link function, such that
D, Q, and L are device condition, signal quadrant, and listener indicators, respectively, for the ith localization judgment. In this study there were five device conditions and four signal quadrants distinguishing signals emitted from front left, back left, back right, and front right of the listener. The decision to combine signal locations into quadrants, as opposed to treating signal location as a continuous, circular covariate (McMillan et al., 2013) was based on a visual understanding of the similarity of judgments within quadrants. The listener-level indicator models variability in localization performance among listeners and effectively accounts for the repeated-measures structure of the data set. Finally, both models include a device condition × quadrant interaction to model how localization performance with each device condition varies across signal locations. The FB/BF confusion model also includes a listener × signal quadrant interaction, which models FB/BF reversal rate variability among listeners within signal locations.
The α and δ terms were modeled as normally distributed random effects. The variance components for each random effect distribution were given weakly informative half-Cauchy priors with common scale parameter A, which itself was given a uniform (0,30) hyper-prior distribution. This model is challenging to fit using maximum likelihood methods as in McMillan et al. (2011) and so was fit from a Bayesian perspective using Markov chain Monte Carlo (McMillan et al., 2013). Three separate chains were run with dispersed starting values for 50,000 iterations each thinned to 1,000 samples. Trace plots and Gelman–Rubin diagnostics showed reasonably good convergence of the fitted models. Final inferences were based on the combined 3,000 iterations from the three separate chains.
Figure 6 shows the fitted model error sources. Essentially it shows the posterior standard deviation (x-axis) among levels of the effects included in Models 1 and 2. For example, for “Quadrants,” the point and error bar is the estimated standard deviation among quadrants for FB/BF reversal effects (left panel) and precision effects (right panel). Based on the left panel of Figure 6, the highest source of variation in FB/BF reversal rates is among quadrants within device condition. FB/BF reversal rates also vary markedly among listeners, but these effects depend on the quadrant. There is comparatively little overall variation in FB/BF reversals among listeners and quadrants. The right panel indicates that much of the variability in localization precision is described by device condition, with comparatively less dependence on the quadrant. Listeners also vary considerably in terms of precision.

Finite sample standard deviation of the effects included in Models 1 and 2 where error bars are the posterior interquartile range of the standard deviation.
Table 8 shows estimates of localization precision. BCF and BCT generate nearly identical precision across signal position (i.e., quadrant). The loudspeaker and BC conditions all appear to have somewhat higher precision in response to signals emitted from the left than from the right. The converse is true for the headphone condition.
Summary of Localization Precision by Stimulus and Signal Position Where LCL = Lower 90% Bayesian Confidence Limit and UCL = Upper 90% Bayesian Confidence Limit.
Localization precision is plotted for each signal position against FB/BF reversal rates in Figure 7. BCF and BCT generate nearly identical precision. Loudspeakers result in the best performance when the signal is in front of the listener, whereas signals from behind the listener using BCB appear to give approximately better results than the other conditions.

Scatterplots of the estimated localization precision (y-axis) by front–back reversal rates (x-axis) as a function of signal position.
Discussion
The main question to be answered by this study was, “Which bone vibrator position facilitates the best sound localization performance for spatially differentiated virtual auditory signals?” Since AC headphones are typically the device used to transmit spatially differentiated virtual auditory signals, this condition was used as the benchmark for comparing the BC results.
Values of 1% to 2% of LR/RL localization reversal errors are typical for localization studies (e.g., Smith-Abouchacra, 1993). The rate of FB/BF errors observed in this study was also in line with, or smaller than, rates reported in the literature for virtual audio signals reproduced by earphones. For instance, Begault (1992) reported error rates of 27.5% and 33%, Wenzel, Arruda, Kistler, and Wightman (1993) reported error rates from 20% to 43%, and Schonstein, Ferre, and Katz (2008) reported error rates between 37.5% and 52%.
In most of the studies, the rate of FB error (perceiving signals presented from the front as coming from the back) is much higher than the rate of BF errors. For example, Wenzel et al. (1993) reported 25% and 6% and Begault and Wenzel (1993) reported values as high as 47% and 11% of FB and BF errors, respectively. In the current study, the proportion of FB and BF errors was about 2:1 for the headphones and BCB, which is typical for AC localization studies. However, for BCF and BCT, the proportion was 1:2. Such reversed proportion is not unusual in localization studies (e.g., Smith-Abouchacra, 1993) and is frequently observed in studies involving audiovisual context (Letowski & Letowski, 2011). In the case of BC signals, a proportion of FB/BF errors may be related to the transducer location along front–back axis of the head. The further transducers are to the back of the listener’s head, the greater the tendency to perceive signals as coming from behind.
Regarding overall localization accuracy, this study shows that performance for all but one of the BC positions (i.e., BCB) was as accurate as the headphones. It is important to note that because the BC signals were presented in 55 dB (A) of background noise, the associated signal-to-noise ratio (SNR) for the BC signals was 10 dB, whereas the loudspeaker and headphone signals were presented at a much higher SNR (≥30 dB). Despite this difference, two of the three BC positions resulted in relatively the same level of accuracy as the headphones.
Overall sizes of the reported errors also were similar to those found in earphone-based studies. For example, Wenzel and Foster (1993) and Begault, Wenzel, and Anderson (2001) reported signed errors of 10º to 20º in the horizontal plane. The headphone data collected in the current study provided lower signed error values after reversal correction and BC signed errors were comparable to those for the headphone condition.
In addition, this study identified differences between the accuracy of BCF and BCB vibrator positions. Figure 8 shows the mean error for BCF was never higher than that for BCB and, for several signal locations, was considerably lower. There were no significant differences between BCF and BCT nor between BCT and BCB. As seen in Figure 8, for all but six signal locations, BCF resulted in more accurate responses than the other two BC conditions. For most of the signal locations, BCB resulted in the least accurate responses. Ultimately, the frontal vibrator position seems to benefit localization accuracy for signals originating up front while placing the vibrators above the ears seems to benefit localization for signals originating in the back. The signs of the signed errors (Table 1) indicated that lateral errors were always made toward the more lateral direction. This pattern was relatively identical for all three BC vibrator positions as well as for the loudspeaker and headphone conditions.

Unsigned error and associated standard error bars per bone conduction headset condition: bone conduction headset positioned in front of the ear (BC front), bone conduction headset positioned above the ear (BC top), bone conduction headset positioned in back of the ear (BC back).
Overall, localization precision was fairly high (>0.75) across device conditions and signal locations. In terms of BC precision, listeners had a tendency to localize signals more precisely with BCF and BCT when the signal was in back, whereas BCB resulted in a higher level of precision when the signal was in front. However, there is a tradeoff since listeners also perceived signals as coming from behind more often with BCB when it was actually coming from the front.
Conclusions
BC devices provide several benefits for radio communication tasks (e.g., small size, lightweight, leave ears open, etc.). However, to deliver spatial information, they need to yield small localization errors. Adding spatial cues to audio streams can help users isolate particular sources for attentional focus and enhance navigation or spatial target acquisition. For additional examples, see Begault, Wenzel, Godfroy, Miller, and Anderson (2010).
The ability to localize sounds using BC interfaces has important industrial and military implications. When the ears are occluded by hearing protection devices or an encapsulating helmet, BC interfaces can be used to provide spatially different messages. Such messages can provide navigation support through spatially separated multichannel communication or spatially separated warning signals to reduce masking caused by simultaneous messages. Using spatially enhanced BC interfaces with the ears open enables multichannel radio communication without jeopardizing natural hearing and person-to-person communication. Such interfaces may be used in virtual and augmented reality applications. In addition, various forms of spatial audio media can be delivered via spatially enhanced BC interfaces for safer entertainment options (e.g., jogging and listening to music).
The results of our study indicate that HRTF-processed BC signals allow sound localization in the horizontal plane with accuracy similar to AC headphone-based sound reproduction. Placement of BC vibrators in front of the ears seems to be the optimal location for localization of those evaluated. Operationally, the top or back position may be acceptable for BC systems integrated into tactical headgear. Further studies should investigate vertical localization of virtual BC signals and compare various BC systems to identify performance variabilities between devices.
Key Points
Five auditory device conditions (loudspeakers, AC headphones, and a BC headset located in front of the ears, above the ears, and behind the ears) were used to deliver spatially separated signals to listeners to compare listeners’ ability to localize virtual 3D signals.
Gaussian white noise burst signals were randomly presented from 16 azimuthal locations in virtual space and listeners were instructed to point in the direction from which they perceived the signal to have originated.
Localization performance when the BC headset was positioned behind the ear was significantly worse than localization performance for the AC headphones condition.
Based on the unsigned error, localization performance was significantly better when the BC headset was positioned in front of the ear than when it was positioned in back of the ear.
Footnotes
Maranda McBride is an associate professor of management in the School of Business and Economics at North Carolina Agricultural and Technical State University in Greensboro, North Carolina. She earned her PhD in industrial and systems engineering with a concentration in human machine systems engineering from North Carolina Agricultural and Technical State University in 2003.
Phuong Tran is a senior electrical engineer of the U.S. Army Research Laboratory, Human Research and Engineering Directorate at Aberdeen Proving Ground, Maryland. Her works focuses on bone conduction communication research. She earned a bachelor’s degree in electrical engineering from Drexel University in 1989 and a master’s degree from Penn State University in 1994.
Kimberly A. Pollard is a research biologist in the Human Research and Engineering Directorate at the U.S. Army Research Laboratory in Aberdeen Proving Ground, Maryland. She earned her PhD in biology from the University of California, Los Angeles in 2009.
Tomasz Letowski is a senior research scientist at the Army Research Laboratory, Human Research and Engineering Directorate. He earned a PhD in acoustics and telecommunications from Wroclaw Technical University in 1973 and a DSc in technical sciences from Warsaw Technical University in 1986.
Garnett P. McMillan is biostatistician with the National Center for Rehabilitative Auditory Research in the VA Portland Health Care System. He earned his PhD in anthropology and MS in mathematics and statistics from the University of New Mexico in 2001, and completed a postdoctoral fellowship with the World Health Organization’s International Agency for Research on Cancer in 2003–2004.
