Abstract
This study examined the utility of five popular assessments of work as a calling. A large and diverse group of working adults completed the Calling Paragraph, Brief Calling Scale (BCS), Calling and Vocation Questionnaire (CVQ), Calling Scale (CS), and Multidimensional Calling Measure (MCM) at two time points, along with a face valid measure of having a calling (yes or no) and three work-related outcomes. All measures were found to be reliable; have strong test–retest reliability; and moderately to strongly correlate with work meaning, career commitment, and job satisfaction at Time 1 and Time 2. Confirmatory factor analyses revealed mixed evidence concerning the ability of all instruments to load onto one factor. The BCS and CVQ were the best predictors of having a calling, whereas the CS and MCM were the best predictors of work outcomes. The discussion highlights the complexities of each of these instruments in accurately assessing a calling versus a more global, positive work outlook. Recommendations are offered for researchers seeking to study work as a calling.
Over the last decade, research on what it means to have and live out a calling to a particular line of work has grown exponentially in the fields of industrial/organizational psychology and management. A calling is typically viewed as a type of work that is highly personally meaningful, prosocial in nature, and often arises as the result of an internal or external summons (Dik & Duffy, 2009; Duffy & Dik, 2013). A recent review by Duffy and Dik (2013) discussed findings from dozens of studies and noted that viewing one’s career as a calling has been consistently linked to positive work, well-being, and organizational outcomes, including career maturity, work meaning, career commitment, organizational commitment, life meaning, job satisfaction, and life satisfaction. These links are especially pronounced when individuals are living out the career to which they feel called. However, as noted by Duffy and Dik (2013) and several other scholars (Dobrow & Tosti-Kharas, 2012; Hagmaier & Abele, 2012), calling has been measured in a variety of ways (e.g., categorically, unidimensionally, and multidimensionally) using a variety of conceptual definitions of the construct. These differences in measurement and conceptualization of calling have “muddied the waters” for this emerging research area.
In this article, we seek to provide clarity on the measurement of calling that can inform future research conducted on the construct. A large and diverse sample of American adults completed the most popular calling instruments in the literature along with a variety of instruments assessing work outcomes. The same group of adults was surveyed again 3 months later. Evidence will be presented on (1) the normality, reliability, and temporal stability of calling as measured by each instrument; (2) the degree to which each of these instruments are measuring one underlying construct; (3) the ability of each instrument to accurately predict having a calling (yes or no); and (4) the relation of each instrument with key work outcomes typically associated with a calling (work meaning, career commitment, and job satisfaction) at two time points. In the following section, we review how calling has been conceptualized and measured in previous research and review general findings regarding how calling has been linked to healthy work and well-being outcomes.
Conceptualization and Measurement
Categorization
At a basic level, several studies have been completed assessing calling in a categorical fashion, assessing how well participants fit with a particular prompt concerning their attitudes toward work. The Work-Life Questionnaire (WLQ; Wrzesniewski, McCauley, Rozin, & Schwartz, 1997) was developed in one of the first empirical studies conducted on calling. The authors defined calling as work that is usually seen as socially valuable involving activities that may be, but not necessarily, pleasurable. The authors asserted that a person with a calling does not work for financial gain or career advancement but rather for the fulfillment that doing the work brings to the individual. In this initial study, participants were provided with three paragraphs that described work as a job (e.g., work primarily as means to an income), a career (e.g., work as a status ladder that meets achievement needs), or a calling and were asked to rate each paragraph on a 4-point scale ranging from very much, somewhat, a little, or not at all like me (scored 3–0). The questionnaire also included 18 true or false items, all of which except 3 appeared in at least one of the job/career/calling paragraphs. Responses to these items were examined to confirm expectations about the correlations of the items with paragraph scores, and the items were found to significantly and substantially correlate with their respective paragraphs.
Wrzesniewski, McCauley, Rozin, and Schwartz (1997) found that approximately one third of participants viewed their work as a calling and that those with a calling orientation reported significantly higher levels of life and job satisfaction. Using the same questionnaire, Cardador, Dane, and Pratt (2011) revealed significant relations between a calling orientation (regardless of the job one actually holds) and organizational identification and turnover intention. Specifically, those with calling orientations were attached to their organization because they viewed it as instrumental to the fulfillment of their calling. Another study using the WLQ found viewing work as a calling was related to the character strength zest (Peterson, Park, Hall, & Seligman, 2009), and also using the WLQ, Harzer and Ruch (2012) found that individuals who used four to seven signature strengths at work had higher levels of calling than those who used less than four strengths. In sum, studies using the WLQ have demonstrated its validity and link to positive work and well-being outcomes.
Unidimensional
Brief Calling Scale (BCS)
A recent review by Duffy and Dik (2013) identified the unidimensional BCS (Dik, Eldridge, Steger, & Duffy, 2012) as the most popular instrument used to assess calling. Dik, Eldridge, Steger, and Duffy (2012) conceptualized the presence of a calling in line with the three-part definition proposed by Dik and Duffy (2009) who defined calling as “a transcendent summons, experienced as originating beyond the self, to approach a particular life role (in this case work) in a manner oriented toward demonstrating or deriving a sense of purpose or meaningfulness and that holds other-oriented values and goals as primary sources of motivation” (p. 427). The BCS consists of 4 items and assesses the presence of and search for a calling; participants respond on a 5-point scale ranging from 1 (not at all true of me) to 5 (totally true of me). In a validation study of the BCS, Dik et al. (2012) conducted a multitrait–multimethod analysis with 134 undergraduate students and 365 informants who answered the same instruments in terms of their perception of the students. Scores on the BCS correlated positively with scores on other measures of calling and with the informant reports of participants’ perceptions of calling. BCS scores also correlated in the predicted directions with scores on well-being criterion variables and other work-related variables, establishing evidence of convergent and discriminant validity.
Two studies with diverse samples assessed the prevalence of a calling with the BCS, and both found that over 40% of the participants felt that the statement “I have a calling to a particular kind of work” was mostly or totally true of them (Duffy, Allan, Autin, & Bott, 2013; Duffy & Sedlacek, 2010). Other studies have found no meaningful differences across racial or class groups in experiencing a calling, but results have demonstrated that students seeking advanced professional degrees were more likely to feel a career calling (Duffy & Sedlacek, 2010). A host of studies using the BCS have consistently found the perception of a calling to correlate with well-being outcomes such as meaning in life and life satisfaction (Duffy et al., 2013; Duffy, Manuel, Borges, & Bott, 2011c; Duffy & Sedlacek, 2010; Steger & Dik, 2009; Steger, Pickering, Shin, & Dik, 2010; Torrey & Duffy, 2012).
Scores on the BCS have also been found to strongly correlate with work-related outcomes. For example, with a sample of 255 college students, Dik, Sargent, and Steger (2008) found that a sense of calling positively correlated with career decision self-efficacy (CDSE) and outcome expectations (cf., Hirschi & Hermann, 2013; Steger & Dik, 2009). In another study, perceiving a calling was strongly correlated with vocational self-clarity, comfort with one’s career choice, choice-work salience, and career decidedness (Duffy & Sedlacek, 2007). Along with the noted studies, several others have exhibited similar positive correlations between the perception of calling and vocational development, career commitment, work meaning, job satisfaction, work engagement, and occupational identity (Duffy, Bott, Allan, Torrey, & Dik, 2012b; Duffy, Dik, & Steger, 2011b; Hirschi, 2011; Hirschi & Hermann, 2012). To summarize, the BCS is a widely used measure to assess calling, plainly assessing the presence of calling in an individual’s life, and scores on the BCS have been found to positively relate to a plethora of well-being and career outcomes.
Calling Scale (CS)
Another unidimensional measure of calling is the CS developed by Dobrow and Tosti-Kharas (2011). In the instrument development study, calling was defined as a consuming, meaningful passion people experience toward a domain. Dobrow and Tosti-Kharas (2011) tested the reliability and validity of the scale with 1,500 individuals from four samples in different domains: a 7-year, four-wave longitudinal study in the music domain; a 6-week, two-wave longitudinal study in the art domain; and two single-wave studies in the general business and management domains. Twelve items were included in the final version of the scale, and respondents used a 7-point response scale ranging from 1 (strongly disagree) to 7 (strongly agree). Scores on the scale correlated in the predicted directions with other calling measures and career-related constructs, which demonstrated convergent and discriminant validity.
Across four different samples, scores on the CS were found to positively correlate with work engagement, job involvement, intrinsic and extrinsic motivation, religiosity, domain satisfaction, career-related self-efficacy, career insight, and professional association involvement (Dobrow & Tosti-Kharas, 2011). In another longitudinal study of musicians using the same instrument, type of musical involvement and college major were significant predictors of initial calling (Dobrow, 2013). Specifically, noninstrumentalists with music-oriented majors started off with higher callings. In the same study, individuals who were more behaviorally involved and felt higher social comfort in the calling domain experienced higher levels of calling early (Dobrow, 2013). Interestingly, in contrast to the positive findings noted previously, Dobrow and Tosti-Kharas (2012) used the CS and found that those with a calling were less receptive over time to advice from trusted mentors that threatened their sense of calling; this finding points to a potential negative side of having a calling. In sum, the CS has been found to be reliable and valid and correlated with positive outcomes as well as a potential negative outcome of career foreclosure.
Multidimensional
Calling and Vocation Questionnaire (CVQ)
The CVQ (Dik et al., 2012) is a multidimensional measure of calling comprising 24 items and is divided into six subscales (Transcendent Summons Presence/Search, Purposeful Work Presence/Search, Prosocial Orientation Presence/Search) that form CVQ-Presence and CVQ-Search scores. Calling was conceptualized in line with Dik and Duffy’s (2009) three-part definition of calling noted previously. A four-level response scale ranging from 1 (not at all true of me) to 4 (absolutely true of me) was used to eliminate a neutral or midpoint response option. The instrument was developed in two studies. The first study utilized exploratory and confirmatory factor analyses in a cross-validated split-sample approach with 456 undergraduate students, and the CVQ was found to correlate in the hypothesized directions with work hope, prosocial motivation, life meaning, and the search for meaning, thus establishing convergent and discriminant validity. In Study 2, the same study that validated the BCS, a multitrait–multimethod analysis was used with 134 undergraduate students and 365 informants, and again, the CVQ demonstrated convergent and discriminant validity (Dik et al., 2012).
Numerous studies have used the CVQ to examine the relation of calling to work and well-being outcomes. The CVQ was administered to a sample of 855 Canadian college students, and a sense of calling was found to positively correlate with CDSE and outcome expectations (Domene, 2012). Among another sample of undergraduate students, the presence of a calling weakly correlated with life satisfaction and moderately correlated with religiousness and meaning in life (Duffy, Allan, & Bott, 2012a). Academic satisfaction and meaning in life were also found to fully explain the link between calling and well-being. Another study explored the link between calling and academic satisfaction and revealed that calling moderately correlated with academic satisfaction, and this link was mediated by CDSE and work hope (Duffy, Allan, & Dik, 2011a). Finally, in a study of working adults, Duffy, Dik, and Steger (2011b) found that approximately half of the sample identified as having a calling and that calling weakly correlated with withdrawal intentions and moderately correlated with career commitment, job satisfaction, and organizational commitment. The authors found career commitment to fully mediate the calling–job satisfaction relation, partially mediate the calling–organizational commitment relation, and act as a suppressor in the relation between calling and withdrawal intentions. As evidenced by the results of these studies, scores on the CVQ mirror findings from studies utilizing alternate measures of calling.
Multidimensional Calling Measure (MCM)
A second multidimensional scale used to measure calling is the MCM (Hagmaier & Abele, 2012). Hagmaier and Abele conceptualized calling as a career to which one strongly identifies, contributes to a sense of meaning, and is guided by a transcendent force. Participants respond to this 9-item measure of calling on a 6-point scale ranging from 1 (strongly disagree) to 6 (strongly agree). The MCM consists of three subscales: Identification and Person–Environment Fit (IP), Transcendent Guiding Force (TGF), and Sense and Meaning and Value-Driven Behavior (SMVB). The MCM development occurred over four different studies. Study 1 was qualitative and established five core categories of the experience of calling, and Study 2 validated the core categories and developed a quantitative measure of calling. Studies 3 and 4 tested the quantitative measure by conducting confirmatory factor analyses and established convergent and criterion validity in German and U.S. samples.
Hagmaier and Abele (2012) found all three subscales of the MCM to correlate with job satisfaction in a German sample. The same correlations were found in a U.S. sample, but a measure of burnout was also included, with which the MCM had a negative relation. The TGF subscale was most closely related to calling, and the IP subscale was more related to job satisfaction. In both studies, the MCM positively correlated with the BCS. Scores on the MCM did vary across samples; whereas the U.S. sample endorsed SMVB the most, those in the German sample were most likely to endorse IP. These results reiterate the consistently found positive relation between calling and job satisfaction and also provide evidence for cross-cultural differences in the experience of calling.
The Present Study
In this study, a large and diverse group of adults at two time points will complete the five most used/valid calling instruments (Calling Paragraph, BCS, the CS, CVQ, and MCM) at two points 3 months apart. Using this data, we will first examine normality, reliability, and temporal stability of each instrument. Second, using confirmatory factor analysis, we will explore whether the instruments are measuring one underlying construct. Third, logistic regression will be used to examine the ability of each instrument to accurately predict having a calling (yes or no). Fourth, scores on each instrument will be correlated with career commitment, work meaning, and job satisfaction at time 1 and 3 months later. It is hoped that findings from this study will provide clarity in the assessment of work as a calling.
Method
Participants
The sample was comprised of 897 adults, 473 (53%) of whom were female, 418 (47%) of whom were male, and 6 of whom were transgender (<1%); their mean age was 33.27 (SD = 10.79). Participants were recruited through Amazon’s Mechanical Turk (MTurk), a participant recruitment website where individuals usually earn between US$0.20 and US$0.50 per 30 min for completing questionnaires. For this study, we compensated participants US$0.50 for completing our questionnaire. Participants were predominantly White (n = 708, 79%), followed by African American (n = 62, 7%), Hispanic/Latino/a (n = 47, 5%), Asian/Asian American (n = 36, 4%), Asian Indian (n = 18, 2%), American Indian (n = 11, 1%), Pacific Islander (n = 2, <1%), Arab American (n = 1), and other (n = 11, 1%); two participants did not respond to this question. Regarding the highest level of educational attainment, 6 (1%) reported some high school, 78 (9%) high school, 29 (3%) trade/vocational school, 294 (33%) some college, 344 (38%) college, and 143 (16%) graduate/professional school; 3 participants did not respond to this question. Participants identified their social class as follows: lower class (n = 55, 6%), working class (n = 331, 37%), middle class (n = 431, 48%), upper middle class (n = 68, 8%), and upper class (n = 5, <1%); seven participants did not respond to this question. For participants who identified their annual household income (n = 411), the average was US$46,185.
Instruments
Having a calling
Participants were asked the following item: People sometimes describe having a calling in life, often to a specific job or career. Do you have a calling? They were given options of yes and no. Totally 443 participants (49%) answered yes on this item and 454 (51%) answered no.
Calling Paragraph
In Wrzesniewski et al.’s (1997) study, the authors presented participants with three paragraphs adhering to work as a job, career, or calling. These same three paragraphs were also presented in this study, and participants were asked to rate how much they are like the person described in each paragraph on a 4-point Likert-type scale ranging from very much to not at all. The Calling Paragraph was as follows: “C’s work is one of the most important parts of his life. He is very pleased that he is in this line of work. Because what he does for a living is a vital part of who he is, it is one of the first things he tells people about himself. He tends to take his work home with him and on vacations, too. The majority of his friends are from his place of employment, and he belongs to several organizations and clubs relating to his work. Mr. C feels good about his work because he loves it, and because he thinks, it makes the world a better place. He would encourage his friends and children to enter his line of work. Mr. C would be pretty upset if he were forced to stop working, and he is not particularly looking forward to retirement.”
BCS
The degree to which participants perceived a calling in their career was assessed by the Presence subscale of the BCS (Dik et al., 2012). Participants were presented with the following prompt: “Some people, when describing their careers, talk about having a ‘calling.’ Broadly speaking, a ‘calling’ in the context of work refers to a person’s belief that she or he is called upon (by the needs of society, by a person’s own inner potential, by God, by Higher Power, etc.) to do a particular kind of work. Although at one time most people thought of a calling as relevant only for overtly religious careers, the concept is frequently understood today to apply to virtually any area of work.” Using this prompt, the 2 items on this subscale were “I have a calling to a particular kind of work,” and “I have a good understanding of my calling as it applies to my career.” Items were answered on a 5-point scale ranging from (1) not at all true of me to (5) totally true of me, and scores on the 2 items were added for a total presence of calling score. This scale has been used in numerous previous studies with working adults (e.g., Duffy & Autin, 2013; Duffy et al., 2013; Torrey & Duffy, 2012) and has demonstrated adequate convergent and divergent validity and reliability (Dik et al., 2012). For instance, in the validation study, Dik et al. (2012) found the BCS scores to correlate positively with scores on other measures of calling and with informant reports of participants’ perceptions of calling. BCS scores have also been correlated with aspects of career maturity and well-being in previous studies (Duffy & Sedlacek, 2007, 2010). In this study, the correlation of these 2 items was .79 at Time 1 and .83 at Time 2.
CVQ
The CVQ (Dik et al., 2012) was developed based on Dik and Duffy’s (2009) three-part definition of calling. The CVQ is a 24-item questionnaire that assesses both the presence of and search for a calling. In this study, only the 12-item Presence Scale was used, which contains 3- and 4-item subscales assessing Transcendent Summons, Purposeful Work, and Prosocial Orientation. Example items include “I was drawn by something beyond myself to pursue my current line of work,” “My work helps me live out my life purpose,” and “My work contributes to the common good.” Participants were asked to respond on a 4-point scale ranging from not at all true of me to absolutely true of me. In their instrument development study, Dik et al. (2012) found an internal consistency of .88 and a 1-month test–retest reliability of r = .75. Calling was also found to correlate in the expected directions with life meaning, work hope, and two other measures of calling. In this study, the estimated internal consistency of the Presence Subscale scores was .91 at Time 1 and .91 at Time 2.
MCM
The MCM (Hagmaier & Abele, 2012) is an instrument that assesses calling according to three dimensions: IP with one’s work, SMVB, and a TGF. These three dimensions—identification, meaning, and guiding force—are each assessed by 3 items on the MCM, with the total scale containing 9 items. Example items include, “I am passionate about doing my job,” “My job helps make the world a better place” and “An inner voice is guiding me in doing my job.” Participants answered each item on a 6–point Likert-type scale ranging from strongly disagree to strongly agree. In their instrument development study, Hagmaier and Abele (2012) found that the three-factor model was a good fit with the data, that the subscales and overall scale have strong internal consistency reliably and test–retest reliability over a 3.5-month period, and that the measure was equally valid with German and English samples. The authors also found the MCM to strongly correlate with the BCS (r = .61) and to be predictive of job satisfaction and work disengagement above and beyond demographic variables. In this study, the estimated internal consistency reliability of the total scale scores was .92 at Time 1 and .92 at Time 2.
CS
The degree to which participants felt a calling, defined as a “consuming, meaningful passion people experience toward a domain” (p. 1001), was assessed by Dobrow and Tosti-Kharas’s (2011) CS. The scale consists of 12 items that participants answered on a 7-point Likert-type scale ranging from strongly disagree to strongly agree. The authors recommend that items be answered according to a specific domain, and in the instrument development study, the domains of music, art, business, and management were used. As such, example items were “I am passionate about playing my instrument/singing/engaging in my artistic specialty/business/being a manager” and “I would sacrifice everything to be a musician/an artist/in business/a manager.” In this study, we adapted the instrument to be applicable to those in all occupations. For example, these 2 items were reworded to be “I am passionate about my work” and “I would sacrifice everything to do my job.” In the instrument development study using multiple samples at different time points, Dobrow and Tosti-Kharas (2011) found the scale to have strong internal consistency reliability, strong test–retest reliability, and to be equally valid across different occupational groups. Additionally, this study and several others (Dobrow, 2013; Dobrow & Tosti-Kharas, 2012) have found scores on this measure to correlate in the expected directions with other calling measures, intrinsic motivation, self-efficacy, job involvement, behavioral involvement, and choosing careers related to one’s calling. In this study, the estimated internal consistency reliability of scale scores was .95 at Time 1 and .96 at Time 2.
Work meaning
The degree to which participants felt their work was meaningful was measured by the Work as Meaning Inventory (WAMI; Steger, Dik, & Duffy, 2012). The WAMI is composed of 10 items with three subscales: positive meaning (4 items), meaning making through work (3 items), and greater good motivations (3 items). Example items from each subscale are, “I understand how my work contributes to my life’s meaning,” “My work helps me better understand myself,” and “I know my work makes a positive difference in the world.” Items were answered on a 7-point Likert-type scale ranging from strongly disagree to strongly agree. Steger et al. (2012) found scores on the total scale to have strong internal consistency reliability (α = .93) and to correlate in the expected directions with career commitment and job satisfaction. In this study, the estimated internal consistency reliability for the total scale scores was .94 at Time 1 and .95 at Time 2.
Career commitment
The Career Commitment Scale (Blau, 1985, 1988) was used to measure one’s level of commitment to one’s occupation or career field. This is a 7-item scale with example items being, “I like my job too well to give it up” and “My current job is an ideal line of work.” Items were answered on a 5-point scale ranging from strongly disagree to strongly agree, and scores on the 7 items were added together to get a total career commitment score. The scale has been found to be reliable (e.g., αs of .83 and .84 in two samples reported by Blau, 1988), and convergent and discriminant validity has been demonstrated across multiple samples and time points (Blau, 1985, 1988). In this study, the estimated internal consistency reliability of scale scores was .91 at Time 1 and .92 at Time 2.
Job satisfaction
The degree to which participants felt satisfied with their current job was measured by a 5-item scale developed by Judge, Locke, Durham, and Kluger (1998). Example items are, “I feel fairly well satisfied with my present job,” “Most days I am enthusiastic about my work” and “I find real enjoyment in my work.” Items were answered on a 7-point Likert-type scale ranging from strongly disagree to strongly agree. Strong internal consistency reliability was found for the 5-item scale by Judge et al. (1998), and scores on the instrument were found to correlate in the expected directions with well-established measures of job satisfaction and core self-evaluations. Additionally, previous studies have linked scores on this instrument to the experience of a calling (Duffy et al., 2012b, 2013). In this study, the estimated internal consistency reliability of scale scores was .91 at Time 1 and .92 at Time 2.
Procedure
The data used for this study were taken from the Career and Life Longitudinal Study (CALLS). The CALLS is a five-wave survey that gathered data from American working adults over a 1-year period in 3-month intervals. The primary data used for this study are taken from the first wave, with the second wave being used to examine the 3-month test–retest reliability of the instruments. A total of 1,129 participants completed Wave 1 of this study. Given the large sample size, we removed 232 individuals who did not have complete data on any of the scales used in the analyses. This resulted in a final total sample of 897 adults. Of this group, between 360 and 373 participants completed the same instruments 3 months later. This represents approximately 41% of the sample from Time 1. T-tests were conducted to examine group differences in the five calling instruments plus having a calling for those who did or did not complete both waves of the survey. No significant differences were found on any of these measures, indicating that this sense of a calling was consistent for those who completed one or both waves.
For all waves of this survey, participants were recruited through MTurk, an online participant source. MTurk has been used increasingly by researchers in recent years, and several studies have demonstrated the quality of data collected from this participant source. Given the popularity of MTurk, several studies have examined the reliability and validity of data collected through this service (see Buhrmester, Kwant, & Gosling, 2011; Goodman, Cryder, & Cheema, 2013). Compared to Internet and community samples, MTurk samples are equally reliable and diverse, and better represent the population than sample of college students (Buhrmester et al., 2011; Goodman et al., 2013; Paolacci et al., 2010).
Regarding validity, concerns have been raised about individuals’ motivation to participate in studies on MTurk, given the relatively low compensation rate. Addressing these concerns, Buhrmester, Kwant, and Gosling (2011) found that although some MTurk users participated both for financial gain and for leisure, and that data quality was similar regardless of compensation rate. Additionally, Goodman et al. (2013) found that many classic studies in areas such as behavioral economics and decision making replicate with MTurk data. In light of the recent data demonstrating MTurk as a valid, generalizable, and cost-effective participant source, it was deemed an appropriate method of recruitment for the present study.
On the MTurk website, participants have “worker” accounts and researchers have “requester” accounts. Requesters create “hits” leading to tasks. It is also possible to set limitations as to who can participate in a given hit; thus, we limited the access of our questionnaire to working adults in the United States. The hit we created consisted of a link to our questionnaire; after respondents finished the questionnaire, they were given a code, which they then entered into MTurk in order to receive compensation (US$.50). To collect follow-up data, we utilized MTurk’s “Bonus” feature, which allows requesters to award minimum bonus compensation of US$0.01 to workers, along with a message. We used this function to contact workers with the link to the next wave of the study, awarding them US$0.01 with instructions for completing the survey. All participants from the first wave were contacted through the Bonus system 3 months after they completed the original questionnaire. Reminder messages (also using the bonus system) were sent approximately 1 and 2 weeks after the initial bonus message; participants received three bonus messages in total to remind them to take the second round of surveys. After completing the questionnaire link contained in the bonus message, participants were awarded US$0.50 for completing the survey.
Results
Normality, Reliability, and Temporal Stability
The normality, reliability, and temporal stability of each instrument were assessed. As seen in Table 1, all multi-item instruments had strong estimated internal consistency reliability/correlations ranging from .79 to .95. None of the measures had skewness levels that exceeded one, though the BCS and Calling Paragraph each had kurtosis levels over one in the negative direction. This suggests that there was more of a peak in the proportion of respondents answering at the low end of each scale (not feeling a calling) than would be expected from a normal curve. The temporal stability of each instrument was assessed by correlating scores from an identical survey taken 3 months later. Between 360 and 373 participants completed the same instrument at both time points. As seen in Table 1, all instruments had strong test–retest reliabilities, ranging from .61 to .75.
Reliability and Normality Information for Calling Instruments.
aThese are correlations.
One Underlying Construct
The degree to which all five instruments were measuring one underlying construct was assessed. Confirmatory factor analysis was conducted to examine (a) a model where scores on all five instruments were hypothesized to be correlated and (b) a model where scores on all five instruments were hypothesized to be correlated and load on a higher order factor. Each calling measure was represented by its own construct, except for the Calling Paragraph, since it is only 1 item. For the multidimensional measures (i.e., the CVQ and the MCM), subscales were calculated and used to load onto each factor. The BCS was not parceled, since it only contains 2 items. For the CS, items were parceled using the method recommended by Weston and Gore (2006). Specifically, parcels were created by conducting an exploratory factor analysis for the items in the CS and assigning items to parcels in countervailing order according to the magnitude of factor loading, in order to have relatively equal loadings for each parcel. Three parcels were created for the scale.
To evaluate the models, we used confirmatory factor analysis with maximum likelihood estimation in AMOS 18 (Arbuckle, 2007). Indices of fit that minimized the likelihood of Type I and Type II error were chosen (Hu & Bentler, 1999). These included the chi-square test (χ2), the comparative fit index (CFI), the root mean squared residual (SRMR), and the root mean square error of approximation (RMSEA). A significant χ2 can indicate a poor fitting model, but this test is not reliable in larger samples (Tabachnick & Fiddel, 2013). In determining the cutoff criteria for the remaining indices, we followed recommendations by Hu and Bentler (1999). Recommended ranges are as follows: ≥.90 as acceptable fit to ≥.95 as excellent fit for CFI, ≥.10 as marginal fit to ≥.05 as excellent fit for SRMR, and ≥.10 as marginal fit to ≥.05 as excellent fit for RMSEA. Some scholars argued that researchers should be careful when using these criteria as strict cutoffs and should consider other factors, such as sample size and model complexity, when judging the fit of models (Weston & Gore, 2006).
Correlational model
In this model, the constructs were only allowed to correlate, rather than load onto a higher order factor. The measurement model initially had acceptable fit indices: χ2(45) = 562.66, p < .001, CFI = .95, SRMR = .05, and RMSEA = .11, p < .001. However, several of the subscales of the different calling measures (MCM—TGF and CVQ—Transcendent Summons; MCM—Sense and Meaning and CVQ—Purposeful Work) had very similar content. The fit of the model improved after allowing the errors of these subscales to correlate, χ2(43) = 460.44, p < .001, CFI = .96, SRMR = .04, and RMSEA = .10, p < .001. Each item/parcel loaded on its respective factor at a value of .77 or higher, and the factors correlated at values of .54 or higher. The strong CFI values indicate high correlations among the model constructs, however, the borderline RMSEA suggests that the model may be overly complex.
Higher order model
The higher order model built upon the measurement model by having each construct load onto a second-order calling factor. The same two subscale errors were allowed to correlate as in the measurement model. The fit of the higher order model was degraded compared to the measurement model, χ2(48) = 645.28, p < .001, CFI = .94, SRMR = .07, and RMSEA = .12, p < .001. Each calling measure loaded on the construct at a value of .67 or higher. The higher order factor explained 39% of the variance in the BCS, 85% in the CS, 97% in the MCM, and 67% in the CVQ. Analogous to the correlational model, the strong CFI indicates strong correlations among the constructs, but the unacceptable RMSEA value indicates the model is overly complex.
Relation to Having a Calling
Participants were asked a simple, face valid, yes/no question concerning having a calling, “People sometimes describe having a calling in life, often to a specific job or career. Do you have a calling?” Of our sample, 443 (49.4%) answered no to this item and 454 (50.6%) answered yes. We used logistic regression to examine the relation between each of the five instruments to having a calling and determine the likelihood that each of the instruments accurately predicted the presence of a calling. All predictor variables were standardized to allow for comparisons across each regression equation. As seen in Table 2, the χ2, Nagelkerke R 2, B, and odds ratios are all included. All models were significant according to the χ2. The BCS and CVQ were the best predictors of having a calling, followed by the MCM, CS, and Calling Paragraph. The odds ratios provided additional evidence of the link between these variables. Because all variables were standardized, the odds ratios suggest that for one standard deviation increase in the instrument, the odds of having a calling are multiplied by a specific amount. For example, with a standard deviation increase in the BCS, the odds that a participant has a calling is 8.08 times more likely.
Binary Logistic Regression Examining Relation of Calling Instrument to Having a Calling.
Note. CI = confidential interval. Scale scores are standardized.
Relation to Work Outcomes
Finally, the correlations of each of the five scale scores to work meaning, career commitment, and job satisfaction were examined at baseline and 3 months later. As seen in Table 3, moderate to strong correlations were found between each CS and each of these work outcomes at baseline. Specifically, the Calling Paragraph, the CS, and the MCM each strongly correlated with all three work outcomes. Also as seen in Table 3, the correlations of each instrument to these work outcomes 3 months later were examined. Although the strength of the correlations between the instruments and work outcomes was diminished from baseline, moderate to strong correlations were still found between scores on each instrument and Time 2 work meaning, career commitment, and job satisfaction. The only exception was a weak correlation between baseline BCS score and Time 2 job satisfaction (r = .25).
Descriptive Information and Correlations of Calling With Work and Well-Being Outcomes.
Note. All correlations significant at the p < .01 level.
Discussion
Research on the study of work as a calling has grown dramatically over the last 10 years, and it is important to explore the complexities with assessing this construct. The goals of the current study were to examine (1) the normality, reliability, and temporal stability of calling as measured by each instrument, (2) the degree to which each of these instruments measured one underlying construct, (3) the ability of each instrument to accurately predict having a calling (yes or no), and (4) the relation of each instrument to work meaning, career commitment, and job satisfaction at two time points.
First, we examined the normality, reliability, and temporal stability of each instrument. The high estimated internal consistency/correlations (for the 4 multi-item instruments) and test–retest reliability for each instrument suggests that all five measures are reliable and are relatively stable over time. However, the BCS and Calling Paragraph demonstrated negative kurtosis, indicating that participants disproportionality answered at the low end of each scale. Because only half of our total sample noted having a calling, these disproportionate responses may be capturing participants who simply do not feel called to their work at any level. The BCS, in particular, was the instrument most correlated with the face valid having a calling item, perhaps suggesting that this instrument is getting directly at the presence of a calling. The large group of participants who do not feel a calling likely answered not at all true of me for both items on the BCS.
To begin assessing for construct validity, a confirmatory factor analysis was used to determine the extent to which all five instruments were measuring one underlying construct. The higher order calling model approached acceptable fit and each calling measure loaded onto the higher order calling construct. This result supports the notion that diverse research on calling using different measures is likely tapping into the same basic construct. However, the correlational model, which did not include a higher order factor, fits slightly better. This occurred despite the very high correlations between some of the constructs and the redundancy in some subscales of the different measures. Therefore, although the different measures of calling appear to be measuring aspects of an underlying construct, different conceptualizations and operationalizations of calling have resulted in different components (e.g., prosocial impact, passion, etc.) being included in some measures but not others. This likely has led these scales to measure slightly different constructs that may better operate independently. Despite these slightly differing conceptualizations of calling, the evidence of each generally reflecting the same higher order construct provides moderate assurance that studies using different instruments may be synthesized with caution.
Given the evidence that each is potentially measuring independent constructs, however, it is important to understand which of these best reflects actually having a calling. This was assessed through logistic regression.
The best predictor of answering yes to the face-valid calling question was the BCS, followed by the CVQ, MCM, CS, and Calling Paragraph. Interpretation of this finding may lie in the item content of each of the scales.
As noted previously, although each of the scales are likely measuring one higher order factor, they are reflecting different theoretical explanations and definitions of calling. This may be important in explaining why the BCS was more effective than other measures in predicting having a calling. When examining the item content of these measures, the BCS is the broadest and most face valid of all the measures. It does not measure particular facets of having a calling but simply the extent to which one feels called in general. Although a brief conceptualization of calling is explained in the instructions for the BCS, it does not include any theory-specific definitions of calling. Thus, participants are able to answer the question in a way which satisfies their individual conceptualization of a calling and reflects the higher order factor that all of the calling instruments attempt to measure.
The CVQ was the next best predictor of having a calling. Similar to the BCS, the CVQ contains several items that demonstrate face validity such as, “I am pursuing my current line of work because I believe I have been called to do so.” This inclusion of items with greater face validity may increase the utility of measuring the calling construct as opposed to correlates of calling. It may also be that the theoretical basis for the CVQ is a more sound reflection of the underlying construct. Although the MCM, CS, and Calling Paragraph all significantly predict having a calling, they were weaker predictors, indicating that they may actually be measuring correlates of the higher order factor. For example, the item content of the CS largely targets one’s affective associations with their work (e.g., I enjoy my work more than anything else) and work salience (e.g., My work is always on my mind in some way). Although one who is called to a particular job may very well enjoy it immensely and think of it often, this experiential description may not be indicative of calling itself but something closely related. The stronger predictive quality of the CVQ may indicate that the subscales are closer than the other measures at reflecting the true facets of the higher order calling construct. Although the BCS may be useful for predicting simply having a calling, it is important to measure the specific aspects of the higher order construct in order to understand the development of calling as well as how it operates across groups. These results indicate that, of the extant instruments, the CVQ is the closest to accurately representing the components that contribute to the lived experience of having calling.
Finally, we examined the relations between the five calling instruments and work meaning, career commitment, and job satisfaction at baseline and 3 months later. Matching results from previous studies (Duffy et al., 2012b, 2013), apart from the BCS-job satisfaction correlation, each calling instrument moderately to strongly correlated with each work outcome at both time points. This suggests that all of the measures may be useful in predicting positive work outcomes. Some instruments, however, had stronger correlations than others. This may be related to the previously stated speculation that the MCM, CS, and Calling Paragraph may be a better measure of outcomes and correlates of calling than the calling construct itself. For example, the MCM correlates more strongly with work meaning than it does with having a calling, and it is the strongest of all the measures in predicting work meaning. Thus, perhaps the MCM is more effectively detecting work meaning rather than calling. Likewise, although the CS was one of the weakest predictors of having a calling, it correlated strongly with career commitment. This indicates that, although the CS does predict calling to an extent, it may be more effective in measuring career commitment. Conversely, whereas the BCS and CVQ were the best predictors of having a calling, they had weaker correlations to work outcomes than the other instruments. This may imply that the BCS and CVQ best capture the pure presence of a calling but are limited in being a weaker predictor of work outcomes.
Taken together, these results provide insight into the validity of existing calling measures. Although each measure generally reflected a unified higher order construct, the evidence suggests that the theoretical underpinnings of each instrument were distinct enough to merit evaluation of which maintained the strongest construct validity. The best measurement of having a calling was the largely atheoretical BCS, though it is limited in its ability to provide information on specific facets of calling. The best multidimensional predictor of having a calling was the CVQ. Results indicated that the remaining instruments may be better utilized in predicting outcomes of calling rather than calling itself. These results provide direction regarding which instruments to use in future research as well as demonstrate the need to further identify factors comprising the unexplained variance in the higher order construct.
Practice Recommendations
The findings from this study may have important implications for how future researchers and practitioners assess work as a calling. Results suggest that all of the measures included in this study are more similar than different and come close to reflecting one underlying construct. Apart from the moderate correlations of the BCS with the Calling Paragraph and CS, all of the scales were strongly correlated with one another. Each also significantly predicted having a calling and moderately to strongly predicted work meaning, career commitment, and job satisfaction at baseline and 3 months later. Thus, all of the examined instruments may be helpful in assessing calling and closely related constructs, such as meaning.
The results from this study, however, also indicate that there are important distinctions among these measures, particularly in regard to their potential utility in assessing calling versus predicting positive work outcomes of calling. Results suggest that the BCS and CVQ may be the most useful instruments for assessing actually having calling. For practitioners who simply want to know whether or not their client feels called, the BCS might be the most appropriate option. As noted previously, this scale was most reflective of the overarching construct of calling and allows flexibility in how calling is conceptualized. For clinicians seeking to learn more about specific aspects of calling in a client’s life, results indicate that the CVQ might be the best option. With this scale, clinicians can get a sense of what areas may be targeted to increase the client’s sense of calling. For example, the client may have a clear presence of a Transcendent Summons but may still be searching for opportunities to help others through their work. This information might guide practitioners in creating an intervention (in this case finding opportunities for prosocial behavior) to promote the client’s sense of calling.
Limitations
A number of limitations should be taken into consideration when interpreting the results of this study. First, although the study has strength in examining calling longitudinally, there were only two time points and only 3 months apart. In the future, it will be important to study calling over a longer period of time in order to better inform the predictive validity of the calling instruments. Second, it is important to take into consideration that about 80% of our participants identified as White, and over half had obtained at least an undergraduate degree and reported a middle- or upper-middle-class background. Thus, our sample was relatively privileged and, in the future, it will be important to gather more data on ethnic minorities and those who report a lower socioeconomic background. Third, participation in this study was limited to working adults which resulted in the exclusion of unemployed individuals and may have contributed to the relatively privileged nature of the sample. Future studies that look at more diverse populations, such as the unemployed, will help scholars to better understand how the construct of calling functions across groups.
Fourth, our sample was collected exclusively from the United States. Future studies might explore the implications of culture on the measurement of calling and its outcomes; this might include expanding the examination of calling measurement to outside the United States as well as exploring calling across cultures within the United States. Fifth, although this study provided further insight into the facets that comprise calling, more work is needed to tease apart correlates of calling from the construct itself. In addition to quantitative methods utilized in this study, future examination of calling might implement qualitative data, especially with regard to establishing a more crystallized theoretical definition of calling. Sixth, the data for this study were collected via the online data collection service MTurk, which poses problems in how we understand response rates and selection bias as a result of who was able to participate in the study. Specifically, since geographic location was not assessed, we were unable to examine potential geographic differences among the constructs in this study. Finally, our attrition rate was relatively high. Only 41% participants from Time 1 followed up at Time 2. Although there were no differences in calling scores among those who completed one versus two waves, certain personality characteristics not measured in this study may be more prevalent in those who completed both waves.
Footnotes
Declaration of Conflicting Interests
The author(s) declared no potential conflicts of interest with respect to the research, authorship, and/or publication of this article.
Funding
The author(s) received no financial support for the research, authorship, and/or publication of this article.
