Abstract
Knowing yourself requires knowing not only what you are like in general (trait self-knowledge) but also how your personality fluctuates from moment to moment (state self-knowledge). We examined this latter form of self-knowledge. Participants (248 people; 2,938 observations) wore the Electronically Activated Recorder (EAR), an unobtrusive audio recorder, and completed experience-sampling self-reports of their personality states four times each day for 1 week. We estimated state self-knowledge by comparing self-reported personality states with consensual observer ratings of personality states coded from the EAR files, which formed the criterion for what participants were “actually” like in the moment. People had self-insight into their momentary extraversion, conscientiousness, and likely neuroticism, suggesting that people can accurately detect fluctuations in some aspects of their personality. However, the evidence for self-insight was weaker for agreeableness. This apparent self-ignorance may be partly responsible for interpersonal problems and for blind spots in trait self-knowledge.
Keywords
Imagine that one frazzled moment before your morning coffee, you contemptuously order your romantic partner to get out of your face when he or she attempts to talk to you while you are rushing to leave for work. What happens next might depend on whether you are aware of how you just acted. If you realize in the moment that you just acted like a jerk, you can apologize and try to make amends right away. Alternatively, if you continue with your day, oblivious to how you acted and failing to do anything about it, this might cause resentment to brew over time. Here, we tackle a critical piece of the self-knowledge puzzle: Do people know how they are acting in the moment?
Self-knowledge is defined as the degree to which a person’s self-views reflect what they are really like (Vazire & Carlson, in press; T. D. Wilson, 2009). Most previous self-knowledge research has focused on how well people know what they are typically like (trait self-knowledge), showing that there are bright spots and blind spots in people’s trait self-knowledge (Vazire, 2010; Vazire & Mehl, 2008). A related form of self-knowledge—and the focus of this article—is whether people know what they are like from one moment to the next (state self-knowledge). In other words, do people have insight into their personality states (i.e., thoughts, feelings, and behaviors over shorter periods of time; Fleeson, 2001)?
Although state self-knowledge has been a largely neglected phenomenon (cf. Gosling, John, Craik, & Robins, 1998), there are several reasons why it is important to examine whether people know what they are like from moment to moment. First, state self-knowledge may help people understand what they are like in general (trait self-knowledge). If the first step to understanding your general pattern of behaviors is the ability to recognize the instances that form the pattern, trait-level blind spots (e.g., not knowing that you are a jerk) may arise in part from a lack of awareness of one’s behavior in the moment (e.g., not realizing when you are being rude). Thus, identifying blind spots in state self-knowledge can help us understand how to improve both state and trait self-knowledge.
Second, state self-knowledge may pave the way for more comprehensive and contextualized forms of self-knowledge. Knowing only a person’s traits has been described as the “psychology of the stranger” (McAdams, 1995, p. 380), with deeper knowledge of a person coming from an understanding of the dynamic, contextual influences on fluctuations in the person’s personality states (McAdams, 1995; Vazire & Carlson, in press; R. E. Wilson & Vazire, 2015). When the person you seek to know is yourself, this implies that trait self-knowledge is a relatively rudimentary form of self-knowledge. To really know yourself, it is also important to understand the idiosyncratic influences (e.g., goals, social roles, biological states) that cause you to think, feel, and act differently from one moment to the next. State self-knowledge—the ability to accurately detect these personality fluctuations in the first place—is likely an important part of the process of forming and revising your ideas about what these influences might be.
State self-knowledge might also have immediate practical consequences. Although some evidence suggests that trait self-knowledge may predict better social relationships (Tenney, Vazire, & Mehl, 2013), being aware of and able to do something about your disagreeableness in the moment might be more useful than knowing that you are generally a disagreeable person.
Finally, assessing the accuracy of state self-reports can help researchers decide when to trust these measures. Many studies now use the experience-sampling method (ESM), which asks people to report on their momentary experiences and behaviors (Mehl & Conner, 2012). Because the ESM minimizes retrospective reporting biases, it is often touted as the gold standard for assessing in-the-moment experiences (Schwarz, Kahneman, & Xu, 2009). However, as with any self-report measure, ESM reports are valid only to the extent that people have self-insight into their momentary states. Thus, the validity of ESM reports for any given behavior or experience should be tested, rather than assumed.
The aim of the present research was to examine whether people (specifically, college students in North America) have self-knowledge about their personality states in their everyday lives. The biggest challenge in self-knowledge research is the criterion problem—how do we establish what people are actually like (i.e., the “ground truth”)? To assess whether people have accurate self-views, we must compare self-reports with an independent measure (i.e., an accuracy criterion). The ideal criterion would repeatedly capture a broad range of people’s behaviors in their natural environments using consensual observer ratings. We used the only method we know of that can do this in an unobtrusive way: the Electronically Activated Recorder (EAR; Mehl, 2017), a wearable device that repeatedly captures short audio recordings of people’s observable behaviors and environments. These recordings are coded by multiple observers, providing a reliable outside perspective on people’s real-world behaviors and allowing us to compare repeated self-perceptions of personality states with observer ratings in the same moments.
Method
We used data from the first wave of the longitudinal Personality and Interpersonal Roles Study (PAIRS). Other manuscripts have used the ESM personality-state variables (Beck & Jackson, 2018; Breil et al., in press; Finnigan & Vazire, 2018; R. E. Wilson, Harris, & Vazire, 2015; R. E. Wilson, Thompson, & Vazire, 2016) and other variables from this data set (Colman, Vineyard, & Letzring, 2018; Edwards & Holtzman, 2017; Solomon & Vazire, 2016; Weidman et al., in press), but this is the first article that examines within-person associations between self-reported and EAR-coded behavior.
Participants
The study involved 434 students at Washington University in St. Louis, who were recruited in 2012 and 2013 via flyers and classroom announcements across the campus. Participants were paid $20 for the initial laboratory-based assessment, were entered into a lottery with the opportunity to win $100 for completing ESM surveys (with a 1 in 10 chance of winning if all ESM surveys were completed), earned an additional $20 for wearing the EAR, and received a “time capsule” that contained feedback on how their personality had changed across the seven waves of the study.
The sample size of the original study was determined by the stopping rule of ending data collection when we reached the end of a semester and had recruited at least 400 participants. The number of ESM observations per participant was determined by our subjective impression of how many repeated measures we could obtain from each participant without compromising the quality of the data or the participants’ goodwill. After exclusions (described in the Data Exclusions section), the final subset of 248 participants (173 women, 74 men, 1 gender not reported) used in the current analyses ranged in age from 18 to 29 years (M = 19.17 years, SD = 1.8) and identified as Caucasian (n = 141), Asian (n = 57), Black (n = 24), American Indian or Alaska Native (n = 1), or other or multiple races (n = 18) or did not disclose their race (n = 7).
Procedure
Here, we describe the measures and procedures relevant to the current article. Codebooks for all measures in the larger study are available at https://osf.io/akbfj.
ESM measures
Four times per day (at 12 p.m., 3 p.m., 6 p.m., and 9 p.m.) for 15 days, participants received a text message notification and were e-mailed a link to a survey that contained ESM measures of Big Five personality states in the target hour (11 a.m.–12 p.m., 2 p.m.–3 p.m., 5 p.m.–6 p.m., and 8 p.m.–9 p.m.). Using the nine items that we adapted from the Big Five Inventory (John, Naumann, & Soto, 2008), participants reported their state extraversion (“quiet” [reverse-scored]; “outgoing, sociable”), agreeableness (“considerate, kind”; “rude” [reverse-scored]), conscientiousness (“reliable”; “lazy” [reverse-scored]), and neuroticism (“worried”; “relaxed” [reverse-scored]; “depressed, blue”) in each target hour (e.g., “From 11am–noon, how [outgoing, sociable] were you?”). Responses were made on a 5-point scale (1 = not at all, 5 = very). Participants completed the agreeableness items only if they reported that they were around other people during the target hour. We did not include ESM measures of Big Five openness in this wave of the study because we previously believed that the openness items on the Big Five Inventory would not translate well to a state measure (an issue we changed our mind about in later waves of data collection).
EAR protocol
During the first week (6–8 days) of the ESM protocol, 311 participants wore the EAR, implemented through the iEAR app using an iPod Touch device. The EAR component of the study was optional, was offered only during nonsummer months of the study, and was not an option when all of the researchers’ iPod Touches were in use by other participants. The EAR was programmed to record 30-s audio snippets of participants’ ambient sounds every 9.5 min from 7 a.m. to 2 a.m. Participants were encouraged to wear the EAR as much as possible and to wear it clipped to a waistband or the outside of their pockets (i.e., not inside a bag or pocket). Although participants had no way to tell when the device was recording, they were told that they could decide to not wear the EAR at any time for any reason.
After 3 to 4 days, participants returned to the lab to upload their data (because of device memory limitations) and then continued wearing the device before returning it after another 3 to 4 days. After returning the device, participants received a compact disc with their recordings so that they could listen to and erase any files they did not want the researchers to hear. Only a few participants (n = 15) chose to erase a total of 99 files. After these files were deleted, along with files from 6 participants who withdrew and 1 participant who had only silent recordings (suggesting that the microphone malfunctioned), 152,592 usable recordings from 304 participants remained.
EAR codings
From September 2013 until February 2018, research assistants from Washington University in St. Louis (n = 8) and the University of California, Davis (n = 100), listened to the audio files recorded during the same hours as the ESM reports (11 a.m.–12 p.m., 2 p.m.–3 p.m., 5 p.m.–6 p.m., and 8 p.m.–9 p.m.). For each of their assigned participants, coders listened to the six or seven 30-s files from each ESM-matched hour (3–3.5 min total); rated participants’ levels of state extraversion, agreeableness, conscientiousness, and neuroticism during that hour (as part of a larger survey); and then moved on to the next ESM-matched hour for that participant.
Coders’ ratings were made using the same items and 5-point scale (1 = not at all, 5 = very) that participants used in their ESM self-reports, with a few minor differences: (a) The items were worded in terms of how the participant seemed (e.g., “In this hour, the participant seemed [quiet]”), (b) coders completed the agreeableness items only if they believed that the participant was interacting with other people (not just around others) during the target hour, and (c) coders had the option to select “No way to tell” (rather than a number on the scale).
Because research assistants joined and left the lab at different times, each participant was coded by a different set of coders. Initially, we aimed to have each participant coded by three coders. However, as the interrater reliabilities based on three coders were low, we decided to add three more coders, so that each participant was coded by at least six coders. Between the two sets of codings, we made minor changes to the coding protocol (see the Supplemental Material available online), in hopes of increasing interrater reliability.
Transcripts
After seeing the results of the key quantitative analyses, we decided to supplement these analyses with qualitative data from transcripts of the EAR files. These transcripts were obtained through a separate coding task, in which participants’ utterances were transcribed by a different research assistant from the one who provided their observer ratings. Transcribers were trained to recognize the participant’s voice; to handle ambiguities such as repetitions, filler words, nonfluencies, and slang; and to use special characters to indicate when participants were singing or acting (see “Transcription Guide” at https://osf.io/kd8b3).
Data exclusions
ESM exclusions
In line with exclusion criteria applied in previous articles that used the PAIRS ESM data (Finnigan & Vazire, 2018; R. E. Wilson et al., 2015; R. E. Wilson et al., 2016), we excluded ESM reports (a) if they were completed more than 3 hr after the notification was sent, (b) if participants completed fewer than 75% of the items, (c) if participants used the same response option for at least 70% of the items, and (d) if participants indicated that they were sleeping during the target hour. We also excluded practice ESM surveys that were completed during each participant’s initial laboratory session. After these exclusions, 10,949 reports from 406 participants remained.
EAR exclusions
Coders rated participants’ personality states only in hours that contained sufficient acoustic information. We kept only the hours that at least three coders rated as being informative (for details, see the Supplemental Material). On the basis of these criteria, 807 out of 5,222 hr (15.45%) were uninformative (and excluded from further analyses).
Minimum number of matched observations
Of the remaining 4,415 EAR observations, 3,050 observations had a corresponding ESM report (from 289 participants). We excluded 112 observations from 41 participants who had fewer than five matched observations (i.e., time points that contained both ESM and EAR data), resulting in 2,938 observations from 248 participants. Because there were some missing data (especially for agreeableness, as responses to these items were conditional on either being around other people [ESM reports] or interacting with other people [EAR observations]), we applied the five matched observation inclusion criteria for each personality state. This ensured that each analysis included only participants who had at least five time points containing both ESM and EAR data for the focal personality state. Beyond this minimum, we retained time points that had either ESM or EAR data to allow Mplus to use all available information. This left final sample sizes of 2,938 observations for the extraversion, conscientiousness, and neuroticism analyses and 2,519 observations for the agreeableness analyses.
Data analyses
Key analyses
The data had a multilevel structure, with observations (Level 1) nested within participants (Level 2). We used B. O. Muthén and Asparouhov’s (2009) general multilevel-structural-equation-modeling (MSEM) framework, implemented in Mplus (Version 8.1; L. K. Muthén & Muthén, 2017), to model the within-person agreement between self-reported and EAR-coded personality states. MSEM enables the modeling of both random effects (to allow for individual differences in state self-knowledge) and latent variables (so that variables are not assumed to be measured without error). This means that differences in effect sizes for the four personality states will not be due to differences in measurement reliability. MSEM also allows for Level 1 and Level 2 effects to be simultaneously estimated, so that within-person effects are not conflated with between-person effects. Thus, we estimated latent variables and effects at both the within-person level and the between-person level but focus on the within-person effects in this article (for between-person correlations, see Table S2 in the Supplemental Material). We ran separate models for each of the four personality states (see Fig. S1 in the Supplemental Material).
Measurement models
Each personality state was modeled as a latent variable. For the ESM latent variables, the indicators were the two or three items for the personality state. For the two-item measures (for agreeableness, conscientiousness, and extraversion), in order for the model to be locally identified, we fixed both item loadings to 1 (and allowed all variances to be freely estimated). For the three-item neuroticism measure, we fixed the first item loading to 1 and allowed the other two factor loadings (and all variances) to be freely estimated.
For the EAR latent variables, we used coders as indicators. Some hours were coded by more than six coders, but to reduce model complexity, we included data from up to only six coders for the latent variables (for details, see the Supplemental Material). To create the indicators, we computed scale scores (i.e., the average of the two or three items for each scale) for each of the six coders. Then we used these six scale scores as indicators. Thus, every latent variable had six indicators (with each indicator representing a scale score from a given coder for a given participant). For a given participant (e.g., Participant 1), all ratings from Coder 1 were from the same coder (e.g., Research Assistant 1). However, for a different participant (e.g., Participant 2), Coder 1 could have been a different research assistant (e.g., Research Assistant 2). To model the interchangeability of coders, we fixed all loadings for the six indicators to 1, constrained the six residual variances to be equal, and allowed the within- and between-person variances of the latent EAR variables to be freely estimated.
We conducted multilevel confirmatory factor analyses (Geldhof, Preacher, & Zyphur, 2014; Shrout & Lane, 2012) on these measurement models to obtain level-specific omega (ω) reliability estimates. The within-person ω (ωWP; see Table 1) estimates the reliability of change, which is the proportion of within-person variability due to meaningful changes in the personality state from one moment to the next, as assessed by two or three items (for the ESM latent variable) or six coders (for the EAR latent variable). The between-person ω (ωBP; see Table S1 in the Supplemental Material) estimates the proportion of between-person variability due to true between-person differences on participants’ average personality states.
Descriptive Statistics
Note: All ratings were made on Likert-type scales ranging from 1 to 5. Values for 1 – intraclass correlation coefficient(1), or ICC(1), reflect the proportion of total variability attributable to within-person variability; ωWP reflects the within-person estimate of the reliability of change. Variance estimates for the average within-person standard deviation (SDWP), between-person standard deviation (SDBP), and 1 – ICC(1) are based on the measurement models (i.e., latent variables). Means were obtained by computing the aggregate mean (from observed scores) for each participant and then computing the mean of these means (such that all participants were weighted equally). ESM = experience-sampling method (i.e., self-reports); EAR = Electronically Activated Recorder (i.e., observer reports).
Structural models
For each personality state, for the within-person models, we regressed the EAR latent variable onto the ESM latent variable, with random slopes and random intercepts for each participant. In other words, this model allowed each participant to have a different mean level on each personality state and a different association between self-reported and EAR-coded states. We also modeled the between-person path from the ESM latent variable to the EAR latent variable, although this is not the focus of this article.
Estimation and inference criteria
Because of the computational demands of these models, we used the Bayes estimator in Mplus (B. O. Muthén & Asparouhov, 2012), with the default set of diffuse (i.e., noninformative) priors. We used the 95% credibility interval (CI) around the standardized effects (βs) as inference criteria for the range of plausible population values of the effect sizes.
Qualitative analyses
To provide a sense of what was happening when participants and observers disagreed about participants’ personality states, we report the transcripts that correspond to the largest discrepancies between ESM self-reports and EAR observer ratings for those Big Five states that showed the largest self–other disagreement. We conducted this supplemental analysis for each Big Five domain for which we judged self–observer agreement to be low.
To do this, we standardized the ESM self-reports and EAR observer ratings within each person across the same time points included in the key analyses (but using observed variables instead of the latent variables that were used in the key analyses). We then matched up the ESM and EAR data with the hour-level transcripts (i.e., all of the decipherable words across the six or seven 30-s recordings in each hour) from the 121 participants who gave permission to publish their transcripts, retaining only the hours that contained transcripts (i.e., excluding hours in which participants did not speak or had no decipherable speech). Next, we subtracted the EAR observer ratings from the ESM self-reports and selected the 50 target hours associated with the largest discrepancies between ESM self-reports and EAR observer ratings (25 in each direction, for each personality state). The transcripts from these target hours are shown in Tables S3 to S6 in the Supplemental Material.
Results
Descriptive statistics
Descriptive statistics for all variables are reported in Table 1. The intraclass correlations, ICC(1)s, show that there was substantial (≥ 45%) within-person variability for each of the personality states, as captured by both the ESM self-reports and EAR codings. Note that the within-person reliabilities for state agreeableness (both EAR and ESM) and ESM conscientiousness were relatively low but that MSEM corrects for attenuation due to measurement error.
State self-knowledge analyses
Do people have self-knowledge of their personality states in everyday life? To test this, we examined the correspondence between self-views (ESM reports) and observed behavior (EAR codings) using the models described above. Positive slopes reflect agreement between self-reports and observed behavior, which we interpret as evidence of self-knowledge, whereas weak or flat slopes may or may not reflect lack of self-knowledge (for more details regarding interpretation, see the Discussion section). Figure 1 shows the individual slopes and the average slope (bold line) for each Big Five domain (for all unstandardized and standardized estimates, see Table S1).

Spaghetti plots depicting the within-person associations between self-views (x-axes) and observed behavior (y-axes), separately for each of the four personality states. Self-views were obtained using the experience-sampling method (ESM), and observed behavior was coded from Electronically Activated Recorder (EAR) recordings. Each colored line represents the slope for a different participant, and the black line shows the average within-person effect. Strong positive slopes reflect self–observer agreement, which we interpret as evidence of self-knowledge. Weak or flat slopes may or may not reflect lack of self-knowledge. The x-axes show deviations from each person’s mean self-reported personality state, whereas the y-axes show the uncentered range of EAR codings from 1 to 5.
The average slopes were positive and nonzero for all four domains. However, the effects were quite a bit larger for extraversion (β = 0.63, 95% CI = [0.60, 0.66]) and conscientiousness (β = 0.47, 95% CI = [0.40, 0.55]) than for neuroticism (β = 0.27, 95% CI = [0.21, 0.32]) and agreeableness (β = 0.20, 95% CI = [0.11, 0.32]). When participants rated themselves as being more extraverted or conscientious than they usually were, the EAR coders also rated them as more extraverted or conscientious than their typical levels. This suggests that participants had self-insight into their momentary fluctuations in extraversion and conscientiousness.
In contrast, when participants rated themselves as more neurotic or agreeable than usual, this only weakly corresponded to how the EAR coders rated them. In addition, the 95% CIs for self–observer agreement on neuroticism and agreeableness, while excluding zero, did not overlap with those for extraversion and conscientiousness. In short, agreement was substantially weaker for neuroticism and agreeableness than for extraversion and conscientiousness, even though the models accounted for differences in measurement reliability across constructs. These results are more complicated to interpret than those for extraversion and conscientiousness and may or may not imply that participants lacked self-insight into how neurotic and agreeable they were in the moment. We will return to this challenge of interpretation in the Discussion section.
In the meantime, to allow readers to gain a better sense of what was happening when participants and observers disagreed about participants’ momentary agreeableness and neuroticism, we provide the transcripts from the hours with the largest self–observer discrepancies in these states in Tables S3 through S6 in the Supplemental Material (for those participants who gave consent to share their EAR recordings). In the Discussion section, we share a few of our own observations from reading these transcripts, but we encourage readers to explore the transcripts in Tables S3 through S6, along with the full set of shareable transcripts and their corresponding self–observer discrepancy scores (posted on our Open Science Framework page, osf.io/kd8b3, the password for which is available on request).
Discussion
Our goal was to test whether people know what they are like in the moment. We found high levels of self–observer agreement for state extraversion and conscientiousness but lower levels of agreement for neuroticism and agreeableness. These results can be interpreted as accuracy estimates only if we assume that observers can detect true fluctuations in personality states through brief audio recordings of participants’ everyday behaviors and environments. We believe that this assumption holds more strongly for momentary extraversion, conscientiousness, and agreeableness than for neuroticism. Thus, we interpret our results as showing that people have self-insight into their momentary extraversion and conscientiousness, that momentary neuroticism is difficult (but not impossible) for observers to judge, and that people have poor self-knowledge of their momentary agreeableness.
The findings for extraversion are consistent with a large body of literature demonstrating high self–observer agreement on trait extraversion across a wide range of conditions (for a review, see Vazire & Solomon, 2015). These findings provide new evidence that self-perceptions of state extraversion are accurate—that is, people know when they are being more or less extraverted than usual. Likewise, the substantial self–observer agreement for conscientiousness suggests that people are willing and able to report when they are acting lazy versus acting reliable.
Because the EAR is not a perfect criterion, however, the lower self–observer agreement for state neuroticism and agreeableness could suggest (a) that EAR coders could not accurately detect these states or (b) that people’s self-reports are inaccurate. Although both explanations are probably partially correct, we suspect that the first explanation largely accounts for low self–observer agreement for neuroticism. Previous studies have suggested that neuroticism is quite hard to observe (John & Robins, 1993) and that people are the best judges of their own trait neuroticism (Vazire, 2010). We suspect that it was difficult for EAR coders to detect states such as being worried on the basis of only audible behaviors. Thus, for state neuroticism, the weaker self–observer agreement may not imply low self-insight.
To explore whether this interpretation is consistent with the transcript data, we looked at the transcripts from the time points with the greatest discrepancies between self-reports and observers’ ratings of state neuroticism (see Tables S5 and S6). As the content of these transcripts did not seem particularly informative to us, we explored another potentially relevant indicator—quantity of speech. After looking at the ESM–EAR discrepancies across all time points (including those with no speech), we observed that many of the time points in which self-reports of neuroticism were much higher than observer reports contained no speech. This suggests that people sometimes feel quite worried or depressed without expressing it verbally, which is consistent with our interpretation that state neuroticism is difficult to pick up from acoustic information alone.
However, we believe that it is plausible that people have less self-insight into their momentary agreeableness. Kindness and rudeness (the agreeableness states measured here) are defined more by behaviors than by thoughts and feelings. Thus, fluctuations in these states should be observable in naturalistic interactions with friends, roommates, and classmates, which the EAR is optimized to capture (Mehl, 2017). Indeed, our findings for extraversion show that EAR coders can detect interpersonal behaviors. We therefore believe that the weak self–observer agreement for agreeableness casts doubt on people’s self-insight into how agreeable or disagreeable they are in the moment. This is consistent with the only other study we know of that examined people’s awareness of their agreeableness-related behavior (during one laboratory-based group task; Gosling et al., 1998).
Tables S3 and S4, which report transcripts for the time points with the largest self–other discrepancies for agreeableness, may help shed light on the plausibility of our interpretation. For example, we agreed with the EAR coders that the participant who said “her twin brother did not have her in his wedding, which is such bullshit” was acting disagreeable (contrary to her self-rating) and that the participant who said, “Trust me, breaking up helps. And you have a good support system here. . . . You can come into me and Mel’s room, just have a glass of wine,” was acting quite agreeable (again, contrary to her self-rating). Of course, it is easy to cherry-pick examples that fit our interpretation (and readers may not agree that even these cherry-picked examples support our interpretation), so we encourage readers to read the transcripts and come to their own conclusions.
Our results should be interpreted with the following limitations in mind. First, the observers had only 3 to 3.5 min of recordings spread across each hour they rated and had access to only acoustic information. Second, personality states comprise more than just observable behavior and more than just the content captured by the two to three items we used per domain. Third, there was less within-person variability in both self-reports and EAR codings of agreeableness states relative to the other personality states. Thus, it was likely more difficult for participants to detect the relatively narrow fluctuations in their own agreeableness states compared with, for example, the larger fluctuations in their own extraversion states.
However, given the challenges of studying self-knowledge, we believe that our methodology stands out in several ways: (a) high realism (we measured behavior across many situations in people’s everyday lives), (b) moderate to high consensus on what participants were like from one moment to the next (we had each observation coded by six coders), and (c) high precision of estimates (we had large numbers of people and observations). Thus, although these results should not be the final word about state self-knowledge, they provide a strong test of college students’ self-knowledge of what they are like during everyday moments.
Self-knowledge is central to our lives (and to the way that social scientists study our lives, often relying on self-reports). Our findings show that we can probably trust what people say about their momentary levels of extraversion, conscientiousness, and likely, neuroticism. However, our findings also call into question people’s awareness of when they are being considerate versus rude. This is consistent with theoretical propositions and empirical evidence that people are poor judges of their trait agreeableness (John & Robins, 1993; Paulhus & John, 1998; Vazire, 2010; Vazire & Mehl, 2008) and suggests that people may not know how agreeable they generally are in part because they lack awareness of how rude or considerate they are in everyday moments.
Being aware of one’s behavior in the moment also has benefits beyond its implications for trait self-knowledge. Practically, a momentary lapse in kindness could have dire consequences that could be avoided if you quickly realize that you just acted like a jerk. Recognizing instances of behavior while they happen is also a precursor to a deeper form of self-insight that involves not only knowing that you can sometimes be contemptuous but also knowing when (and ultimately, why) that happens (e.g., being caffeine deprived and in a rush). If it is true that, as Calvin told Hobbes, “we don’t devote nearly enough scientific research to finding a cure for jerks” (Watterson, 1992, p. 58; see also Sutton, 2007), perhaps a good place to start is with people’s blind spots about their behavior in the moment.
Supplemental Material
SunOpenPracticesDisclosure – Supplemental material for Do People Know What They’re Like in the Moment?
Supplemental material, SunOpenPracticesDisclosure for Do People Know What They’re Like in the Moment? by Jessie Sun and Simine Vazire in Psychological Science
Supplemental Material
SunSupplementalMaterial – Supplemental material for Do People Know What They’re Like in the Moment?
Supplemental material, SunSupplementalMaterial for Do People Know What They’re Like in the Moment? by Jessie Sun and Simine Vazire in Psychological Science
Footnotes
Acknowledgements
We are grateful to Julia Rohrer, Chris Hopwood, Rick Robins, Olivia Atherton, Rich Lucas, and Luke Smillie for comments on an earlier draft of this article; to Mijke Rhemtulla for advice on the measurement models; to the many research assistants who helped run the study and code the Electronically Activated Recorder recordings; and to Brittany Solomon, Kelci Harris, Kathryn Bollich, Robert Wilson, and Katie Finnigan for supervising data collection and coding.
Action Editor
Brent W. Roberts served as action editor for this article.
Author Contributions
S. Vazire conceived the study, acquired funding, and supervised data collection. J. Sun supervised Electronically Activated Recorder coding and curated and analyzed the data under the supervision of S. Vazire. Both authors contributed equally to drafting and editing the manuscript, and both authors approved the final manuscript for submission.
Declaration of Conflicting Interests
The author(s) declared that there were no conflicts of interest with respect to the authorship or the publication of this article.
Funding
Data collection was supported by National Science Foundation Grant BCS-1125553 (to S. Vazire).
Open Practices
Design and analysis plans for this study were not preregistered. Codebooks for all measures in the larger study are available via the Open Science Framework (OSF) and can be accessed at osf.io/akbfj. Although ethical considerations prevent us from making the audio files and complete set of transcripts publicly available, the quantitative data, R scripts, and Mplus input and output files required to reproduce the analyses reported in this article are available at osf.io/kd8b3. This OSF repository also contains a password-protected file that contains transcripts (for the time points included in the key analyses) from participants who consented to have their Electronically Activated Recorder recordings shared. Interested researchers can request the password from the first author. The complete Open Practices Disclosure for this article can be found at https://journals-sagepub-com.web.bisu.edu.cn/doi/suppl/10.1177/0956797618818476. This article has received the badges for Open Data and Open Materials. More information about the Open Practices badges can be found at
.
References
Supplementary Material
Please find the following supplemental material available below.
For Open Access articles published under a Creative Commons License, all supplemental material carries the same license as the article it is associated with.
For non-Open Access articles published, all supplemental material carries a non-exclusive license, and permission requests for re-use of supplemental material or any part of supplemental material shall be sent directly to the copyright owner as specified in the copyright notice associated with the article.
