Abstract
BACKGROUND:
Self-reported test is one of the main psychosocial risk assessment tools. However, this test it is susceptible to certain sources of error, including social desirability. Since psychosocial risks are emerging, there are not many studies on their assessment.
OBJECTIVE:
The aim of this work is to analyze the impact of social desirability on the short version of the CopSoq-ISTAS 21 assessment tool.
METHOD:
A total of 563 workers (45.10% women and 54.90% men) participated in this study. The short version of the CoPsoQ-Istas21 questionnaire with four Likert scale questions as markers, which correspond to the Eysenck Personality Lie Scale Questionnaire Revised (EPQ-r), were used. The sample was divided into two halves, and both a confirmatory analysis and an exploratory analysis were carried out to find out the factorial structure of the scale and, with it, apply the bias filtering method.
RESULTS:
The results indicate that 10% of the scale is biased due to social desirability, and that there are significant differences between the group with bias clean scores and the group with scores without bias control.
CONCLUSIONS:
The effects of social desirability on the scale are verified, so it is concluded that in a psychosocial risk assessment is not enough to apply a self-report test and interpret its results, being necessary to minimize the sources of error.
Keywords
Introduction
The current health crisis caused by the Covid-19 virus has considerably aggravated the exposure to psychosocial risk factors, which has led to dismissals or fear of losing a job, among other effects [1, 2]. Consequently, we are facing one of the most worrying challenges in occupational risk prevention, since the effects of stress are so harmful that it is already qualified as a “silent killer” [3]. Being a risk to the health of workers, it is mandatory to correctly identify and evaluate occupational stressors [4, 5] although as there are no specific regulations or official guidelines on how to evaluate psychosocial risks, currently the most widely used evaluation tool is the self-report test.
These questionnaires have to offer confidence in the results (art. 5.2) [6], that is, psychometric guarantees that confirm their quality in terms of theoretical foundation, validity and reliability. However, and despite being a threat to validity, few studies control the effects of bias [7, 8].
In this sense, the tests have to be fair and unbiased [9], thus, there are numerous warnings about the possibility of obtaining distorted results when using self- report tests, due to response biases [10], including social desirability (hereinafter SD) [11, 12]. SD refers to the tendency to respond by being socially acceptable [13]. Depending on the impression to be offered, the distortion can be positive or negative, affecting the criterion validity [14]. SD is a construct made of two factors: self-deception and impression management [15, 16].
In recent decades, the term faking has gained strength to refer to the intentional manipulation of responses. Faking is considered the most threatening bias in studies on psychology and social sciences, especially regarding non-cognitive assessment tools [17–19]. Regarding psychosocial risk assessment tools, no studies on the impact of SD have been found. Therefore, it is extremely important to know the effect of this bias in the tests that evaluate work-related stressors, due to its relevant contribution in occupational health prevention, of special interest to prevention technicians who assess psychosocial risks.
Consequently, the following hypotheses are formulated:
H1: The results obtained with the short version of the CoPsoQ-ISTAS 21 questionnaire show effects due to SD.
H2: There are significant differences of score between the group with clean bias scores and the group without bias control.
Materials and methods
Sampling and participants
Given the difficulty of performing a random sample, an incidental sample by quotas balanced by sex was chosen. For the quotas, the distribution of employed persons was used as a reference, grouped by sex from the Active Population Survey of Spain (EPA) published by the National Institute of Statistics of the same country [20]. The figures are shown in Table 1, expressed in thousands.
Sampling based on the number of employed by gender group
Sampling based on the number of employed by gender group
*Data from the third quarter of the 2016 EPA survey.
Regarding the unit of analysis studied, 563 employed workers were observed. All belonged to different sectors of activity in the national territory, which implies that the sample has been sufficiently heterogeneous. The invitation to participate voluntarily and anonymously in this research was made using the technique of non-random sex-balanced quota sampling, due to its similarity to stratified random sampling, through the professional network “LinkedIn”, and the questionnaires were administered online, through the Google Form (Google forms). As for the exclusion criteria, 39 unemployed workers and 40 self-employed workers were eliminated. A total of 86.70% of the participants were of Spanish nationality and 9.20% of other nationalities. 45.10% correspond to women and 54.90% to men, which is indicative of a balanced sample, since those rates are basically consistent with the data from the last EPAsurvey [20]. Regarding the age of the participants, it was between 16 and 67 years old, with a range of greater participation between 36 and 45 years old (45.30%), being the least frequent the participants under 26 years old (3%). About the employment situation, a large majority had an indefinite contract (67.10%), followed by workers with a temporary contract (27%).
CopSoq-ISTAS-21 (version 2)
The psychosocial risk assessment tool used for this study is CoPsoQ-ISTAS 21 [21], a widely distributed worldwide methodology and periodically updated, with numerous scientific literature in which its psychometric properties are contrasted [22–24].
It was chosen to use the short version of the CoPsoQ-Istas21 version 2 questionnaire, made up of 30 items, given that it had a smaller number of items in relation to version 1 of the CoPsoQ questionnaire. This is intended to minimize fatigue and invariable response biases, both related to long surveys [10].
The response format of this scale is Likert-type, of five points (0 to 4). This
questionnaire measures 6 main dimensions, divided into several subdimensions [21], specifically the following: Psychological demands at work (Subdimensions: Quantitative demands, work pace,
emotional demands) (6 items, α= .77). Work-family conflict (double presence) (2 items, α= .73). Control over work (Subdimensions: influence, development possibilities, meaning
of work) (6 items, α= .83). Social support and leadership quality (Subdimensions: predictability, role
clarity, leadership quality, role conflict) (8 items α= .87). Work compensation (Subdimensions: employment insecurity, working conditions
insecurity (4 items α= .73). Social capital (Subdimensions: Justice, vertical trust) (4 items, α= .88).
Markers of social desirability
The bias filtering method proposed by Ferrando et al. [25] requires the inclusion of four Likert-style items (Table 2) as markers, which correspond to the Eysenck Personality Lie Scale Questionnaire Revised (EPQ-r):
Markers of SD
Markers of SD
To obtain the data, a 44-item online self-report test was designed using the Google Forms
application. The questionnaire was divided into the following sections: Sociodemographic and socio-labour issues (10 items). Variables of interest: CoPsoQ-ISTAS 21 v.2 scale (30 items). Items used as markers of social desirability (4 control items).
The methodological guarantees of research through data extracted online are similar to the face-to-face procedure [26] and encourage anonymity, which reduces socially desirable responses [27].
The original data matrix consisted of 642 participants of which 79 participants were eliminated, according to the above exclusion criteria. The resulting data underwent a filtering process that is summarized in Table 3. Regarding the analysis of missing values and values out of range in the variables of interest, they are coded as zero, due to the mandatory response design in data collection.
Data cleansing
Data cleansing
To validate the hypotheses, the statistical packages MATLAB, Excel, Factor, G*Power, Statistical Package for Social Sciences (IBM SPSS) version 24 for Windows and AMOS have been used.
In the first place, descriptive statistics were calculated to demographically categorize the sample. To find out the influence of SD on the questionnaire, it was used the method based on factor analysis proposed by Ferrando et al. [25]. The advantage of this technique is that it manages to extract the SD saturations in the content items, resulting in individual scores free of bias. Therefore, in this way it is possible to avoid the limitations of other SD control methods, in which there is a risk of eliminating true scores, as occurs with specific SD scales [28].
This bias filtering method is based on factor analysis, and to use it is necessary to know the empirical factor structure of the scale used, in this case the short version of the CoPsoQ-ISTAS 21. However, given the lack of necessary information, it has been necessary to conduct exploratory and confirmatory factor analyzes to determine the dimensionality of the scale. To obtain evidence about the predictive capacity of the proposed factorial model (cross-validation), the original sample has been divided into two random subsamples.
Subsample 1 = 60% approx. of individuals to perform an Exploratory Factor Analysis (EFA), n1= 325.
Subsample 2 = 40% approx. of individuals to perform a Confirmatory Factor Analysis (CFA), n2= 238.
Once the scale factors were extracted, the SD control method was applied to the entire sample N = 563, assuming that there is no correlation between the content factors and the SD factor.
The SD bias filtering procedure consisted of two stages [12, 25], the first based
on an initial interitem correlation matrix with the influence of SD, whose model structure
is represented in the following equation:
Where, θ1 = Content factor; θ2 = Social Desirability factor; λ= factor loadings; and ɛ= residuals.
Once the inter-marker correlation matrix was obtained, a polychoric correlation factorial analysis is conducted to obtain the loading values of each marker on the SD factor. With the loading values obtained, the factorial weights of the content items in the SD factor are calculated, using the Instrumental Variable Technique. Finally, the variance explained by the SD factor was eliminated from the inter-item correlation matrix, thus achieving a matrix that is clean of the SD bias (residual matrix).
In the second stage, the residual matrix without SD was used, performing an exploratory factorial analysis, followed by a Minium Rank Factor Analysis (MRFA) factor extraction method. The factorial structure obtained is rotated using the Promin oblique rotation technique [29] to allow correlation between factors. The results show a load in the content factor and another, preferably secondary, in the orthogonal factor of SD, thus obtaining clean scores from the SD bias. As an additional check, once the bias has been filtered, the results were compared with the original scores using the multivariate analysis of variance (MANOVA).
Results
Descriptive analysis
To obtain an overview of the quality of the items of the CoPsoQ scale and the SD markers, Table 4 shows the main descriptive statistics of subsample 1 (n1= 325). Regarding the normality analysis, the univariate skewness values were between -1.05 and 1.61, and the univariate kurtosis values were between -1.40 and 2.39, with an average skewness of 0.28 and an average kurtosis of -0.60. According to the criteria of several authors, the distribution complies with the parameters of normality, that is, it is not asymmetric nor does present excess of kurtosis [30–32].
Descriptive statistics items of CoPsoQ-ISTAS21 and SD (n1= 325)
Descriptive statistics items of CoPsoQ-ISTAS21 and SD (n1= 325)
To find out the number of underlying dimensions of the short version of CopSoq-ISTAS 21, an EFA (Exploratory Factor Analysis) was performed using a first subsample random (n1= 325). As a previous step, it was checked if the matrix met the conditions to be factored, by calculating the Kaiser-Meyer-Olkin (KMO) statistic. The result of.86 implies that the responses were not random and that the factoring results are stable. Computing Horn’s parallel analysis, five dimensions underlying the data were suggested. These results propose a dimension related to response style (DS) and four content dimensions concerning the CoPsoQ-ISTAS21 (Fig. 1).

Visual representation of an EFA model.
To verify that the scale follows the suggested factorial structure, an EFA was applied,
retaining four content factors, which were labelled as follows: (F1) Control over work: It is made up of items 7, 8, 11-16, with a
factorial reliability of.91. In general, its related with the degree of autonomy
that workers have on how to conduct their functions. Following the original version
of CoPsoQII [33], it encompasses the
subdimensions of influence (items 7 and 8), development possibilities (items 11 and
12) and sense of work (items 13 and 14). As a result of the EFA, it also includes
the role clarity subdimension (items 15 and 16). (F2) Psychological demands at work: Composed of items 1-6, 9 and
10, with a factorial reliability of.87. It refers to job demands, both quantitative
and emotional. It is in line with the starting theoretical framework that is made up
of subdimensions of quantitative demands (items 1 and 2), emotional demands (items 5
and 9) and work rhythm (items 6 and 10). In turn, it includes items 3 and 4 that
correspond to the work- family conflict dimension. This same factorial structure was
observed in the EFA performed for the validation of the French version of CoPsoQ
[34]. (F3) Work compensation: It is made up of items 21–24 and shows a
factorial reliability of.80. The solution of this third factor coincides entirely
with the original theoretical factor and with the adaptation for Spain in its short
version [21, 33], measuring insecurity about working conditions (items 21
and 22) and job insecurity (items 23 and 24). (F4) Social support and leadership quality: Its factor reliability
is.94 and it is made up of items 17-19, 20, 25-29 and 30. This dimension refers to
the help, both material as emotional, that workers need to receive from the other
members of the organization [35]). It
partially coincides with the starting theoretical model, measuring role conflict
(items 17 and 18), predictability (items 19 and 20) and leadership quality (items 29
and 30). To this factor (F4) is added a new dimension, named social
capital, which is made up of the subdimensions of vertical trust (items
25 and 26) and justice (items 27 and 28).
To evaluate the simplicity of this factorial solution, three index were calculated, the Bentler simplicity index (S = .99), the loading simplicity index (LS = .59) and the residuals analysis index, which indicated a mean root squared residual (RMSR) of.05. All these analyses have indicated adequate values.
To confirm the factorial structure noted in the EFA of the first subsample, its replicability in the second subsample was studied using an AFC. A combination of indicators was applied to test the fit degree to the underlying four-factor model. From the results of these goodness-of-fit index, the suitability of the proposed model can be deduced from different perspectives (RMSEA = .01, GFI = .98 and CFI = .99).
The last analysis carried out has been the Tucker Congruence Coefficient, which shows the degree of congruence or discrepancy between the pattern of factor loading obtained in the EFA and the AFC. Values of.88,.90,.90, and.92 were reached for each factor, respectively, and the overall consistency was.90. These data indicate that the congruence of the two factorial structures, extracted during cross-validation, are statistically analogous [36].
Results indicate that 10% of the CoPsoQ items appear affected by the effect of SD. The specific reagents that have been affected by SD are items 3, 5 and 18, and item 4 is close to the limit of.20. (Table 5)
Items with greater sensitivity to SD
Items 3,4 and 5 belong to the factor labelled as F2. Specifically, the first two integrate the double presence subdimension, according to the original theoretical model. For its part, item 18 is part of the factor conceptualized as F4 and is one of the items that evaluates role conflict, according to the theoretical model of the original factorization.
The possible explanation for the SD found in items 3 and 4 would point in the same direction as other investigations, where responses about participation in housework were distorted to show a better social image [37, 38]. This could be an explanation for the high rates of exposure to the psychosocial factor of double presence (45%) in the research made by Louzán [39], where 45% of workers indicated being exposed to the stressor of double presence, above the factor risk of “Insecurity” (31.36%).
Given that the data indicates that both samples are representative of the same population, it was decided to use the entire sample (N = 563) to apply the method of Ferrando et al. [25] and estimate the influence of SD on the scale. In this way, more precise estimations will be obtained, since the larger the sample size, the greater the confidence in the stability of the proposed model.
Table 6 shows the loadings in the SD factor and in the four content factors after oblique rotation for the whole sample. The results indicate that most of the items are relatively pure indicators, as indicated by the LS index. The only exception is item 17 of the CoPsoQ, with a loading higher than.30 in more than one factor.
Matrix of factor loadings EFA controlling SD (n = 563)
Matrix of factor loadings EFA controlling SD (n = 563)
Items affected by SD have been shaded in gray.
Following the criteria suggested by Vigil-Colet et al. [12], factor loadings greater than.20 demonstrate the possibility of bias due to SD, and loadings greater than.30 indicate a substantial effect of SD on the item. Excluding the items used as markers, the result indicates that 10% of the items show the possibility of the SD effect, specifically items 3, 5 and 18.
To demonstrate whether the presence of SD can generate differences between groups in the CoPsoQ items, it was decided to calculate the factorial scores of the participants in the SD factor and divide them into two groups, using the median as the cut-off point. Next, the means of the SD saturations of each factor in both groups (with SD control and without SD control) were compared. Table 7 shows the means for all the factors expressed in T scores (mean 50 and standard deviation 10).
Factorial scores for high and low SD groups (n = 563)
**p < .001 *p < .05.
The multivariate analysis of variance (MANOVA) on the factorial scores without eliminating the effect of the SD shows significant differences between the groups (F (4,558) = 13.606; p < .001). Specifically, the univariate analysis was significant (p < .001) for factors F2 (Psychological demands at work) and F4 (Social support and leadership quality). In the first case, the group with high bias scores offered higher scores on the Psychological Demands at Work variable than the low SD group, that is, it showed greater exposure to risk. On the other hand, in the measure of social support and leadership quality, the group high in SD showed lower scores than the group low in SD, which again implies a greater exposure to the risk of social support, since in this factor the lowest scores supposed a higher risk factor.
On the other hand, when the effects of the SD are eliminated, the MANOVA decreases the significance (F (4,558) = 2.470; p < .05). In fact, the only difference at a univariate level was found in the factor related to Psychological Demands at Work. Participants with higher SD showed higher scores on Psychological Demands at Work than participants with lower SD.
Regarding the explanation of SD in item 5, it would be socially desirable to answer affirmative, since the respondents would be giving value to their work by pointing out that their functions go beyond the tasks of their job, showing that they are valuable members of the organization, as noted in Gamero [40] and Gamero-Burón and González [41].
In the comparison analysis between groups, the results are very revealing, since there are statistically significant differences. The group in which the SD bias is not eliminated presents greater exposure to the risk dimensions relative to F2 and F4, which cannot be considered a result of chance. This is an important finding, because if the results of the questionnaires are considered without controlling for SD, their interpretation would entail implementing corrective preventive measures of a greater magnitude than if the resulting data is considered once the bias has been filtered.
The results obtained indicate that the most sensitive SD factors were F2 (psychological demands) and F4 (social support and leadership quality), compared to the rest of the factors. As an example, scoring high on the item of F2 “you have to work very fast” and scoring low on the item of F4 “it can be said that your immediate boss plans tasks well” would form part of the social stereotypes from the point of view of the employee. Thus, working quickly or under pressure would be the most desirable response due to its positive connotations, since it would imply greater productivity by being able to perform more tasks in less time, a quality that is known to be highly valued by employers [42].
Therefore, the interpretations and inferences made without controlling this response tendency should be taken with caution, especially regarding to the variables of double presence and psychological demands. Otherwise, the results could lead to the implementation of preventive interventions that do not correspond to the real needs.
It should be noted that, as indicated by Meliá, a problem with psychosocial assessment instruments is “the lack of specialized training by many professionals dedicated to occupational health and safety, in such disciplines as psychological assessment, psychometrics and work psychology” (paragraph 4).
In short, all self-report scales, including those for psychosocial risk assessment, are subjective measures that have in common the measurement of reality perceived by the participants, and it is this perception that can be modulated by various elements such as SD or self-deception. That is, it is not enough to administer a questionnaire and interpret the information coming out of the corresponding software, instead it is necessary to minimize the unwanted variance such as that due to SD.
Also, it can be stated that the SD influences the results of the scale by acting as a distorter of the scores, since the results show that some workers have a certain tendency to mark the response options that best identify them as valuable subjects within the organization, in addition to their environment-family context (double presence, psychological demands).
The findings of this study are considered particularly useful for professional practice, as they provide evidence on the influence of SD on the scales that evaluate work stressors. In addition, it explains how to analyze and correct this bias using the technique of Ferrando et al. [25], which allows to quantify the degree of distortion of the results and improving the validity of the data obtained. These achievements suppose a progress towards a correct risk assessment, since, on one hand, they help to reduce subjectivity when determining risk levels and, on the other hand, they help to quantify and control the degree of error inherent in the SD.
This study has made it clear that, to identify, control and minimize possible sources of error in an investigation, knowledge related to test theory and more sophisticated data analysis techniques are necessary, since they contribute to increase the significance of the results and, therefore, they would allow a greater advance about this field.
Therefore, it would be necessary to transfer these notions in greater depth to the area of the psychosocial evaluation of work origin, both theoretically and practically.
The practical implications derived from this study point the need to raise awareness in the scientific and professional community about the importance of SD biases in self- report items that measure working conditions and about the consequences of undervaluing them. In this sense, this study can contribute to improve the interpretations derived from the psychosocial risk assessment self-reports and thereby improve their quality. Likewise, given the results obtained from this research, the institutions in charge of developing evaluation instruments could redesign or promote new tools, taking special care in the wording of the items that measure factors sensitive to SD, such as the factor of dual presence and psychological demands, which would constitute a methodological advance in psychosocial risk assessments.
Finally, it is necessary to point out that, although the research has offered information of great importance and utility for the prevention of psychosocial risks of occupational origin, it is not exempt from limitations, which it would be convenient to contemplate in future studies that follow the research lines presented here. One of the limitations lies in the very subjective nature of the data collected, a limitation common to all research carried out exclusively with self-reports. Regarding the sample used, as it proceeds from a non-probabilistic sampling, it prevents the results from being safely generalized to the population. Future works that intend to replicate or improve this study should use random sampling to dispel doubts about the representativeness of the sample. Another limitation, referring to the sample and derived from the recruitment system, has to do with the fact that the respondents belong to different and varied organizations, and therefore it cannot be guaranteed that there are no other variables that can explain the found data.
In conclusion, despite the limitations mentioned, this research serves as support to confirm the role of response biases in the validity of the results obtained with self- reports in psychosocial risk assessments. Considering the lack of scientific literature on this topic, it would be welcome studies that explore SD in measures with the same purpose, such as FPSICO or DECORE, which contribute to broadening this line of research, both refuting and improving the methodological solutions provided here.
Ethical approval
Not applicable.
Informed consent
Informed consent was obtained from all participants included in the study.
Conflict of interest
The authors declare that they have no conflicts of interest.
Footnotes
Acknowledgments
Not applicable.
Funding
Not applicable.
