Abstract
Intimate partner violence (IPV) is prevalent worldwide, including in Latinx populations. Reported rates of IPV in Latinx populations vary widely, indicating that measurement errors may be impeding researchers’ and clinicians’ understanding of IPV in these populations. We conducted a systematic review across a range of social science databases to evaluate psychometric properties and translation methodologies of Spanish-language IPV measures. Records were included if they included Spanish measures assessing IPV victimization. We identified 91 records with a total of 70 measures and evaluated the measures’ extant psychometric evidence using the COnsensus-based Standards for the selection of health Measurement Instruments. For the measures translated from English to Spanish, we evaluated the translation methodology based on best-practice recommendations for achieving translations that are psychometrically equivalent to their original versions. We found that validation information about measures was sparse and that few translations adhered to best-practice recommendations. Based on our a priori criteria we recommend the Plazaola-Castaño translation of the Index of Spouse Abuse. In closing, we discuss the validity evidence of translated measures independent of the original language version and best-practice recommendations in translating psychological measures.
Intimate partner violence (IPV) can be defined as physical, sexual, or psychological violence directed at an individual by a spouse, dating, or sexual partner (Breiding et al., 2015). About one in three women and one in five men experience IPV in their lifetimes, both within the United States and worldwide (Breiding et al., 2015; WHO, 2021). Experiencing IPV has been associated with various negative physical, psychological, and economic outcomes for survivors, including injuries from physical violence (Sheridan & Nash, 2007), increased risk for internalizing disorders and serious mental illness (Chandan et al., 2020), and reduced lifetime earning potential (Greeson et al., 2011). There is no reason to believe that IPV in Latinx communities would be distinct from that of the rest of the domestic population, but reported prevalence rates vary widely (Gonzalez et al., 2020).
In this article, we systematically review the validity of Spanish-language measures of IPV, including the translation methods for measures translated into Spanish, and provide recommended measures based on a priori criteria. The validity problems of many English-language IPV measures (Alexander et al., 2022), including significant limitations in detecting IPV victimization using self-report and clinician-administered IPV assessments (Follingstad & Rogers, 2013; Kataoka et al., 2010), suggest that there may be similar problems with Spanish-language IPV measures. While there is no inherent reason the validity of measures developed in Spanish would be better or worse than those developed in English, the act of translating IPV assessment instruments from one language to another adds another potential source of measurement error. In any case, given that 36% of Latinx individuals living in the U.S. are not proficient in English (Pew Research Center, 2019), we feel that the time is right for a review of Spanish-language IPV measures.
Challenges in Assessing for IPV
Despite the prevalence and impact of IPV, there are significant challenges associated with its accurate assessment. Although screening for IPV in healthcare settings is beneficial (Curry et al., 2018), several barriers exist that prevent detecting IPV victimization. We outline these challenges below.
Challenges Associated with Measurement Format
A primary issue associated with accurate assessment of IPV is measurement format (e.g., face-to-face interview, self-report checklist, etc.), and results are mixed as to which formats are best suited to assess IPV experiences. Indeed, reported rates of IPV may vary based on the format of assessment, from 19.4% through interviews to 29.4% through self-administered questionnaires (Kataoka et al., 2010). Some studies suggest that patients prefer or are more likely to disclose abuse in a self-report format (MacMillan et al., 2006; Rhodes et al., 2006). However, another study comparing reporting in interviews to self-report formats found that participants were more likely to disclose physical abuse in the interview than emotional abuse in the self-report checklist (Svavarsdottir, 2010). These findings indicate that there are limitations to assessing IPV across formats.
There are challenges associated with both clinician-administered and self-report formats for assessing IPV. Commonly cited barriers for clinician-administered measures include discomfort among healthcare workers, lack of time, and lack of knowledge of appropriate screening methods (Sprague et al., 2012). Across both assessment formats issues may include respondent underreporting, differing definitions of what constitutes psychological or emotional abuse, and lack of context in assessments leading to lack of consideration for self-defense (Follingstad & Rogers, 2013). Furthermore, IPV reporting may not be consistent over time, indicating that assessments may be subject to problems with memory or item interpretation and should be paired with complimentary assessment methods (Abramsky et al., 2022; Loxton et al., 2019). Thus, not only are IPV assessments subject to response inconsistency associated with format but also with characteristics of administrators and respondents.
Challenges Associated with Validation Evidence
While many assessment measures exist in both clinician-administered and self-report formats, very few are validated. Alexander et al. (2022) conducted a systematic review evaluating the validity evidence of all available English measures of IPV. They identified over 80 measures of IPV but found that less than a quarter of these measures met established standards of psychometric quality. The use of measures with insufficient evidence of validity may lead to overlooking individuals experiencing IPV, who will then be unable to access appropriate services. Thus, lack of sufficient validation information about IPV measures represents an additional challenge to assessing for IPV broadly.
IPV in Latinx Communities
The prevalence of IPV in Latinx communities in the U.S. has been found to be similar to that of the overall population (e.g., Sabina et al., 2015), although rates have also been found to range widely from 4% (Latta et al., 2016) to 80% (Cavanaugh et al., 2014). A systematic review by Gonzalez et al. (2020) found that 41% of studies examining prevalence of IPV in Latinx populations had prevalence rates that were higher than the national average. These widely varied rates of reported IPV may be due to a number of factors, including reporting hesitance due to immigration status and other sources of mistrust of the criminal justice system (Messing et al., 2015). While one source of variation in prevalence rates may be group differences, given the challenges associated with assessing IPV in Latinx communities, we hypothesize that differences in prevalence rates are due in part to measurement error.
Contributing to difficulties in assessing IPV in Latinx communities are challenges associated with participants’ English-language proficiency (Bauer et al., 2000). About 13% of the total U.S. population speaks Spanish at home (United States Census Bureau, 2020) and over a third of Latinx individuals in the U.S. are not proficient in English (Pew Research Center, 2019), indicating a need for Spanish-language measures of IPV. Although there are a number of IPV measures developed in Spanish (e.g., Inventario de Abuso Psicológico en las Relaciones de Pareja; Calvete Zumalde et al., 2005; la Escala Multidimensional de Violencia en el Noviazgo; García-Carpintero et al., 2018), the most commonly used measures tend to be translations of measures developed in English (e.g., Abuse Assessment Screen; Escribà-Agüir et al., 2016; Modified Conflict Tactics Scale; Muñoz-Rivas et al., 2007). Translated measures may not always share the same psychometric properties as their original language versions (Budruk, 2010) and translated measures must be validated separately to ensure that they are adequately assessing IPV in Spanish-speaking populations. Despite the importance of validating translations, translations are often produced and used before rigorous validity testing occurs, indicating that results from these translations must be interpreted cautiously. These problems compound the aforementioned issues associated with assessing IPV more broadly and further diminish the potential validity of IPV assessments among those whose first language is Spanish.
Best Practices in Translating Measures
Best practices for translation of psychological measures emphasize multi-phase, multi-method processes for achieving final translations. Processes such as Translation, Review, Adjudication, Pre-testing, and Documentation (Harkness et al., 2003) and the International Test Committee Guidelines for Translating and Adapting Tests (Bartram et al., 2018) exemplify this approach. These recommendations include using team-based approaches to developing initial translations, which can prevent individual biases, lack of cultural familiarity, or lack of knowledge about psychological test construction from negatively influencing translation quality. They also recommend thorough pre-testing, or gathering data on the performance of a translated measure, prior to its formal use. Pre-testing is most informative when it is done with members of the target population for a particular measure (e.g., Goerman, 2006), and often takes the form of small focus groups targeting comprehensibility of the translation (e.g., Daniel et al., 2011). These methods also advocate for use of both qualitative and quantitative methods in pre-testing translations, as quantitative methods allow for more precise examination of language-group differences in response patterns and qualitative methods provide insight into how participants are thinking of and understanding items or responses options (Boateng et al., 2018).
Commonly Used Translation Methods
Despite consensus regarding which methods are preferable to achieve conceptually and psychometrically equivalent translations of existing measures, these methods are not consistently used. Frequently used translation methods that are known to produce nonequivalent translations when used in isolation include back-translation (Bolaños-Medina & González-Ruiz, 2012) and translation from single individuals, even if those individuals are trained in translation (Wild et al., 2005). Additionally, few studies conduct formal pre-tests with their target population prior to using newly translated measures (Rios & Sireci, 2014). Measures that are not developed using thorough translation methods may lack conceptual or psychometric equivalence to their original language versions (Bolaños-Medina & González-Ruiz, 2012). A lack of equivalence between language versions of a measure may lead to inaccurate conclusions regarding the presence of the constructs purported to be measured by a translation.
Aims
In the current article, we address a clear gap in the literature by systematically evaluating the psychometric evidence supporting Spanish-language measures of IPV and the methods used in developing Spanish translations of IPV. Our aims for this review were: (a) to evaluate the available reliability and validity evidence for Spanish-language measures of IPV, (b) to evaluate the methods used in translating measures of IPV into Spanish and compare them with best-practice recommendations, and (c) to provide a list of recommended measures for use. We also discuss next steps for the field with regard to translating and validating Spanish-language IPV measures.
Method
Search
We conducted a systematic search during February of 2022 through PsycINFO, PSYCArticles, MEDLINE, Health and Psychosocial Instruments, Chicano Database, Mental Measurements Yearbook with Tests in Print, Social Work Abstracts, Social Sciences Full Text, Sociology Source Ultimate, and PubMed. We included all items published prior to the date of the search. Our Boolean operator search terms were as follows: (IPV OR “intimate partner violence” OR “domestic violence” OR “partner violence” OR “domestic abuse” OR “partner abuse” OR “couple violence” OR “marital abuse” OR “marital violence” OR “couple abuse”) AND (Spanish OR Latin* OR Hispanic OR Mexic*) AND (measure* OR assess* OR inventory OR questionnaire OR instrument OR index OR scale). Two years elapsed between the initial search and the publication of this review, therefore a second identical search was conducted prior to publication to capture any articles published since the initial search. The numbers presented below represent both searches.
Our search yielded 4,608 results, including articles, books, chapters, theses, and dissertations. Fifty-one additional records were found through the reference sections of items that reported their Spanish measures as having been translated or validated in a previous study. Four additional records were located through a search informed by a review of IPV measures by Gómez Fernández et al. (2019). These records were included in the following evaluation of reliability and validity evidence.
Article Selection Criteria
After duplicates were removed, 3,199 records remained for evaluation. Records were excluded if they were not available in English or Spanish. Titles and abstracts were screened for relevance and were excluded if they were not related to IPV in Spanish-speaking women. After screening records by title for relevance, 2,470 remained. Of these, 1,056 were removed through abstract screening, leaving 1,414 for full-text evaluation. Full-text exclusion criteria included records that included only measures of violence against children, elders, or family, measures of sexual violence outside the context of an intimate relationship, measures of individuals’ perceptions of violence, measures of IPV perpetration, or outcome measures. After full-text evaluation, 470 records were identified as having Spanish measures of IPV. These records included measures that had been developed in or validated on participants in a range of Spanish-speaking countries, including, but not limited to, Spain, Mexico, Guatemala, and Colombia. Of these records, 91 reported relevant validity evidence. All inclusion and exclusion criteria were determined a priori. The results of the search are presented in Figure 1 according to Preferred Reporting Items for Systematic Reviews and Meta-Analyses guidelines.

Identification and sorting of records according to preferred reporting items for systematic reviews and meta-analyses guidelines.
Data Extraction Process
After finalizing the list of relevant records, we identified the IPV measures used in each record, whether measures were translated from English or developed in Spanish, and the psychometric evidence for each Spanish-language measure. We then evaluated the reported reliability and validity evidence for each measure in an article according to the standards established by COnsensus-based Standards for the selection of health Measurement INstruments (COSMIN; Prinsen et al., 2018). COSMIN standards provide a standardized structure for evaluating psychometric properties of psychological instruments. Measures were identified as having sufficient (+), indeterminate (?), or insufficient (−) evidence supporting structural validity, internal consistency, reliability, measurement error, hypotheses testing for construct validity, cross-cultural validity/measurement invariance, criterion validity, and responsiveness (Prinsen et al., 2018, p. 28). Reliability and validity evidence were then aggregated across all articles that reported relevant evidence for a particular measure, providing an overall evaluation of psychometric evidence for each measure.
For measures that were translated from English into Spanish, we also evaluated the rigor of translation methodology. We considered measures to be well-translated if they: (a) described using a well-supported method of initial translation, (b) pre-tested translations prior to their use, and (c) made subsequent revisions to translations after pre-testing. These criteria are based on existing recommendations for translation best practices (Harkness et al., 2003; Hendershot et al., under review).
Recommendation Criteria
We identified a priori the criteria needed in order to recommend a measure for use, based in part on criteria identified in a review conducted by Alexander et al. (2022). First, each measure must be evaluated in at least two validation studies. Second, if translated, the original English measures utilized must show sufficient validity evidence, as evaluated by Alexander et al. (2022). Third, recommended measures must display at least half positive (+) COSMIN ratings across the identified records utilizing the measure, indicating that evidence supporting reliability and validity of a measure is stronger than evidence against it. Finally, translated measures must be developed through recommended methods for achieving psychometrically equivalent translations.
Results
Our search resulted in a total of N = 305 Spanish-language IPV measures being located, including informal measures and alternate, adapted, or shortened forms of currently existing measures. Of the N = 305 measures located, n = 148 (n = 74 of which were adapted or shortened forms) were originally developed in English and translated into Spanish, n = 50 (n = 4 of which were adapted or shortened forms) were originally developed in Spanish, and n = 107 were informal measures (e.g., measures that were unnamed or described as items assessing IPV, etc.). The majority of measures identified were excluded due to not having valid evidence reported in the records located in our search (n = 235). The remaining measures (n = 70) had validity evidence reported in the items located in our search and were eligible to be assessed by our evaluation criteria (Figure 2).

Measures meeting criteria for recommendation.
Aim 1: Evaluating Psychometric Properties of Spanish IPV Measures
We evaluated psychometric evidence across all measures with available validity information. We found that the most common rating received across all areas of COSMIN criteria was indeterminate (?; 86.6%). This indicates that studies largely lacked sufficient information to accurately assess the validity of the measures being used. The second most common rating was sufficient (+; 8.7%), which indicates that the psychometric evidence presented does meet COSMIN criteria. The least common rating was insufficient (−; 4.6%), which indicates that there is insufficient evidence to determine if the psychometric evidence presented meets COSMIN criteria.
Aim 2: Evaluating Translation Methodology of Spanish IPV Measures
Across all translated measures with psychometric evidence (N = 48), we found that n = 22 reported that measures were translated without pre-testing, n = 14 reported that measures were translated with pre-testing, and n = 12 reported that there was no information regarding translation methods. Of the items that reported that translation methods included pre-testing, only n = 8 reported that changes were made subsequent to pre-testing.
In addressing this aim, we found that some English IPV measures were translated by multiple research teams independently. For example, the Index of Spouse Abuse (ISA; Hudson & McIntosh, 1981) was translated independently by Carrasco (2002) and Plazaola-Castaño et al. (2009). As these independent translations varied considerably in phrasing and evidence of validity, we treat independent translations as unique measures in our assessment of which measures met criteria for recommendation (see Table 1 for all translation versions for each measure). Thus, some measures had independent translations that met all or most of our criteria while other independent translations of the same measure did not.
Critical Findings: Spanish Measures of IPV with 2+ Articles with Validity Evidence with Recommended Measures Identified.
Bolded measures indicate that a measure met all our criteria for recommendation.
Asterisks indicate measures that met all our criteria for recommendation except translation methodology.
Furthermore, we found that some articles reported using a particular independent translation version of a measure, but that reported using items that differed significantly from the items outlined in the cited translation. For example, Santos-Iglesias et al. (2013) reported using a translation of the ISA that was developed by Carrasco (2002). However, when items were compared between those reported by Santos-Iglesias and those reported by Cáceres Carrasco, we found significant differences in item phrasing and length across both articles. Again, we identified articles that presented item phrasing that differed from the items’ phrasing in the cited translations and treated these as independent translations of the original English measure, as described above.
Aim 3: Measures Meeting Criteria for Recommendation
Measures Assessed in at Least Two Studies
Of the N = 305 measures identified, n = 70 had available validity information. Of these, n = 12 met our criteria for having at least two studies with validity information. Eleven (n = 11) of these measures were developed in English and translated into Spanish and one (n = 1) was developed in Spanish (see Figure 2).
Measures Translated from Well-Validated Original Versions
We found that n = 4 of the remaining measures met our criteria for being developed from well-validated original English versions. Measures met this criterion if they were developed from measures that were recommended in Alexander et al. (2022) evaluation of English-language IPV measures. The measure developed in Spanish that met previous criteria was exempt from this criterion and was retained, for a total of n = 5 measures meeting criteria up to this point in our evaluation (see Figure 2).
Measures with More Positive Psychometric Evidence than Negative
Of the n = 5 measures retained, n = 3 met criteria for having more positive (+; sufficient) ratings regarding psychometric properties than negative (−; insufficient), according to COSMIN criteria (Prinsen et al., 2018). The n = 3 measures remaining after this added criterion were all translations of English measures.
Translation Quality
Of the remaining n = 3 measures, only n = 1 met our threshold for translation methodology.
Excluded at this Step
The Conflict in Adolescent Dating Relationships Inventory (CADRI; Wolfe et al., 2001) is a 50-item measure assessing IPV perpetration (25 items) and victimization (25 items). The CADRI was developed to assess IPV in high school-age populations and assesses a range of abusive behaviors including sexual, physical, and psychological abuse. The Spanish version of the CADRI assessed here was developed by Fernández-Fuentes et al. (2006). Although it meets our criterion for displaying good psychometric properties, the developers do not provide any information about how CADRI was translated. Thus, we are not able to evaluate if this measure meets our criterion for translation methodology. Given the good psychometric properties of the Spanish translation of the CADRI and its lack of information about translation methodology, we cautiously recommend this measure for use.
The Woman Abuse Screening Tool (WAST; Brown et al., 1996) is a brief measure of physical and emotional abuse, designed to be administered by family physicians. The Spanish version of the WAST was developed by Fogarty and Brown (2002). We identified four articles that evaluated the validity of this translation of the WAST. Although the WAST met all psychometric criteria for recommendation, we did not find evidence that developers made changes to translations after pre-testing the WAST. Thus, it did not meet our criteria for translation methodology. However, given its positive psychometric evidence, we cautiously recommend this measure for use.
Recommended Measure: Plazaola-Castaño et al. Translation of the ISA
Figure 2 displays how many measures were eliminated at each threshold for recommending a measure. Only one measure met all criteria for recommendation: the ISA (Hudson & McIntosh, 1981) is a 30-item self-report measure that assesses both presence and severity of physical and psychological IPV. The translation reviewed here was developed by Plazaola-Castaño (2009; see Supplemental Material Appendix). We found four articles with psychometric evidence for this translation. Collectively, they indicated that this translation of the ISA has evidence of both reliability and validity, and development followed best-practice recommendations for translation, including back-translation to achieve an initial translation, expert content analysis, and pre-testing with a small group of women.
Discussion
In this review, we sought to provide better insight into the psychometric evidence available for Spanish measures of IPV and the methods being used to develop Spanish translations of IPV measures. We identified that the majority of measures located lacked any sort of information that could be used to assess psychometric properties through COSMIN criteria. Of measures that did have relevant reliability and validity information, most information led to “indeterminate” ratings, rather than providing evidence for or against adequate psychometric properties. We similarly found that most studies did not report using best practices in developing translations nor provide any information regarding translation methodology. Our evaluation of located measures resulted in the identification of a single measure that met all of our a priori criteria for recommendation as well as two additional measures that met all criteria except that of translation methodology. With this review, we were able to address several gaps in the literature, namely, a lack of systematic evaluations of translation methodology and a lack of evaluations of psychometric evidence of Spanish measures of IPV. We were also able to provide a measure to recommend for use.
This review builds on prior reviews by assessing a wide range of measures without limiting our search to specific categories or formats for measures, evaluating measures not just on their psychometric properties but also the psychometric properties of their original versions if they were translated, and adding additional criteria for recommendation based on previous reviews. Gómez-Fernández et al. (2019) provide a broad overview of available measures of IPV victimization, including a description of measures and an overview of reliability and validity evidence, but did not apply strict COSMIN criteria. Similar to our review, Martínez Soto and Ibabe (2022) use COSMIN criteria to evaluate measures specifically assessing cyber dating violence victimization and perpetration. We hope our review builds on these reviews and highlights the availability of validated measures of IPV available in Spanish.
This review also identifies a critical concern that, to our knowledge, has not been outlined elsewhere. We found that users of translated measures often made changes to the measures they used without reporting that revisions to the measure had been made. This presents a barrier to developing a thorough understanding of the psychometrics of translated measures, as even small changes to item phrasing can change their functioning (see Hendershot et al., under review), and researchers or clinicians seeking to identify valid and reliable measures for use may not recognize that articles may be referring to using a particular translation while presenting psychometric information on a significantly altered version of that translation. It is possible that researchers and practitioners feel comfortable revising translated measures in ways that they would not with measures that had not been translated. It is worth noting that there may be legitimate reasons for these edits, such as nuances in regional language use or variations in the meaning of translated phrases (Lopez et al., 2008). Nevertheless, as with measures developed in their target language, changing item phrasing can result in significant changes in item- or measure-level psychometric functioning (Hambleton, 2006). Although this review focuses on IPV, it is likely that this issue is present in other areas where researchers and practitioners are using translated measures.
Another area for concern is an overall lack of cross-cultural validation of IPV measures. As noted in the method, one of the COSMIN criteria is that of cross-cultural validity. Although we do not discuss measures’ ratings within each COSMIN criteria in the results, we note that measures were tested for cross-cultural validity, either across language groups or across Spanish-speaking regions. As cross-regional and cross-linguistic differences can change measurement functioning (Van de Vijver & Tanzer, 2004), it is critical to evaluate the cross-cultural functioning of measures across countries or regions prior to their use for research or clinical purposes. It is especially important to consider the measures’ cross-cultural relevance because many of the measures evaluated were translations of measures originally developed in English. Thus, the lack of evaluation of measures’ psychometric properties across Spanish-speaking regions represents a limitation in the research in this area.
Limitations
We identified a number of limitations in this review. First, we encountered some challenges in identifying relevant articles. Validation and measures’ psychometric properties were not the focus of the majority of articles that were collected. This meant that we had to cast a fairly wide net, both in terms of the search terms utilized and the databases in which we searched, to identify relevant articles. Through these methods, we were able to access a sufficient number of measures to overcome this barrier. Second, similar to the first limitation, the focus of most of the articles we accessed was not on translation methodology. Translation is a cross-disciplinary topic, with articles on translation methodology appearing across different databases, including those focused on linguistics, cross-cultural social science, and medicine. This presented a barrier to our accessing of relevant articles. We believe that the varied search terms used and databases accessed minimized the impact of this limitation. Third, information about translation processes themselves can be difficult to label and categorize, given the range of terminologies and definitions for similar processes (Bardaji, 2009; Wild et al., 2005). However, given that most studies located discussed translation as a byproduct of their aims, rather than the focus, we believe that this limitation did not result in any substantial overlooking of relevant articles or measures. Finally, we used fairly blunt categorization methods in assessing the original language of measures and the quality of their translation methodology. If the language of a measure was not made explicit (e.g., participants spoke Spanish but the language of the measure was not stated), the measure was excluded. Similarly, if methods used for translation were not explicit (e.g., the study stated that translation methods were used but did not specifically describe procedures), then the translation method was labeled as “unclear” and would not meet our translation criteria. However, we believe that adhering to these strict methods was critical in maintaining rigor in our evaluation of measures both developed in and translated into Spanish.
Strengths
Despite these limitations, our review had a number of notable strengths. First, as we described, we made an effort to locate and evaluate any articles that indicated that they had Spanish-language IPV measures, whether they were translated into or developed in Spanish. As we recognized that neither translation methodology nor validation of measures was likely to be the focus of many articles that discussed these elements, we chose not to focus on them exclusively in our search terms, resulting in a broad range of results. Second, we developed rigorous a priori criteria for evaluating all measures located. In addition to using an established method of evaluating psychometric properties, we also had a set of criteria for measures both translated and developed in Spanish, allowing us to be more confident in the psychometric properties of the measures recommended for use. Finally, we provide recommendations for which measures display adequate evidence of reliability and validity and follow best practices in their translation methods, which has not been frequently done in similar reviews of IPV measures.
Next Steps
There are several directions for future research that follow from this review. First, we identified that best-practice recommendations for translation are not always followed or are followed only in part (Rios & Sireci, 2014). It is understandable that researchers seeking to conduct translations may not have the resources to follow these recommendations in their entirety, which may be contributing to the number of translations that are developed without pre-testing or revision. However, this practice limits the cross-cultural equivalence of the translations being developed (Bolaños-Medina & González-Ruiz, 2012). One potential solution is for researchers to seek, revise, and pre-test existing translations prior to using them, and document this process as well as associated validity information of the revised translation (see Hendershot et al., under review, for a review of this method). In this way, translations can be improved across multiple research teams, improving the quality and validity of translations over time. Second, we aim to develop a new Spanish measure of IPV based on the items in existing measures. To do this, we plan to access all items from each measure identified in this review, both formal and informal, and use item response theory to identify the items that are most predictive of IPV victimization across all existing measures (see Funk & Rogge, 2007, for a description of this method). Through this method, we aim to develop and validate a measure capitalizing on the strengths of items across existing Spanish measures of IPV. In this process, we will evaluate items for cross-cultural appropriateness. As regional differences in Spanish-language terms and phrases may contribute to lack of measurement validity (Van de Vijver & Tanzer, 2004), emphasizing comprehensibility across Spanish-speaking regions will improve the cross-cultural applicability of the measure we develop. Given the sparse evidence of validity and reliability for currently existing measures and the concerns outlined around revisions to measures that are not documented, this work is timely and much needed.
Conclusion
In this review, we present an evaluation of the psychometric properties of Spanish IPV measures and methods used in translating measures into Spanish. We identified an overall lack of sufficient validity evidence for most measures being used. We also found that methods used to develop translations rarely met translation standards for assessment instruments. We were able to recommend one measure for use based on a priori criteria. Based on these findings we present recommendations for researchers’ and practitioners’ use of Spanish-language IPV measures. This review presents an important step in providing rigorous evaluation not just of psychometric evidence but also of translation methodology.
Implications for Practice, Policy, and Research
Based on the results of this review, we can provide several recommendations for researchers and practitioners seeking to use translated measures or develop their own translations. To assess IPV in Spanish, we recommend using the measure we recommended. If using a measure without validity evidence, especially to evaluate individuals in a particular demographic group (e.g., pregnant women, adolescents, etc.), we recommend interpreting results with caution and describing the results as qualified by the lack of validity evidence.
For researchers and practitioners using translated assessment instruments more broadly, we recommend ensuring that the original versions of translated measures are themselves valid and separately investigating validity evidence for translations. When investigating validity evidence, we recommend confirming that items are consistent across articles purporting to be using the same version of a translation.
For researchers and practitioners seeking to develop their own translations, we recommend following the iterative best-practice recommendations for achieving conceptually equivalent translations (see Bartram et al., 2018). After translating measures, we strongly recommend validating them with their target population for use to ensure that the translations are functioning as intended. Before endeavoring to develop a new translation, we also recommend that researchers identify whether there are any existing translations that may be utilized. We came across several measures that had multiple translations (e.g., Hurt, Insult, Threaten, and Scream, translated independently by both Caldentey et al., 2017, and Chen et al., 2005), each version with insufficient psychometric evidence to meet our criteria for recommendation, although if combined the measure would have met criteria. Furthermore, we recommend that all revisions of a translation be carefully documented and clearly stated in published works. It is possible that some translations may benefit from revision, but efforts should be taken to follow best practices for revising measures (see Hendershot et al., under review).
This review aligns with U.S. Department of Justice Office of Violence Against Women (OVW) priority areas of study. As our review demonstrated, the use of valid Spanish-language measures of IPV would facilitate coping, healing, and achieving safety and justice. The use of the recommended measure also aligns with OVW’s effort to diminish the cultural differences in responding to IPV victimization and social inequalities associated with language barriers. By focusing on the most prevalent non-English language in the U.S., we hope to improve access to justice and services that are useful to victims of IPV crimes.
Supplemental Material
sj-docx-1-tva-10.1177_15248380241259999 – Supplemental material for Spanish-Language Measures of Intimate Partner Violence: A Systematic Review of Psychometric Evidence and Translation Methodology
Supplemental material, sj-docx-1-tva-10.1177_15248380241259999 for Spanish-Language Measures of Intimate Partner Violence: A Systematic Review of Psychometric Evidence and Translation Methodology by Quinn E. Hendershot, Erin F. Reto, Alberto D. Torres-Aragón and Matthew D. Johnson in Trauma, Violence, & Abuse
Footnotes
Acknowledgements
We thank Jesse Smith and Saul Padilla for their help in data extraction for this project.
Declaration of Conflicting Interests
The author(s) declared no potential conflicts of interest with respect to the research, authorship, and/or publication of this article.
Funding
The author(s) received no financial support for the research, authorship, and/or publication of this article.
Supplemental Material
Supplemental material for this article is available online.
Author Biographies
References
Supplementary Material
Please find the following supplemental material available below.
For Open Access articles published under a Creative Commons License, all supplemental material carries the same license as the article it is associated with.
For non-Open Access articles published, all supplemental material carries a non-exclusive license, and permission requests for re-use of supplemental material or any part of supplemental material shall be sent directly to the copyright owner as specified in the copyright notice associated with the article.
