Abstract
Researchers give credit to peer-reviewed, and thus, credible publications through citations. Despite a rigorous reviewing process, certain articles undergo retraction due to disclosure of their ethical or scientific deficiencies. It is, therefore, important to understand how society and academia react to the erroneous or deceitful claims and purge the science of their unreliable results. Applying a matched-pairs research design, this study examined a sample of medicine-related retracted and non-retracted articles matched by their content similarity. The regression analysis revealed similarities in obsolescence trends of the retracted and non-retracted groups. The Generalized Estimating Equations showed that citations are affected by the retraction status, life after retraction, life cycle and the journals’ previous reputation, with the two formers being the strongest in positively predicting the citations. The retracted papers obtain fewer citations either before or after retraction, implying academia’s watchful reaction to the low-quality papers even before official announcement of their fallibility. They exhibit an equal or higher social recognition level regarding Tweets and Blog Mentions, while a lower status regarding Mendeley Readership. This could signify social users’ sensibility regarding scientific quality since they probably publicise the retraction and warn against the retracted items in their tweets or blogs, while avoiding recording them in their Mendeley profiles. Further scrutiny is required to gain insight into the sensibility, if any, about scientific quality. The study’s originality relies on matching the retracted and non-retracted papers with their topics and neutralising variations in their citation potentials. It is also the first study comparing the groups’ social impacts.
1. Introduction
Peer review is a pillar of science guaranteeing the trustworthiness of scholarly outputs. However, there is always a possibility of the fallibility of research, and even after rigorous pre-publication peer review, some articles are found, after publication, to contain seriously flawed or erroneous data or violated professional ethics due to plagiarism, redundancy or lack of lucidity regarding the conflict of interests [1]. Retraction is, thus, a self-correcting, expeditious and democratic mechanism [2] through which scholarly publishers and journals warn researchers about low quality and therefore untrustworthiness of research [3]. Citing retracted publications has consequences such as drawing false and harmful conclusions [4] and, hence, increasing the risk of distorting science. As the articles citing the retracted contents do not witness a significant decrease in their scientific impact [5], it seems that the risk would live on through the citation chain [6].
The scientific community is supposed to detect and vigilantly react to the misleading claims which are of low quality, wrong or worthless. They are expected to sceptically evaluate the claims and resist accepting and, hence, citing them. However, the retracted papers have been found to keep receiving citations [6–8]. Consequently, it would seem that the scientific community is not successful in divesting its knowledge stock of invalid findings. However, it is not realistic to expect the citations to the retracted articles to be absolutely zero from scratch or abruptly ceased after retraction. As the withdrawal of papers is most frequently due to misconducts (e.g. duplicated publication, plagiarism and fraud) rather than scientific weaknesses [9] or genuine errors [1]. Obviously, the misconducts are not necessarily detectable by the citing authors. Moreover, the motivations of citing articles, including the retracted ones, are not necessarily always scientific. There exist perfunctory citations which imply a rather superficial, if any, impact on the citing publications [10–12]. Moreover, citations to the retracted articles might be caused by citing authors’ lack of information literacy, low visibility of the retraction notes, the accessibility of different versions of articles not linked to the related retracted papers or to the retraction notes, the use of second-hand citations without referring to the original cited articles and so on [6,13]. Consequently, a more realistic expectation would be to observe a superficial impact, marked by a relatively lower number of citations from the scratch, due to the very intrinsically poor quality of the retracted papers. This would be expected to be followed by a more rapid pace of obsolescence after the withdrawal of the paper. Consequently, in order to fairly judge the reaction of the audience towards the retracted items, it is necessary to test them against some benchmarks (i.e. some still-valid papers dealing with the same subjects and publication properties).
As far as our verification of the literature revealed, there are few studies comparing the citation performance of retracted articles to that of a control group. For example, Furman et al. [2], Pfeifer and Snodgrass [14] and Rubbo et al. [15] compared the retracted and non-retracted articles controlled in terms of their publication years and publishing journals. As another example, Lu et al. [16] used the retracted papers or the prior publications by the same authors as the treatment group, and the papers with similar citation patterns to the treated papers prior to the date of retraction as the control one. Although papers’ topics are believed to be determining in their citation potential, no studies were found to compare the citation performance of a set of retracted and non-retracted articles matched in terms of their topics. Even within a certain discipline, there are citation-intensive areas and topics which are more probably to attract citations, namely exciting and popular topics [17,18], hot topics [19], controversial topics [20], fundamental subjects [21]. As a result, important subject matters seemingly attract fewer citations, while popular or trivial topics are more probably to gain citations [22]. Moreover, in the previous studies, retracted and non-retracted articles were usually collectively compared in groups, while a matched-pair design would take any probable individual differences between the papers into consideration. Furthermore, no studies were found to deal with the retracted papers’ societal impact compared with a non-retracted group.
Scholarly articles do not exclusively affect readers-as-authors, but they also affect different social classes such as clinicians, students, practitioners, lawyers and professionals. Altmetrics (i.e. those metrics derived from social networks based on such users’ activities as downloading, viewing, bookmarking, liking, following) have recently been used widely to measure the societal impact of scholarly literature [23–25]. They are believed to measure the impact of scholarly outputs at a wider level [26], in different types [27–32] and on different audiences [25, 31, 33–35]. Therefore, they may help us conducting a more realistic and insightful judgement on the merits of papers, especially those under-evaluated by traditional metrics.
However, having been derived from a set of heterogonous social media and networks [26], the altmetrics are of different functions and potentials. For instance, while Mendeley and Twitter have a high number of academic users all around the world, the latter has also absorbed users beyond academia [29,34,36]. Consequently, Twitter may reflect the social impact of papers [28] while Mendeley may provide a picture of their educational or applied impacts [29]. Moreover, social users are driven by different motivations when discussing, mentioning, bookmarking or recording papers in their social profiles or posts. Consequently, high altmetric counts cannot necessarily be considered as a quality indicator [33,37]. However, the metrics differ in their affinity to quality. Although both Mendeley Readership and tweets are able to predict early citations of papers [38,39], the former shows a high correlation with paper quality [26,40,41], while the latter indicates a low correlation [26]. This may have roots in the differences in their users’ motivations. Mendeley users are motivated by papers’ academic and educational values [29] and, hence, they bookmark and record them for future reading or citing [42]. However, (micro)blogging may serve other goals, namely, criticising papers’ features [43–45] informing people, debating about science-related events [45] as well as disseminating scientific findings [46]. (Micro)blogs are sometimes used for just undigested dissemination with almost no sign of debate, contestation or collective reflection [47], but they are sometimes strictly devoted to criticise papers, especially the fraud reports or retractions [48].
To investigate how society and academia react to the retracted articles, the present study examines their traditional and social mention quantities in comparison to those of a controlled sample of similar non-retracted papers in the field of medicine. It also studies the citation obsolescence trend of the retracted papers and their peers before and after retraction. The citations begin to decline over time, after reaching a peak. The peak occurs a few years after publication, depending on the disciplines. For instance, it appears after 2 years for medical papers, 2–3 years for life sciences [49] and 10 years for social sciences [50]. Although scientific articles become obsolete, it does not imply that their contents are revealed to be necessarily wrong or worthless; they perhaps are gradually integrated into the body of knowledge and become common knowledge [51].
The rationale underlying the selection of the field of medicine lies in its direct and indirect impact on peoples’ health and well-being which require accuracy of research data and findings and, hence, a complete and rapid cancellation or neutralisation of invalid and unreliable research findings. The contribution of the present study relies on the fact that it concentrates on a set of retracted and non-retracted papers published on the same topics in order to neutralise topics’ variations in their citation potentials. It also investigates the impact of the retracted articles on the public beyond the scientific community.
2. Literature review
2.1. The characteristics of the retracted articles
There have been extensive studies on the characteristics of the retracted articles in a specific discipline [52], database [53], country [54] and the reasons for retraction [55]. For example, Grieneisen and Zhang [56] surveyed the retracted articles across the full spectrum of scholarly disciplines in 42 of the largest bibliographic databases for major scholarly fields. They found that retractions spread across author affiliation countries and disciplines (e.g. Maths, Physics, Engineering, Social Sciences, Medicine, Life Science and Chemistry); yet, they represented only small fractions of a per cent among all publications for any given field, country, journal or year. Their results showed that limited proportions of articles were retracted due to alleged research misconducts (20%) or loss of faith in the data or interpretations as published (43%). The results of the research conducted by Lu et al. [16] confirmed that the retraction rate is the highest in the hard sciences, especially in biomedical and multidisciplinary journals, and the lowest in social science and arts and humanities. Singh, et al. [57] conducted a comparative analysis of articles retracted between 2004 and 2013 from biomedical literature. Their results showed that a total of 2343 articles were retracted during that time, and the original articles (1056) followed by case reports (783) constituted a major part of it. They also showed that time interval between submission and retraction of articles reduced, and journal impact factor (JIF) and retraction did not have any significant correlation. Moylan and Kowalczuk [58] studied retraction notices of 134 articles, and they found that the three most important reasons for retraction were some kind of misconduct, plagiarism and unreliable data. In another study, Nogueira et al. [59] surveyed retracted articles in dentistry published in journals covered in SCImago Journal Rank. They found that retractions were mostly due to the authors’ malpractice and were more frequently related to journals with less impact. Wang et al. [60] examined the reasons for retractions of articles from open access (OA) journals in biomedical research. Searching through PubMed, they found that the quantity and the rate of retractions increased since 2010. The most common reasons for retraction were errors, plagiarism, duplicated publication, fraud/suspected fraud and invalid peer review. The majority of the retracted articles were from journals with low JIFs, and they were authored by researchers from China, India, Iran and the USA.
2.2. Retraction and social and academic impact
Several studies have concentrated on the citation performance of the retracted articles. For instance, Pfeifer and Snodgrass [14] reported a 35% decrease in citations after the retraction of papers. Budd et al. [61] showed that the retracted papers in biomedical sciences continued to be cited. The citations explicitly or implicitly confirmed the validity of the papers. In another research, Budd et al. [7] examined the retracted articles in biomedical journals and the reasons for these retractions and citations to the articles subsequent to the retraction. The results indicated that the citations to the retracted papers were confirmatory either implicitly or explicitly. The retraction of a publication, even though the retraction may have been visible in the journal and was clearly noted in the MEDLINE database, did not ensure that all subsequent researchers would be alerted to the retraction and would cease making reference to the retracted work. What the cause of retraction was seemed to matter little, citations might have continued to any retracted article. Most of the citations were positive. Shuai et al. [5] found a significant decrease in the impacts of articles after their retraction. Lu et al. [16] found a decrease in citations not only to the retracted articles, but also to the authors’ prior works when not self-reporting the mistakes. However, Neale et al. [62] found no significant diminution in the retracted articles’ citations. According to Korpela’s [63] findings, the retracted articles continued to receive negative and confirmatory citations. It took 24 years after retraction for the latter to cease. In contrast, Furman et al. [2] observed a severe decline in the citations to a set of retracted articles matched to a non-retracted article in terms of their publication year and the publishing journals. Peterson [64] studied whether OA and fee-for-access works differed in terms of the practice and the effectiveness of retraction. By examining citations to the retracted articles indexed by the National Library of Medicine (NLM), he found that post-retraction citation diminished at approximately the same rate across the retracted biomedical literature regardless of the accessibility or the publishing model. He also found that OA literature did not differ from fee-for-access literature in terms of JIF and detection of an error. Bar-Ilan and Halevi [4] studied the retracted articles with reference to the rest of the literature and how their citations were influenced by their retraction. They analysed the context of citation as positive, negative or neutral. Their results showed that the majority of citations to the retracted articles were positive. They found that positive citations were done to the articles which were retracted due to ethical misconduct, data fabrication and false reports.
Teixeira da Silva and Bornemann-Cimenti [6] studied the reasons why retracted articles continued to be cited. They enlisted the reasons as authors not aware of the retraction status of a paper, databases not linking retracted articles with the notice of retraction and many papers deposited in the ‘original’ (i.e. pre-retracted version on personal or institutional websites or online repositories).
On social platforms, higher altmetric attention scores are found to be associated with retraction due to misconducts [65]. Tweets are also believed to increase with the detection of misconducts [26]. Social discussions on the retracted articles are found to take place at an earlier stage than traditional media [66], and continue even after their retraction [67]. Mendeley reader counts of the retracted articles are also found to grow after retraction [68].
As the review of the literature indicates, the withdrawal of papers is of a global nature, involving different countries, disciplines, journals and access models. In spite of the fact that fraud and fallibility are among the major reasons for retraction, the retracted articles keep receiving credits, even after retraction. The citations are not necessarily refutational, but rather confirmatory, signifying the longevity of the undesirable impact of the untrustworthy or erroneous knowledge. The societal impact of the retracted articles is not only associated but also triggered by retraction. No studies were found to control for subjects and topics when verifying and comparing the impacts of the retracted and non-retracted articles, in spite of the fact that they vary in terms of their citation potentials.
3. Research questions
In order to reveal academia and society’s reactions to the retracted papers, the present study attempted to answer the following questions:
Do citations to the retracted papers decrease significantly after-retraction compared with before-retraction phase?
Does the obsolescence trend of the retracted papers accelerate after-retraction compared with before-retraction phase?
Are the retracted papers significantly lower in their citation count compared with their non-retracted peers of the same topics either before or after retraction?
Are the retracted papers significantly lower in their Tweets and Blog Mentions compared with their non-retracted peers of the same topics?
Are the retracted papers significantly lower in their Mendeley Readership compared with their non-retracted peers of the same topics?
4. Method
Using a matched-pairs research design, the present study concentrated on a sample of retracted papers in medicine matched to non-retracted papers similar in their subjects and controlled for their publication date, JIF and access models; their academic and societal impacts were studied. No control was imposed on the citations counts.
The research design has some advantages. First, the matched-pair design would take any probable differences between the papers into consideration. Second, the matching of the papers by their topics neutralises any differences in the impacts probably brought about by topics with different citation potentials. Third, by controlling for the publication date, JIF and access models, the effects of paper recency, journal reputation and open accessibility would also be taken into consideration. Fourth, the obsolescence trends of the retracted and non-retracted papers would be compared in before- and after-retraction phases.
However, the emphasis of the present study on matching the papers in terms of their topic similarity has some drawbacks. It could be difficult, if not impossible, to have a sample of retracted and non-retracted papers of adequate size, within which all the pairs similar in their contents are also published in the same years, journals and access models. Consequently, the papers within each pair in the studied sample are not necessarily of the same features. To deal with the fact, the regression model used to analyse the citation performance of the papers was conducted at three levels (i.e. the whole sample, as well as two subsamples including 241 pairs published in the same years and 215 pairs published in the same journals). Moreover, Generalized Estimating Equations (GEE) was used to analyse all of the features in a single model (see “Data analysis” section for further details).
4.1. The identification of the collection
To identify the retracted articles, a Web of Science (WoS) search was conducted on 25 September 2018; it was restricted to article type ‘Retracted Publications’ across disciplines related to medicine through field tag WC (i.e. WoS Categories). Before that, we needed to accurately delineate the boundaries of the field. The SCI-indexed journals were categorised into 172 subject categories further classified into 22 broader categories in Essential Science Indicator (ESI). As no single medicine field was explicitly delineated, we used the medicine-related categories identified by Leydesdorff and Rafols [69]. They are claimed to be finer-grained and therefore less error-prone than other categorizations. Using a factor analysis method, science was mapped into 14 factors, 5 of which were related to medicine (i.e. Biomedical sciences, Neurosciences, Infectious diseases, Clinical medicine, and General medicine and health). In total, we extracted 3388 records. In fact, 675 items out of the 3388 retracted ones which were published in book format, duplicated or not retrieved were omitted from the sample.
In the next step, by searching each of the retracted papers in PubMed by their titles and PubMed identifier (PMID), a sample of similar non-retracted papers was identified. Based on a topic-based content similarity model called PubMed related articles (pmra), PubMed offered a related article search feature that was effective in ranking the related articles [70]. It listed and ranked the papers similar to the retrieved ones based on their similarities in titles, abstracts and Medical Subject Headings (MeSH). 1 The algorithm took advantage of both natural and controlled words to improve the suggestions because the natural language, though effective in information retrieval [71,72], fails to overcome some intrinsic semantic ambiguity [73]. The deficiencies were believed to be improved by standard semantic tools [74–77]. As research evidence was inconsistent (i.e. sometimes in favour of the controlled vocabulary [78–80] and sometimes in favour of the natural language [81]), a mixed-methods design was prescribed to improve the results [76,82].
The top-ranked similar documents identified for each of the 2713 retracted items were verified to ensure that they were not retracted. The retracted papers published in 1970–1999 were found to lack either non-retracted peers or any social metrics. Consequently, the final collection decreased to 1071 paper pairs (i.e. 1071 retracted papers and 1071 non-retracted peers) published between 2000 and 2018.
Given the dependence of citations to such factors as JIFs of the publishing journals [83], publication date [84] and OA models [85,86], it was necessary to take the factors into account in the analyses. To do so, during the PubMed searches, the papers were also verified in terms of their access models, and they were categorised into two categories of OA and non-open access (NOA). In addition, the JIFs of the publishing journals were extracted from JCR 2002–2008 and were averaged to control for their nuances over time. The publication and retraction dates were also recorded to be controlled in the data analyses. In the next step, the identified papers were searched in WoS for their year-by-year citation data in March 2019.
The citation ages were calculated at two levels relative to publication and retraction dates. A citation age relative to the publication date signified the years spent for the citation after the original publication of the cited paper. As the retraction date was probably to act as a crucial point in the inclination of scientific communities towards the retracted paper, the citation age relative to the retraction date was also calculated; it reflected the duration spent for the citation after the retraction cutoff point.
The social indicators of the papers were gathered by online access courtesy of Altmetric.com. Given the variety of social metrics and their different nature and functions, it was necessary to study a wide range of metrics. However, there were a lot of metrics with zero values for the papers in the sample. Consequently, the present research concentrated on Tweets and Blog Mentions as social mentions as well as Mendeley Readership which represented records in the social reference sharing site.
5. Data analysis
The best fits for the obsolescence of the papers were investigated using regression analysis. Given the skewness of the citation data, their geometric mean (Geo-Mean) was calculated to plot the citation age of the papers against it. To handle the zero values in the citations, the standard approach proposed by Thelwall [87] was applied; so 1 was added to the citation counts and then was subtracted after the calculation of the Geo-Mean values. For the overall obsolescence trend, the citation age was calculated based on the time span between the publication and the citation dates. For the obsolescence trend in the before- and after-retraction phases, the citation age was calculated relative to the retraction date.
As mentioned above, given the emphasis of the present study on the paper pairs with the same topics, it was not possible to have a sample of adequate size consisted of pairs similar in all their features, including publication year, publishing journals and access models. Consequently, to explore the effects of these features, GEE was used to analyse the differences between the retracted and non-retracted papers in terms of their citations. GEE is an extended version of the generalised linear model that allows the analysis of correlated observations such as repeated measurements without assuming the normality of the residuals’ distribution. It is suitable for those longitudinal/clustered data analysis applying repeated measurements at different points in time to investigate any possible effect of the covariates and factors on the response variable. As the responses from the same individual tend to be more similar in this kind of analyses, within-subject and between-subject variations incorporated into the model to improve the efficiency of the estimation and the power of the model. As Wang [88] and Williamson et al. [89] stated, unlike mixed-effects models, GEE does not require a normal distribution of the responses, but it requires the correct specification of marginal mean and variance as well as the link function. The link function is used to connect the covariates and marginal means. Within-subject correlations among the repeated measures are incorporated into the estimation as the responses’ working correlation structure or matrix. The aim is to increase the efficiency relative to some naive estimators, such as those which assume that repeated observations from a subject are independent of one another. GEE is used to study the population-average pattern or trend over time for longitudinal data. The estimates of the parameters are, hence, averaged at population level. In addition, the estimates are consistent and asymptotically normally distributed even when the working correlation structure is not appropriately specified. Moreover, fitting is easier due to treating the variance–covariance matrix of responses as nuisance parameters.
Table 1 summarises the characteristics of the GEE conducted in the present study. As seen, paper pairs served as subjects measured in terms of their citation counts at three levels (i.e. before and after retraction for the retracted and non-retracted papers in different citation ages, within-subject variable). Paper groups (the retracted and non-retracted categories) and access models (OA and NOA) entered as predictors. Moreover, citation ages relative to the publication and retraction dates were added to the analysis as covariates to control for their effects. The ‘Citation cutoff point’ refers to the point in the citation life cycle where a paper is retracted. As the non-retracted papers have no retraction dates, the withdrawal point of their retracted couples was considered as their cutoff points to investigate the citation behaviour of the non-retracted papers during the same period as their peers’. Given the non-normality of citation counts, Natural Log (LN) transformation of the metrics plus 1 was used as proposed by Thelwall and Wilson [90].
The specifications of the GEE procedure.
AR (1) is selected when the repeated measurements have a first-order autoregressive relationship.
It should be mentioned that the dates of social metrics were not recorded by Altmetric.com. Therefore, they could not be divided into before–after retraction parts. Thus, in the model tested for social mentions, the within-subject variable was set to just ‘retraction group’ with the average of JIF values as the covariate, and the working correlation matrix was set to ‘independent’.
6. Results
5.1. The obsolescence trends of the retracted and non-retracted papers
Figure 1 illustrates the scatter plot of the Citation Geo-Means versus citation ages of the retracted and non-retracted papers depicting their obsolescence trends. As seen, after a sharp initial rise from age one to three, an exponential decay occurred for both groups. The non-retracted group’s citation peak (Citation Geo-Mean = 4.443) was higher than the retracted one’s (Citation Geo-Mean = 3.081). Moreover, the retracted group was generally lower in its year-by-year Citation Geo-Mean.

Year-by-year distribution of the Citation Geo-Means of the retracted and non-retracted papers.
In order to understand the pattern of the citation changes over time, different models were tested using regression analysis. It should be mentioned that the data points before the citation peaks that seemingly acted as outliers (Figure 1) were omitted to yield a model with the highest strength. The results showed that an exponential model best fitted the data related to the retracted (y = 5.2921e−0.144x, R2 = 0.89) and non-retracted papers (y = 7.4402e−0.096x, R2= 0.89). Accordingly, the annual citations to both of the groups decreased based on an exponential model. According to the model, the retracted papers got obsolete at a relatively faster pace (n = −0.144) compared with the non-retracted ones (n = −0.096).
In the next step, we divided the citation age of each paper into two phases of before and after retraction. We, then, counted the citations of the papers in each of the phases and calculated their Citation Geo-Means. Figure 2 visualises the obsolescence trends of the paper groups in each of the phases. Here, too, the data points before the citation peaks were not included in the regression analyses in order to avoid the effect of seemingly outlier data. As observed, the retracted group reached its peak at a lower point (3.65) compared with its rival group (4.05). It exhibited a slower obsolescence pace during the before-retraction period (n = −0.023) compared with its after-retraction life cycle (n = −0.132), while the non-retracted group experienced almost equal obsolescence speeds during its before-retraction (n = −0.042) and after-retraction (n = −0.041) periods. Furthermore, at the retraction point (i.e. the zero point on the X-axis), the retracted group showed an abrupt increase (Citation Geo-Mean = 3.59) close to its initial peak (Citation Geo-Mean = 3.65). It could mark the citations notifying the retraction of the papers.

Year-by-year distribution of the Citation Geo-Means of the retracted and non-retracted papers before and after retraction.
The obsolescence model illustrated in Figure 2 analyses the citation performance of the whole sample. Since the papers in each of the pairs were not necessarily of the same publication year, in the next step, we analysed a subsample of 241 pairs within which the papers were issued in the same date to control for any probable time effects. The result is displayed in Figure 3. As seen, the two groups were almost similar in their citation peaks in the before-retraction phase. An almost similar picture of the obsolescence trend was reflected for the subsample, so both groups began to get obsolete during the before-retraction phase with the same paces (n = −0.1 for the non-retracted vs n = −0.104 for the retracted group). However, after a sudden increase in their citations at the retraction point (Citation Geo-Mean = 3.45), the retracted papers experienced a faster decay trend (n = −0.143), while the papers in the non-retracted group experienced a much slower decline (n = −0.017).

Year-by-year distribution of the Citation Geo-Means of the retracted and non-retracted papers with the same publication dates before and after retraction.
According to Figure 4, which illustrates the obsolescence trends of the subsample consisted of 215 retracted and non-retracted papers published in the same journals, the retracted group exhibited a faster obsolescence pace during its after-retraction life cycle (n = −0.101) compared with its before-retraction one (n = −0.0541). However, the non-retracted group indicated a negligible gap in its obsolescence speed during its after-retraction (n = −0.037) and before-retraction (n = −0.04) periods. Furthermore, after retraction, the retracted items experienced a very faster obsolescence pace compared with their peers (n = −0.037 vs n = −0.101, respectively). Here, too, they witnessed a sudden rise in their citation at the retraction point (Citation Geo-Mean = 3.96), slightly higher than their citation peak before retraction (Citation Geo-Mean = 3.30).

Year-by-year distribution of the Citation Geo-Means of the retracted and non-retracted papers with the same journals before and after retraction.
5.2. The comparison of citations to the retracted and non-retracted papers
As mentioned, GEE was used to study the differences between the retracted and non-retracted papers in terms of their citations before and after retraction by controlling for the citation age relative to retraction and publication dates as well as the average IF of the related journals. The analyses were carried out at two levels: one on the year-by-year citations and the other on the sum of citations. To be brief, the results of the former which is more detailed are reported. The results of the sum of citations are presented in the Supplemental Appendices 1–3.
The GEE parameter estimates summarised in Table 2 show that all of the factors and covariates had significant contributions to the model, except for the access models. Non-retracted (B = 0.303) and after-retraction groups (B = 0.170) were the strongest factors and positively explained the year-by-year citations. The covariates citation age (relative to publication date) (B = 0.019) and citation age (relative to retraction date) (B = 0.066) positively contributed to the model, signifying that the more time passed after publication and also after retraction of a paper, the more citation it received. Moreover, the average JIF (B = 0.052) positively explained the citation counts. The NOA factor (B = −0.052) negatively, though insignificantly, explained the citation counts.
GEE parameter estimates for year-by-year citations.
NOA: non-open access; JIF: journal impact factor.
Table 3 summarises the pairwise comparisons of the factors. As observed, the citation counts were significantly lower before retraction than after retraction (Mean Difference = 0.170, Sig. = 0.000). Moreover, the non-retracted papers were significantly higher in their citation counts compared with the similar retracted papers (Mean Difference = 0.303, Sig. = 0.000). NOA papers experienced a citation disadvantage, though insignificant, compared with the OA group (Mean Difference = −0.052, Sig. = 0.177).
The pairwise comparisons of the factors in terms of their year-by-year citations.
NOA: non-open access.
Table 4 illustrates the pairwise comparison of citations to the papers in various factors when interacting with each other. As observed, the after-retraction phase exhibited a citation advantage either for the non-retracted (Mean Difference = 0.170, Sig. = 0.000) or the retracted group (Mean Difference = 0.170, Sig. = 0.000) compared with their before-retraction phase of life cycles. However, the citation gap between before and after phases was much larger for the non-retracted group compared with the retracted category. The non-retracted group was higher in its citation counts compared with the retracted one either before (Mean Difference = 0.303, Sig. = 0.000) or after retraction (Mean Difference = 0.303, Sig. = 0.000). In the after-retraction phase, where the citation to the papers was at its highest level, the citations to the retracted articles were still lower compared with those of the non-retracted group in before-retraction phase (Mean Difference = 0.133, Sig. = 0.005).
The pairwise comparison of year-by-year citations to the papers concerning the interacting factors.
According to the analyses conducted on the sum of citations summarised in Supplemental Appendix 3, the after-retraction phase showed no superiority in terms of citations, either in the non-retracted (Mean Difference = 0.139, Sig. = 0.071) or in the retracted group (Mean Difference = 0.139, Sig. = 0.071), compared with their before-retraction phase of life cycles. However, the retracted group showed to be lower compared with the non-retracted one in all of the comparisons, even in after-retraction phase. This signifies that the increase in the citations of the retracted group was not big enough to promote it to even the relatively weaker status of its rival group.
5.3. GEE for social metrics
The paper pairs were then analysed at two retracted and non-retracted levels in terms of their social metrics, including Tweets, Blog Mentions and Mendeley Readership using GEE. Table 5 summarises the parameter estimates of the models yielded. As seen, the non-retracted group negatively explained the Blog Mentions (B = −0.267), implying that non-retracted papers were less probably to be mentioned in blogs. Furthermore, like the model yielded for the citations (Table 2), the Blog Mentions were not associated with the access models of the papers. Moreover, the publication age was not found effective in the model. This is in spite of the fact that younger papers were more probably to get social measures [34,91]. The retraction age negatively contributed to the model (B = −0.022), implying that the sooner a paper was retracted, the less was the probability for the paper to be mentioned in the blogs.
GEE parameter estimates for social metrics.
NOA: non-open access.
Tweets showed an almost similar picture. The non-retracted group negatively, though insignificantly, explained the Tweets. Furthermore, the Tweet count was not associated with the access models of the papers. The retraction age also negatively contributed to the model (B = −0.026), implying that the sooner a paper was retracted, the less would be its Tweet counts. However, the publication age was negatively effective in the model (B = −0.075).
In the model yielded for the Mendeley Readership, no significant association was found for the retraction and publication ages. However, the citation counts were observed to be positively explained by the non-retracted category (B = 0.475) and average IF (B = 0.066) and be negatively predicted by the NOA model (B = −0.135). The pairwise comparisons of the categories revealed that non-retracted papers and OA model attracted significantly higher readership compared with their retracted (Mean Difference = 0.475, Sig. = 0.000) and NOA peers (Mean Difference = 0.135, Sig. = 0.035), respectively (Table 6). Although Mendeley Readership showed dependence on the access model, this did not lead to a significant difference between the OA and NOA papers within each group, according to the pairwise comparisons presented in Table 7. However, the non-retracted group was higher in its Mendeley Readership either in the OA or NOA model compared with the retracted one (Table 7).
The pairwise comparison of Mendeley Readership.
OA: open access; NOA: non-open access;
The pairwise comparison of Mendeley Readership of the papers concerning the interacting factors.
7. Discussion and conclusion
Although scholars have to pursue the most valid and reliable research design to achieve the most truthful knowledge, they may intentionally or unintentionally report fallible or deceptive knowledge. The misconducts or mistakes can be positioned on a severity continuum from innocuous (e.g. gift authorship, duplicated publication and salami effect) to crucial (e.g. data making and fabrication) [92]. Recently, the scientific community has been experiencing an increase in scientific fraud which is believed to have roots in the competitive atmosphere of science characterised by ‘publish or perish pressure’ [93], researchers’ ambitions and financial needs, scientific hubris [92,94], pressure to publish in ‘high impact’ journals [58], lack of research funds and the proliferation of predatory OA journals [95]. Fraud and deceit in medicine are tightly related and directly detrimental to human well-being and health and are, therefore, considered as an ‘evolving type of crime’ [96]. Consequently, any mistakes, either intentional or inadvertent, ought to be cancelled out as soon as detected. Retraction is a mechanism expected to offset the negative consequences of scientific misconduct and mistakes. This gives rise to the question of how the mechanism has been successful in achieving its ultimate goals. According to research findings, withdrawn papers continue to receive citations [94,97]. However, no studies were found to compare the retracted papers with their non-retracted peers dealing with the same topics. To re-investigate the phenomenon, the present study used a matched-pairs research design to compare the obsolescence trends and citation counts of the retracted and non-retracted papers.
The results of the regression analyses showed that the retracted and non-retracted groups of articles show similarities in their obsolescence trend, in that both reach their peak points in the same ages (third year of publication) and adhere to an exponential model in their annual trends after the peaks (Figure 1). The former group achieves its peak at a considerably lower point, although it shows a rather positive increasing trend in their Citation Geo-Mean before retraction. However, the group starts to get obsolete after retraction with an even more accelerated pace compared with that of its non-retracted rival group (Figure 2).
The results of the GEE analyses revealed that the traditional citation counts of the papers in the sample are affected by their retraction status, life cycle, life after retraction and average JIF, which is in line with the existing knowledge [2,14,98]. The positive effect of the citation age relative to the retraction date reveals that the later a paper is retracted, the higher its citation counts are. However, it is not a strong element. Instead, according to the B coefficients of the model, the ‘non-retracted’ and ‘after retraction’ categories are the strongest factors which positively predict the citation quantity. The pairwise comparison of the factors revealed that the non-retracted papers are higher in their citation counts compared with their retracted peers either before or after their retraction. Although both groups experience an increase in their citations in the second phase of their lives (i.e. after retraction), the retracted papers are lower in their citation quantity even in this phase.
Overall, according to the obsolescence trend that continues at a faster pace after retraction for the retracted papers, one may conclude that the scientific community starts to reduce recognising the papers after the public announcement of the fallibility of the claims. In other words, the retraction mechanism relatively succeeds in preventing the spread of the fallible information. However, the fact that the citations to the retracted papers are higher in number after retraction compared with those received in before-retraction phase reveals that the retraction mechanism does not come to completely eradicate, but to attenuate the negative consequences of the erroneous and worthless outputs. The situation seems to be the same as reported in 1990 by Pfeifer and Snodgrass [60], who observed the citations to the retracted papers to be reduced, but not ‘effectively purged’.
Low quality of papers could not be completely hidden from the sharp eyes of judicious scholars. The relatively fewer citation counts received by the retracted papers before retraction compared with their non-retracted peers – either in the same period or in the after-retraction phase – can witness the existence of some already withheld potential citations. Consequently, why the withdrawal of the papers fails to orient the scientific community towards the complete eradication may rely on somehow inevitable and shallow citations caused by such factors as coincidence and negligence. On the one hand, some of the citing papers may be accepted or published at the same time that their cited articles are announced to be retracted. Therefore, the citation would be inevitably released before or at the time of the authors’ awareness of the official announcement of the retraction. On the other hand, the ongoing citation to the retracted papers can be attributed to some kind of superficial impact. As scholars are not necessarily scrupulous and conscious in their citation habits, they may choose and cite easily accessible items, rely on the most visible and the shortest representation of an item (e.g. abstracts, second-hand citations and snippets) without digging deep into its details, tactically cite (e.g. perfunctory citation given out of politeness, policy or piety), and cite to provide background and introductory information. It is obvious that these types of citations cannot signify a profound impact. Moreover, the free (or low cost) online and widespread availability of a wide range of materials puts all contents with different degrees of validity and authority on the same level of accessibility and hence on the same level of credibility in the minds of Internet users [99]. This may arouse some kind of passive impact characterised by the user’s loss of his/her control (or needs of control) over his/her information seeking behaviour. This may be reflected in users’ lack of critical evaluation knowledge and skills [99], unwillingness to undertake extensive efforts to verify the credibility of online information and their rare and occasional use of information quality criteria [100,101].
According to the findings related to the social metrics, non-retraction positively predicts the social impact, as measured by Mendeley Readership, while negatively explains it when measured by Blog Mentions or Tweets, though the effect is not significant for the latter. This is in line with previous findings confirming the high correlation of papers’ quality with their Mendeley Readership [26,40,41] and its low correlation with Tweet counts [34]. The retracted papers are significantly lower in their Mendeley Readership compared with their non-retracted peers. The positive association of retraction with the Tweets and Blog Mentions and its negative association with Mendeley Readership may seem paradoxical at the first glance. However, the situation would be clarified when the differences of the social networks in terms of their nature and functions are taken into consideration. In fact, Mendeley is a reference manager. It is probably that Mendeley users added the retracted articles before retraction and then did not verify the records to delete the retracted ones. It is, thus, interesting to conduct further investigations to test how the retracted articles are added to Mendeley libraries before and after retraction. Moreover, as Mendeley is an online scholarly social network devoted to scientific research and reference management [102], it is more scientific in nature. It is, therefore, not far from expectations that its users show to be more prudent when confronting poor quality papers. However, Twitter and Blogs are relatively more public and popular in nature [103] with the potential to attract lay audiences [104]. Consequently, there could be a contamination risk of disseminating the retracted papers among the public by non-expert users with low information literacy and evaluation skills. On the other hand, the social users may use the microblogging and sharing the facilities of such social networks as Twitter or Blogs to broadcast, discuss and probably warn about a new retraction. The Retraction Watch blog [48] devoted to the discussion on the retracted papers is an obvious instance. As a result, a social post containing a link to a retracted paper announcing its retraction can gain momentum, go viral and lead to a high social impact for the retracted article. From this angle, the increase in tweeting retracted articles is not harmful, but constructive in the sense that it helps readers in distinguishing the valid and invalid papers. This gives rise to the question of how social networking functions regarding the fake and fallible scientific claims: does it leverage their diffusion or help to promote public watchfulness? Is it possible that the social users publicise the retraction and warn against the retracted articles in their Tweets or blogs while avoiding recording them in their Mendeley profiles? The various and mixed motivations of social mentioning require further studies to shed light on the real societal impact of the retracted papers.
On this basis, the results of the present study urge for enhancing information and media literacy, especially training to assess credibility [99]. It also highlights the necessity for a more watchful and reliable reviewing system to detect and weed the poor quality manuscripts before their publications. It also highlights the need for a highly visible and transparent system of alerting and awareness raising about the retracted items as also proposed by Korpela [63].
The present research has some limitations. Given the relatively small size of the sample, the results of the present study are not generalizable and should be interpreted with caution. Moreover, retraction reasons which are not taken into consideration here are of different importance. For example, frauds more seriously jeopardise scientific authenticity and ethics than accidental mistakes or authorship conflicts. Accordingly, the citations to the retracted articles are not of the same importance. It is, therefore, necessary to repeat the research by taking the withdrawal reasons into account and comparing the impacts of the retracted papers categorised by the gravity of their retraction reasons. Furthermore, the retracted articles showed to be equal or higher in their societal impact regarding Tweets and Blog Mentions compared with their non-retracted counterparts, while they have a lower status in Mendeley Readership. This requires scrutinised opinion mining to elucidate the motivations of social users in mentioning them.
Supplemental Material
Appendix_1 – Supplemental material for How do academia and society react to erroneous or deceitful claims? The case of retracted articles’ recognition
Supplemental material, Appendix_1 for How do academia and society react to erroneous or deceitful claims? The case of retracted articles’ recognition by Hajar Sotudeh, Nilofar Barahmand, Zahra Yousefi and Maryam Yaghtin in Journal of Information Science
Supplemental Material
Appendix_2 – Supplemental material for How do academia and society react to erroneous or deceitful claims? The case of retracted articles’ recognition
Supplemental material, Appendix_2 for How do academia and society react to erroneous or deceitful claims? The case of retracted articles’ recognition by Hajar Sotudeh, Nilofar Barahmand, Zahra Yousefi and Maryam Yaghtin in Journal of Information Science
Supplemental Material
Appendix_3 – Supplemental material for How do academia and society react to erroneous or deceitful claims? The case of retracted articles’ recognition
Supplemental material, Appendix_3 for How do academia and society react to erroneous or deceitful claims? The case of retracted articles’ recognition by Hajar Sotudeh, Nilofar Barahmand, Zahra Yousefi and Maryam Yaghtin in Journal of Information Science
Footnotes
Acknowledgements
Declaration of conflicting interests
The author(s) declared no potential conflicts of interest with respect to the research, authorship and/or publication of this article.
Funding
The author(s) received no financial support for the research, authorship and/or publication of this article.
Notes
Supplemental material
Supplemental material for this article is available online.
References
Supplementary Material
Please find the following supplemental material available below.
For Open Access articles published under a Creative Commons License, all supplemental material carries the same license as the article it is associated with.
For non-Open Access articles published, all supplemental material carries a non-exclusive license, and permission requests for re-use of supplemental material or any part of supplemental material shall be sent directly to the copyright owner as specified in the copyright notice associated with the article.
