Abstract
Cell and gene therapy (CGT) innovations have provided several significant breakthroughs in recent years. However, CGTs often come with a high upfront cost, raising questions about patient access, affordability, and long-term value. This study reviewed cost-effectiveness analysis (CEA) studies that have attempted to assess the long-term value of Food and Drug Administration (FDA)-approved CGTs. Two reviewers independently searched the Tufts Medical Center CEA Registry to identify all studies for FDA-approved CGTs, per January 2023. A data extraction template was used to summarize the evidence in terms of the incremental cost-effectiveness ratio expressed as the cost per quality-adjusted life year (QALY) and essential modeling assumptions, combined with a template to extract the adherence to the Consolidated Health Economic Evaluation Reporting Standards (CHEERS) checklist. The review identified 26 CEA studies for seven CGTs. Around half of the base-case cost-effectiveness results indicated that the cost per QALY was below $100,000-$150,000, often used as a threshold for reasonable cost-effectiveness in the United States. However, the results varied substantially across studies for the same treatment, ranging from being considered very cost-effective to far from cost-effective. Most models were based on data from single-arm trials with relatively short follow-ups, and different long-term extrapolations between studies caused large differences in the modeled cost-effectiveness results. In sum, this review showed that, despite the high upfront costs, many CGTs have cost-effectiveness evidence that can support long-term value. Nonetheless, substantial uncertainty regarding long-term value exists because so much of the modeling results are driven by uncertain extrapolations beyond the clinical trial data.
INTRODUCTION
Cell and Gene Therapy (CGT) is an emerging field with new technologies beginning to transform treatment opportunities in several therapeutic areas. The Food and Drug Administration (FDA) describes CGTs as therapeutic agents that modify gene expression, change the biological characteristics of cells, or reform/substitute damaged tissues. 1 These agents can potentially cure, control, prevent, or diagnose a disease. In the United States, CGTs are regulated by the Center for Biologics Evaluation and Research (CBER). As of January 2023, there were 27 FDA-approved CGTs. 2
Most of these therapies target severe and rare diseases, and several have been approved in the accelerated approval program, which is a regulatory pathway that facilitates earlier approval of drugs targeting serious conditions and where the demonstrated clinical benefit can be based on surrogate endpoints, typically biomarkers that function as proxies for the clinical outcomes of interest, and single-arm trial designs. 3 However, the approvals based on surrogate endpoints and single-arm trials, often with a short follow-up duration, lead to ambiguity regarding the efficacy, safety, and long-term value of CGTs. 4,5
Many CGTs enter the market with high treatment costs, and given the nature of one-time single-dose treatments, there are substantial affordability concerns for payers. 6,7 Globally, the reimbursement and coverage decisions of CGTs vary substantially, partly due to variations in value assessments based on long-term clinical and cost-effectiveness analysis (CEA) modeling. 8,9 CEA is a form of economic evaluation that compares the differences in costs to the differences in health outcomes between two or more treatments (e.g., a new CGT vs. the current standard of care). The CEA result, often in the form of the incremental cost-effectiveness ratio (ICER), is then compared with various thresholds to determine if the treatment of interest can be considered a cost-effective use of resources.
Several countries, such as England, Australia, and Canada, use CEA to inform which new pharmaceutical treatments, including CGTs, to reimburse and cover in the national health care systems. In other countries, such as the United States, CEA evidence may, if anything, have more informal impacts on managed care processes. 10,11 However, modeling the long-term clinical and cost-effectiveness evidence of CGTs faces several challenges. 5,12 As described, a primary challenge pertains to modeling the long-term effectiveness and durability, given that the evidentiary basis almost exclusively comes from single-arm trials and surrogate endpoint data. 5,12,13
A recent review of CEA studies of advanced therapy and medicinal products, including gene therapies, somatic-cell therapy medicines, and tissue-engineered medicines, identified 23 articles that predominantly covered somatic cell and tissue-engineered medicines. 14 Despite the high upfront costs associated with CGTs, in contrast to continuing treatments where the costs are spread out over a longer time period, results indicated that the cost-effectiveness expressed as the cost per gained quality-adjusted life year (QALY) was often below what is considered a reasonable cost per QALY in the United States (i.e., $100,000–$150,000/QALY). 14,15
Two other recent reviews explicitly focused on challenges and recommendations for conducting health technology assessments and CEAs for CGTs. They showed that the estimated cost-effectiveness varied substantially across studies, even for the same treatment. 16,17 The primary challenges highlighted as a cause for uncertainty and variability across studies included evidence limitations regarding the durability of treatment effects and how to link unvalidated surrogate endpoints to clinical outcomes such as health-related quality of life and survival. 14
This study adds to the recent reviews of evidence of cost-effectiveness for CGTs in several ways. This review is focused on the cost-effectiveness evidence for FDA-approved CGTs for the U.S. health care market. In addition, considering the importance of modeling the long-term durability of treatments, this review also extracted data and analyzed modeling assumptions related explicitly to long-term patient benefits. Furthermore, this review assessed other modeling assumptions that may drive the uncertainty around the estimated cost-effectiveness, such as the assumed CGT costs, model selection, and the choice of relevant comparator treatments. Finally, the review assessed if studies complied with good reporting standards and procedures and how future work could improve upon the currently available cost-effectiveness evidence.
An ongoing debate revolves around the substantial upfront costs linked with many CGTs and their value proposition as potentially curative treatments. 18,19 If discussions within this field are to be informed by cost-effectiveness evidence on long-term value, there must be a thorough understanding of the validity and uncertainty in such evidence, which this study aims to facilitate.
MATERIALS AND METHODS
Study selection and data extraction
This review used the Tufts Medical Center CEA Registry 20 to collect data on 27 FDA-approved CGTs based on the list of Approved Cellular and Gene Therapy Products per January 2023. 2 The CEA Registry is a database containing over 10,000 original CEA studies. Two co-authors independently searched the database using the generic name and trade name for each FDA-approved CGT, with the full text of each identified publication extracted. This study was deemed nonhuman research exempt from formal approval by the University of Florida Institutional Review Board.
Basic information on each identified study and its main results, together with details on modeling and analytical assumptions of the CEA model, economic costs, health outcomes, and uncertainty assessments, were extracted based on a data extraction template (Supplementary Table S1). Information on the CEA model focused on the type of model used (e.g., partitioned survival model or Markov cohort models), the model’s time horizon, and if a societal or health care perspective, which determines which type of costs to include, was used.
The extraction of economic cost information included, for example, information on which gene therapy drug price was assumed, if it was assumed that the price would change in future years, and how discounting was applied. Extraction of health outcome data included the type of outcome metric used (e.g., life-years or QALYs gained), how long-term health outcomes were extrapolated/modeled, and, when applicable, how the quality of life (“utility score”) was determined. Finally, extraction on uncertainty assessments focused on what type of sensitivity analysis was used and how the results from such analyses were presented. Selected key indicators from the data extraction template were summarized in a descriptive table. Regarding the cost-effectiveness results, when expressed as the cost per gained QALY, we compared findings to the $100,000 to $150,000 per QALY cost-effectiveness threshold, which is often used to indicate if a drug is cost-effective in the U.S. health care setting. 15
Assessing the quality of reporting
Reporting standards and modeling assumptions were assessed using the 2022 version of the Consolidated Health Economic Evaluation Reporting Standards (CHEERS). 21,22 The purpose of the CHEERS checklist has been described as ensuring “identifiable, interpretable, and useful,” (Husereau et al., 21 p. 10) economic evaluations as input for decision-making. Many journals require or recommend the submission of the CHEERS checklist alongside economic evaluations of health interventions.
The checklist contains 28 items, many of which can be described as facilitating sound reporting regarding methods and assumptions of the specific modeling or statistical analysis. Other items capture reporting that may be related to transparency around the context of the work, such as conflict of information, study funding, and stakeholder engagement. At least two independent co-authors assessed each study, and any potential conflict was resolved in group meetings, including all co-authors. In line with recommendations, summary scores were not produced and analyzed based on the CHEERS checklist. 21 The assessment based on the CHEERS checklist was summarized in a table with information on whether studies fully adhered, partly adhered, or did not adhere to the checklist items.
RESULTS
Summary of identified studies
The review identified 26 CEA studies that evaluated seven FDA-approved CGTs covering 36 base-case CEA assessments (a few identified studies reported results on multiple CGTs). Apart from original journal publications, two Institute for Clinical and Economic Review (ICER) reports were also identified. 23,24 The studies covered Sipuleucel-T (Provenge) for prostate cancer (approved 2010), Talimogene laherparepvec (Imlygic) for melanoma (approved 2015), Axicabtagene ciloleucel (Yescarta) for B-cell lymphoma (approved 2017), Tisagenlecleucel (KYMRIAH) for lymphoma (approved 2017), Voretigene neparvovec-rzyl (Luxturna) for inherited blindness (approved 2017), Onasemnogene abeparvovec (Zolgensma) for spinal muscular atrophy (approved 2019), and Betibeglogene autotemcel (Zynteglo) for ß-thalassemia (approved 2022).
Table 1 summarizes the main findings from the 26 studies. More than half were conducted from a U.S. payer perspective; the remaining were from the United Kingdom, Canada, Spain, Singapore, Japan, Germany, and the Netherlands payer perspective. Only four of 26 studies included a societal perspective.
Summary of Identified CGT Studies
CGT, cell and gene therapy; ICER, incremental cost-effectiveness ratio; QALY, quality-adjusted life year; HSCT, hematopoietic stem cell transplantation; SOC, standard of care; RCT, randomized controlled trial.
aIf not documented in U.S. dollars (USD), a conversion to USD was made based on the following assumptions: 1 € = 1.09 USD, 1 Singapore dollar = 0.74 USD, 1 ¥ = 0.0069 USD, 1 £ = 1.25 USD.
bBased on the base-case analysis in each study.
cAssumed price represents the assumed cost for each cell and gene therapy that was incorporated in the cost-effectiveness models, and it could be equal to the upfront cost if it is a single-dose treatment (e.g., Zynteglo) or the cost associated with multiple doses (e.g., Imlygic).
Regarding base-case ICERs from the studies and each assessment, 19/36 assessments showed that the evaluated CGT had an ICER below $100,000/QALY, 2/36 were mixed, and 15/36 had an ICER above $100,000/QALY. With the $150,000/QALY threshold, the interpretation was similar, with 20/36 assessments showing that the CGT had an ICER below $150,000/QALY (3/36 mixed and 13/36 with an ICER above the threshold).
Despite that, a majority of studies and assessments indicated ICERs below the $100,000-$150,000/QALY thresholds; the results on incremental QALY gains and ICERs varied widely between studies for the same treatment, influenced by the assumptions of long-term treatment effects, the comparator, and the price of treatments. Figure 1 shows the base-case cost per QALY from the identified studies and highlights the variability in results across studies for the same CGT. For CGTs with multiple studies identified, the results differed by a factor of 36 (tisagenlecleucel), 8 (voretigene neparvovec-rzyl), 7.7 (onasemnogene abeparvovec), 5 (axicabtagene ciloleucel), and 2 (sipuleucel-T).

Plot of the incremental cost per QALY gained from identified studies. Each dot represents the base-case cost per QALY from each of the included studies. Results where the gene therapy was dominant or dominated are excluded for legibility. QALY, quality-adjusted life year.
Sipuleucel-T and Talimogene laherparepvec had modest QALY gains (0.16–0.37), and the cost per QALY estimates (ICERs) were all above what is typically considered a cost-effective use of resources (i.e., $100,000–$150,000/QALY). 15 The results for voretigene neparvovec-rzyl varied significantly, showing the substantial impact that varying modeling assumptions can have on the results. QALY gains versus standard of care were modeled from a lower bound of 1.3 to an upper bound of 9.4, with resulting ICERs varying from being dominant (lower-cost and better health outcomes) up to $643,813 per QALY. There was substantial variation in modeling results for axicabtagene ciloleucel as well, with incremental QALY gain results versus standard of care varying between 1.52 to 6.54, with associated ICERs varying between $58,146 and $289,000 per QALY.
One study compared axicabtagene ciloleucel with tisagenlecleucel, which showed lower costs and better health outcomes for axicabtagene ciloleucel. The treatment with the highest number of identified studies was tisagenlecleucel, and cost-effectiveness results were mostly favorable (ICERs below typical U.S. thresholds), but varied substantially (Fig. 1). However, the modeled incremental QALY gains still varied widely between different studies, also between studies using the same comparator. Results for onasemnogene abeparvovec indicated substantial incremental QALY gains (10.36 to 12.18). In comparison with best supportive care, the ICERs for onasemnogene abeparvovec were still above the referenced threshold levels, whereas the comparisons with nusinersen gave lower ICERs or even dominant results (onasemnogene abeparvovec leading to lower costs and better health outcomes). Finally, only one study was identified for betibeglogene autotemcel, which indicated a substantial incremental QALY gain and a favorable ICER.
The trial efficacy data underlying most modeling assumptions were generally based on single-arm studies, except for data for talimogene laherparepvec, sipuleucel-T, and voretigene neparvovec-rzyl. All models assumed a lifetime perspective and were based on direct survival curve extrapolations in partitioned survival models or long-term extrapolations using transition probabilities in Markov cohort models.
Some studies explicitly assumed that some of the patients would be cured. In some cases, such curative assumptions were strongly related to modeled incremental QALY gains and, thus, the associated ICERs. For example, some of the models for axicabtagene ciloleucel assumed that patients who had not progressed at a particular time were cured (and could never progress in the future). In contrast, other models assumed some positive progression risk over the entire lifetime. The modeled incremental QALY gains were the largest in models with the most optimistic cure assumptions. In a few cases where the same therapy was assessed in models using payer and societal perspectives, the societal perspective was associated with slightly more favorable cost-effectiveness results (lower cost per QALY).
Reporting standards
The assessment of adherence to good reporting standards based on the CHEERS checklist showed some items that were generally lacking in the identified studies (Supplementary Table S2). For example, only four 26,30,31,43 out of 26 identified studies followed the CHEERS recommendation for appropriate title selection (Item 1). None of the studies referenced the use of a health economic analysis plan (Item 4).
Furthermore, most of the studies frequently failed to specify and address distributional concerns for vulnerable populations (Item 19), discuss approaches or the effect of involving patients, payers, or communities in the study design (Items 21 and 25), address concerns regarding heterogeneity, and explain the methods used to evaluate the results based on different subgroup analyses (Item 18). On the other hand, all the studies successfully provided a structured abstract (Item 2), relevant background and objectives (Item 3), comparators (Item 7), time horizon (Item 9), discount rate (Item 10), description of cost resources (Item 14), summary of the main results (Item 23), and description of uncertainty (Item 24).
DISCUSSION
The number of new CGTs provides promising treatments for a wide variety of severe conditions that also have a considerable economic burden on the health care system and society. 49 Considering the high upfront costs associated with most CGTs, there are considerable affordability concerns for payers, and questions have been raised regarding the long-term value of these therapies. However, if the significant health improvements assumed with several CGTs are realized and potentially offset the cost of other high-cost care, treatments could still provide beneficial long-term value. 18 In this review, we identified studies that have assessed and modeled the long-term value of FDA-approved CGTs based on cost-effectiveness analyses, which directly impacts reimbursement and coverage decisions in several European and Asian countries, and may informally impact managed care processes in the United States. 10,11
The CEA studies identified in this review showed a wide range of modeled incremental QALY gains and associated ICERs. Two treatments had no study indicating favorable cost-effectiveness results defined as ICERs below $100,000–$150,000 per QALY 15 (sipuleucel-T and talimogene laherparepvec). In contrast, the other five treatments had varied results depending on the comparator and modeling assumptions. A treatment with a particularly large variation across studies was voretigene neparvovec-rzyl, where the modeled QALY gains and ICERs indicated everything from a very favorable cost-effectiveness (dominant) 39 to a very unfavorable cost-effectiveness ($643,813/QALY). 40 The studies also indicated that the upfront cost of treatment does not necessarily determine the long-term value, given that it may also be related to substantial health gains and cost offsets.
However, the review also highlights many challenges and limitations when relying on evidence of cost-effectiveness to assess the long-term value of CGTs. The results indicated that for therapies where several studies were identified, the cost per QALY varied as much as up to a factor of 36. Variations in modeling assumptions caused significant differences in modeled QALY gains and cost-effectiveness results within the same treatment. This variation in modeling results highlights that the assumptions on the long-term durability of treatment effects drive the cost-effectiveness results and that such assumptions cannot be informed by data (at least not yet). For example, two studies evaluating axicabtagene ciloleucel assumed a life-long cure for a fixed part of the patient population (if no progression was observed at five years), whereas two other studies assumed some continuous risk of progression during the entire lifetime.
Not surprisingly, the studies including curative assumptions led to models with larger QALY gains and more favorable cost-effectiveness results. Another review aimed to identify recommendations for conducting economic evaluations of CGTs and found that many economic evaluations did not adhere reasonably well to these recommendations, causing additional concern regarding the validity and transparency of these models. 17
Almost all the included studies used short-term clinical trial data to extrapolate long-term survival or progression of disease. Generally, theory cannot inform on the most appropriate statistical assumptions for survival curve extrapolation and multiple survival models were often fitted in the identified studies regarding the distribution, shape, and hazard function. This implies that the researcher’s decisions on the model assumptions will have a substantial impact on estimated benefits and the cost-effectiveness of the treatment, sometimes referred to as the researcher’s degrees of freedom. 50,51 For example, review studies have indicated that cost-effectiveness models with funding from the pharmaceutical industry are more likely to report favorable results (i.e., lower cost per QALYs), perhaps because more optimistic assumptions will be made regarding long-term clinical benefits. 52
Furthermore, this review assessed the reporting standards of the included studies using the CHEERS checklist. The results highlight that many items were adequately reported in the reviewed studies and several items that were regularly not reported or discussed. All included studies lacked a predefined health economic analysis plan, which was recently included in the CHEERS checklist to reduce the risk of various biases contaminating the modeling. 53 Furthermore, two items that were only addressed in two of the included studies 23,24 were the involvements and effects on patients, clinicians, and stakeholders. These two items were recently included in the CHEERS checklist and have not been a significant topic in the CEA literature, as documented in other therapeutic areas. 54 In addition, most identified CEA studies did not address subgroup analysis and potential treatment heterogeneity.
However, it should be noted, that given that many CGTs are focused on rare diseases, trials with small sample sizes imply substantial limitations in addressing heterogeneity or subgroup analyses. The same issue was observed regarding reporting distributional effects across varying demographic and economic patient subgroups, which was not addressed in most studies. Almost all the studies reported data on study parameters (modeling assumptions); however, assessing if all data are reported without attempting full replication is challenging. Furthermore, several studies 23,26,38,40,41,43,47 using Markov cohort models did not report transition probabilities, which limits transparency and invalidates replication attempts. Finally, identifying the source and motivations for the assumed utility parameters is important to reflect the model’s validity. It was mostly unclear what led the studies to assume specific utility values, whether it was based on the best available evidence or if it was based on convenience sampling for selected sources.
A limitation of this review is that it is based on a single database (CEA registry), which implies that there are risks of missing CEA studies of CGTs that were not covered in the used database. Second, CGTs represent a fast-growing field, and during this review period, the FDA approved three new CGTs. Finally, quality assessments based on the CHEERs checklist can only identify quality in reporting, but not the actual modeling quality, which would require a more extensive analysis of each respective model and replication attempts, which was outside the scope of this study.
CONCLUSIONS
This review of CEA studies of FDA-approved CGTs showed that, despite the often substantially high treatment costs, many of these products were considered cost-effective using typical referenced CEA thresholds in the United States. However, several challenges and limitations were identified when relying on CEA studies to assess the long-term value of CGTs. Modeling assumptions regarding treatment effect extrapolation beyond the trial data varied significantly across studies for the same treatment and caused substantial variation in modeled QALY gains and ICERs. Our review also identified room for improvement regarding reporting standards to facilitate increased transparency, primarily concerning data inputs, such as assumed transition probabilities.
Footnotes
AUTHORS’ CONTRIBUTION
S.A. conducted data collection, data interpretation, and drafted and revised the article. S.N. and D.A. assisted with data collection, data interpretation, and revised the article. B.H. assisted with data interpretation and revisions of the article. M.S. was responsible for the concept and design, conducted data collection and interpretation, and drafted and revised the article.
AUTHOR DISCLOSURE
The authors declare no conflict of interest.
FUNDING INFORMATION
No funding was received for this article.
SUPPLEMENTARY MATERIAL
Supplementary Table S1
Supplementary Table S2
References
Supplementary Material
Please find the following supplemental material available below.
For Open Access articles published under a Creative Commons License, all supplemental material carries the same license as the article it is associated with.
For non-Open Access articles published, all supplemental material carries a non-exclusive license, and permission requests for re-use of supplemental material or any part of supplemental material shall be sent directly to the copyright owner as specified in the copyright notice associated with the article.
