Abstract
Human biospecimens are used in 40% of cancer research publications. Tumor biobanks are an important source for these biospecimens and support both prospective and retrospective research studies. Supporting retrospective research requires tumor tissue biobanks to accrue an adequate inventory, or stock, of cases comprising tumor biospecimens and associated treatment and outcomes data. We propose a model to establish appropriate targets for stocks of frozen tissue biospecimens in tumor biobanks, sufficient to support cancer research needs. Our model considers national levels of investment in academic cancer research relative to research use of cases described in publication output, and scales this to the local context of the BC Cancer Agency Tumour Tissue Repository (TTR) as an example. Adjustment factors are then applied to correct for the primary intended user base of the biobank, as well as variables intrinsic to all biobanking operations and case collection. On this basis we estimate a current target stock for the TTR of approximately 4500 cases. Local research demand derived from case release data can then be applied to fine-tune accrual targets and refine the biobank's relative portfolio of cases from different tumor sites. We recognize that current targets will need regular remodeling as research demands change over time and that our initial model has some limitations related to the need to extrapolate from available research and biobank utilization data, and does not incorporate biospecimen/case contributions within the context of a network. However, we believe the lack of models to estimate inventory targets for tumor biobanks and to better balance research demand with biospecimen supply, contributes to the hesitation of funders to provide support, and also the problems of sustainability faced by many biobanks. Creating tangible inventory targets will improve biobank efficiency, sustainability, and may also encourage increased and stable funding.
Introduction
H
In Canadian cancer research, the majority of biobanking occurs within the context of small ‘mono-user’ type biobanks,4–6 initiated to address a specific research question in a single investigator driven research study. The biospecimen inventory targets required for such mono-user biobanks are often empirical (e.g., collections associated with basic discovery research) or pre-determined by statistical calculations (e.g., collections associated with validation studies and population cohort studies) so as to be adequately powered to answer a specific research question and/or be representative of a study population. While these mono-user biobanks are numerous, their function and user base allows for pre-defined case collection targets, and sustainability and ongoing accrual are secondary issues that arise only if a biospecimen collection remains following study completion. At the opposite end of the spectrum, while the minority of biobanking is conducted by ‘poly-user’ biobanks,4–6 it is associated with considerable capital and infrastructure expenditures.
As previously defined, 5 poly-user biobanks are intended at inception to be resources that support multiple research questions and research users, and are typically large, institution driven, and offer dedicated expertise and infrastructure to conduct biobanking. While historically defined by the ‘stock’ or collection of cases that these biobanks hold, many biobanks have expanded their scope of operations to be able to support both ‘prospective’ and ‘retrospective’ research collection protocols. The former involves specialized and/or dedicated collection of materials by agreement with and for individual research studies.
But an integral part of the mandate of most poly-user biobanks remains to create preassembled collections of biospecimens associated with historical annotating data (e.g., patient treatment and outcomes data). These ‘retrospective protocols’ thus allow for research that depends on retrospective selection of cases from an existing stock on the basis of specific criteria and where the patient outcomes are retrospectively known. The availability of existing stocks in biobanks enables rapid and efficient pursuit of discovery and translational research to establish and validate hypotheses that can then be confirmed in prospective studies and trials, and is a critical part of research infrastructure. However the operational design of poly-user biobanks can lead to unfocused ongoing case accrual, and the possibility that inventories of biospecimens are created which do not match the requirements of local research and demand.
Conversely, slowing case accrual and inadequate biospecimen stock availability can hamper research output. Given that research resources are finite, assessment of the existence, quality and scale of biobank inventories, held across different types of repositories, needs to be addressed. A strategy to better coordinate biospecimen supply with research demand will greatly improve efficiency in the biobanking system.
Recognizing the diversity of biobank types and the increasing importance of biobanking for research, generating data on the extent of research utilization of biospecimens, and dialogue around current and future targets for biobanks are all now urgently needed. Lack of discussion around the targets for biobanks has likely contributed to the hesitation of funders to provide adequate support, and the problems of sustainability faced in particular by many poly-user biobanks.
We have previously attempted to address part of the knowledge gap underlying this issue by assessing research trends in utilization of biospecimens in cancer research.1,7–9 Our approach through a series of studies has been to examine the use of biospecimens in published articles to determine the research demand for biospecimen cases, both in terms of quantity and format, and to predict future trends.
To complement this approach, we have also compiled our experience of case utilization and demand across our national Canadian Tissue (recently changed from ‘Tumour’) Repository Network (CTRNet).10,11 In the present study, we have extended our work to propose a model for establishing initial, and refining ongoing, accrual targets for individual biobanks. We consider the elements that influence the inventory targets held by one example tumor biobank, the British Columbia Cancer Agency's Tumour Tissue Repository (TTR). 12
Model Assumptions, Calculations, and Results
Background to the TTR
The details of the TTR have been previously published. 12 As a tumor biobank with an institutional and provincial mandate, the TTR's first priority is to support a broad spectrum of cancer research that meets the criteria for research excellence conducted in British Columbia (BC). The major focus of the TTR is to support both prospective and retrospective research questions, and for the latter to provide access to cohorts of frozen tissue biospecimens that are linked to other biospecimens (e.g., blood or FFPE format tissues) and outcomes data (with a growing focus on providing services in support of prospective research studies).
The major product type used in this analysis of stocks is the ‘case’. A “case” refers to a single frozen tumor tissue specimen associated with patient consent, and other companion tissues (e.g., normal tissue, blood, etc.) and the aliquots and derivative products from these samples. The primary focus, and base case currency, of the TTR is frozen tissue, and so all modeling discussed here on focuses on this biospecimen format. Further, since the collection and maintenance of frozen stock represents a large fraction of ongoing operational expenditures for most biobanks, considering the case of frozen tissue is relevant across biobanks and research specialties.
Major factors used in models to estimate initial and refined stock targets
The factors considered in determining the targets for current stock and for future stock of a tumor biobank are listed in Tables 1 and 2.
The group of ‘research use’ factors (Table 1) support a calculation of the past research use from the total number of biospecimens used by researchers to generate publishable data. They also provide some indication of the optimal format for processing and storage of these biospecimens. We have estimated the values of these factors from the data obtained from previous studies of patterns of biospecimen use and from our own experience in research user requests to the TTR over the past 5 years.
The group of ‘research funding’ factors (Table 1) support a calculation of the past research investment in different tumor sites and the numbers of investigators who are supported by academic research funding.
The group of ‘biospecimen qualification’ factors (Table 1) support a calculation of the total numbers of cases for each tumor site that a biobank needs to accrue in order to build a sufficient stock to support research, based on a knowledge of the recent demand and investment as calculated above.
The total numbers of cases needed in Canadian academic cancer research is calculated from ‘research use’ factors as follows; in a recent literature survey of a subset of Canadian academic cancer researchers funded in 2010, 35 investigators generated 500 publications over a period of 4 years. 1 Of these articles, 88% contained primary data, 38% of these used human biospecimen cases, and within this subset of articles 31% accessed biobanks (as opposed to other sources such as directly from patients), making a total of 0.37 articles/year/investigator utilizing cases from biobanks.
In most instances, biobanks were accessed for frozen tumor cases, but occasionally this was for support in obtaining fresh tumor biospecimens. The former are usually collected fresh under a general protocol and then frozen and stored to compile outcomes data (see ‘retrospective protocol’ above). The latter are collected fresh as part of a ‘prospective protocol’ and used directly in a study. Fresh biospecimens are therefore not included in the target stock estimate but are relevant to consideration of future accrual. Although we did not enumerate the numbers and formats of biospecimens used in detail in our most recent study, 1 in previous studies7–9 we have shown that while the frozen:fresh biospecimen usage ratio in articles has changed over time, a current ratio of ∼5:1 can be estimated, and this is consistent with the experience of the TTR in its activities supporting research projects over the past 5 years.
Furthermore the average size of frozen biospecimen case cohorts used can be estimated at 150 cases per study.7–9
Therefore we can estimate that, on average, the number of frozen cases required by each cancer research investigator per year is:
Based on the statistics published by the Canadian Cancer Research Alliance (CCRA report 2011), 13 ∼1500 cancer researchers were funded in Canada in 2011, and therefore their total estimated annual frozen case needs from biobanks were 66,600 cases (i.e: 44.4×1500).
Model to determine initial case stock targets
The estimated total number of biospecimens required to support the publication output generated by cancer researchers in Canada can then be used to determine strategic biobank stock targets (Table 3).
i. Tumor sites: categories of tumor research areas by tumor site. As described in text, Others (+) are all minor categories that have been and can potentially be collected by the TTR. Others (−) are all minor categories that cannot be collected by the TTR. No frozen prostate cases are collected.
ii. National investment: is derived from data extracted from the Canadian Cancer Research Alliance. 13
iii. National stock (estimated): an estimate of the total numbers of cases used in publications by Canadian academic researchers (see methods).
iv. BC stock (estimated): an estimate of the contribution of studies conducted by British Columbia based researchers to the total national stock used obtained by adjusting national Canadian data for the % population in BC, based on demographic data from Statistics Canada (see methods)
v. Biospecimen qualification factors: (defined in Table 1)
a. Quality factor: is a subjective value derived from accumulated experience of the TTR and varies from 50%–90% for different tumor sites.
b. Reuse factor: is a subjective value derived from accumulated experience of the TTR and varies from 5–10 studies supported by each case for different tumor sites.
c. Selection factor: is derived from analysis of all applications for biospecimen access to the TTR where major criteria could be clearly determined. The mean (±SD) selection criteria per biospecimen request was 1.39 (±1.13). Each criterion (tumor type, sex, outcome, etc.) discriminated an average of 35% (±28%) of the stock, thus eliminating on average 65% of biobank cases from study inclusion. Therefore the selection factor applied in our model was derived from the formula:
which results in a selection factor of: (0.35)1.39=0.232.
The table provides details of national research use and local biobank release experience data to calculate an estimate of the case stock targets for the BC Cancer Agency Tumour Tissue Repository (TTR).
First the numbers are adjusted for relative research funding in terms of investment (proportion of funding allocated for each cancer type) and region (based on provincial population as a fraction of total population), and then by biospecimen qualification factors to determine the appropriate relative stock targets for the TTR.
The relative investment in cancer research within different tumor sites and the proportional contribution of BC researchers is calculated from publically available data (CCRA annual reports 2005–2011 13 and Statistics Canada) to determine the number of cases used that is relevant to BC. The tumor sites used in this analysis are: breast, colorectal, lung, prostate, other (+) (grouping for several minor frequency sites previously and potentially collectable in the future by the TTR), other (−) (grouping for several minor frequency sites not collectable by the TTR).
Within each tumor site the target number for a stock that can support the number of cases used is adjusted by the typical specimen size and yield in terms of adequate tumor content (quality factor), the number of studies that a typical case can be used to support (reuse factor), and the influence of selection criteria that are used to select specific cases from the stock (criteria factor). For example, in approximately 30% of cases of breast tumors, the grossly selected tissue samples will show no or minimal tumor cell content; for colorectal cases 25% will contain mostly adenomatous components or marked necrosis; making them little value for many research projects and this is reflected in the ‘quality factor’.
The TTR has a design such that only sections and aliquots of biospecimens are released from each case and so each case can support multiple studies and this is reflected in the ‘reuse factor’. Each study supported involves selecting subsets of 1%–70% of cases from the stock on the basis of one or more selection criteria (such as age, sex, tumor type, survival, etc.) and this is reflected in the ‘selection factor’.
Revision and updating of case stock targets
While our model offers an approach to generate an overall forecast for the accrual target and specific subgroup targets for different tumor areas (Table 3), these targets are not static. Most tumor biobanks will require many years to accrue the case numbers suggested by the initial target calculation and several relevant factors change and evolve during this time. These include a number of extrinsic ‘research trend’ factors (see Table 2). Many of these factors can only be estimated from assessment of trends in biospecimen use once the biobank is operational, and so ongoing monitoring in this area is needed to ensure the targets set are revisited to be responsive to research needs. In other words, the research that biobanks are intended to support is itself a ‘moving target’.
There are also several intrinsic ‘biobank trend’ factors that change over time and these are easier to assess but also require annual reappraisal of stock and demand pressures to adjust the specific accrual targets in different tumor areas. Application of local ‘biobank trend’ factors (Table 2) to the predicted inventory targets defined by Table 1 allows a given biobank to revise existing stock targets. Table 4 considers the numbers of cases released by a biobank relative to existing stock and predicted targets to model how many new cases are needed to reach the existing target, to replenish existing stock, and to refine relative case numbers to better align predicted case inventory targets with local research demand.
i. Tumor sites: categories of tumor research areas by tumor site. As described in text, Others (+) are all minor categories that have been and can potentially be collected by the TTR.
ii. Cases released: the number of cases released from stock for use by academic researchers, limited to those that included a frozen tissue biospecimen as of January 2014. This number reflects the cumulative release activity over an approximately 4.5 year period since an access mechanism was first initiated. The number is smaller than the total release activity because it excludes releases of cases where other types of biospecimens only were released and also excludes all cases collected and then released as part of prospective protocols.
iii. Stock status: is the total numbers of cases held in stock by the TTR as of January 2014.
iv. TTR Pressure Factors: Demand pressure is defined as the number of cases released (from each tumor site) relative to the average number of cases (released from all tumor sites). Stock pressure is defined as the number of cases released as a percentage of all cases of that type held in inventory.
v. Local Pressure Factor Impact on Accrual:
a. Demand Pressure: Calculating how demand for a given tumor type relates to the average case request received by a biobank can help identify relative demand for the portfolio of cases in that biobank's inventory. If a certain tumor type is in higher than average demand (>1.0), the biobank should consider increasing accrual. Conversely, if a case type is in lower than average demand (<1.0), accrual of that type could be slowed to help rebalance the portfolio and align with local research demand.
b. Stock Pressure: Determining the proportion of inventory release allows a biobank to identify whether stocks are being significantly depleted and require replenishment at above or below its current accrual rate. Importantly, if the stock used is higher than the selection factor (see Table 3) for a given case type, it is likely that requests are being limited by low inventory holdings and increased accrual is indicated to better serve local demand.
vi. Overall Accrual Recommendation: Combining both local demand pressures can generate an overall accrual recommendation. If both factors suggest increased or decreased accrual a biobank can plan to adapt its collection strategies to align with local demand. If the overall recommendation is to maintain current accrual targets, the biobank's initial target predictions are well aligned with local research needs.
vii. Demand/Stock Pressure Accrual Recommendation Ranges: Empirical ranges to guide changes in accrual rates based on the levels of case inventory and release experienced by the biobank.
The table provides details of factors within individual cancer site categories that are relevant to the revision of current stock targets for the BC Cancer Agency Tumour Tissue Repository (TTR).
Discussion
We have proposed a model for defining targets for an individual provincial tumor biobank. To establish appropriate targets for stocks for this biobank to support retrospective research, we have considered the levels of investment in academic cancer research over the past 10 years relative to research use of cases obtained from tumor biobanks over a recent 3-year period. We have then applied factors to adjust for the primary intended user base of the biobank and variables intrinsic to biobanking operations and collection of cases. On this basis, we estimate that the TTR current stock target (for 2014) is approximately 4500 cases. Then with consideration of actual local demand for types of cases released from the biobank, we estimate a refined target that adjusts the relative portfolio of cases in different tumor sites.
The first generation of poly-user tumor biobanks4–6 emerged in Canada in the 1990s, to meet simple demands for increased numbers of biospecimens and improved oversight. Second generation biobanks then emerged in the early 2000s to address a desire for improved biospecimen quality through implementation of standardized processes and annotation. 10 The CTRNet reflects a strategy initiated in 2004 to build on these prior investments to functionally link biobanks that are funded and driven by local priorities, and to address the need for harmonization across these regional tumor biobanks. 14 This is essential to enable national and international collaboration and research15–17 on larger cohorts and to respond to customer demands (reviewed in Ref. 18 ).
But even those CTRNet biorepositories initiated more recently have lacked a good rationale for targets for their stock beyond the setup period. This is partly because there are few published estimates of research demand (need and use) for biobanks. However the sustainability and current funding status of even leading CTRNet tumor biobanks is uncertain, short term, and fragmented, and prior assumptions of self-sustainability outlined in the research and business plans underlying these biobanks are flawed. Both existing and potential new funders are hesitant to address this critical issue for cancer research infrastructure in the face of undefined targets.
To some extent the model we propose is oversimplified and has several limitations, but to our knowledge this is the first attempt in the literature to address the critical question of biospecimen supply versus demand. The dominant driver for our model is the annual use of cases by academic researchers, termed research use factor. We have based this calculation on the Canadian cancer research landscape; however this calculation would be specific to jurisdiction and research focus, and would need to be adjusted if applied to other settings.
We have derived the biospecimen research use figure from several studies of case use in the literature.1,7–9 But these studies only assess the output of academic researchers, are relatively small in scope, and are indirect in terms of the method of data acquisition and retrospective analysis. Therefore these studies do not include specific details of biospecimen sources for specific types of biospecimens. Prospective direct data collection from large numbers of investigators or from project applications and reports would improve data accuracy.
Unpublished academic and industry case use is also not captured in these previous studies, but academics and industry investigators use biospecimens in pilot studies, assay development, and in studies that fail or were never intended to be published. Consequently the numbers used here are very likely an underestimate of actual use. Also, the data obtained on research funding and trends are based on investment data reports that are not contemporary. Furthermore, researcher numbers obtained from these reports and extrapolations to determine the proportion accessing cases with frozen tumor biospecimens from biobanks is only an approximation. Any attempt to place fixed target accrual limits will also face the drawback of leaving a biobank unable to fulfil large biospecimen case requests without significantly depleting holdings, thus potentially reducing future research capacity.
Accrual estimates assume maintenance of overall current capability to accrue. Revision of targets is necessary at annual intervals to reassess the effects of research and biobank trends on the initial targets. The emergence for example of a specific research focus among the investigators in the region (e.g., increased demand for small subsets or new treatment associated cases) or new demand to support projects by prospective accrual are not encompassed. Determination of future accrual will also be influenced by new data on national investment priorities and case use data.
We propose our initial target calculation, derived from Table 3, be updated based on ongoing biobank experience with case release and local research demand changes, calculated in Table 4. Ongoing accrual activities can then be planned to mirror the fluctuations in demand experienced on an annual basis. Part of this accrual to sustain retrospective stocks may be accomplished through engaging increasingly with prospective research projects to provide accrual services for fresh tissues and at the same time retaining a portion of the biospecimen on each case to add to frozen stocks.
We have introduced several concepts such as ‘Biospecimen qualification factors’ that we believe are important to consider when defining inventory targets, however the default values we have used are also only estimates and may deserve more detailed evaluation and study to refine them in future. There are also other factors that, if quantifiable, would improve our initial and refined case target calculations. For example, the TTR inventory target estimates do not take into account the existence of other tumor-site specific biobanks that operate within the same geographical region. These oligo-user biobanks are typically developed by dedicated research groups to support studies on a common research question and disease site (e.g., ovarian, prostate, breast). While researcher access to such collections may be limited by the terms of consent and the governance, priorities and tissue format decisions of these biobanks driven by specialized research questions, their contribution to predicted supply and demand remains poorly defined.
Perhaps the most important reservation around our model is that it assumes the biobank's primary role is limited to the local geographical region. The increasing need for biobanks to be able to support larger pan-regional, -national, and –international studies means that a broader horizon needs to be considered in future modeling.15–17 Also the evolving capability of harmonized biobank networks to supply biospecimen requests by drawing on multiple sites should encourage regional biobanks to limit stock accrual in certain areas and while increasing resources on areas of demand. This would also allow a shift in operational focus to maintaining existing stocks at set target levels to support retrospective research studies, and then harnessing resources to support specific prospective studies which capitalize on the established assets and capabilities of biobanks.
In conclusion, development of models to establish stock targets for tumor biobanks will allow funding envelopes to be developed for national tumor biobanking. This would establish finite funding requirements and targets for poly-user biobanks, allowing these to focus on quality and standardization within a known budget. It could also result in partial re-allocation of current funding distributed to support mono- and oligo-user biobanking efforts that, in some instances, might be better conducted by established poly-user biobanks.
Footnotes
Acknowledgments
This work was supported by the Tumour Tissue Repository Program at the BC Cancer Agency (a part of the Canadian Tissue Repository Network that is funded by a grant from the Institute of Cancer Research, Canadian Institutes of Health Research) and the Office of Biobank Education and Research, University of British Columbia (that is supported by the Department of Pathology and Laboratory Medicine, University of British Columbia).
Author Disclosure Statement
No conflicting financial interests exist.
