Abstract
Clinical exome sequencing is a genetic technology making the transition from a laboratory research tool to a routine clinical technique used to diagnose patients. Standards help make this transition by offering authoritative shortcuts for time-intensive tasks, but each shortcut means that something is lost during abstraction. In clinical exome sequencing, reliance on standards may obscure the match between a patient’s phenotype and genotype. Based on three years of observations, I show how a clinical exome sequencing team decides when to trust standards and when to develop workarounds. I argue that the match between phenotype and genotype is circumscribed by the team’s reliance on specific standards and that trusting in standards means trusting in experts’ appropriate use of standards, generating a workflow of reflexive standardization.
Since the discovery of the DNA sequence and especially since the successful completion of the human genome project in 2003 (see Reardon, 2005), scientists have promised a genomic revolution in which advances in genetics would lead to a radical rethinking of scientific and biomedical practice (Nelkin and Lindee, 2004). Yet, despite massive research funding by both private and public agencies for several decades, clinical applications of genetic science have remained sparse (Khoury et al., 2007). The next step in making genomics clinically relevant is the adoption of exome sequencing, which constitutes a targeted approach to sequencing the coding regions of the human genome to identify genes associated with disorders. The exome contains all the known exons in the human genome. Exons are short, functionally critical sequences of the DNA that are preserved in mature RNA. The exome constitutes about 1 percent of the human genome involved in an estimated 85 percent of disease mutations. Proof-of-concept studies and successful applications have appeared in leading science and medical journals (Biesecker, 2010; Tiacci et al., 2011), and some commercial and academic organizations have introduced clinical exome sequencing (CES). The purpose of bringing exome sequencing to the clinic is to provide a genetic diagnosis to patients. Rather than moving through a series of single genes or even a gene panel, CES offers the opportunity to examine most clinically relevant parts of the genome at once. Exome sequencing is situated within a history of genetic screening and testing technologies, complementing and partially supplanting karyotype and microarray testing (De Chadarevian, 2002; Keller, 2000; Lindee, 2005).
Because exome sequencing is a technological platform making the transition from ‘bench to bedside’ (McBride et al., 2010), the technology, which was previously used only for research purposes, now needs to meet the demanding federal quality requirements of clinical laboratories. Social scientists have noted that the period of introduction of innovative technologies into work settings is a particularly critical time in a technology’s life cycle because implementation requires the articulation of assumptions that will quickly slide back into a taken-for-granted infrastructure once a technology is routinized (Berg, 1997; De Laet and Mol, 2000; Suchman, 2007). Such introductions constitute a handover from the people designing, packaging, and marketing a technology to the users of the technology – although the distinction between user and designer remains blurry (Oudshoorn and Pinch, 2003). The technology for sequencing has been available for research purposes, and the question is how to make the technology commercially and regulatorily applicable for a clinical diagnosis.
Standards and standardization matter greatly in making this clinical transition (Thevenot, 1984). The technology needs to be aligned with databases, validated measures, calibrated instruments, and regulatory-approved laboratory practices. These standards offer stable and mobile shortcuts for taking exome sequencing from a research to a clinical setting in the sense that they offer a standardized way of accomplishing a time-intensive task. Each of these standards, however, has its own contexts of origin, memory practices (Bowker, 2008), and blend of limitations and opportunities, which circumscribe the work that can be accomplished.
In his book Trust in Numbers, the historian Theodore Porter (1996) situates the increasing dependence on standards in science and medicine in a larger historical context. He argues that over the course of the 20th century, personal trust among scientists gave way to credibility based on standards and quantification. The turn to ‘mechanical objectivity’ occurred in an attempt to unify a weak scientific community against outside political pressures and internal disagreement (see also, Daston and Galison, 2007). The loss of personal judgment in science undermined the ‘opportunity for others to doubt the analysis’ (Porter, 1996: 228). Porter’s contrast between trust circulating within small scientific communities and trust imbued in numbers and standards suggests that these forms of trusting are exclusive alternatives. In a world saturated with standards (Timmermans and Epstein, 2010), however, more hybrid forms of expertise will emerge. Underappreciated in Porter’s work is the realization that the turn to mechanical objectivity is not only a political defense mechanism used by scientists under siege but also privileges particular kinds of scientific knowledge. As Busch (2011) put it, the power of standards resides in their ability to set the rules that others must follow or the range of categories from which they may choose. Besides making exome sequencing work, the issues for implementing CES through standards are, then, the relationship between trust in standards and traditional interpersonal forms of trust, and what genetic knowledge opportunities are opened and foreclosed due to standards.
Next, I explain what it means to trust in standards compared with other sources of trust in scientific work. I argue that trust in standards presumes socio-political pressures, a process of making the standard work, and particular outcomes. I examine how reliance on three specific standards in exome sequencing is a response to pressures of commercialization in anticipation of insurance reimbursement, industrial competitors, and fear of litigation. I show that reliance on specific standards calls for verification and repair work but that in some situations, especially when there is too much data to consider and sequencing needs to be scaled up, the sequencing team has little choice but to rely on imperfect standards. At each point, the standards limit the possible matches between phenotype and genotype and influence how genes inform diagnosis and treatment. I argue that in an era of big data and countless regulatory mandates, trust never resides solely in standards but also in experts’ appropriate use of standards. Exome sequencing then signals a form of reflexive standardization in which the ability to trust standards, correcting for standards’ weaknesses, and calculating the consequences of using standards emerges as a form of scientific expertise.
Trust in standards
I define standards as means to construct uniformities across time and space through the generation of agreed-upon technical rules (Bowker and Star, 1999; Timmermans and Epstein, 2010). This definition is inevitably broad (see also Busch, 2011) and covers technologies of infrastructural design, terminology, performance, and procedures (Timmermans and Berg, 2003: 24–27). Although standards may be situated in local settings, they involve more than one ‘community of practice’ or activity site; they make things work together over distance. Authoritative external bodies of some sort, such as technical experts or professional organizations, manufacturers’ associations, or the state, usually back up standards. Lampland and Star (2009) note that standards often are nested within other standards and are distributed unevenly across the social landscape.
In the spectrum that spans from infinite flexibility to stable and immutable uses, standards tend to err on the side of stability, although a trajectory of improvability is needed for the standard to remain relevant (Stinchcombe, 2001). A standard implies a ‘script’ (Akrich, 1992) or world narrative that specifies the various roles of users, as well as their skills, motivations, requirements, tools, and final outcomes. These assumptions become inscribed to varying degrees in operating protocols and the material software and hardware of the standards. Such assumptions also extend to a physical, legal, and economic infrastructure that enables the standard to do its work (Epstein, 2009; Oudshoorn and Pinch, 2003). Yet, these assumptions are necessarily incomplete, idealistic, and simplistic. Under-specification of users and infrastructures is necessary for the standard to reach a broad constituency and adapt to diverse settings (Timmermans and Berg, 1997). Although users are constrained by the terms of engagement embedded in the design of a standard, they may appropriate standards for uses that were not even imagined when the standards were created (Casper and Clarke, 1998; De Laet and Mol, 2000; Lakoff, 2005). Tinkering, repairing, subverting, or circumventing prescriptions are necessary to make standards work (Lampland and Star, 2009; Star, 1995). Users often need to work deliberately to save the standard from falling apart under changing circumstances (Alder, 1998; De Laet and Mol, 2000; Hogle, 1995; Jordan and Lynch, 1998).
What does it mean to trust in standards to conduct clinical and scientific tasks? Trust in standards refers to confidence in the attributes of standards, either by choice or by necessity, to get a job done. Alternative sources of trust include professional expertise, managerial hierarchy, religious authority, majority opinion, tradition, or magic. Porter (1996) contrasts trust in standards with trust in professionals. In Porter’s conception, trust is largely an interpersonal quality typical of face-to-face scientific communities able to police themselves internally. Yet, as the title of his book suggests, trust can also be imbued in standards, generating a form of mechanical objectivity based on technical rule-following. The advantage of working with standards when implementing new technologies is that users should not have to worry about the make-up of the standard. The authoritative nature of standards with implied reliable content and user-friendly characteristics should give confidence that the work is valid and reliable.
Trust in standards, however, is not an a priori given, but is instead a result from users working with them. In light of necessary repair work and misfits between standards and the tasks-at-hand, the standard may fit its original purposes but may not be appropriate for a new application. A lack of trust will require retracing and duplicating some of the work that a standard was supposed to do, modifying, ignoring, or dismissing the standard. Such verification work, however, still takes place along the route the standard has created. Because the patch-up or renewal reacts against a standard, it will be influenced by standards.
In science, trust in standards also matters for the added substantive knowledge that working with standards produces. Stinchcombe (2001) argues that the substantive rationality of a standard depends on its formal characteristics. Thus, whether or not a standard can be plugged seamlessly into a setting to produce added substantive value or is experienced as fraudulent and alienating depends on whether the standard is cognitively adequate for the situation it describes, is communicable to the people who need to use or implement the standard, and is improvable and improving over time. A standard stands in a double relationship to a purpose: it abstracts relevant elements from a broader universe of possibilities by deliberately ignoring other elements, and it has authority to govern social action, regardless of what was left out in the process of abstraction. The key issue of interest for social scientists is how, in processes of creating, improving, or using standards, this balance between abstraction and exclusion matters for the substantive rationality of the work. Social scientists have hinted at this link between form and function by stating that standards embed choices, values, priorities, power, or politics (e.g. Busch, 2011; Lampland and Star, 2009; Timmermans and Epstein, 2010).
In sum, trust in standards draws attention to the socio-political contexts in which standards emerge as a preferred source of trust in light of alternatives, the processes of making standards trustworthy, and outcomes with particular consequences. How do standards help exome sequencing make the transition to the clinic? The purpose of exome sequencing is to link genetic variants found in the exome to patient characteristics in order to diagnose the patient. In light of the countless different patient symptoms and the quickly changing genetic knowledge base, standards are attractive tools to achieve clinical payoff: they suggest stability, authoritativeness, and, above all, functionality. They promise to do time-consuming and resource-intensive jobs and allow genetic sequencing to be scaled up qualitatively and quantitatively. They assure technical expediency and interpretative focus. Trusting in standards, however, inevitably devalues other sources of trust, requires work to make the standard fit the situation at hand, and, importantly in the context of exome sequencing, influences not only how genes inform diagnosis but also what the link is between genotype and phenotype.
Methods and setting
For the past 3 years, I have been observing and audio-recording weekly data board meetings of a CES team at one of the first US academic centers to offer exome sequencing. My observations cover the first meetings in which the team discussed how to set up exome sequencing and the discussion from the first exome case through to the more than 1000th case. The team invited me to attend their meetings because they initially had plans to apply for a research grant that required a social science component. 1 A professional service transcribed the recordings, which I analyzed following the guidelines of abductive analysis (Timmermans and Tavory, 2012). This analytical approach depends on coding and conceptualizing data in close relationship with the relevant social science literature, in my case the Science and Technology Studies (STS) literature on standards, in order to theorize surprising findings. The research received institutional review board approval.
The meetings take place in a small conference room with an oval table with about 9 seats, surrounded by a second row of about 20 seats. At each 2-hour meeting, the team discusses the sequencing data of between 8 and 14 patients by projecting the data in the form of an Excel document on a screen at the front of the room. The data analyst, or if present the ordering clinician, then goes over a short description of phenotype, followed by a detailed review of the genetic results in which promising alleles are highlighted. Usually about 15 people – consisting of bioinformaticians, laboratory analysts, laboratory directors, geneticists, genetic counselors, clinicians, and support personnel – attend the meetings. Besides geneticists, some of the clinicians are specialists who referred a patient for CES. They bring in-depth knowledge about the patient. In other cases, however, the team deals with referrals and has limited knowledge about the patient. Most participants bring smart phones, tablets, and laptop computers to the meeting, looking up information (or distracting themselves) during the presentation.
The team developed the following procedures. A clinician decides either at the local hospital or in a remote location that a patient can benefit from CES and contacts the exome team. The clinician often takes this step after using other genetic testing technologies (e.g. microarray, single-gene tests, or gene panels), but exome sequencing is increasingly used as a first-line genetic test. The team requires payment information, informed consent, and a requisition form to be filled out. A phlebotomist takes a patient’s blood and sends it to the laboratory for DNA extraction and exome sequencing. Based on the patient’s chief complaint, the analyst prepares a library of genes that may be relevant to the symptoms (the primary gene list). The team analyzes the exome sequencing results in light of this gene library, although in the case of trios (simultaneous sequencing of a child and both biological parents) the team will look at all de novo changes to assess whether they relate to the phenotype. The analyst applies filters to the gene list to pick out the likely pathological candidates and annotates the resulting allelic variants with their indications. Then, the results are discussed at the data board meeting. At the meeting, the team decides which results to report out as pathogenic, likely pathogenic, variants of uncertain significance, or incidental findings. Finally, the patient and referring clinician are informed of the results.
In light of the discussion at the data board meeting, there are three critical moments where standards enter the exome sequencing process: first, during the process of turning patient symptoms into clinical indications on a requisition form; second, during the transformation of the requisition form into a list of genes associated with the phenotype (the gene library); and third, during the translation of the sequencing results based on this gene library into reportable results.
From patient to requisition form
The logic of exome sequencing as a clinical test is to match the phenotype to the genotype. The requisition form aims to standardize phenotypes as a reference point for diagnosis. The form was made in-house from pre-existing forms and adapted to the task at hand (see Figure 1). This form is a standard that aims to render hospital-specific electronic and paper patient records from all over the globe, patient trajectories of different durations, and countless medical specialties accessible and compatible with the laboratory work. The form has institutional authority, scripts a patient’s salient characteristics, and nests other standards (see below).

Requisition form with identifiers removed.
The CES team spent several meetings creating and discussing this form. The form is a pragmatic hybrid that combines multiple aims: identifying patients along various clinically relevant demographic dimensions, tracking a blood sample, facilitating payment, defining the kind of test required (testing one person – called a proband – or a trio), offering a checklist of the paperwork required for the test, and providing a summary of clinical indications. The form is at some points very specific and at other points surprisingly vague. The form reflects the expertise of the team and where they think exome sequencing will be useful.
The requisition form embeds fascinating narratives. Gender, for example, is divided into ‘male’, ‘female’, and ‘unknown’, because some team members specialize in disorders of sex development. Trios take children and parents as their model, although other hereditary combinations may be useful for sequencing. The category of race has, in addition to ‘European Caucasian’, checkboxes for ‘Ashkenazi Jewish’ and ‘Other Jewish’, while ‘Asian’ and ‘Hispanic’ are only generally indicated. This information is important to assess the population frequency of polymorphisms (see below), and the categories reflect the uneven knowledge base of genetic sequencing.
We can also observe how the form builds upon other standards. If clinicians want to list suspected genes, they are encouraged to use gene symbols from the Human Genome Organization (HUGO) Gene Nomenclature Committee (HGNC). If the clinician submits extracted DNA, the team specifies specimen preparation following Clinical Laboratory Improvement Amendments (CLIA) guidelines. 2 For billing purposes, the team also requests International Classification of Diseases, Ninth Revision (ICD-9) codes that will justify the medical necessity for the test and are used for insurance reimbursement. ICD-9 codes need to be precise because if the code does not match the patient’s condition, the insurance company will refuse payment. Indicative of a separation between reimbursement and clinical work, however, the approximately 18,000 ICD-9 codes are insufficiently granular to signify the patient’s medical condition.
The clinical indications have their own list of checkboxes, and these are used to create the gene library. The form should facilitate creating a gene library that is broad enough to capture all the relevant alleles that may explain the phenotype but narrow enough to interpret the results. The form structures exome sequencing to answer a specific clinical question – that is, the patient has these symptoms, what is the genetic cause? – rather than a full exploration of all potentially genetically variants in the patient. 3
The indications rubric on the form constitutes a compromise between the expertise present in the CES team, the information needs of the team, promising markets for CES, and similar lists created by competitors. For example, the category of ‘multiple miscarriages’ under family history and ‘sudden infant death’ under clinical indications are put together because of a clinician’s interest in finding out why some families struggling with infertility may have a genetic disorder leading to miscarriages and infant mortality. The reasoning for paying attention to them is that the team assumes that people going for infertility treatment constitute an attractive market for CES. These people are already spending extensive resources on infertility treatment, and a genetic test may provide a reason for infertility or fetal mortality and help focus treatments.
The requisition form embodies a tension between an administrative and a clinical logic. From an administrative perspective, CES can proceed if payment information, requisition form, and informed consent are present and the specimen sample is correctly identified. In that sense, the form works administratively as a standard. It brings together temporally and geographically dispersed processes and has institutional authority because exome sequencing cannot proceed in the laboratory without the form. It also coordinates in one page various activities and ties together couriers, complex payment systems, CLIA-required laboratory documentation, and a broad range of medical specialties. In fact, we can go further and state that the requisition form is a critical element in the constitution of the dispersed network of people, specimens, technologies, and experts and not simply a representation of that network (Berg and Bowker, 1997).
Insurance reimbursement is an important socio-political driver for relying on this specific standard. Each test costs several thousand dollars, and a requisition form that records a mode of payment should safeguard against financial losses. Due to the complexity of billing procedures, however, the team is unable to find out whether the academic center where they were based is being reimbursed for individual tests. As long as the patient provides insurance and payment information, this does not seem to matter much for the ability to sequence. The broader goal is to use appropriate billing codes and a clinical rationale to compel insurance companies to pay for this new test. The external pressure for reimbursement thus renders this kind of standard preferable.
The form regularly fails for the clinical goal of having a sufficiently specific set of symptoms to initiate CES. More precisely, the team does not trust clinicians to fill out the forms completely and correctly. The clinical indications list on the requisition form includes broad categories such as cardiomyopathy and neurological disorders and more specific conditions, such as autism. Since the indications are only an intermediate step to creating a gene list, even the diseases are insufficiently precise to narrow down candidate genes. Therefore, the CES team has text boxes for additional description and differential diagnosis. Still, even these free-text inputs are not problem-free. Differential diagnosis rests on a step-by-step elimination process of likely diagnoses, and the clinician needs to think ahead about what might come up in the full pathway of the differential diagnosis. Otherwise, the CES would be performed for too narrow a purpose and the team would have to reanalyze the data whenever the clinician entertains a new possibility in the differential diagnosis pathway. The broader issue with free-text entries is that the data are indeed open-ended; the medical profession lacks a standard way of describing symptoms. Even more, standardization runs counter to the professional character of medicine, which gives great autonomy to clinicians to name and describe phenotypes (Mol, 2002).
The genetic counselors and laboratory personnel have much experience with clinicians’ inability to fill out requisition forms completely and correctly. One sighed during a meeting, ‘you can’t train doctors’. They have been correcting incomplete forms for a broad variety of genetic tests. Correction does not only mean filling in blank entries but prioritizing symptoms into a coherent phenotype. Is osteopenia (brittle bones) part of a cancer patient’s symptoms or is it actually a side effect of chemotherapy? As a side effect, it is unlikely to be inherited and have a genetic cause. If it is part of the phenotype, the clinicians will be looking for genes that are involved both in cancer and in osteopenia. Corrections and elaborations provide a deeper temporality to the synchronic picture of a list of equally valued boxes, ranking symptoms in a temporal narrative of likely cause, effect, and incidentals. Elaborating also means deciphering the intent behind checked diagnoses. The team suspects that an autism diagnosis is often given incorrectly when developmental delay may be more appropriate (see Eyal et al., 2010). Clinicians may opt for autism, however, because the diagnosis gives access to various educational and developmental services. Any patient labeled autistic is therefore automatically checked for developmental delay, even if the clinician leaves that latter box blank.
The CES team has created a workaround to manage unreliable clinicians. If the patient is local, the team asks the requesting physician to send the patient to the genetics clinic for a referral. While obtaining informed consent, the team then offers genetic counseling, which includes obtaining an extensive medical history. In these cases, the clinicians thus conduct their own medical examination and create the clinical indications list based on the patient visit. In essence, they duplicate a medical examination in order to fill out the form correctly. If the patient is from a remote institution, such double-checking is impossible, but the team designates a genetic counselor to telephone the requesting clinician, ask questions about symptoms, and request a number of recent clinical progress notes. In these cases, the team culls their own keywords out of a close reading of the patient’s file. This arrangement requires a costly, time-consuming, and staff-intensive investigation of symptoms. Without it, however, the technology would likely fail to produce a match between phenotype and genotype. As a result, CES might not reveal anything clinically relevant, due to the test’s scope being too narrow.
Standardized phenotypes have substantive consequences for the possible genetic matches. The phenotype draws salient characteristics from patient complaints, observed symptoms, and test results as represented in a medical file in a handful of fixed keywords. One consequence is that a technology that promises a comprehensive screen of most clinically relevant mutations is turned into a focused test to answer specific clinical questions about likely diagnoses. Standards render the technology more focused, and it is much easier to standardize precise diagnostic questions than open-ended screens. By using the list of keywords as a measure of what really counts in sequencing, the team avoids the most ethically hairy discussions of reporting incidental findings because they make it difficult for such findings to be identified. Incidental findings are likely pathogenic findings that are not associated with the clinical phenotype, for instance, finding a TP53 mutation associated with cancer in a child tested for developmental delay. If the team does not systematically examine the cancer genes, they are unlikely to come into view (in this example, they could only appear if a gene was associated both with developmental delay and with cancer). 4
Taken in isolation, each keyword may fit several disease mechanisms and several configurations of symptoms that may or may not be required for the disease label to apply. The search term ‘dysmorphi*’, for example, may capture a broad range of morphological constellations related to a various birth defects and congenital, genetic, or isolated diseases. Even if the feature is specified in a subsequent keyword (such as ‘microcephaly’), the sequencing of this exome will target all genes associated with dysmorphic features and the patient is made equivalent with countless others with whom there may be little commonality. Each keyword, then, will inevitably set sequencing on the path of misleading associations that will need to be ruled out later.
Trust of the requisition form to provide financial and phenotypical information needed to conduct exome sequencing reflects socio-political pressures where geneticists are motivated to show that the technology ‘works’ and will thus become part of diagnostic repertoires and reimbursements. Typically, a laboratory executes a test and does not worry about the results or their implications. Laboratories function on what Dorothy Smith (1998) called ‘document time’, in which the requisition form is ‘what really happened’. In clinical laboratories, this means that you receive what you ask for. If the test does not diagnose a patient, it is not the laboratory’s problem as long as the laboratory processed the test according to CLIA standards. With CES, the laboratory and clinicians care about the outcome because the team still needs to prove the clinical relevance of the test to ordering clinicians and insurance companies. The team therefore tries to create favorable conditions for the test to succeed. The requisition form captures multiple aims, but the goal that matters most to the team – proving that exome sequencing can diagnose patients – defies standardization. The team therefore enacts time-consuming repairs of the form’s limitations by manually retrieving what should be the intent behind requesting the test. Inevitably, however, the form and the workaround reduce the patient’s phenotype to a limited number of keywords based on what clinicians anticipate as clinically relevant. The universe of possible genetic matches is already circumscribed by the culling of the list of clinical indications into a list of between 3 and 15 keywords.
From requisition form to gene list
The laboratory analyst next associates the keywords with all relevant possible genes and their constituent allelic variants, again underscoring the privileged position of the clinical indications. The resulting list of retrieved genes will range from about 50 to several thousand genes. The field of genetics relies upon two commonly used general databases to determine which genes are associated with phenotypes: the Human Gene Mutation Database (HGMD) and Online Mendelian Inheritance in Man (OMIM). The need for the databases comes from the realization that the literature on human genetic variants is rapidly expanding and some kind of condensed, searchable form is needed to keep up with changing understandings of genetic functionality. These databases are again standards because they intend to bring together and interpret a vast interdisciplinary literature spread over countless journals from diverse disciplines and countries in the same standardized user-friendly format. The categories of the database prescribe certain uses and users, and the databases build on other standards such as nomenclatures or publication guidelines to report variants.
The CES team uses these databases to create their primary gene library for the conditions under consideration and to interpret the results. 5 The databases should provide insight into clinically relevant genes associated with symptoms. Databases are critical for answering ‘has this been seen before?’ and ‘is this sequence variation pathogenic?’ By using these databases, however, the team also buys into a series of complex assumptions built over time into the database infrastructure (Bowker and Star, 1999). The databases are curated, meaning that an editorial staff makes decisions about what should be included in the database, and it is those decisions that circumscribe the information to be retrieved.
A biologist and mathematician teamed up in 1987 to create HGMD to examine underlying molecular mechanisms of mutagenesis. When they realized the utility of the database for genetic counselors, clinicians, and researchers, they continuously updated the database. HGMD catalogs single base-pair substitutions in coding, regulatory, and splice- or slicing-relevant regions; micro-deletions and micro-insertions; indels and triplet repeat expansions; as well as gross gene deletions, insertions, duplications, and complex rearrangements. By June 2013, the database contained more than 141,000 different lesions in 5714 genes, with new entries accumulating at a rate of more than 10,000 per year, drawn from 1950 journals (Stenson et al., 2014). Despite the rapidly growing database, the curators made several decisions that limited the utility of the database for clinical use.
The major limitation of HGMD is that the database aims to be comprehensive and list every allelic variant with clinical relevance, but initially only referenced a single instance of any mutation, usually the first published report. This synchronic logic limited the clinical relevance of the database because subsequent published manifestations may signal different phenotypes. The curators noted that their decision to reference only the original article was made ‘to avoid confusion between recurrent and identical-by-descent lesions’ and ‘because any unselective collection of mutation reports would have resulted in an inflation of references with little or no practical or scientific use’ (Krawczak et al., 2000: 46–47). The practical use of a more comprehensive phenotype from a clinical perspective may, however, make the difference to whether a gene is included in a gene list. In 2009, HGMD curators started adding references to some selected polymorphisms, but they do not seem to have done this systematically for all the previously published literature.
HGMD’s curation policy of offering only minimal reference to the literature stands in contrast with the policy used in the OMIM database. Johns Hopkins geneticist Victor McKusick initiated Mendelian Inheritance in Man in the early 1960s as a printed volume that aimed to list human genes and genetic disorders. After 12 printed volumes, the database went online in 1987 as OMIM. This database does not aim to be comprehensive but focuses on variants that are relatively common, represent a novel mechanism of mutation, or have historic significance (McKusick, 2007). Unlike HGMD, OMIM provides a compendium of bibliographic material and observations on inherited disorders and genes. The database is organized by gene locus but focuses on medical relevance. Curating consists of creating entries for each distinct gene or genetic disorder for which sufficient information exists and provides distinctive characteristics of given clinical disorders, including variations from usual cases. The staff reviews several leading journals that publish major articles in clinical and molecular genetics. The issue with OMIM is both the selection of the allelic variances to include and the annotation of the genes. Even more than HGMD, OMIM’s inclusion criteria constitute a judgment call in which the utility of the database depends on the impact of decisions that may only become clear in the future. The other problem is that the database is actually not very user friendly. The diachronic annotation system creates too much text to read through for clinicians looking for a quick answer.
Based on the review of the OMIM and HGMD curating philosophies, we would expect that HGMD is the more comprehensive database, while OMIM provides depth for selected allelic variants. Several studies comparing the two full databases, however, find that both databases miss genes and variants and that there is a lag time between publication and inclusion in the database (George et al., 2008; Peterson et al., 2013). The same studies also show that OMIM included unique genes and variants that HGMD does not have. The judgment calls inherent in curating lead to different inclusion patterns.
Other problems facing HGMD, OMIM, and other databases are that over time a publication bias developed against publishing clinical mutations (Krawczak et al., 2000) and that the published literature is error prone. The first single base-pair substitution in a human gene underlying a genetic disorder was published in 1979 (Chang and Kan, 1979). Further germline mutations underlying human inherited diseases characterized at the molecular level were published in the major biology and medical journals. Over time, the prestige of publishing single-gene mutations became less rewarding (measured as decline in the article’s impact factor). A broader variety of journals now publish variants, and more variants are published in one article, requiring broader searches to pick up relevant journals and articles. At the same time, with the value of publishing variants diminished, many variants no longer make it to journals and fewer manuscripts focus explicitly on variants. The extent of publication bias becomes apparent if we take a look at gene-specific databases created and curated, often on the side, by a small group of specialist researchers. About 50 percent of the content of these databases consists of unpublished reports (Patrinos and Brookes, 2005). To further complicate the issue, mutation reports contain errors of inconsistent location data, confusion of strand orientation, and typing errors. Studies suggest that up to a quarter of the disease-causing entries in HGMD consisted of common polymorphisms or sequencing errors (Bell et al., 2011; Xue et al., 2012). HGMD contacts original authors when published information is unclear or incorrect, but only half of such requests are satisfactorily answered. HGMD therefore keeps a ‘Bad Bank’ of inadequately described mutations (Stenson et al., 2014).
The CES team uses the databases at the critical juncture of creating an initial gene list of potentially relevant genes. The team quickly learned that the keywords determine the number and kinds of genes that are included in the gene library. For example, a very common diagnostic sign is not meeting developmental milestones in language development and interaction. Clinicians may write down ‘developmental delay’, ‘mental retardation’, or ‘intellectual disability’ to capture these symptoms on the requisition form. Each one of these terms produces an only partly overlapping gene list. 6 If you run the analysis based on only one of these terms, you may thus omit critical genes. Similar, the search terms ‘muscle’ or ‘muscular’ provide different results.
In cases where the primary gene list was regarded as being too short based on the geneticists’ familiarity with genes or when synonymous terms produced different results, the staff became skeptical that working with the databases was a reliable way of creating the gene library. They suspected, for example, that ‘OMIM likes Pompe [disease] a lot’. And they knew that HGMD was unreliable and overly inclusive for de novo autism variants. 7 Their options for correction, however, are limited. They cannot go out and do a full literature search for each clinical indication to create a customized gene list, especially since many potentially pathological mutations are not even published. For certain conditions, the staff therefore checks gene libraries from commercial panel gene tests in the expectation that the gene lists are more precise and up-to-date. For other conditions with which the research team has local expertise, the staff created their own gene library. For example, the staff member working on disorders of sex development helped create a list of all genes involved in sex development. Still, these areas of expertise are spotty. For many symptoms and clinical complaints, the staff remains dependent on the content of the databases.
Basically, the team’s strategy here is to hope that the weaknesses and strengths of each database cancel each other out and give the best possible analysis of the current state of knowledge as represented in the database. The qualifier of ‘current state of knowledge’ is critical because the OMIM database, for example, is updated daily. An analysis done on a gene list from last week may be slightly changed by the week the results are available. In fact, the team has reanalyzed previously performed exome sequences and found additional results based on newly published journal articles. The team thus qualifies its results temporally and spatially: the results are the best we can do at this point in time and based on what is included in the main databases.
External pressure from insurance companies does not prompt the use of these standards. Instead, the pressure is logistical and market-driven. In order to scale up exome sequencing and meet a turnaround time that distinguishes the clinical laboratory from commercial competitors, the team feels no choice but to trust the databases. A full, customized review of the literature for every indication would be unfeasible and unreliable. Working with databases also suggests a new set of skills required from the exome sequencing team: understanding the logics, theories, philosophies, and politics of an entire database rather than the biological pathways of individual genes and variants. Senior team members convey these skills to novices through cautionary tales of almost missed phenotype–genotype matches during the meetings. The team may have a sense that the gene list seems too short, but in an era of diffused and dispersed data, the knowledge in the databases surpasses the collective knowledge from the team. As Bowker (2000) pointed out with respect to the convergence between biodiversity databases and the imagined and real biodiverse world, ‘the database itself will ultimately shape the world in its image’ (p. 675). Despite their inaccuracies and contradictions, OMIM and HGMD entries circumscribe the universe of relevant genotypes.
Filtering and matching
Considering that a single individual will have about 25,000 variants, 250–300 loss-of-function variants, and 50–100 variants associated with inherited disease (Genomes Project Consortium et al., 2010), the challenge of interpreting exome findings lies in identifying the variants responsible for disease. ‘In theory, they could all be communicated to the patient and ordering clinicians, but doing this’, a laboratory director explained to visitors, ‘would spam them and render the results useless’. The team aims to reduce these data to one or two most likely genetic variants implicated in the phenotype. The requisition form sets the parameters for the phenotype of interest, and the genetic databases demarcate the outer bounds of the range of genes of interest. The remaining issue now is to separate the signal from the noise or the likely causal variants from the non-pathogenic polymorphisms. Even with the limited search criteria and the constraints of the databases, many potential allelic variants could cause the phenotype. The laboratory analyst will apply several filters to the findings to exclude variants from consideration. Each of these filters is its own standard, used in research and clinical settings for the purpose of determining the pathogenic potential of allelic variants. They again come with specific limitations and built-in assumptions about use and users. These standards derive their authority from research groups, funding agencies, and their widespread implementation.
The analyst filters the results based on the calculated allele frequency of the possible candidate variants. The rule is that if the variant is too common in a population, it is unlikely to explain any rare disease. A variant is disqualified as likely causative if it is found in more than 0.1 percent of the population of dominant variants. If too prevalent, the allele will not be available for final interpretation. The 0.1 percent rule is a blunt, consensus-based standard that presumes that all rare diseases are equally rare. The information about population frequency comes from the Exome Variant Server. This database contains the exomes of 2203 unrelated African Americans and 4300 unrelated European Americans. Here, ethnicity matters. Thus, in the case of a child adopted from China, the team was at a loss in interpreting the allele frequency. If the allele is highly prevalent in any community covered by the Exome Variant Server, the staff will downgrade the results. However, for people from ethnic groups not covered by this database, a population frequency is not revelatory. Population frequency and mutational burden thus insert an additional qualifier in the interpretation of the sequencing results: filtering is possible if the patient is from one of the ethnic groups represented in the databases. As long as the team subscribes to the ethnicity–genetics correlation, this is a limitation with few workarounds.
Various statistical algorithms will further reduce the number of genetic matches on technical aspects of exome sequencing: poor coverage of the sequence, low quality score, location of the variant in the gene, and repetitive sequences that are hard to map. Not meeting any of these technical criteria will render these alleles unavailable for matching phenotype and genotype. Very occasionally, someone will ask during a meeting why a certain gene was not shown in the final list of candidate genes, and the analyst will try to retrieve the reason for filtering. As in their use of databases, the staff simply trust that the filters work as intended, although they know that there are blind spots that may create false-negatives. For example, for trios, a filter will exclude any variant that is present in the proband and one of the unaffected parents. The staff realizes, however, that in genes with variable penetrance, a parent may be unaffected, but the child may show symptoms. Still, once they are filtered out, these genes can no longer be considered for matching phenotype and genotype. The final results are then contingent on assumptions embedded in the filter algorithms.
Allele frequency and technical scores do not answer the real question under consideration: does the genetic variant cause pathology? If a particular variant has not previously been associated with disease, then there is no filter to answer this question authoritatively, but several measures offer some indication of pathology. With high-throughput sequencing methods generating countless single-nucleotide variants compared with the reference sequence, bioinformaticians have developed a number of measures that predict whether a missense mutation (amino acid substitution at the protein level) is pathological or not. The prediction models determine a likelihood that a given missense mutation affects the protein structure or function and then calculate whether the variant is pathological or neutral. Different measures vary in the properties of the variant they take into account, the nature of the classification method used for decision-making, and the calibration of the measure based on particular databases.
The analyst checks the pathology of the data with three predictive measures, SIFT, PolyPhen2, and CONDEL. 8 Although the issue of a variant’s pathology is critical for interpretation, the analyst does not actually filter the results based on these predictions but makes the information available to the team. The results are viewed as indicative rather than reliable. The team worries about false-positives and false-negatives.
These three predictive pathology filters are ignored because there is a viable workaround for their function: the collective judgment of the data board to match genotype and phenotype. Filtering by standards reduces the pool of candidate matches from thousands to a handful. At this point, customized interpretation takes front stage to create a causal match but again is done in dialogue with standards. Due to the broad phenotypical criteria, the analyst double-checks the clinical description of genetic variants as provided in HGMD, OMIM, and UniProt (a database of protein sequences) with the patient’s phenotype. The Excel document has columns for the short clinical text information drawn from HGMD and OMIM. If these columns are blank, the analyst manually checks these online databases for any recently added information. If no clear association with a human disease is identified, it is dismissed as ‘not a clinical gene’. If there is text in the column, the analyst might pull up the short clinical description. She does this either by projecting the clinical summary retrieved from the databases or by going into OMIM and projecting the full clinical description. At this point, the analyst turns to the room and asks whether they think this description matches the phenotype. In some exceptional ‘slam dunk’ cases, the genes are well known and the team feels comfortable reporting the match as pathogenic. For others in which the text seems to match the phenotype, the team does not take any further shortcuts but goes directly to the biomedical literature to retrieve the original articles that made the link between phenotype and genotype for this particular variant. They project the article on the screen and determine whether the molecular data fit the clinical picture, skimming the article for the phenotypical description of the patients and the presence of the specific variant. Occasionally, the team is skeptical of the quality of the articles, but because there is a written record that makes a plausible link between the variant seen in this patient and the published literature, they tend to err on the side of reporting the results. 9
From the perspective of the literature on standards, we see that the staff attempts to overcome the limits of the HGMD and OMIM databases with customization. They do not customize prior to filtering but only take this extra step when the list of potential matches has been reduced to a few likely candidates. The team wants to be 100 percent certain that the results they report are backed up in the literature because the final report becomes a legal, self-sustained document. The team thus double-checks that the short clinical summary of the OMIM and HGMD databases reflects the intent of the article that formed the basis for the entry. Even then, customization is constrained by the databases. Although the staff remarked at one point that HGMD is a ‘crap shoot’ and that OMIM is neither ‘correct nor even accurate’, they still regularly typify mutations that come up through sequencing as ‘not an HGMD gene’ or a ‘known HGMD variant’. In the former case, the fact that the variant has not been reported in HGMD is sufficient to dismiss the variant and in the latter inclusion in HGMD may elevate the variant as a likely pathogenic case after the staff checks the original article. They remain dependent on the databases to bring the genes to their attention in the first place.
The filtering and matching process thus reveals another socio-political pressure point that makes the turn to standards preferable: a concern about liability. The staff is caught between the conflicting goals of making as many matches as possible across their cases and making the correct matches. The technology will fail as a diagnostic tool if its ability to clarify a diagnosis remains low, but the team will be in even greater trouble if they send out reports to clinicians with a match that will not hold up to legal scrutiny. Since the final report will at best only highlight a couple of allelic variants, the team anticipates that clinicians will check the literature to find out what is known about the gene. They therefore want to be as confident as possible of their own report. 10
Conclusion
Employing CES to elucidate a patient’s diagnosis depends on creating a workflow that links the exome sequencing technologies with standards (Petty and Heimer, 2011). The standards promise shortcuts to cumbersome tasks such as summarizing an entire patient file and conducting a physical examination to come up with a description of a phenotype, reviewing the full genetic literature to list all disease-relevant genes and variants, or calculating pathology predictions and robustness measures.
The staff realizes that taking shortcuts by relying on standards means that something may become lost in translation or that the genotype–phenotype link may become obscured. The dilemma facing users of standards is do you follow the shortcut that a standard promises, do you elaborate the standard, or do you think it is preferable and possible to replicate the work? The answer depends on the kind of standard and the alternative routes of action. Because the staff does not have faith in the trustworthiness of clinicians’ descriptions of symptoms on the requisition form, they decided that their best alternative is to duplicate the work of describing symptoms. Otherwise, incomplete description of clinical symptoms dooms exome sequencing from the start. Confronted with the inaccuracies of the HGMD and OMIM databases, however, the staff cannot afford to conduct a literature search for each possible phenotype. With an international literature spanning hundreds of journals and evolving daily, they agree that their results are limited by the current knowledge as represented in the databases and that sequencing at a later time may produce different results. The filters used by the analyst are considered proximate enough to exclude the least likely pathological candidates, although some candidate genes may be inappropriately excluded. At this point, when the list of candidate genes is greatly reduced, further matching depends on retrieving the primary literature and thus elaborating the standard databases. The substantive match between phenotype and genotype that exome sequencing achieves is then circumscribed by how the team makes the different standards work.
Porter (1996) presents trust in standards in scientific and technical contexts as an alternative to trust in face-to-face relationships typical of small communities; the switch to standards and quantification happened when objectivity was called into question due to internal divisions and external political pressures. Standards in exome sequencing are not only tools to create shortcuts but also convey a sense of objectivity during this technology’s introduction in the clinic. When exome sequencing results might determine a patient’s diagnosis, treatment plan, and prognosis, when sequencing might not only reflect back on a patient but also on all genetically related relatives, and when it might inform reproductive decision-making, the potential for litigation is real if the results are found wanting. At the same time, while rushing a potentially lucrative technology to the clinic in a competitive commercial environment, adherence to standards conveys credibility.
However, the opposition between trust in standards and trust in scientific communities is overly simplistic. One of the distinguishing features of this CES team is exactly the data board meeting where experts weigh in on the appropriateness of the standards, decide on workarounds, consider the representativeness of population-based databases, and interpret the biomedical literature. Indeed, standards set the parameters of the genotype–phenotype link, but they do not determine what will be reported to patients. At each step during sequence analysis and interpretation, there are countless interpretive decisions that can influence the final results, such as expanding the list of symptoms in ways that reflect the clinician’s intent, restricting or expanding the gene list, checking and applying technical measures in particular circumstances, and dismissing variants based on heritability or on the link between the HGMD and OMIM clinical description and this particular patient. Even more, much of the expertise required in exome sequencing resides in knowing whether the standard or database is trustworthy in any given instance.
In an era of big data and external governmental laboratory regulations, trust in standards is no longer an alternative to trust in face-to-face relationships; trust resides in how experts manage the limits of standards. This hybrid form of using standards can be called reflexive standardization. By taking us back to the time when standards first changed scientific practice, Porter’s analysis helps us to defamiliarize what is now ubiquitous. It would be inconceivable to do anything in contemporary genetics without engaging countless standards (see Busch, 2011). The result of the widespread diffusion of standards is that scientific expertise now involves the reflexive evaluation of the appropriateness of standards. Clinicians order exome sequencing from this team instead of its competitors because they trust that this team of academic researchers will bring the best available knowledge to bear on sequencing. Trust in standards presumes adherence to collectively created regulatory standards (Cambrosio et al., 2006, 2009), professional guidelines (Timmermans and Berg, 2003), and the current knowledge standards of the field. When challenged on why they did not make a match or did not look at other possible diagnoses, the team can pull out the requisition form and show what was ordered. They can also point to the databases and show what the given state of knowledge was at the time of their analysis. Trust in standards, then, should be more precisely understood as trust in experts’ appropriate use of standards.
The reflexive nature of working with standards becomes apparent when standards produce anomalous or unexpected results, raise questions, prompt a search of solutions, and, generally, recursively generate conversations, practices, and phenomena that feed back in the process of standardization. Reflexively using standards is particularly apparent during the early implementation stages of new technologies. With aspirations for a high-volume clinical service with a quicker turnaround time, the team intends to progressively rely on standards. When they reach the higher volume, customizing the work performed by standards for exome sequencing will no longer be possible and the staff will have to depend on the requisition forms, databases, and filter measures as self-sufficient standards. Indeed, even over the 3 years of observations reported on in this article, the requisition form and its repair work became more routinized. Discussions about the accuracy of the phenotype prevailed at the beginning of the observations, but second-guessing of the clinician’s intent became less frequent, although the staff continued to consider the descriptive variability of phenotypes a vexing problem. If the clinical information was deemed insufficient, the team did not sequence the exome until the clinician provided the requested information. In contrast, gaps in HGMD and OMIM have become more apparent as time passes, but repair possibilities remain limited.
In Stinchcombe’s (2001) conceptual framework, HGMD and OMIM are flawed standards because of the cognitive inadequacy for the task at hand: the databases are noisy, inaccurate, and do not cover the essential spectrum of required information. Stinchcombe points out that full accuracy is not always necessary, as long as others know how to fill in the details. With HGMD and OMIM, however, geneticists do not even know what they don’t know. They just realize that the databases are not completely reliable. However, a flawed standard is still more efficient than the alternatives of canvasing the field on a case-by-case basis.
Exome sequencing is anticipated to be a transition technology to full genome sequencing, and thus, how exome sequencing makes the transition from the laboratory to the clinic through standards matters. The implementation of exome sequencing will likely constitute a platform on to which genome sequencing will be grafted, provoking even more concerns related to the treatment of big data. In exome sequencing, standards come not only with prescriptions for action but also with assumptions about what counts as a phenotype and a genotype. The phenotype, in the form of keywords, emerges as an abstraction out of a patient’s file and renders the patient equivalent with everyone else to whom those keywords may apply. The genotype used in sequencing reflects all the contingencies inherent in research, publishing and curating at the thousands of different sites worked into the databases and filters. Making a match, then, requires evaluating whether the standardized phenotype and standardized genotype apply to the patient in question. Even in this final customization, there is no way outside standards: each match triangulates different standards. Within the context of CES, standards then constitute our phenotype, our genotype, and the match between them.
Footnotes
Acknowledgements
I thank Adam Hedgecoe, Daniel Navon, Sergio Sismondo, Mark Vardy, the reviewers, and audience members at the annual 4S conference in San Diego for helpful comments on this article.
Funding
The research was funded by National Science Foundation (NSF) grant SES-1256874 ‘Next Generation DNA Sequencing Technologies: The Communication of Test Results’.
