Mechanisms and Minimization of False Discovery of Metabolic Bioorthogonal Noncanonical Amino Acid Proteomics

Abstract

Metabolic proteomics has been widely used to characterize dynamic protein networks in many areas of biomedicine, including in the arena of tissue aging and rejuvenation. Bioorthogonal noncanonical amino acid tagging (BONCAT) is based on mutant methionine-tRNA synthases (MetRS) that incorporates metabolic tags, for example, azidonorleucine [ANL], into newly synthesized proteins. BONCAT revolutionizes metabolic proteomics, because mutant MetRS transgene allows one to identify cell type-specific proteomes in mixed biological environments. This is not possible with other methods, such as stable isotope labeling with amino acids in cell culture, isobaric tags for relative and absolute quantitation and tandem mass tags. At the same time, an inherent weakness of BONCAT is that after click chemistry-based enrichment, all identified proteins are assumed to have been metabolically tagged, but there is no confirmation in mass spectrometry data that only tagged proteins are detected. As we show here, such assumption is incorrect and accurate negative controls uncover a surprisingly high degree of false positives in BONCAT proteomics. We show not only how to reveal the false discovery and thus improve the accuracy of the analyses and conclusions but also approaches for avoiding it through minimizing nonspecific detection of biotin, biotin-independent direct detection of metabolic tags, and improvement of signal to noise ratio through machine learning algorithms.

Introduction

Metabolic proteomics revolutionized biomedicine, by enabling the shift from static analyses of proteomes to monitoring their dynamics. Stable isotope labeling combined with mass spectrometry (MS) analysis is a powerful technology, which allows one to precisely measure changes in protein levels over time and between different treatments.¹ However, they do not reveal the proteins that are produced just by one cell type in an organism, or the proteins that are produced by young animal in heterochronic parabiosis because the entire proteome is metabolically labeled.

Bioorthogonal noncanonical amino acid tagging (BONCAT) enables such cell-fate or age-specific detection of proteomes in mixed biological environments and at a specific experimental time. Moreover, coupling the downstream technologies, BONCAT MS and BONCAT antibody arrays have demonstrated the potential for generating important data on cell-specific and age-specific changes in de-novo synthesized proteomes.^2,3

BONCAT revolutionizes metabolic proteomics, because mutant methionine-tRNA synthases (MetRS) transgene allows one to identify cell-type-specific proteomes in mixed biological environments.⁴ This is not possible with other methods, such as stable isotope labeling with amino acids in cell culture, isobaric tags for relative and absolute quantitation, and tandem mass tags.^5,6 In BONCAT-MS, the tagged proteome is purified with clickable dibenzocyclooctyne (DBCO) beads or by biotin-affinity, after azide-alkyne ligation (“click chemistry”) between biotin and the amino acid analog's (azidonorleucine [ANL]/azidohomoalanine [AHA]/homopropargylglycine) azide moiety.^3,7,8

BONCAT-MS has been used to identify the cell-type-specific proteins, exemplified by the proteome of hippocampal excitatory neuron, as well as protein determinants of neural regeneration, spatial memory formation and age-specific proteomes in settings of heterochronic parabiosis.^9

–14 Alternative to MS methods developed by us and recently adapted by others are antibody arrays that use a biotin Click instead of protein biotinylation.^2,15,16

It has been assumed that all proteins after affinity purification are tagged with the noncanonical amino acids. However, there is no confirmation in the BONCAT-MS data that only tagged proteins are detected, and it is not feasible to profile ANL/AHA tagged proteins directly, because the salt-adducts of methionine make its molecular weight very similar to the tag (Supplementary Fig. S1a).¹⁴ In MS, the negative controls of ANL administered to wild-type (WT) mice or cells are lacking and the results do not clearly discern the degree of nonspecific binding of unlabeled proteins to the DBCO, or avidin, etc. affinity columns.^9,17,18

An unknown degree of false positives complicates the analysis, reducing reproducibility and introducing potentiating erroneous conclusions and interpretations. The approaches to enhance the accuracy of BONCAT-MS results start from statistical analysis and algorithmic improvement that assume that the data themselves are on the metabolically tagged proteins, for example, ignoring the possibility that the results are already contaminated by the noise of untagged proteins.^19
–21

In this work, we focused on revealing and minimizing the false discovery of BONCAT-MS and BONCAT-antibody array. Our results uncover and characterize problematic false positives of BONCAT-MS, for example, broad classes of proteins could be misinterpreted as being tagged and show that this false discovery rate is not affected by protein molecular weight or methionine content, but that protein abundance might increase it. Our data show that there are also false positives in BONCAT-antibody arrays, which are due to physiologically biotinylated proteins and can be reduced with biochemical approaches.

Moreover, we describe the antibody array application of biotin-independent direct detection of ANL/AHA-tagged proteome (termed Cy3-direct BONCAT), which eliminates the false positives and greatly increases the resolution, and dynamic range of the metabolic proteomics. Finally, we developed a novel computational image segmentation platform, which reduces the artifacts and increases the sensitivity of comparative proteomics. Together these technologies enable accurate studies with capacity to distinguish even subtle proteome changes, as needed for early diagnosis of pathology, the detection of age-imposed, genetically, or environmentally induced alterations, and for monitoring responses to treatments in clinic or to various experimental conditions in the laboratory.

Materials and Methods

Mouse genotyping

All procedures were performed in accordance with the administrative panel of the Office of Laboratory Animal Care. The protocol was approved by the UC Berkeley Animal Care and Use Committee. Genotyping was performed as previously described.² Genomic DNA from ear clips was extracted by using the digestion buffer (100 mM Tris-HCl (pH 8.5), 5 mM ethylenediaminetetraacetic acid (EDTA), 0.2% sodium dodecyl sulphate (SDS), 200 mM NaCl, 100 μg/mL proteinase K) for 1 hour at 95°C. Then DNA samples were precipitated with isopropanol and then dissolved in Tris-EDTA buffer.

PCR conditions consisted of denaturing at 95°C for 5 minutes; 30 cycles at 95°C for 30 seconds, 60°C for 30 seconds, and 72°C for 30 seconds; with a final extension of 72°C for 5 minutes. PCR products and DNA Ladder (N0467S; New England BioLabs, Ipswich, MA) were separated on an 1.5% agarose gel stained with ethidium bromide for 50 minutes at 100 V. Gel images were acquired using a ChemiDoc™ XRS imaging system (Bio-Rad Laboratories, Hercules, CA). Primer synthesis was by Elim Biopharmaceuticals Inc. (Hayward, CA).

Click-Western blot

To identify the ANL-labeled proteins, Click-western blotting was performed. ANL was purchased from Jena Biosciences (Cat. No. CLK-AA009). ANL was intraperitoneal (IP) injected into mice at the indicated dose. Blood samples were collected by heart puncture. The blood was allowed to clot for 30 minutes at room temperature before centrifugation at 5000 g for 5 minutes, then blood serum was aliquoted and stored at −80°C. For in vitro experiment, Primary mouse tail fibroblasts were derived from MetRS^L274G mice or C57/B6 mice. In brief, tail snips were incubated for 0.5 minutes in 70% ethanol, then minced and digested with collagenase for 2 hours at 37°C. The tissue was filtered through a 100 μm cell strainer (BD Falcon™ 352360) and washed with Dulbecco's modified Eagle's medium (DMEM) (D5796; Sigma-Aldrich).

The cell pellets were resuspended, cultured, and expanded in DMEM containing 10% Bovine Growth Serum (BGS), 1% penicillin-streptomycin (PS) at 37°C with 5% carbon dioxide (CO₂), and cells were passaged with 0.05% trypsin when culture reaches around 70%–80% confluency (days 6–7 of culture). The second passage of fibroblasts was used in this experiment. The cells were treated with 2 mM ANL (6-zaido-L-lycine HCL; Cat. No. CLK-AA009-500; Jena Bioscience GmbH) for 2 days, and the medium was refreshed daily.

Serum and fibroblasts were clicked with biotin labeled alkyne according to manufacturer's protocol (Click-iT^® Protein Reaction Buffer Kit, Molecular Probes C10276, Thermo Fisher Scientific). The Click-iT reaction was carried out for 20 minutes at room temperature and then the proteins were precipitated from the reaction mixture by methanol/chloroform following the manufacturer's instruction.

After air dry, the resulting pellet was solubilized in Laemmli buffer and boiled at 100°C for 10 minutes, then separated through SDS-polyacrylamide gel electrophoresis (PAGE) on 4%–20% Mini-PROTEAN TGX Precast Gels (Cat. No. 456-1095; Bio-Rad). Transferred polyvinylidene difluoride membranes were blocked with AdvanBlock™-PF Protein-Free Blocking solution (R-03023-D20; Advansta, San Jose, CA) and Peroxidase Labeled Streptavidin (474-3000; KPL, Gaithersburg, MD) were applied, bands were visualized using the WesternBright™ ECL (Cat. No. K-12045-D50; Advansta) following manufacturer's protocols.

BONCAT-mass spectrometry

For in vivo experiment, ANL was I.P. injected into MetRS mice (6 months old) or C57/B6 mice (6 months old) for 7 days at 0.2 mmol/kg. Protein extraction was carried out by homogenizing liver samples in cell lysis buffer (Cat. No. 90408; Thermo Scientific) with Protease Inhibitor (Halt™ Protease and Phosphatase Inhibitor Cocktail), then different methods (Methods 1–4, Supplementary Data) were used for protein purification, reduction, alkylation, and digestion.

For in vitro experiments, primary mouse tail fibroblasts were derived from MetRS^L274G mice or C57/B6 mice. In brief, tail snips were incubated for 0.5 minutes in 70% ethanol, then minced and digested with collagenase for 2 hours at 37°C. The tissue was filtered through a 100 μm cell strainer (BD Falcon 352360) and washed with DMEM (D5796; Sigma-Aldrich). The cell pellets were resuspended, cultured, and expanded in DMEM containing 10% BGS, 1% PS at 37°C with 5% CO₂, and cells passaged at 70%–80% confluency with 0.05% trypsin. The cells were treated with 2 mM ANL (6-zaido-L-lycine HCL; Cat. No. CLK-AA009-500; Jena Bioscience GmbH) for 3 days, and the medium was refreshed daily.

Protein extraction was carried out by homogenizing cells in cell lysis buffer (Cat. No. 1863073; Thermo Scientific) with Protease Inhibitor (Halt Protease and Phosphatase Inhibitor Cocktail), then 10 mM dithiothreitol (DTT; SC-29089; Santa Cruz Biotechnology) was added to 1 mL homogenates and kept at room temperature for 20 minutes, following with 100 mM chloroacetamide (Cat. No. C0267; Sigma-Aldrich) at room temperature for 30 minutes in the dark. Twenty-four micromolar sulfo-DBCO-biotin (Cat. No. 760706; Sigma-Aldrich) was added to the lysates with 15 minutes shaking at room temperature. Then 100 mM ANL was added to quench the reaction for 30 minutes at room temperature.

The biotinylated protein samples were incubated with Streptavidin Magnetic Beads (Cat. No. S1420S; New England BioLabs) at room temperature for 1 hour with mixing. Beads were collected with a magnetic stand, then beads were washed five times with IP-MS wash buffer (Cat. Nos. 1863056 and 1863058; Thermo Scientific), after elution, the samples were dried in a speed vacuum concentrator. Then 100 μL 6 M freshly made urea solution was added to the sample and vortex to denature protein, followed by adding 2 μL 500 mM DTT for 30 minutes at 60°C. Solution was cooled before adding 6 μL 500 mM iodoacetamide (IAA) for 45 minutes in the dark at room temperature. For digestion, 100 ng trypsin/LysC (V5073; Promega) was added and incubated at 37°C overnight.

Mass spectral data were acquired by the QB3/Chemistry Mass Spectrometry Facility at the University of California, Berkeley.

Data analysis of BONCAT-MS

Five different previously published methods of BONCAT MS were tested: Methods 1–4 compared proteins found in vivo in ANL-treated B6 mice with those found in ANL-treated MetRS mice and Method 5 compared proteins found in vitro in ANL-treated B6 fibroblasts with those found in ANL-treated MetRS fibroblasts. The ANL-treated C57/B6 mice and fibroblasts are referred to as the negative control and the ANL-treated MetRS mice and fibroblasts are referred to as the experimental group. Every protein was analyzed in six replicates. The Normal Spectral Count (NSC) values of each replicate were given. The p-value of the statistical difference between the NSC values of the experimental group and the negative control group were calculated using a two-tailed heteroscedastic t-test.

For each protein analyzed, the mean was taken of the NSC values of the experimental group to yield an Average Spectral Count Value (ASCV) of the experimental group. The same was done with the negative control's NSC values. Proteins with a negative control ASCV equal to 0, an experimental group ASCV >0, and p < 0.05 (and separately, p < 0.01) were assigned to the experimental group in Venn diagrams. Proteins with a negative control ASCV >0, an experimental group ASCV equal to 0, and p < 0.05 (and separately, p < 0.01) were assigned to the false positive group in Venn diagrams. Proteins that were found in both the negative control and the experimental groups with above determinations of p < 0.05 (and separately, p < 0.01) were shown in the Venn diagrams’ overlap.

The overlapping proteins (between the negative control and the experimental BONCAT MS) were subsequently analyzed, as described below. It is important to note that the magnitude of false-positive discovery in BONCAT MS supersedes just these overlapping protein pools, for example, there were hundreds to thousands false positive putatively ANL-tagged proteins in the negative control C57/B6 tissues and cells.

In the false-positive group with Venn diagrams overlays between the MetRS^L274G and C57.B6, we found 162 peptides that correspond to 118 genes; these contributed to the false discovery in all methods of BONCAT tag-enrichment before MS. Gene Ontology (GO) open-source Enrichment Tool was used to determine the Biological Processes ascribed to these 118 genes, with the 10 largest Fold Enrichment values being generated using the Python data science library. The Molecular Function and the Cellular Component were also found for these common false positives.

To examine whether there is a preference in false-positive discovery for larger proteins and/or those with more methionine residues, 162 peptides were profiled through the UniProt database and by FASTA for their molecular weights and methionine content (as a percent of the amino acid chain). The FASTA sequences of these proteins were inputted into BlastKOALA for Kyoto Encyclopedia of Genes and Genomes (KEGG) analysis. The molecular weight and methionine content were also compared between the 162 common false-positive proteins and all BONCAT MS-yielded proteins by UniProtKB, using an iterative algorithm. The molecular weight and methionine content data for all proteins were aggregated and compared to the molecular weight and methionine content data of the common false-positive 162 proteins, using histograms (Molecular weight—Fig. 1d, Methionine content—Fig. 1e).

FIG. 1.

False-positive discovery of BONCAT MS. (a) The schematics of the negative control of ANL-tagging of the proteome in vivo, and detection by BONCAT MS in C57.B6 tissues that do not express MetRS^L274G. (b) Hundreds to thousands of false positives are revealed (p < 0.05 MS threshold), using various published methods of affinity-purification of metabolically tagged proteins and subsequent BONCAT MS. (c) Click western on MetRS^L274G and C57.B6 fibroblasts that were tagged with ANL. Clear signal to noise is detected, with some expected background in the C57.B6 negative control cells. (d) Histograms of molecular weight and the methionine content for the 162 false-positive proteins that are shared between the published Methods 1–5, and all false positives (p < 0.01). (e) Stacked bar chart of false-positive proteins in each method and the 162 shared false-positive proteins. Percent of the top 460 most abundant mouse proteins—yellow, percent of the top 460 most abundant mouse liver proteins—orange, other proteins—blue (***p < 0.01). ANL, azidonorleucine; BONCAT, bioorthogonal noncanonical amino acid tagging; DBCO, dibenzocyclooctyne; HRP, horseradish peroxidase; MetRS, methionine-tRNA synthases; MS, mass spectrometry.

Finally, we analyzed the degree of false-positive discovery that comes from tissue abundant proteins. The liver protein database from PAXdb lists the 460 proteins most abundant in the liver. For Methods 1–5, we determined how many of the false-positive proteins of each method were present in the top 460 most abundant liver proteins; and this process was repeated with the 162 commonly found false-positive proteins (Fig. 1f).

BONCAT antibody array

Antibody arrays (Mouse L308 Array; RayBiotech AAM-BLG-1-2) were used to profile the proteins circulating from mouse samples. The samples were run on three different arrays. The experiment was performed by following by manufacturer's protocol with modifications below. The mean values for each of the 308 proteins were compared between the mutant and the WT results for all proteins; and those found to be elevated by twofold in the mutant with p < 0.05 were considered to be significant.

Heart tissue was clashed using grinder using liquid Nitrogen for 3 minutes. Then tissue was homogenized from heart tissues with Lysis buffer (50 mM Tris-HCl, pH 8.0, 150 mM NaCl, 1% Triton X-100, 0.5% Sodium Deoxycholate, with protease inhibitor) using grinder. The protein extracted was homogenized using homogenizer for 2 minutes. Proteins (100 μg) were clicked with biotin labeled alkyne as indicated above with fresh Click reagents.

After the methanol precipitation, precipitated protein was dissolved in Lysis buffer with 0.2% TritonX-100 and 3M Urea (Protein solution 1). After centrifuge, the protein Ppt was again dissolved with Formic acid and this solution was mixed with protein solution 1. The total protein solution was dissolved again with sonication completely. The bicinchoninic acid assay analysis was used to quantify the clicked product. These protein solutions were dialyzed with 1 × phosphate buffered saline with 0.2% Triton X-100 and 0.1% Glycerol for overnight. After dialysis, the protein solution was again solubilized using sonication and also pH was adjusted to pH 8.5 with 2M Tris to dissolve the precipitation.

The protein solution was analyzed using Click western blot and after confirmation of the Click reaction, the arrays were blocked with blocking solution overnight. Array was incubated with clicked protein (33 μg), washed and detected with Cy3 labeled streptavidin (SA), as recommended by the manufacturer. Arrays were scanned using genePix 4000B scanner (Molecular Devices, Sunnyvale, CA) at 532 nm.

Feature extraction was done using genePixpro6.1 software (Molecular Devices). The normalization method Opening of GenePix software was used to subtract the fluorescence background, after which, for each protein, duplicate spots on each array were averaged, local background fluorescence was subtracted, and resulting fluorescence signal was normalized by the internal positive controls on the arrays. Average signals from all 4 arrays (4 experiments for positive and negative samples) were set to 1.

Molecular weight and methionine content analysis

Text files containing the primary data were processed using Microsoft Excel. Each of the 644 spots on the array slide represents an antibody ligand site, and each antibody has a duplicate, summing up to a total of 308 antibody sites that bind uniquely to a protein. First, each protein value was divided by the geometric mean of the Total Intensity of Array-built negative controls. Then, these values were averaged by their array duplicates and further normalized, as geometric mean of Array-built Positive 2 values. These normalized by assay internal negative and positive controls data were then used for comparing the protein expression levels between the cohorts.

We selected the proteins with a normalized fold change >2 and a p-value of <0.01 from the two-tailed Student's t-test, and looked up their molecular weights and amino acid sequences on UniProt database. With this information, we used GraphPad Prism (version 8.4.3, San Diego), normalized the molecular weight, and calculated methionine content level separately by setting the maximum value as 100% and the minimum value as 0%, and presented in frequency histograms and pie diagrams.

Streptavidin magnetic beads treatment

Streptavidin magnetic beads (S1420S; New England Biolabs) were washed by lysis buffer for three times and then after the removal of buffer, protein lysis solution was mixed together and rotated for 1 hour at room temperature. After purification of biotin labeled proteins, using magnetic stands, the SA beads were removed. For isolation of biotin-labeled proteins, beads were boiled with 2 × SDS-sample buffer for 5 min for 95°C.

In-gel detection of newly synthesized cell proteome

To verify the feasibility of Cy3-direct method, we detected ANL-labeled proteins by conjugation to Cy3-Alkyne and subsequent SDS/PAGE–in-gel fluorescence scanning. MetRS^L274G or C57BL/6 mouse myoblast were treated with 2 mM ANL for 4 days. Cells were washed with PBS for three times and lysed with radioimmunoprecipitation assay (RIPA) buffer and protease inhibitor cocktail (Thermo Fisher Scientific). Lysates collected were analyzed for protein concentration by BCA assay. Then, a click chemistry reaction was performed using the Click-iT Protein Reaction Buffer Kit (C10276; Thermo Fisher Scientific) to bind the protein with Cy3-alkyne following manufacturer's instructions. After separation by SDS-PAGE, gel images were acquired using Typhoon FLA 9500 (GE Healthcare).

Cy3-direct antibody array

Myoblasts were treated/untreated with 2 mM ANL for 4 days. The cells were lysed with RIPA buffer. Then, cell lysate was reacted with biotin-alkyne using the Click-iT Protein Reaction Buffer Kit (Cat. No. C10276; Thermo Fisher Scientific) following the manufacturer's protocol, and dialysis against PBS buffer were performed twice afterward. The array slides were blocked in blocking buffer at room temperature for 1 hour. After aspirating the blocking buffer, the sample is placed on each subarray and incubated at 4°C for 16 hours with gentle shaking. The array slides were then washed with washing buffer.

After washing, the slides were dried in the hood for 1 hour and then the signals were detected using genePix 4000B scanner (Molecular Devices) at 532 nm. To show the positive spots in the array slides after washing, 1× Cy3-conjugated SA supplied in the Antibody Array kit was added to each subarray and incubated at room temperature for 2 hours with gentle shaking. After washing, the slides were dried in the hood for 1 hour and then the signals were detected using genePix 4000B scanner (Molecular Devices) at 532 nm.

K-means multilevel thresholding/Otsu thresholding

Otsu's method for image thresholding has the objective of minimizing intraclass variance. This can be expressed mathematically as minimizing the variance contained within the filters (cluster centers). $σ^{2} \cdot ω (t) = ω_{0} (t) \cdot σ_{0}^{2} (t) + ω_{1} (t) \cdot σ_{1}^{2} (t)$

ω₀ and ω₁ are the probabilities of 2 classes separated by threshold t. σ₀ and σ₁ are the variances of these 2 classes.

Objective of k-means (minimize J):

Supplementary Figure S8c shows the raw images parsed with just the green channel, and Supplementary Figure S8d shows the sample output of the k-means intensity threshold segmentation.²²

Machine learning applied to image segmentation filters (trainable segmentation)

First, we will discuss the procedure used to process the raw antibody array image.

Top and left padding is automatically computed based on the distance from the top and bottom to the first significant well (which is defined after k-means quantization).

Around 644 40 × 40 patches are obtained by applying a 40 × 40 identity kernel with a stride of 40.

For each 40 × 40 patch, apply standard image segmentation filters, such as scale-invariant feature transform, to compute feature the image.

Split the entire dataset into 80/20 for training/test (with random shuffling).

If doing trainable segmentation: Flatten each 40 × 40 patch into a 1600-size 1-dimension vector, and apply a classification algorithm (e.g., random forest) to classify whether each pixel is considered signal or background.

Using Python-skimage, pixelwise features are computed, such as Gaussian blurring, Sobel filters, Hessian, and Difference of Gaussians.

These pixelwise features are then used to train a vanilla classifier, such as random forest (50 trees, depth of 2). In this random forest classifier, we seek to minimize the gini impurity between our labels (from the ground truth mask) and projected inference in each tree of the forest. The sample output of the trainable segmentation pipeline is shown in Supplementary Figure S8e.

Deep convolutional network (U-Net architecture inspired)

Please refer to Supplementary Figure S8b and e. A deep convolutional network with residual connections was used to generate a pixel-by-pixel mask.

Results

Well-controlled BONCAT MS uncovers a large number of false positives

The most popular methods for profiling ANL, AHA, and so on, methionine analog-tagged proteins are based on biotin-SA (or similar high affinity) interactions following click chemistry tagging of biotin to ANL, AHA, and other noncanonical amino acids in polypeptides.^11,23 After this purification, the protein pool is digested and subjected to MS, assuming that all the recovered peptides are metabolically labeled, as there is no independent confirmation in the MS step that the profiling is performed only on the ANL- or AHA-tagged proteome.

To test if there could be unlabeled proteome detected by this method, we used the appropriate negative control: ANL that was administered to C57.B6 mice or their derived cells, where there is no MetRS^L274G and no capacity to incorporate ANL, and all tested current methods of affinity purification (DBCO-based and biotin-based) yielded high levels of false positives by MS (Fig. 1a, b). Click western blotting comparing homozygous transgenic MetRS^L274G and C57.B6 cells (Fig. 1c) and mice (Supplementary Fig. S1b–d) that were treated with ANL confirmed a good signal to noise ratio (MetRS^L274G to C57.B6).

Thus, the problem of false-positive discovery does not stem from the efficiency of ANL-tagging, but from the enrichment and downstream MS steps, for example, affinity enrichment and an erroneous assumption that all column-eluted proteins are metabolically tagged. In this regard, a nonspecific background in Click westerns is typically detected (Fig. 1c; Supplementary Fig. S1) in agreement with the recorded false positives from biotin-clicked enriched MS analysis.^2,12

We next performed bioinformatics and computational analysis of the false-positive proteins using five reported BONCAT MS methods. We identified several hundreds to several thousands of false-positive proteins that were present in all five methods. While the false-positive proteins varied, we found 162 specific proteins that were determined to be false positive in all the tagged proteome purification methods. GO and KEGG analysis demonstrated that the false-positive proteins for each method tended to be involved in genetic information processing, cellular processes, environmental information processes, and carbohydrate metabolism (Supplementary Fig. S2), in other words, a variety of functions.

To examine whether there is a preference in false-positive discovery for larger or smaller proteins or those with more or less methionine residues, the molecular weights and methionine content (Fig. 1d) of the false positives were analyzed. It was found that the average molecular weights of the 162 shared false-positive proteins were lower than the molecular weights of all false-positive proteins, with a significance of p < 0.01. There was no statistically significant difference between the methionine content of the 162 shared and all false-positive proteins.

Finally, we looked for a correlation between false-positive discovery and protein abundance in a given tissue. Namely, for Methods 1–5, we determined how many of the false positives (total and the shared 162) were in the top 460 most abundant overall and most abundant liver proteins (PAXdb) (Fig. 1e). Of the 162 shared false-positive proteins, 20.34% were in the top 460 most abundant overall mouse proteins compared to 5.65%, 6.22%, 9.10%, 8.77%, and 12.68% for Methods 1, 2, 3, 4, and 5, respectively, with a statistical significance of p < 0.01. Of the 162 shared false-positive proteins, 37.29% were in the top 460 most abundant mouse liver proteins compared to 10.14%, 10.52%, 18.09%, 25.23%, and 22.46% for Methods 1, 2, 3, 4, and 5, respectively, with a statistical significance of p < 0.01.

The 162 proteins that are false positives for all 5 methods are significantly more abundant than the overall false positives, both in terms of the whole mouse proteome and the mouse liver proteome. In general, therefore, and perhaps not surprisingly, the more abundant proteins are more likely to be seen as false positive.

Summarily, biotin purification-based MS yields a large fraction of false-positive proteins of variable molecular weight and methionine content, and which have a variety of protein functions in different cell processes.

False positives of BONCAT antibody arrays

The detection of specific proteins-of-interest is more straightforward and needs much less starting sample with antibody arrays than MS. We were the first to adopt antibody arrays for metabolic proteomics in the ANL-MetRS^L274G system, and reported statistically significant differences between ANL-tagging the C57.B6 strain (the negative control that reveals the false positives), and ANL-tagging the MetRS^L274G transgenic strain (the experimental data).² Here, we rigorously investigated the false discovery of metabolic antibody arrays in both AHA and ANL labeling set-ups.

Several negative controls were applied to the BONCAT arrays: Hank's balanced salt solution (HBSS), FPLC-purified mouse serum albumin, lysates of C57.B6 primary myoblasts treated with ANL, different protein concentrations, and Click and no-Click detection. Interestingly, each of these controls demonstrated false-positive detection, even though C57.B6 does not express MetRS^L274G and is not capable to incorporate ANL during translation; some of the controls had no tag at all (Fig. 2).

FIG. 2.

False-positive discovery of the BONCAT antibody arrays. (a) Representative images of BONCAT antibody array for the negative controls: HBSS, Mouse Albumin, C57.B6 primary myoblasts treated with ANL (with and without Click reaction). False positives become more visible with more protein lysate being added to the arrays (30 μg) and with increased brightness/contrast. Array-built negative controls (ellipses); array built-in positive controls are shown in rectangles. (b) Frequency histograms of the molecular weight and methionine content of the false positives that are defined as <2-fold over the array-built negative controls. The computational comparison established that proteins with a molecular weight 10–50 kDa are 66.67% of the total 18 false positives, 60–100 kDa—27.78%, and 110 kDa and above are 5.55%. The methionine content frequency distribution was found to be roughly symmetric, with a peak centered at 1.5% methionine in the encoded amino acid sequences of the false positives: 0.0%–1.5% to be 61.11%, 2.0%–3.0% to be 33.33%, and above 3.0% to be 5.56%. Protein information was compiled from UniProt database (The UniProt Consortium. UniProt: the universal protein knowledgebase in 2021. Nucleic Acids Res 2021;49:D480–D489). HBSS, Hank's balanced salt solution.

The false positives were defined as twofold SA Cy-3 signal intensity over the array-built negative control, the same cutoff for the data in the antibody array experiments.^24
–26 Eighteen false positives out of 308 array proteins were found in 7 independent control experiments (no-tag and C57.B6+ANL). The number of false positives varied between individual control experiments (Fig. 2; Supplementary Fig. S3).

There were false positives that appeared in all negative controls, including HBSS (FS-7-associated surface protein; Matrix metalloproteinase-24 protein; Fig. 2) apparently array-specific artifacts. Not only the prevalence but also the identity of the false positives was variable between the experiments, and as expected, more was seen with increased photomultiplier tube (PMT) settings of the laser slide scanner and or increasing the brightness of acquired images (Fig. 2; Supplementary Fig. S3). Proteins with a molecular weight between 10 and 50 kDa contributed to 66.67% of the false positive, 60–100 kDa to 27.78%, and 110 kDa and above to only 5.55% (Fig. 2b). The methionine content frequency distribution was found to be roughly symmetric, with peak centered at 1.5% (Fig. 2b).

These results demonstrate a ∼5% false-positive discovery rate of antibody arrays, which is similar to typical optical protein profiling methods, fluorescence-activated cell sorting and immunofluorescence microscopy, for example.

Mechanism of the false positives of BONCAT antibody arrays

The false positivity of the antibody arrays is surprising, because the samples are applied to the arrays directly (no biotin enrichment columns), and only AHA or ANL residues are capable of linking biotin through Click chemistry (total protein biotinylation is omitted in the BONCAT array), and only Clicked biotin can be detected by SA-Cy3. So how was the false-positive signal generated?

We postulated that antibody arrays detect physiologic naturally biotinylated proteins, which reacts with the SA-Cy3 and are misinterpreted as experimentally biotinylated or AHA/ANL tagged.²⁷ The variable presence of these proteins in different samples might account for the variability of the false-positive detection (Fig. 3a). Removal of the naturally biotinylated proteins with SA-beads before the Click of biotin to AHA or ANL residues should reduce this false-positive background (Fig. 3a).

FIG. 3.

False positives of the biotin-based BONCAT are largely due to the presence of naturally biotinylated proteins. (a) A schematic of naturally biotinylated protein contribution to false positives, and removal. (b) Western blot images and quantification of the pixel densities that are normalized by protein loading, of natural biotinylated proteins from brain, heart, liver, and blood serum. Variable from sample-to-sample levels of naturally biotinylated proteins are detected. (c) Click westerns on C57.B6 and MetRS^L274G cells treated with ANL; the nonspecific signal is reduced by removal of natural biotinylated proteins with SA beads, and reduction of nonspecific click reactions with IAA. (d) Representative BONCAT antibody array images of heart tissue lysates from C57.B6 and MetRS^L274G mice that were treated (or not) with AHA/ANL in vivo, as indicated. Treatment of the lysates with SA beads before the Click visibly reduced the false positives. (e) Heat maps of the 308 proteins that were profiled in heart tissue in 15–16 independent BONCAT antibody array experiments (performed in duplicates for each protein). Removal of the false positives, naturally biotinylated proteins (SA), increases the signal to noise and gives more robust detection of BONCAT labeled proteins, p < 0.05 for each protein in each comparison (tagged vs. untagged, ±SA). AHA, azidohomoalanine; IAA, iodoacetamide; SA, streptavidin.

To characterize the naturally biotinylated proteome, we used Streptavidin magnetic beads to enrich for these proteins in tissue lysates that were derived from C57.B6 brain, heart, liver, and blood serum. Moreover, young and old animals were compared in their naturally biotinylated proteomes, considering our long-standing scientific interest in comparative age-specific protein analyses.^28
–30

Western blotting on the purified naturally biotinylated proteins were performed, using horseradish peroxidase-SA detection, without biotinylation or Click chemistry (Fig. 3b). The physiologic biotinylated proteome was detected in all tissues studied; the levels of naturally biotinylated proteins were variable from sample-to-sample but were elevated in the brains of old mice, compared to young (Fig. 3b).

Click westerns on ANL tagged proteins have notorious nonspecific band(s), the identity of which was not previously explained.^2,9 In agreement with our hypothesis, such nonspecific bands were diminished after removal of naturally biotinylated proteins with streptavidin magnetic beads, and even more so by also adding IAA, which blocks nonspecific Click reactions at cysteine thiols (Fig. 3c).^31,32

Moreover, removal of the naturally biotinylated proteins with SA beads significantly reduced the false positives in the in vivo tagged proteome of MetRS^L274G and C57.B6 mice. Multiple independent experiments with AHA/ANL tagging of MetRS^L274G and C57.B6 mice and profiling the de novo synthesized heart tissue proteome demonstrated that removal of naturally biotinylated proteins before Click greatly reduces false positives and background noise in the no-tag and C57.B6+ANL negative controls (Fig. 3d, e). As expected, out of the 308 profiled proteins, only some were expressed and tagged during the AHA/ANL treatment.

In this regard, we focused BONCAT arrays and the characterization of false positives on the heart, because of the highly efficient metabolic labeling of this tissue, compared with blood, brain, skeletal muscle, and liver (Supplementary Fig. S3 and previous publication).²

As expected, removal of naturally biotinylated proteins resulted in a clearer pattern of proteins that are resolved by the BONCAT arrays and reduced experimental noise (Fig. 3a, d, e). These results also confirmed that AHA incorporation into the newly synthesized proteins by the WT MetRS in C57.B6 mice is more robust than ANL tagging by the MetRS^L274G (Fig. 3d, e).

Interestingly, while in the more robust AHA method, the removal of naturally biotinylated proteome reduced the number of hits, in the weaker orthogonal MetRS^L274G+ANL tagging system, the number of de novo synthesized proteins that are detected became increased when the background was reduced, due to a better signal to noise resolution (Fig. 3e).

in addition, the resolution of the antibody arrays was improved and artificial variability between the samples was reduced, through improving the methods of sample preparation, such as enhancing the solubility of proteins with formic acid, urea, pH adjustment, and sonication (Supplementary Fig. S4). It was found to be prudent to test the starting sample by Click westerns for protein quantity, quality, and tagging efficiency before applying to the arrays (Supplementary Fig. S4).

These data provide an explanation for false-positive discovery in the antibody arrays, suggest that varying sample levels, tissue levels, and age-specific abundance of a naturally biotinylated proteome might introduce variation in the data, and provide chemical and biochemical steps for improving the quality of the starting samples. Finally, these results compare the side-by-side BONCAT efficiency between the AHA-C57.B6 and ANL-MetRS^L274G.

Cy3-direct BONCAT arrays eliminate false positives and improve resolution of comparative proteomics

In published approaches, biotin-tagged proteins are purified on their SA-affinity, or bind to cognate antibodies on the arrays, which is followed by their detection with MS and/or SA-conjugated fluorescent dye on the Arrays.^{2,4,9,16,33,34} As established above, these methods have inherent false positives stemming from natural biotinylation, which is not completely solved by removing the physiologically biotinylated proteins, because this also discards the proteins, which are both experimentally tagged and naturally biotinylated.

To eliminate the biotin-reliance of comparative proteomics, we replaced the biotin-alkyne click chemistry (Biotin-SA) with directly clicked to ANL/AHA Cy3-alkyne (termed, Cy3-Direct). The effects of this approach on the minimization of false positives and on the resolution of metabolic proteomics were examined with Click westerns and BONCAT arrays (Fig. 4).

FIG. 4.

Cy3-direct eliminates false positives and enhances the resolution of comparative proteomics (a) Schematic of Cy3-direct method, compared to the biotin-SA method. (b) In-gel fluorescence scanning and SDS/PAGE detection of copper-catalyzed conjugation of Cy3-Alkyne to ANL-labeled proteins derived from the MetRS^L274G and C57BL/6 myoblast lysates (all cells were treated with ANL). (c) Representative images of BONCAT antibody arrays with Cy3-direct and biotin-SA methods. Less false positives (C57.B6) and higher signal detection (MetRS^L274G) are visible. (d) Heat map of the proteome that was resolved in multiple Cy3-direct and biotin-SA BONCAT antibody arrays demonstrate significant decrease in the false positives (MetRS^L274G vs. C57BL/6 myoblast treated with ANL), which is accompanied by the significant increase in the detection of the studied 308 proteins. N = 4–6, p < 0.05 MetRS^L274G versus C57BL/6. (e) Heat map of the proteome comparing application of SA-Cy3 before versus after hybridization of Cy3-clicked protein lysates to the arrays. SDS/PAGE, sodium dodecyl sulfate-polyacrylamide gel electrophoresis.

Primary myoblasts were derived from MetRS^L274G and C57BL/6 mice and all the cells were metabolically labeled with ANL in culture (4 days, 2 mM ANL). Protein lysates from these MetRS^L274G and the negative control, C57BL/6, myoblasts were clicked with either Cy3-Alkyne or biotin-Alkyne, as described in Materials and Methods section. The tagged proteome was resolved with in-gel fluorescence (Cy3-Alkyne click) or chemiluminescence (biotin-Alkyne click) and with BONCAT arrays (Fig. 4a schematically depicts this study).

Cy3-Direct robustly detected the azido-labeled proteins with low-to-no background, when comparing MetRS^L274G cell lysate to the negative control, for example, C57/B6 cell lysate, after identical administration of ANL to all the cells (Fig. 4b). The nonspecific bands and other background noise that are typical for the biotin-based Click westerns and are present in the biotin-SA Chemiluminescence blotting were eliminated with the Cy3-Direct (Fig. 4b).^{2,4,11,34
–36} Similar results were seen for the AHA labeling system (Supplementary Fig. S5).

Next, we compared the Cy3-Direct and biotin-SA in multiple independent experiments that culminated with the BONCAT antibody arrays (Figs. 4c, d). These studies demonstrated that in BONCAT arrays, Cy3-Direct resolves ninefold more proteins than the biotin-SA (Fig. 4c, d). This much improved resolution was simultaneous with the greatly diminished false positives (Fig. 4c, d; Supplementary Fig. S6).

Specifically, 197 proteins were resolved by the Cy3-Direct BONCAT arrays, compared to only 18 by the biotin-SA BONCAT arrays. Thirty-five false positives (C57/B6+ANL proteins) were detected with the biotin-SA BONCAT arrays, compared to four false positives in the Cy3-Direct method.

Confirming that naturally biotinylated proteins are a major source of false positives, SA-Cy3 array reagent that is added after the Cy3-Direct protein lysates are incubated with the arrays, increases the false positives of the C57/B6 samples, which become at the same level, as in the biotin-SA method (Fig. 4e; Supplementary Fig. 7). This is self-explanatory, because SA-Cy3 binds to all biotin-containing proteins, be they naturally biotinylated or clicked. In contrast, when this SA-Cy3 array reagent is added first to highlight the arrays-built positive controls; for example, before any proteins are present, and the Cy3-Direct protein lysates are added after the SA-Cy3 is washed away, the false positives are greatly minimized (Fig. 4c–e; Supplementary Figs. S6 and S7).

These data establish the Cy3-Drect BONCAT array method, which eliminates the reliance on biotin, consequentially increasing the accuracy of signal detection and minimizing the false positives, while also decreasing the experimental time and simplifying the assay.

Semantic image segmentation for noise reduction in optical outputs

Another persistent problem with the optical detection of antibody arrays’ data is the noise of nonspecific fluorescence that at times significantly masks specific signals. To address this problem and thus increase the reliable interpretation of the antibody array experiments, we used computational analysis and computer vision approaches.

The problem of denoising antibody array images can be reduced to a semantic image segmentation problem. There are many approaches available for image segmentation, including Otsu's intensity thresholding and pixel-by-pixel or segment feature extraction.^22,37,38 Here, we applied different data segmentation methods (Supplementary Fig. S8) and compared their efficiencies in clarifying the results of antibody arrays, using the Jaccard overlap metric score.³⁹

Moreover, several computer vision approaches were used, including utilizing baseline models, to compare efficacy.^40,41 One unsupervised learning approach was k-means intensity thresholding, an extension of Otsu's method.²² The threshold was varied based on the relative intensities of all other pixels in a 40 × 40 patch.

All other approaches involved pixel masks, for example, the manual content of signals—dots, that were generated as input labels for weakly supervised learning. Traditional machine learning methods, such as a random-forest classifier, were used to learn the mask from the input image pixel-by-pixel. The results were compared against deep learning methods to learn the same problem. In self-supervised learning, Gaussian noise was introduced into the input image, and the algorithm objective was for an autoencoder to reconstruct the original image from the deliberately noisy one. Figure 5a shows the supervised training pipeline.

FIG. 5.

Antibody array image segmentation. (a) Overview of the supervised learning pipeline. After splicing the original image into 40 × 40 patches, the patches (which are displayed as a stitched image) are fed as input to the downstream machine learning algorithms. The output is an image segmentation mask where every pixel is computed as a probability distribution of being signal (foreground) or background. (b) Denoised image segmentation output of K-means, random forest, and residual CNN algorithms. (c) Jaccard Score metric results for all the computer vision methods applied. The deep learning approaches excel compared to the more traditional machine learning algorithms. (d) Saliency heatmaps display important regions in the original input image, which contribute most to the output of signal or background for each pixel. (e) Compressed representations of an antibody array image based on the average across every 40 × 40 patch. CNN, Convolutional Neural Network.

In Figure 5b, we see a representative image of a noisy antibody array where nonspecific fluorescence interferes with the positive and negative array-built controls and with the experimental signals. The image segmentation and machine learning approaches visibly reduced this nonspecific fluorescence.

The Jaccard scores are computed for all the methods and are compared between training and validation data for detecting the signal and the background (Fig. 5c). The Convolutional Neural Network (CNN) architecture is described in Supplementary Figure S8, which also provides the overall workflow, additional examples of arrays with nonspecific fluorescence and their denosing through our machine learning approaches.

In Figure 5d, saliency maps were computed by taking the absolute value of the gradient of our CNN model scores with respect to the input for every pixel. As can be seen from this figure, the model is the region of pixels, which estimates the location of the signal, in agreement with the published work.⁴² These results, as well as the compressed representations of antibody arrays (Fig. 5e; Supplementary Fig. S8c), demonstrate that our residual CNN and Deep Convolutional Generative Adversarial Network (DCGAN) are the best models for this task. The residual CNN architecture is outlined in Supplementary Figure S8b. The main structure of this network is developed from U-Net, with modifications made to the number of hidden layers for a faster and more lightweight model, which is suitable for our problem.

Our trained CNN model has a validation Jaccard accuracy score of 71% and DCGAN of 66.5% in coverage of identifying positive signals. This vastly improves on a simple k-means thresholding and other traditional computational and machine learning approaches. Note that k-means thresholding is additionally used as a baseline (30% accuracy against the user-defined mask), which can be valuable in evaluating the accuracy of the user-defined mask.

Summarily, these novel computer vision approaches denoise the raw optical data of the antibody arrays, enabling more accurate interpretation of the results of comparative proteomics.

Discussion

MS and antibody arrays are two analytical tools that are commonly used to profile proteomes, but as this study shows, in application to BONCAT, both have inherent false-positive tendencies. The false discovery is more obvious in the antibody arrays, which have internal positive and negative controls and where the signal of the external negative controls (HBSS, no tag, etc.) can be compared with the experimental outcome, setting up a detection threshold, for instance, twofold higher with p < 0.05. In contrast, the false discovery of BONCAT MS is not as obvious for AHA tagging, and while some error is revealed through comparing with the MetRS^274G ANL labeling, there is no easy way to objectively distinguish the binding artifacts of the biotin affinity steps from the experimental outcomes, for example, the true metabolically tagged proteome.

Indeed, our results uncover that BONCAT MS generates hundreds to thousands of cryptic false positives using published methods, which we found by applying appropriate negative control of tagging C57.B6 mice with ANL (Fig. 1). Proteins with diverse biological functions, sizes, and methionine contents were revealed as false positive, cautioning the interpretations in various fields of research. As expected, the higher abundance proteins are more likely to contaminate the enrichment columns.

The main sources of data contamination appear to be the naturally biotinylated proteome and the nonspecific binding of untagged proteins to the biotin affinity purification, DBCO beads, etc. columns. In support of this conclusion, a clear signal to noise increase, yet, still some background, are typically detected after the biotin Click step in Click western (Fig. 1, Supplementary Figs. S1 and S3, and in published work).²

In this regard, another proteomics method using aptamers also relies on biotin and affinity-based enrichment for specific protein-aptamer complexes. Thus, controls for nonspecific binding might be prudent, particularly because even low false-positive discovery in aptamer proteomics will be amplified in the NextGen sequencing, and there are no negative controls like nonspecific IgG, which even in case of antibody-antigen interactions that have much higher affinity, typically yield some false-positive nonspecific binding.^43,44

BONCAT antibody arrays offer a feature of increasing PMT exposure during the scans and/or increasing the image brightness/contrast, which better reveals the false positives, that otherwise might appear as faint dots (Fig. 2 and previous publication).¹⁶ Such no-tag signal is a given consequence of the naturally biotinylated proteome and our data demonstrate that preclearing with SA beads reduce these false positives (Fig. 3).²⁷

Importantly, even though some proteins that were naturally biotinylated and metabolically tagged are expected to be diminished, the number of ANL tagged experimentally detected proteins were increased when the nonspecific noise and artificial variability between the samples were reduced. Rigorous profiling of proteins in comparative proteomics relies on the signal-to-noise sensitivity, thus approaches which increase the distinction between the true signal and nonspecific signal allow one to identify more proteins (Fig. 3; Supplementary Fig. S4).

In another interesting observation, a side-by-side comparison between the AHA and ANL tagging demonstrated that the bioorthogonal transgenic MetRS^L274G does not perform as effectively as the native MetRS (Fig. 3). Thus, while the bioorthogonal transgenic metabolic labeling enables many more experimental possibilities for specific cell-fate, cell age, and so on, proteome labeling might miss some of the newly synthesized proteins, compared to the WT-based AHA system.^{9,23,31,45,46}

At the same time, MetRS^L274G ANL tagging helps to avoid a potential caveat of comparative proteomics that follows blood heterochronicity or administration of proteins into host animals and tracing these to different organs. In contrast to MetRS^L274G approach, where WT animals are incapable of integrating ANL into their proteome, biotinylation, or AHA/biotin click, etc., similar methods do not distinguish between primary and secondary (after protein degradation and amino acid recycling) incorporation of the labels into the host proteome.^47

–50

Strengthening the conclusion of misinterpreting naturally biotinylated proteome for the metabolically labeled proteins, we demonstrate that Cy3-Direct BONCAT eliminates the false positives of the biotin-reliant Click Chemistry proteomics. Cy3-Direct BONCAT technology greatly improves the accuracy and resolution of protein detection, enhances the sensitivity and dynamic range of comparative proteomics (Fig. 4; Supplementary Figs. S5–S7), and in addition, makes the assays simpler and faster. In confirmation of the paradigm, SA-Cy3 reagent that highlights the naturally biotinylated proteins on the arrays and negates the biotin-independence, increases the nonspecific noise.

To further improve the signal to noise resolution, our computational analysis (computer vision) uses machine learning and data segmentation to significantly reduce the nonspecific fluorescence, thereby increasing the resolution and accuracy of BONCAT arrays (Fig. 5; Supplementary Fig. S8). This computational paradigm can be applied to any optical data, such as conventional antibody arrays, quantitative immunofluorescence, in situ sequencing, and so on.

Conclusions

Summarily, this work provides comprehensive characterization of false discovery of the BONCAT metabolic proteomics, and it establishes the technologies for minimizing and avoiding these undesirable outcomes.

Footnotes

Authors’ Contributions

C.L. and E.W. planned and performed the experiments with results shown in Figures 1 –4 and 5b, Supplementary Figures S1–S7 and cowrote the article. N.W. planned and performed the studies that are shown in Figure 5 and Supplementary Figure S8 and cowrote the article. W.H. performed the antibody array experiments, provided Figure 2b, and formatted the references. L.B. provided Figure 1b, d, e, and Supplementary Figure S2. K.A. and J.D. provided and discussion of proteomics’ methods. M.M. provided Supplementary Figure S1b and c. M.J.C. conceived and planned the work on the naturally biotinylated proteins and cowrote the article. I.M.C. designed and directed the study, interpreted, and integrated the results and wrote the article.

Acknowledgments

We thank Leran Mao for Click-Westerns. We also thank Katheryn Zhou, Natalie Celt, Crystal Gong, Alina Su, and Alexandra Benoni for providing technical help with these studies. We thank Dr. Lori Kohlstaedt at QB3/Chemistry Mass Spectrometry Facility at UC Berkeley for the assistance in MS analysis.

Author Disclosure Statement

The authors declare no conflict of interests.

Funding Information

This work was supported by the Open Philanthropy and NIH RO1 grants to I.M.C.

Supplementary Material

Supplementary Figure S1

Supplementary Figure S2

Supplementary Figure S3

Supplementary Figure S4

Supplementary Figure S5

Supplementary Figure S6

Supplementary Figure S7

Supplementary Figure S8

References

Heuillet

, Bellvert

, Cahoreau

, et al. Methodology for the validation of isotopic analyses by mass spectrometry in stable-isotope labeling experiments. Anal Chem, 2018; 90:1852–1860.

Liu

, Conboy

, Mehdipour

, et al. Application of bio-orthogonal proteome labeling to cell transplantation and heterochronic parabiosis. Nat Commun, 2017; 8:643.

Glenn

, Stone

, Ho

, et al. Bioorthogonal noncanonical amino acid tagging (BONCAT) enables time-resolved analysis of protein synthesis in native plant tissue. Plant Physiol, 2017; 173:1543–1553.

Alvarez-Castelao

, Schanzenbächer

, Langer

, Schuman

. Cell-type-specific metabolic labeling, detection and identification of nascent proteomes in vivo. Nat Protoc, 2019; 14:556–575.

Snider

, Wang

, Bogenhagen

, Haley

. Pulse SILAC approaches to the measurement of cellular dynamics. Adv Exp Med Biol, 2019; 1140:575–583.

Vinaiphat

, Low

, Yeoh

, Chng

, Sze

. Application of advanced mass spectrometry-based proteomics to study hypoxia driven cancer progression. Front Oncol, 2021; 11:559822.

Koren

, Gillett

, D'alton

, et al. Proteomic techniques to examine neuronal translational dynamics. Int J Mol Sci, 2019; 20:3524.

Landgraf

, Antileo

, Schuman

, Dieterich

. BONCAT: Metabolic labeling, click chemistry, and affinity purification of newly synthesized proteomes. In: Gautier

, Hinner M (eds): Site-Specific Protein Labeling. Methods in Molecular Biology, Vol. 1266. New York, NY: Humana Press, 2015, pp. 199–215.

Alvarez-Castelao

, Schanzenbächer

, Hanus

, et al. Cell-type-specific metabolic labeling of nascent proteomes in vivo. Nat Biotechnol, 2017; 35:1196–1201.

10.

Di Paolo

, Farias

, Garat

, et al. Rat sciatic nerve axoplasm proteome is enriched with ribosomal proteins during regeneration processes. J Proteome Res, 2021; 20:2506–2520.

11.

Saleh

, Jacobson

, Kinzer-Ursem

, Calve

. Dynamics of non-canonical amino acid-labeled intra- and extracellular proteins in the developing mouse. Cell Mol Bioeng, 2019; 12:495–509.

12.

Evans

, Bodea

, Götz

. Cell-specific non-canonical amino acid labelling identifies changes in the de novo proteome during memory formation. Elife, 2020; 9:e52990.

13.

Denninger

, Chen

, Turkoglu

, et al. Defining the adult hippocampal neural stem cell secretome: In vivo versus in vitro transcriptomic differences and their correlation to secreted protein levels. Brain Res, 2020; 1735:146717.

14.

Sadlowski

, Balderston

, Sandhu

, et al. Graphene-based biosensor for on-chip detection of bio-orthogonally labeled proteins to identify the circulating biomarkers of aging during heterochronic parabiosis. Lab Chip, 2018; 18:3230–3238.

15.

Dadfar

SMM

, Sekula-Neuner

, Trouillet

, et al. Evaluation of click chemistry microarrays for immunosensing of alpha-fetoprotein (AFP). Beilstein J Nanotechnol, 2019; 10:2505–2515.

16.

Yang

, Stevens

, Chen

, et al. Physiological blood–brain transport is impaired with age by a shift in transcytosis. Nature, 2020; 583:425–430.

17.

Schiapparelli

, McClatchey

, Liu

H-H

, et al. Direct detection of biotinylated proteins by mass spectrometry. J Proteome Res, 2014; 13:3966–3978.

18.

Y-F

, Bao

H-M

, Xiao

X-F

, et al. Biotin tagging coupled with amino acid-coded mass tagging for efficient and precise screening of interaction proteome in mammalian cells. Proteomics, 2009; 9:5414–5424.

19.

Huttlin

, Hegeman

, Harm

, Sussman

. Prediction of error associated with false-positive rate determination for peptide identification in large-scale proteomics experiments using a combined reverse and forward peptide sequence database strategy. J Proteome Res, 2007; 6:392–398.

20.

Shen

, Sheng

, Dai

, et al. On the estimation of false positives in peptide identifications using decoy search strategy. Proteomics, 2009; 9:194–204.

21.

Bogdanow

, Zauber

, Selbach

. Systematic errors in peptide and protein identification and quantification by modified peptides. Mol Cell Proteomics, 2016; 15:2791–801.

22.

Liu

, Yu

. Otsu Method and K-means. In: 2009 Ninth International Conference on Hybrid Intelligent Systems, Shenyang, China, August 12–14. IEEE, 2009, pp. 344–349.

23.

Mahdavi

, Hamblin

, Jindal

, et al. Engineered aminoacyl-tRNA synthetase for cell-selective analysis of mammalian protein synthesis. J Am Chem Soc, 2016; 138:4278–4281.

24.

Mehdipour

, Mehdipour

, Skinner

, et al. Plasma dilution improves cognition and attenuates neuroinflammation in old mice. Geroscience, 2020; 43:1–18.

25.

Lin

Q-M

, Tang

X-H

, Lin

S-R

, et al. Bone marrow-derived mesenchymal stem cell transplantation attenuates overexpression of inflammatory mediators in rat brain after cardiopulmonary resuscitation. Neural Regen Res, 2020; 15:324.

26.

, Zhang

, Zhu

, et al. The profile of angiogenic factors in vitreous humor of the patients with proliferative diabetic retinopathy. Curr Mol Med, 2017; 17:280–286.

27.

Yagi

, Terada

, Baba

, et al. Localization of endogenous biotin-containing proteins in mouse Bergmann glial cells. Histochem J, 2002; 34:567–572.

28.

Lehallier

, Gate

, Schaum

, et al. Undulating changes in human plasma proteome profiles across the lifespan. Nat Med, 2019; 25:1843–1850.

29.

Schaum

, Lehallier

, Hahn

, et al. Ageing hallmarks exhibit organ-specific temporal signatures. Nature, 2020; 583:596–602.

30.

Ori

, Toyama

, Harris

, et al. Integrated transcriptome and proteome analyses reveal organ-specific proteome deterioration in old rats. Cell Syst, 2015; 1:224–237.

31.

Ngo

, Schuman

, Tirrell

. Mutant methionyl-tRNA synthetase from bacteria enables site-selective N-terminal labeling of proteins expressed in mammalian cells. Proc Natl Acad Sci USA, 2013; 110:4992–4997.

32.

van Geel

, Pruijn

GJM

, van Delft

, et al. Preventing Thiol-Yne addition improves the specificity of strain-promoted azide–alkyne cycloaddition. Bioconjug Chem, 2012; 23:392–398.

33.

Cefaliello

, Penna

, Barbato

, et al. Deregulated local protein synthesis in the brain synaptosomes of a mouse model for Alzheimer's disease. Mol Neurobiol, 2020; 57:1529–1541.

34.

Craig

, Huyghues-Despointes

, Yu

, et al. Structurally optimized analogs of the retrograde trafficking inhibitor Retro-2cycl limit Leishmania infections. PLoS Negl Trop Dis, 2017; 11:e0005556.

35.

Rothenberg

, Taliaferri

, Huber

, et al. A proteomics approach to profiling the temporal translational response to stress and growth. iScience, 2018; 9:367–381.

36.

Evans

, Benetatos

, van Roijen

, et al. Decreased synthesis of ribosomal proteins in tauopathy revealed by non-canonical amino acid labelling. EMBO J, 2019; 38:e101174.

37.

Yang

, Stafford

, Kim

. Segmentation and intensity estimation for microarray images with saturated pixels. BMC Bioinformatics, 2011; 12:462.

38.

Ensink

, Sinha

, et al. Segment and fit thresholding: A new method for image analysis applied to microarray and immunofluorescence data. Anal Chem, 2015; 87:9715–9721.

39.

Bertels

, Eelbode

, Berman

, et al. Optimizing the dice score and Jaccard index for medical image segmentation: Theory and practice. In Lecture Notes in Computer Science. Springer International Publishing, 2019, pp. 92–100. https://doi.org/10.1007/978-3-030-32245-8_11.

40.

Ronneberger

, Fischer

, Brox

U-Net: Convolutional networks for biomedical image segmentation. In Lecture Notes in Computer Science. Springer International Publishing, 2015, pp. 234–241. https://doi.org/10.1007/978-3-319-24574-4_28.

41.

Medical Image Computing and Computer-Assisted Intervention -- MICCAI 2015. In: Navab N, Hornegger J, Wells WM, & Frangi A, (eds.): Lecture Notes in Computer Science. Springer International Publishing, 2015. https://doi.org/10.1007/978-3-319-24553-9.

42.

Simonyan

, Vedaldi

, Zisserman

Deep inside convolutional networks: Visualising image classification models and saliency maps. ArXiv:1312.6034 [Cs], 2014. http://arxiv.org/abs/1312.6034.

43.

Ngo

, Sinha

, Shen

, et al. Aptamer-based proteomic profiling reveals novel candidate biomarkers and pathways in cardiovascular disease. Circulation, 2016; 134:270–285.

44.

Gold

, Ayers

, Bertino

, et al. Aptamer-based multiplexed proteomic technology for biomarker discovery. PLoS One, 2010; 5:e15004.

45.

Yuet

, Doma

, Ngo

, et al. Cell-specific proteomic analysis in Caenorhabditis elegans. Proc Natl Acad Sci USA, 2015; 112:2705–2710.

46.

tom Dieck

, Kochen

, Hanus

, et al. Direct visualization of newly synthesized target proteins in situ. Nat Methods, 2015; 12:411–414.

47.

Erdmann

, Marter

, Kobler

, et al. Cell-selective labelling of proteomes in Drosophila melanogaster. Nat Commun, 2015; 6:7521.

48.

Wang

Y-C

, Peterson

, Loring

. Protein post-translational modifications and regulation of pluripotency in human stem cells. Cell Res, 2014; 24:143–160.

49.

Reily

, Stewart

, Renfrow

, et al. Glycosylation in health and disease. Nat Rev Nephrol, 2019; 15:346–366.

50.

Borrmann

, van Hest

JCM

. Bioorthogonal chemistry in living organisms. Chem Sci, 2014; 5:2123.

Supplementary Material

Please find the following supplemental material available below.

For Open Access articles published under a Creative Commons License, all supplemental material carries the same license as the article it is associated with.

For non-Open Access articles published, all supplemental material carries a non-exclusive license, and permission requests for re-use of supplemental material or any part of supplemental material shall be sent directly to the copyright owner as specified in the copyright notice associated with the article.

0.00 MB

23.76 MB