Rise of Clinical Microbial Proteogenomics: A Multiomics Approach to Nontuberculous Mycobacterium

Abstract

Nontuberculous mycobacterial (NTM) species present a major challenge for global health with serious clinical manifestations ranging from pulmonary to skin infections. Multiomics research and its applications toward clinical microbial proteogenomics offer veritable potentials in this context. For example, the Mycobacterium abscessus, a highly pathogenic NTM, causes bronchopulmonary infection and chronic pulmonary disease. The rough variant of the M. abscessus UC22 strain is extremely virulent and causes lung upper lobe fibrocavitary disease. Although several whole-genome next-generation sequencing studies have characterized the genes in the smooth variant of M. abscessus, a reference genome sequence for the rough variant was generated only recently and calls for further clinical applications. We carried out whole-genome sequencing and proteomic analysis for a clinical isolate of M. abscessus UC22 strain obtained from a pulmonary tuberculosis patient. We identified 5506 single-nucleotide variations (SNVs), 63 insertions, and 76 deletions compared with the reference genome. Using a high-resolution LC-MS/MS-based approach (liquid chromatography tandem mass spectrometry), we obtained protein coding evidence for 3601 proteins, representing 71% of the total predicted genes in this genome. Application of proteogenomic approach further revealed seven novel protein-coding genes and enabled refinement of six computationally derived gene models. We also identified 30 variant peptides corresponding to 16 SNVs known to be associated with drug resistance. These new observations offer promise for clinical applications of microbial proteogenomics and next-generation sequencing, and provide a resource for future global health applications for NTM species.

Introduction

Nontuberculous mycobacterial (NTM) species are responsible for a wide range of disease manifestations ranging from pulmonary to skin infections (Katoch, 2004). Among the NTMs, Mycobacterium abscessus, a rapidly growing mycobacterium, is known to be responsible for a wide spectrum of soft tissue diseases and disseminated infections in immunocompromised patients (Griffith et al., 2007; Moore and Frerichs, 1953). Several hospital-based outbreaks of M. abscessus infections have been reported, suggesting the importance of this organism in healthcare-associated infections (Koh et al., 2010; Viana-Niero et al., 2008). M. abscessus is also considered one of the prominent NTMs and its increased prevalence is an issue of concern because of its high virulence and antibiotic resistance (Esther et al., 2010; Griffith et al., 1993).

Studies have shown that M. abscessus has two major morphotype forms, rough (R) and smooth (S) (Byrd and Lyons, 1999; Ripoll et al., 2009). The rough (R) form is known not to have surface polyketide compound, glycopeptidolipids, and is associated with more severe infection (Catherinot et al., 2007; Howard et al., 2006; Pawlik et al., 2013). M. abscessus UC22 strain has the rough (R) morphological form and is one of the causative organisms of the upper lobe fibrocavitary (UC) type of pulmonary disease (Kim et al., 2017).

Virulence in mycobacteria is related to the ability of the bacterial cells to survive in the host (Smith, 2003). In mycobacterial genomes, most of these virulence genes encode enzymes for lipid and fatty acid metabolism, cell envelope proteins, gene expression regulators, and proteins of signal transduction systems (Forrellad et al., 2013). Virulence factors such as PE/PPE family, mammalian cell entry (Mce) protein family, ESX (type VII secretion systems) export, and twin-arginine translocase (Tat) export systems have been well characterized in Mycobacterium tuberculosis but have not been studied in M. abscessus (Bottai and Brosch, 2009; Bottai et al., 2014; Tsirigotaki et al., 2017; Zhang and Xie, 2011).

A recently published study has shown that rough morphotype of M. abscessus is generally more virulent than the smooth morphotype, which suggests interdependence of morphology and virulence (Kim et al., 2017). M. abscessus UC22 strain has a rough morphotype and there have been no studies on integrated molecular profiling of M. abscessus UC22 strain so far, for the identification of genomic determinants of virulence and drug resistance, which may be associated in the UC form of pulmonary disease.

In addition, the clinical features presented in pulmonary tuberculosis infection, as well as in the pulmonary infection due to M. abscessus, are reported to be similar (Ishiekwene et al., 2017; Wankhade and Bhore, 2017). As a result, a clear distinction between them is currently difficult. Furthermore, M. abscessus is known to be inherently resistant to several chemotherapeutic agents used to treat pulmonary tuberculosis (Griffith, 2014). Therefore, lack of early detection results in inappropriate treatment of M. abscessus infection. This necessitates a need for a detailed molecular profiling of M. abscessus that will provide a platform to further investigate differentiating molecular patterns.

The advancement in high-throughput mass spectrometry has not only enabled the in-depth exploration of proteomes but also helped in refining the gene models of several organisms from bacteria to human (Gallien et al., 2009; Heunis et al., 2017; Kim et al., 2014; Venter et al., 2011). It is reasonable to believe that protein-level evidence as provided by LC-MS/MS analysis is more confirmatory than the computational prediction of protein-coding genes annotated against any genome sequence (Ruggles et al., 2017). The multidisciplinary protocols of integrating proteomic data into the genome annotation process are collectively termed proteogenomics (Venter et al., 2011).

Several proteogenomic studies on mycobacterium species have highlighted the limitations of gene prediction programs, including genome assembly errors, start site, and gene boundaries (Gallien et al., 2009; Kelkar et al., 2011; Pinto et al., 2018; Potgieter et al., 2016; Zheng et al., 2017).

In this study, we carried out next-generation sequencing and LC-MS/MS analysis of M. abscessus UC22 clinical isolate, and applied a proteogenomic approach to improve the annotation of the genome. We identified several single-nucleotide variations (SNVs) in the genes known to be associated with drug resistance, validated further by the identification of the corresponding variant peptides through the proteomic data. Use of a proteogenomic approach in the current study led to the discovery of novel protein-coding regions in the M. abscessus UC22 genome, which have been missed in the previous annotation. We believe that the integrated multiomic approach presented in this study enhanced our knowledge on the M. abscessus UC22 strain. The results of which can provide insight into understanding the molecular mechanisms of virulence and drug resistance in this strain.

Materials and Methods

M. abscessus culture and whole-genome sequencing

A clinical isolate from pulmonary tuberculosis patient (Patient ID: JAL-14421) was obtained from the Department of Microbiology, National JALMA Institute of Leprosy and other Mycobacterial Diseases, Agra, India. The sputum was smear positive for acid-fast bacilli. Culturing and biochemical tests revealed the isolate to be an NTM. The drug susceptibility testing revealed the isolate to be resistant to rifampicin, isoniazid, ethambutol, and streptomycin. The isolate was processed as per the standard protocol and inoculated on Lowenstein-Jensen (LJ) slants and then was further used for DNA extraction. The DNA isolation was carried out similar to Sharma et al. (2017). We used the cetyltrimethylammonium bromide method of DNA isolation from the bacterial culture grown on LJ slant (van Embden et al., 1993).

The quality of DNA was assessed on 1% agarose gel and the DNA was quantified on the NanoDrop spectrophotometer (ND-1000 UV-Vis). Paired-end DNA libraries were prepared according to the manufacturer's description and were sequenced with a read length of 2 × 100 nucleotides using an Illumina HiSeq2500 instrument (Illumina, Inc., San Diego). Whole-genome sequencing led to the identification of the isolate to be M. abscessus. Metagenomic analysis was performed using Kraken tool on the raw reads, which confirmed the isolate belonged to M. abscessus UC22 strain (Wood and Salzberg, 2014). The study was approved by the National JALMA Institute of Leprosy and other Mycobacterial Diseases Institutional Ethics Committee.

M. abscessus culture and protein extraction

M. abscessus UC22 clinical isolate was grown in Middlebrook 7H9 medium (supplemented with OADC [oleic acid, dextrose, and catalase]). The cultures were incubated at 37°C in shaker incubator. The culture was harvested at stationary phase and centrifuged for proteomic analysis. The cell pellets were washed thrice with chilled phosphate-buffered saline. Mechanical lysis was carried out with 0.1 mm Zirconia beads in a Precellys tissue homogenizer for 3 min (6500 rpm) twice under continuous cooling at or below 4°C in 2% SDS lysis buffer (2% SDS in 50 mM triethylammonium bicarbonate (TEABC), 1 mM sodium fluoride, 1 mM sodium orthovanadate, 2.5 mM sodium pyrophosphate, and 1 mM β-glycerophosphate).

Lysates were resuspended in twice the volume of lysis buffer with phosphatase inhibitors, and the lysates were passed through 0.22 μm filters. Bicinchoninic acid assay was used to estimate the concentration of proteins.

Trypsin digestion and fractionation

In-solution digestion was carried out as described previously (Nagarajha Selvan et al., 2014). Briefly, 300 μg of protein was reduced and alkylated with 10 mM dithiothreitol and 20 mM iodoacetamide, respectively. The proteins were precipitated using chilled acetone kept at −20°C overnight. The protein pellets were reconstituted in 100 mM TEABC buffer and digested using trypsin (1:20) (modified sequencing grade; Promega, Madison, WI) at 37°C overnight. After confirming the digestion efficiency, 100 μg peptide digest was fractionated using StageTip-based strong cation-exchange (SCX) chromatography.

The remaining peptide digest was subjected to basic pH reverse-phase chromatography (bRPLC). For bRPLC fractionation, the peptide digest was loaded onto Waters XBridge column (Waters Corporation, Milford, MA; 130 Å, 5 μm, 250 × 4.6 mm) using a manual injector attached to the Hitachi HiChrom HPLC system. The peptide digest was fractionated at a flow rate of 0.5 mL/min using a linear gradient of 5–100% of 10 mM TEABC, 90% acetonitrile over 120 min. A total of 96 fractions were collected and concatenated to a final of 12 fractions. The pooled fractions were evaporated to dryness using SpeedVac concentrator (Thermo Scientific) followed by desalting using C18 StageTips.

For SCX chromatography-based fractionation, the peptide digest was evaporated to dryness and reconstituted in 2% trifluoroacetic acid (TFA) solution. The sample was loaded onto StageTip packed with cation-exchange material PolySULFOETHYL A (EmporeTM Cation 47 mm extraction disc 66889-U), washed with 0.2% TFA, and fractionated into 6 fractions using increasing concentrations of ammonium acetate (50–300 mM ammonium acetate, 20% acetonitrile, and 0.5% formic acid). The final fraction was obtained by eluting the peptides using 5% ammonium hydroxide and 80% acetonitrile. The eluates were evaporated to dryness using SpeedVac concentrator (Thermo Scientific) at room temperature and stored at −20°C until LC-MS/MS analysis.

LC-MS/MS analysis

The peptides from both bRPLC (12 fractions) and SCX (6 fractions) fractionation methods were analyzed in technical replicates. Two separate fractionation methods were used to increase proteome coverage. The data were acquired on Thermo Scientific Orbitrap Fusion Tribrid mass spectrometer (Thermo Fischer Scientific, Bremen, Germany) connected to Easy-nLC-1200 nanoflow liquid chromatography system (Thermo Scientific). The peptides were reconstituted in 0.1% formic acid and loaded onto a 2 cm trap column (nanoViper, 3 μm C18 Aq) (Thermo Fisher Scientific). Peptides were separated using a 15 cm analytical column (nanoViper, 75 μm silica capillary, 2 μm C18 Aq) at a flow rate of 300 nL/min. The solvent gradients were set as linear gradient of 5–35% solvent B (80% acetonitrile in 0.1% formic acid) for 90 min. The total run time for each fraction was 120 min.

Global MS survey scan was carried out at a scan range of 400–1600 m/z (120,000 mass resolution at 200 m/z) in a data-dependent mode using an Orbitrap mass analyzer. The maximum injection time was 5 msec. Only peptides with charge state 2–6 were considered for analysis, and the dynamic exclusion was set to 30 sec. For MS/MS (tandem mass spectrometry) analysis, data were acquired at top speed mode with 3-sec cycles and subjected to higher collision energy dissociation with 34% normalized collision energy. MS/MS scans were carried out at a range of 100–1600 m/z using Orbitrap mass analyzer at a resolution of 30,000 at 200 m/z. Maximum injection time was 120 msec. Internal calibration was carried out using lock mass option (m/z 445.1200025) from ambient air.

Data analysis

Mapping and variant detection using M. abscessus UC22 reference genome

The quality of the raw reads were checked for PHRED score of <Q20 using FastQC toolkit. Reads were then aligned to M. abscessus UC22 reference genome (NZ_CP012044.1) using Burrows–Wheeler Alignment Tool (BWA version-0.7.15) (Li and Durbin, 2009). The alignment files were subjected to local realignment and deduplication using the Genome Analysis Toolkit (GATK version-3.6) (McKenna et al., 2010). Variants (SNVs and insertion/deletions (In/Dels) were identified from each alignment file using GATK. The variants and In/Dels were annotated using in-house Perl scripts. Variants identified in this study were manually inspected using Integrative Genomics Viewer (Robinson et al., 2011).

Multilocus sequence typing analysis

To confirm the strain, we carried out multilocus sequence typing (MLST) on M. abscessus UC22 isolate using MLST 1.8 tool (Larsen et al., 2012).

Database searches for peptide and protein identification

Raw data files were processed to generate peak list files using Proteome Discoverer software version 2.1 (Thermo Fisher Scientific, Bremen, Germany). The MS/MS spectra were searched using SequestHT and Mascot search algorithms against protein database (NCBI updated March 2017), combined with common contaminants (total no. of sequences: 5,211). The genome sequence of M. abscessus UC22 was downloaded from the NCBI FTP site (NCBI Reference Sequence: NZ_CP012044.1). A six-frame translated database was created containing M. abscessus UC22 translated sequences from stop codon to stop codon using in-house Python script. Studies have shown that apart from AUG, mycobacteria use GTG and TTG as initiator methionine codons (Cole et al., 1998; Kelkar et al., 2011).

An alternate start codon database was additionally created using GTG and TTG codons and it was translated as initiator methionine in addition to valine and leucine, respectively. Peptide sequences smaller than seven amino acids were not included in the database. The hypothetical N-terminal tryptic peptide database was created by fetching all the peptide sequences that began with methionine and ended with lysine/arginine from the six-frame translated genome database. Peptide sequences with lengths ranging from 7 to 25 amino acids were considered. The raw data files were first searched against M. abscessus UC22 protein database from NCBI (updated March 2017), combined with common contaminants (total no. of sequences: 5,211).

Following the first-pass search, other searches were performed sequentially in the following order (i) against M. abscessus UC22 protein database, (ii) six-frame translated genome database, (iii) alternate start codon database, and (iv) a hypothetical N-terminal peptide database. The search parameters included trypsin as the proteolytic enzyme with a maximum of two missed cleavage allowed. Oxidation of methionine and N-terminal acetylation protein was set as dynamic modifications, whereas carbamidomethylation of cysteine was set as static modification. Precursor ion mass tolerance and fragment ion mass tolerance were allowed with 10 ppm and 0.05 Da, respectively. The false discovery rate (FDR) was calculated using Percolator node and a cutoff of 1% peptide spectral match (PSM) and 1% peptide-level FDR was used for identifications.

Bioinformatic analysis

We carried out in silico analysis with SignalP (V4.1) to identify signal peptides in M. abscessus UC22 proteins (Petersen et al., 2011). Furthermore, the subcellular localization of all the proteins identified in our study was predicted using the PSORTb (V3.0.2) (Yu et al., 2010). We used PRED-LIPO prediction tool to identify lipoproteins in our proteomic data from M. abscessus UC22 clinical isolate (Bagos et al., 2008). To characterize the variant proteins, gene ontology (GO) analysis was performed using Blast2GO tool (Gotz et al., 2008). Functional enrichment of the proteins identified with variant peptides was carried out using FunRich with the Blast2GO annotation as backend database (Pathan et al., 2015).

Generation of customized strain-specific proteome database

A custom variant protein database was created to identify variant peptides containing the nonsynonymous SNVs identified in the genome analysis. Only the nonsynonymous SNVs were incorporated into the protein sequences. The variant peptides having single PSM were manually verified by checking their spectra.

The unmapped reads that did not align to the M. abscessus UC22 reference genome in the genome analysis were assembled with Velvet de novo assembler (Version 1.2.10) (Zerbino and Birney, 2008). VelvetOptimiser (Version 2.2.6) was used to determine the optimal kmer length for de novo assembly of the unmapped reads and was run with default parameters according to the software documentation (Zerbino and Birney, 2008). A kmer length of 75 was used for generation of contigs, further translated in six frames using in-house Python script. The raw data files were searched against the variant protein database and de novo assembled contig database separately with same parameters used in the protein searches.

Workflow for manual genome annotation

The genome search-specific peptides (GSSPs) obtained from searching the MS/MS data against alternate genome translated database were mapped to the M. abscessus UC22 genome to identify the probable coding regions. Genome coordinates of all the peptides were obtained using in-house Perl script. Peptides mapping to multiple locations in the genome were not considered for further analysis.

From the translated genome database search results, GSSPs were obtained by filtering peptides that mapped to known proteins. FgeneSB and GeneMark (Version 2.5) gene prediction programs were used to generate alternative gene models (Besemer et al., 2001). The novel genes and alterative gene models obtained using peptide evidence and gene prediction programs were checked for their conservation across ortholog evidence in Mycobacterium sp. using protein Basic Local Alignment Search Tool (BLAST) algorithm. MS/MS spectra of all the peptides providing evidence of novel genes or changes in gene structure were manually verified.

Analysis of virulence genes

All proteins identified in the proteomic data and genes carrying SNVs in the genomic data were subjected to BLAST analysis against the virulence factor database (Chen et al., 2012). The genes with at least 96% identity and at least 80% sequence coverage that were orthologous to virulence genes were further considered (Choo et al., 2014).

Data availability

Whole-genome sequencing data were deposited in the NCBI SRA database with accession SRP136682. The mass spectrometry data have been deposited to the ProteomeXchange Consortium (http://proteomecentral.proteomexchange.org) via the PRIDE partner repository with the dataset identifier PXD009405.

Results and Discussion

Strain information and genomic characterization of M. abscessus UC22 isolate

M. abscessus UC22 genome is composed of 5.2 million base pairs with 64% GC, guanine-cytosine content encoding for 5104 genes. We identified a total of 5506 nonsynonymous SNVs (Fig. 1) (Supplementary Table S1A) corresponding to 2080 protein coding genes. In addition, we also identified 63 insertions and 76 deletions in M. abscessus UC22 clinical isolate (Fig. 2A) (Supplementary Tables S1B and S1C).

FIG. 1.

Integrated experimental workflow used to analyze the genome, proteome, and proteogenomics of Mycobacterium abscessus UC22 isolate.

FIG. 2.

(A) Circos plot depicting GC (guanine-cytosine) content, SNVs, insertions, deletions, and positional distribution of the seven housekeeping genes of M. abscessus UC22 isolate using MLST. (B) Circos plot depicting mutation frequency in genes known to confer drug resistance. GC, guanine-cytosine; MLST, multilocus sequence typing; SNV, single-nucleotide variation.

We carried out MLST on M. abscessus UC22 isolate and compared our MLST results with PubMLST database (M. abscessus complex MLST databases) to identify the allelic profile of the UC22 isolate (Jolley and Maiden, 2010). We found our isolate to share six identical allelic types of the seven (argH, cya, glpK, gnd, murC, pta, and purH) housekeeping genes used in MLST for M. abscessus. Our isolate was found to belong to sequence types (STs)-12 profile (Fig. 2A). The MLST approach is used to identify, classify, and assess the spread of epidemic clones throughout the world. The combination of alleles obtained at each locus defines its allelic profile or ST. The number of allelic types of housekeeping genes identified in the Mycobacterium abscessus UC22 isolate is shown in Table 1.

Table 1.

The Allelic Profile of Mycobacterium abscessus UC22 Isolate Identified Using Multilocus Sequence Typing

Locus	Allele_Number	Length	Start position	End position
argH	1	510	2391940	2392449
cya	2	540	538388	538927
gnd	1	506	2466272	2466777
murC	2	445	2061539	2061983
pta	10	520	4608632	4609151
purH	5	497	1215153	1215649
rpoB	1	503	4156459	4156961

Studies have revealed M. abscessus to have an open pan-genome, which is rapidly evolving as opposed to M. tuberculosis. Horizontal gene transfer and homologous recombination phenomenon are known to be scarce in Mycobacterium sp. However, genome sequencing studies on M. abscessus have shown acquisition of virulence genes from other mycobacterial and nonmycobacterial species. Studies using MLST have also shown that some M. abscessus clinical isolates contain a genetic pattern displaying an amalgamation of alleles from the housekeeping genes corresponding to different subspecies (Macheras et al., 2014). This suggests that M. abscessus undergoes homologous recombination and horizontal gene transfer.

As there are not many studies published on M. abscessus clinical isolates from the Indian subcontinent, there is a lack of information on the prevalence and geographical distribution of the UC22 allelic profile. This warrants further need for multiomics-based studies on large numbers of clinical isolates of M. abscessus UC22 strain.

Identification of variants associated with natural and acquired resistance

To identify the genetic factors that may be associated with natural resistance, we compared SNVs identified in M. abscessus UC22 isolate with the genomes of publicly available M. abscessus isolates (Bryant et al., 2016). We observed 227 SNVs corresponding to 98 genes associated with natural resistance in M. abscessus UC22 isolate (Supplementary Table S2A). When we compared our data with publicly available data from M. abscessus and M. abscessus subspecies clinical isolates, we found 180/227 SNVs common among all the isolates. We found 19/227 SNVs common between our isolate and M. abscessus clinical isolates and 4/227 SNVs common between our isolate and M. abscessus subspecies clinical isolates.

We also uniquely identified 24 SNVs in genes known to confer drug resistance. It has been previously shown that M. abscessus monooxygenase orthologs may be potentially involved in resistance to rifampicin (Ripoll et al., 2009). We identified 26 SNVs corresponding to 13 genes that belong to the monooxygenase family in M. abscessus UC22 isolate. These SNVs have earlier been reported in several strains of M. abscessus. In addition, we also identified two SNVs (MAUC22_RS20735 and MAUC22_RS20740) in DNA-directed RNA polymerase subunit beta genes, which are known for conferring resistance to rifampicin. M. abscessus genome also encodes an Ambler class A β-lactamase gene, which shows homology with β-lactamases from the gram-negative bacteria (Nguyen and Thompson, 2006).

We observed three SNVs in class A β-lactamase gene (MAUC22_RS14295) (Supplementary Table S2A). The constant exposure of various mycobacterial strains to a large variety of β-lactams emanates mutations in their β-lactamase genes thus expanding their resistance spectrum even against the newly developed β-lactam antibiotic (Shaikh et al., 2015).

We observed 21 and 22 SNVs corresponding to 3 and 6 genes in aminoglycoside phosphotransferase proteins and N-acetyltransferase family of proteins, respectively (Supplementary Table S2A). In addition, we observed 114 SNVs in ABC transporter family and 17 SNVs in mmpL transporter family of genes (Fig. 2B; Supplementary Table S2A). M. abscessus also contains enzymes such as 2-N-acetyltransferase and aminoglycoside phosphotransferases that could alter aminoglycoside drugs by transferring acetyl or phosphate residues on key positions within the antibiotic and making them inactive (Nessar et al., 2012; Ripoll et al., 2009).

The ABC-type multidrug transporters act as importers or exporters by utilizing ATP energy to pump drugs out of the cell to the external environment (Kerr, 2002; Kerr et al., 2005). The mmpL transporter family belongs to resistance, nodulation, and cell division proteins, which are involved in lipid transport to the membrane with the help of proton motive force of the transmembrane electrochemical proton gradient (Goldberg et al., 1999).

We also compared SNVs associated with acquired resistance in our data with M. abscessus and M. abscessus subspecies isolates. In total, we identified 10 SNVs corresponding to 4 genes, of which SNV in 16S rRNA methyltransferase G gene (MAUC22_RS26020 A25P) was common in all the isolates (Supplementary Table S2B). We did not identify any SNVs in the 23S rRNA methyltransferase G gene (MAUC22_RS11405) associated with clarithromycin resistance, and thus, we expect the clinical isolate to be sensitive to clarithromycin (Mougari et al., 2017; Sohn et al., 2009).

Proteomic characterization of the M. abscessus UC22 strain

Following whole-genome sequencing analysis, we carried out LC-MS/MS-based proteomic profiling of M. abscessus UC22 isolate. Detailed workflow for proteomic analysis is illustrated in Figure 1. We identified 32,020 peptides corresponding to 3601 proteins, which is equivalent to ∼71% of the reference M. abscessus UC22 proteome. Of the total proteins, 686 and 35 proteins were uniquely identified in bRPLC and SCX fractions, respectively (Supplementary Fig. S1). The complete list of proteins and peptides identified in the current analysis is provided in Supplementary Table S3.

We further compared our data with the previously published proteomic study on M. abscessus ATCC 19977 strain (Miranda-CasoLuengo et al., 2016). Of the 2697 proteins reported by them, 1732 proteins were identified in the current analysis. Among these 1732, 1266 proteins have peptide coverage in both studies. However, we provide additional evidence for 466 proteins that was previously not reported, including TetR/AcrR family transcriptional regulator, WhiB family transcriptional regulator, and Tat subunit TatB (Supplementary Table S3).

TetR/AcrR family proteins are known to regulate a large number of activities, such as biosynthesis of antibiotics, multidrug resistance, efflux pumps, virulence, and pathogenicity of bacteria (Deng et al., 2013). TetR/AcrR family proteins mediate multidrug efflux pumps and might also serve as broad spectrum for new drug targets. M. abscessus genome consists of six whiB genes, which are known to be involved in conferring drug resistance to streptomycin, erythromycin, and tetracycline (Morris et al., 2005). The whiB family proteins are putative transcription factors involved in the regulation cellular processes such as pathogenesis, cell division, and responses to oxidative stress (Nessar et al., 2012).

Additional features of the UC22 strain proteome

Bacterial secretory proteins secreted through the general secretory (Sec) pathway are often found to have classical amino-terminal signal peptide sequences (Desvaux et al., 2010). SignalP analysis led to the identification of 293 proteins with signal peptides (Supplementary Table S3). These include hemophore-related proteins, iron transporters, and sensor domain-containing proteins. The subcellular localization analysis predicted 1991 (55%) proteins as cytoplasmic proteins, 790 (21%) localized to cytoplasmic membrane, 44 (1.2%) extracellular, and 14 (0.3%) localized to cell wall (Supplementary Table S3). Examples of proteins localized to the cytoplasm include TetR/AcrR family transcriptional regulator, acyl-CoA dehydrogenase, and α/β-hydrolase proteins.

Lipoproteins are varied class of secreted bacterial proteins that include 1–3% bacterial genome-encoding proteins (Babu et al., 2006). The PRED-LIPO tool predicted 121 lipoproteins in our proteomic data from M. abscessus UC22 clinical isolate (Bagos et al., 2008; Supplementary Table S3). These include sensor domain-containing proteins, Mce family of proteins, and ABC transporter substrate-binding proteins. Overexpression of ABC transporters in M. tuberculosis has been reported to contribute to resistance to several antibiotics (Wang et al., 2013). Similarly, ABC transporters are known for conferring natural resistance to several antibiotics in M. abscessus (Nessar et al., 2012).

Proteogenomic analysis

In the current study, using an integrated data analysis workflow, we carried out proteogenomic analysis M. abscessus UC22 strain.

Identification of novel genes

It is likely that computational pipelines miss a few bona fide protein-coding genes during genome annotation. Using the custom genome database, we identified seven potentially novel peptides of M. abscessus UC22, corresponding to seven novel protein coding genes (Supplementary Table S4A). We identified peptide evidence for a novel protein coding gene: IOB_NG_005, which encodes for a protein with 896 amino acids (Fig. 3A). An ortholog for the novel gene was identified with BLAST algorithm in M. abscessus subsp. abscessus strain 876, suggesting the conservation of this gene in M. abscessus.

FIG. 3.

Refinement of genome annotation of M. abscessus UC22 genome-based proteomic evidence. (A) Illustration of a novel gene identified using proteogenomic analysis. (B) Illustration of N-terminal extension of MAUC22_RS08685 gene model.

We further performed SMART analysis to reveal mycobacterial membrane protein large (mmpL)—associated domain in this novel gene. The mmpL transporter family genes are distributed throughout the M. abscessus genome and these proteins mediate transport across the cytoplasmic membrane using the proton motive force of the transmembrane electrochemical proton gradient (Nessar et al., 2012). mmpLs are also known to play a role in the export of lipid components across the cell envelope, involved in physiology of mycobacteria and/or can act as virulence factors (Ripoll et al., 2009; Viljoen et al., 2017).

Correction of existing gene models

We identified six N-terminal extensions, which were also supported by ORFs, open reading frames predicted by FgeneSB and GeneMark (Supplementary Table S4B). N-terminal extension of MAUC22_RS08685 protein with one peptide mapping upstream of the gene is depicted in Figure 3B.

Confirmation of translational start sites

In this study, we carried out mass spectrometry-based proteomic approach to identify the modified N-terminal peptides. Using protein N-terminal acetylation as dynamic modification, 411 unique N-terminal peptides were obtained for 355 proteins reported in M. abscessus UC22 protein database. Among these proteins, 325 were confirmed with N-terminal peptides with the initiator methionine cleaved, and 30 were confirmed with initiator methionine residues (Supplementary Table S3). Correct assignment of translation start site is important for the protein sequence (Rison et al., 2007). Translational start sites (TSS) of a protein can be annotated by the identification of N-terminal acetylated peptides using mass spectrometry data (Zheng et al., 2017).

Earlier studies have reported that nontryptic nature at the N-terminus of the peptide, such as N-terminal peptides with an initiator methionine residue or an initiator methionine cleaved, could indicate the N-terminal of protein (Kelkar et al., 2011).

Utilization of proteogenomic approach to provide indispensible information for improvising genome annotation has previously been used to refine different mycobacterium genomes such as Mycobacterium smegmatis, M. tuberculosis H37Rv, and Mycobacterium vaccae. A previous study reported by Gallien et al. (2009), an ortho-proteogenomic approach was implemented, resulted in identification of 29 novel genes and confirmation of annotated start site for 342 proteins in M. smegmatis (Table 2). Another study by Potgieter et al. (2016) that focused on the proteogenomic analysis of M. smegmatis mc²155 strain identified 63 novel genes, 81 N-terminal extensions, and confirmation of annotated start site for 558 proteins.

Table 2.

List of Previous Proteogenomic Studies on Mycobacterium Species

Category	Gallien et al. (2009)	Kelkar et al. (2011)	Potgieter et al. (2016)	Zheng et al. (2017)	Pinto et al. (2018)
Strain	Mycobacterium smegmatis	Mycobacterium tuberculosis H37Rv	M. smegmatis mc2155	Mycobacterium vaccae	M. tuberculosis H37Ra
Novel genes	29	41	63	38	63
Confirmation of annotated start site of proteins	342	727	558	445	332
Correction of gene models	—	79	81	98	25
Correction of annotated start site of proteins	—	33	—	—	—

A recent study on M. vaccae strain by Zheng et al. (2017) identified 38 novel genes, correction of 98 gene models, and confirmation of annotated start site for 445 proteins. Apart from these, a previous study published by our group revealed 41 novel genes, correction of 79 gene models, confirmation of annotated start site for 727 proteins, and correction of TSS for 33 proteins in M. tuberculosis H37Rv strain (Kelkar et al., 2011). Another recent study published by our group revealed 63 novel genes, correction of 25 gene models, and confirmation of annotated start site for 332 proteins in M. tuberculosis H37Ra strain (Pinto et al., 2018).

Identification of proteins encoded by the de novo assembled contigs

We identified 77 peptides corresponding to 24 contigs, which were not reported in M. abscessus UC22 reference proteome database. We identified five peptides that mapped to a protein encoded in the first forward strand of NODE_96 contig of our de novo assembled unmapped reads (Supplementary Table S5). Domain analysis with SMART showed that this protein contains p-aminobenzoate N-oxygenase (AurF) domain. We also confirmed the orthologus evidence of this protein in other strains of M. abscessus using the BLAST algorithm (Supplementary Fig. S2).

Similarly, four different peptides that mapped to protein encoded in the first forward strand of NODE_113 contig were also identified. Domain analysis of the identified protein revealed that it contains pyridoxamine 5′-phosphate oxidase domain, which is an FMN flavoprotein that catalyzes the oxidation of pyridoxamine-5-P (PMP) and pyridoxine-5-P (PNP) to pyridoxal-5-P (PLP). Earlier studies on M. tuberculosis have shown that pyridoxal 5′-phosphate biosynthesis is essential for survival and virulence (Ankisettypalli et al., 2016).

Using this approach, we thus provide the first direct protein-level evidence through identification of peptides mapping to contigs translated from the de novo assembled unmapped reads. These coding sequences have probably resulted from genomic rearrangements or gene duplications, which are not present in the M. abscessus UC22 genome or where coding regions vary in the genome M. abscessus UC22 clinical isolate.

Identification of variant peptides in M. abscessus UC22 proteome

A total of 553 variant peptides corresponding to 356 proteins were identified, of which 540 variant peptides had PSMs more than 2 (Supplementary Table S6). We further mapped the variants with SNVs in genes earlier reported to be associated with conferring drug resistance (Nessar et al., 2012). We identified 30 variant peptides corresponding to 16 genes carrying SNVs, resulting in rifampicin, isoniazid, aminoglycosides, and β-lactamase (Table 3). Our data, for the first time, provide evidence at the peptide level for SNVs conferring drug resistance in M. abscessus UC22 isolate (Fig. 4). It is interesting to note that of the 16 genes with variant peptides, 9 belonged to the ABC transporter gene family, which is involved in transporting multiple drugs across the membrane (Danilchanka et al., 2008).

FIG. 4.

Representation of variant peptide evidence for SNVs identified in the M. abscessus UC22 clinical isolate.

Table 3.

List of Variant Peptide Evidences for Single-Nucleotide Variations Conferring Drug Resistance in Mycobacterium abscessus UC22 Isolate

Gene symbol	Description	Protein accession	Peptide	No. of PSMs
MAUC22_RS01425	MULTISPECIES: Proline/glycine betaine ABC transporter ATP-binding protein	WP_005083757.1	GGVRGDDVLTTLDR	2
MAUC22_RS05060	3-oxoacyl-ACP reductase	WP_074244529.1	TAAVELAPR	8
MAUC22_RS06060	Multidrug ABC transporter ATP-binding protein	WP_005092824.1	IAVTDADAALTLTDVGVR	8
MAUC22_RS06060	Multidrug ABC transporter ATP-binding protein	WP_005092824.1	ETVVYPLPVEEVSDEELKQALR	2
MAUC22_RS09310	Cobalt ABC transporter ATP-binding protein	WP_049232335.1	VADDVVWGLPADHKVDIDGLLR	6
MAUC22_RS09320	Sugar ABC transporter substrate-binding protein	WP_049232337.1	SGSAIPAVLSAR	2
MAUC22_RS11855	APH(3\'\') family Aminoglycoside O-phosphotransferase	WP_074244662.1	GLAAMATLAR	2
MAUC22_RS11855	APH(3\'\') family Aminoglycoside O-phosphotransferase	WP_074244662.1	DAVNPDFLTDEDR	4
MAUC22_RS11885	ABC transporter	WP_049232641.1	ATLKDGTAVVVK	1
MAUC22_RS12275	Catalase-peroxidase	WP_049232685.1	DFVAAWVK	44
MAUC22_RS12275	Catalase-peroxidase	WP_049232685.1	FVKDFVAAWVK	16
MAUC22_RS12275	Catalase-peroxidase	WP_049232685.1	GAQIPAEYK	77
MAUC22_RS12275	Catalase-peroxidase	WP_049232685.1	GAQIPAEYKLIDR	2
MAUC22_RS14295	Class A β-lactamase	WP_019164608.1	APIVMAVLTVPEDPTSTK	12
MAUC22_RS16715	N-acetyltransferase	WP_016892376.1	TVLAVVVTPNAASEK	2
MAUC22_RS17240	Cobalt ABC transporter ATP-binding protein	WP_005111617.1	VVALGGIAVRPPAEPPSPAESR	8
MAUC22_RS20735	DNA-directed RNA polymerase subunit beta\'	WP_049233863.1	AQKLEADLAELEAEGAK	58
MAUC22_RS20735	DNA-directed RNA polymerase subunit beta\'	WP_049233863.1	AQKLEADLAELEAEGAKSDVR	137
MAUC22_RS20735	DNA-directed RNA polymerase subunit beta\'	WP_049233863.1	LEADLAELEAEGAK	4
MAUC22_RS20735	DNA-directed RNA polymerase subunit beta\'	WP_049233863.1	LEADLAELEAEGAKSDVR	2
MAUC22_RS21995	ABC transporter ATP-binding protein	WP_074292324.1	FADDLVAVAGSMK	4
MAUC22_RS22700	N-acetyltransferase	WP_049234144.1	ANSAVPLDHSASSR	12
MAUC22_RS22700	N-acetyltransferase	WP_049234144.1	RANSAVPLDHSASSR	4
MAUC22_RS22700	N-acetyltransferase	WP_049234144.1	LIHPPDDLATSDETVVMVR	2
MAUC22_RS22700	N-acetyltransferase	WP_049234144.1	GAVTENWLGISAVR	4
MAUC22_RS23000	ABC transporter substrate-binding protein	WP_016888003.1	AVLDKIDTSCDPTASLRPTSDR	14
MAUC22_RS23715	Alkyl hydroperoxide reductase	WP_005064030.1	KGDPTIDAGELMSAGV	4

PSM, peptide spectral match.

GO analysis of the mutated proteins in M. abscessus UC22 strain revealed them to be involved in catalytic activity (22%), metabolic process (25%), and DNA repair (4.6%) (Supplementary Table S6). These include methylated DNA—protein/cysteine methyltransferase, peptide synthetase, acetoacetyl-CoA synthetase, isopropylmalate isomerase, and 1-pyrroline-5-carboxylate dehydrogenase. Isopropylmalate isomerase enzyme is involved in leucine biosynthesis pathway.

An earlier study has shown that leucine biosynthesis pathway is necessary for the survival and virulence in M. tuberculosis (Manikandan et al., 2011). Previously published proteomic study on M. abscessus ATCC 19977 strain has shown that 1-pyrroline-5-carboxylate dehydrogenase (pruA) is involved in oxidation of proline to glutamate using carbon and nitrogen source for metabolism (Miranda-CasoLuengo et al., 2016). The GO-based functional annotation of identified proteins is depicted in Figure 5.

FIG. 5.

Gene ontology-based functional annotation of proteins carrying variant peptides.

Cell envelope proteins and virulence-associated genes

M. abscessus contains a large set of surface-exported Mce proteins organized in the genome (Ripoll et al., 2009). We found nine proteins belonging to Mce family in our proteomic data (Supplementary Table S7). We also found 14 SNVs corresponding to these 9 Mce proteins and variant peptide evidence for 3 of these SNVs in genome data (Supplementary Table S7). We have also identified these proteins in our lipoprotein prediction analysis. Mce family proteins of mycobacteria are known to be involved in the invasion of host macrophages and nonphagocytic mammalian cells (Zhang and Xie, 2011). In M. tuberculosis, Mce proteins have been shown to have a specific role in lipid transport (Cantrell et al., 2013; Forrellad et al., 2014).

Virulence factors play a very important role in promoting the growth of bacteria in the host environment and they also help in binding to the host cell for invasion (Choo et al., 2014). We identified 115 virulence genes in our proteomic data, which include ABC transporters, type VII secretion family of proteins, and RNA polymerase sigma factor protein family (Supplementary Table S7). These 115 virulence genes were distributed across different Mycobacterium species (Supplementary Fig. S3). Relapse of prolonged infections with the ability to persevere inside phagocytes is the major virulence mechanism known in M. abscessus (Pang et al., 2013). Our analysis reveals that virulence factors known to be involved in M. tuberculosis virulence have orthologs in the M. abscessus genomes of M. abscessus UC22 and M. abscessus ATCC 19977 strains (Table 4).

Table 4.

Partial List of Putative Virulence Genes Identified in Mycobacterium abscessus UC22 Isolate

Gene_symbol	Description	Protein accession	No. of peptides	No. of PSMs	MAB UC22	MAB ATCC 19977	MTB H37Rv
MAUC22_RS03840	MULTISPECIES: DNA-binding response regulator	WP_005064128.1	3	10	√	√	√
MAUC22_RS03715	MULTISPECIES: molecular chaperone GroEL	WP_005064866.1	62	7885	√	X	√
MAUC22_RS14300	GTP pyrophosphokinase	WP_005068903.1	23	122	√	X	√
MAUC22_RS22235	MULTISPECIES: isocitrate lyase	WP_005085937.1	26	905	√	√	√
MAUC22_RS05315	Mammalian cell entry protein	WP_049231996.1	8	47	√	√	X
MAUC22_RS16660	Dihydrofolate reductase	WP_049233433.1	16	495	√	X	√

We identified integral membrane proteins (EccB, EccC, EccD, and EccE), which are present in most of the five ESX systems (Houben et al., 2012). We also identified EccA protein that belongs to the AAA + (ATPases associated with various cellular activities) protein family and is important for type VII secretion system (Houben et al., 2014). We have identified a similar set of integral transmembrane proteins in a comparative proteomic study of M. tuberculosis H37Ra and H37Rv strains published recently by our group (Verma et al., 2017). M. abscessus genome encodes for sigma factors known to be involved in M. tuberculosis virulence (Ripoll et al., 2009). We identified four of these sigma factors in our proteomic data set (SigE, SigF, SigL, and SigM).

We observed three SNVs in accessory Sec system translocase secA2 (MAUC22_RS11910), which are also a part of the virulence gene list with proteome and variant peptide evidence from the proteome data. SecA2 gene is essential to prevent phagosome maturation and also encodes a preprotein translocase ATPase that translocates superoxide dismutase (sodA). Studies have also shown that secA2 mutant of M. tuberculosis is attenuated for growth in macrophages and in a mouse model, which validates the importance of secA2 to M. tuberculosis virulence (Braunstein et al., 2003; Kurtz et al., 2006). We also identified sodA (MAUC22_RS21555) and nuoG (MAUC22_RS10600) in our proteome data, which are virulence factors in M. abscessus and are also found to be antiapoptotic factors (Choo et al., 2014).

Our analysis revealed many putative virulence factors from M. abscessus UC22 strain. At present, M. abscessus species is rapidly evolving and has already acquired antibiotic resistance. To understand the distribution of M. abscessus species, a large number of clinical isolates from the Indian subcontinent might have to be sequenced and examined. The candidate molecules identified in the current study could provide novel insights into biological mechanisms and pathogenesis of M. abscessus UC22 strain, when studied in the context of a large number of clinical isolates.

Conclusions

Clinical microbial proteogenomics provides an opportunity to identify universal candidates for therapeutic interventions by investigating the pace of changes incorporated in the genome and proteome of a pathogen across clinical isolates. This study integrates the genomic and proteomic data obtained from M. abscessus UC22 clinical isolate and provides insights on the strain-specific genomic and proteomic signatures. The proteogenomic approach along with our whole-genome sequencing analysis of UC22 strain helped map several novel genes and correct existing protein annotations.

Among other interesting findings are identification of putative virulence genes associated with the UC22 strain and peptide-level evidence for genomic variants. Our study shows the importance of integration of multiomics platforms for a deeper and better understanding of clinically significant pathogenic organisms.

Footnotes

Acknowledgments

The authors thank the Karnataka Biotechnology and Information Technology Services (KBITS), Government of Karnataka, for the support to the Center for Systems Biology and Molecular Medicine at Yenepoya (Deemed to be University) under the Biotechnology Skill Enhancement Programme in Multiomics Technology (BiSEP GO ITD 02 MDA 2017). The authors acknowledge Yenepoya (Deemed to be University) for access to mass spectrometry instrumentation facility. They thank the Department of Biotechnology (DBT), Government of India, for research support to the Institute of Bioinformatics.

J.A is a recipient of the Senior Research Fellowship from the Council of Scientific and Industrial Research (CSIR), Government of India. R.V. is a recipient of the Senior Research Fellowship from University Grants Commission (UGC), Government of India. O.C. is a recipient of INSPIRE Fellowship from the Department of Science and Technology (DST). M.A.N. is a recipient of Junior Research Fellowship from University Grants Commission (UGC), Government of India. S.M.P. is a recipient of INSPIRE Faculty Award from DST, Government of India.

Author Disclosure Statement

The authors declare that no conflicting financial interests exist.

Abbreviations Used

References

Ankisettypalli

, Cheng

, Baker

, and Bashiri

. (2016). PdxH proteins of mycobacteria are typical members of the classical pyridoxine/pyridoxamine 5'-phosphate oxidase family. FEBS Lett, 590, 453–460.

Babu

, Priya

, Selvan

, et al. (2006). A database of bacterial lipoproteins (DOLOP) with functional assignments to predicted lipoproteins. J Bacteriol, 188, 2761–2773.

Bagos

, Tsirigos

, Liakopoulos

, and Hamodrakas

. (2008). Prediction of lipoprotein signal peptides in Gram-positive bacteria with a Hidden Markov Model. J Proteome Res, 7, 5082–5093.

Besemer

, Lomsadze

, and Borodovsky

. (2001). GeneMarkS: A self-training method for prediction of gene starts in microbial genomes. Implications for finding sequence motifs in regulatory regions. Nucleic Acids Res, 29, 2607–2618.

Bottai

, and Brosch

. (2009). Mycobacterial PE, PPE and ESX clusters: Novel insights into the secretion of these most unusual protein families. Mol Microbiol, 73, 325–328.

Bottai

, Stinear

, Supply

, and Brosch

. (2014). Mycobacterial pathogenomics and evolution. Microbiol Spectr, 2, MGM2-0025-2013.

Braunstein

, Espinosa

, Chan

, Belisle

, and Jacobs

Jr . (2003). SecA2 functions in the secretion of superoxide dismutase A and in the virulence of Mycobacterium tuberculosis. Mol Microbiol, 48, 453–464.

Bryant

, Grogono

, Rodriguez-Rincon

, et al. (2016). Emergence and spread of a human-transmissible multidrug-resistant nontuberculous mycobacterium. Science, 354, 751–757.

Byrd

, and Lyons

. (1999). Preliminary characterization of a Mycobacterium abscessus mutant in human and murine models of infection. Infect Immun, 67, 4700–4707.

10.

Cantrell

, Leavell

, Marjanovic

, Iavarone

, Leary

, and Riley

. (2013). Free mycolic acid accumulation in the cell wall of the mce1 operon mutant strain of Mycobacterium tuberculosis. J Microbiol, 51, 619–626.

11.

Catherinot

, Clarissou

, Etienne

, et al. (2007). Hypervirulence of a rough variant of the Mycobacterium abscessus type strain. Infect Immun, 75, 1055–1058.

12.

Chen

, Xiong

, Sun

, Yang

, and Jin

. (2012). VFDB 2012 update: Toward the genetic diversity and molecular evolution of bacterial virulence factors. Nucleic Acids Res, 40, D641–D645.

13.

Choo

, Wee

, Ngeow

, et al. (2014). Genomic reconnaissance of clinical isolates of emerging human pathogen Mycobacterium abscessus reveals high evolutionary potential. Sci Rep, 4, 4061.

14.

Cole

, Brosch

, Parkhill

, et al. (1998). Deciphering the biology of Mycobacterium tuberculosis from the complete genome sequence. Nature, 393, 537–544.

15.

Danilchanka

, Mailaender

, and Niederweis

. (2008). Identification of a novel multidrug efflux pump of Mycobacterium tuberculosis. Antimicrob Agents Chemother, 52, 2503–2511.

16.

Deng

, Li

, and Xie

. (2013). The underling mechanism of bacterial TetR/AcrR family transcriptional repressors. Cell Signal, 25, 1608–1613.

17.

Desvaux

, Dumas

, Chafsey

, Chambon

, and Hebraud

. (2010). Comprehensive appraisal of the extracellular proteins from a monoderm bacterium: Theoretical and empirical exoproteomes of Listeria monocytogenes EGD-e by secretomics. J Proteome Res, 9, 5076–5092.

18.

Esther

Jr ., Esserman

, Gilligan

, Kerr

, and Noone

PG.

(2010). Chronic Mycobacterium abscessus infection and lung function decline in cystic fibrosis. J Cyst Fibros, 9, 117–123.

19.

Forrellad

, Klepp

, Gioffre

, et al. (2013). Virulence factors of the Mycobacterium tuberculosis complex. Virulence, 4, 3–66.

20.

Forrellad

, McNeil

, Santangelo Mde

, et al. (2014). Role of the Mce1 transporter in the lipid homeostasis of Mycobacterium tuberculosis. Tuberculosis (Edinb), 94, 170–177.

21.

Gallien

, Perrodou

, Carapito

, et al. (2009). Ortho-proteogenomics: Multiple proteomes investigation through orthology and a new MS-based protocol. Genome Res, 19, 128–135.

22.

Goldberg

, Pribyl

, Juhnke

, and Nies

. (1999). Energetics and topology of CzcA, a cation/proton antiporter of the resistance-nodulation-cell division protein family. J Biol Chem, 274, 26065–26070.

23.

Gotz

, Garcia-Gomez

, Terol

, et al. (2008). High-throughput functional annotation and data mining with the Blast2GO suite. Nucleic Acids Res, 36, 3420–3435.

24.

Griffith

(2014). Mycobacterium abscessus subsp abscessus lung disease: ‘Trouble ahead, trouble behind…’. F1000Prime Rep, 6, 107.

25.

Griffith

, Aksamit

, Brown-Elliott

, et al. (2007). An official ATS/IDSA statement: Diagnosis, treatment, and prevention of nontuberculous mycobacterial diseases. Am J Respir Crit Care Med, 175, 367–416.

26.

Griffith

, Girard

, and Wallace

Jr . (1993). Clinical features of pulmonary disease caused by rapidly growing mycobacteria. An analysis of 154 patients. Am Rev Respir Dis, 147, 1271–1278.

27.

Heunis

, Dippenaar

, Warren

, et al. (2017). Proteogenomic investigation of strain variation in clinical Mycobacterium tuberculosis isolates. J Proteome Res, 16, 3841–3851.

28.

Houben

, Bestebroer

, Ummels

, et al. (2012). Composition of the type VII secretion system membrane complex. Mol Microbiol, 86, 472–484.

29.

Houben

, Korotkov

, and Bitter

. (2014). Take five—Type VII secretion systems of Mycobacteria. Biochim Biophys Acta, 1843, 1707–1716.

30.

Howard

, Rhoades

, Recht

, et al. (2006). Spontaneous reversion of Mycobacterium abscessus from a smooth to a rough morphotype is associated with reduced expression of glycopeptidolipid and reacquisition of an invasive phenotype. Microbiology, 152, 1581–1590.

31.

Ishiekwene

, Subran

, Ghitan

, Kuhn-Basti

, Chapnick

, and Lin

. (2017). Case report on pulmonary disease due to coinfection of Mycobacterium tuberculosis and Mycobacterium abscessus: Difficulty in diagnosis. Respir Med Case Rep, 20, 123–124.

32.

Jolley

, and Maiden

. (2010). BIGSdb: Scalable analysis of bacterial genome variation at the population level. BMC Bioinformatics, 11, 595.

33.

Katoch

VM.

(2004). Infections due to non-tuberculous mycobacteria (NTM). Indian J Med Res, 120, 290–304.

34.

Kelkar

, Kumar

, et al. (2011). Proteogenomic analysis of Mycobacterium tuberculosis by high resolution mass spectrometry. Mol Cell Proteomics, 10, M111 011627.

35.

Kerr

ID.

(2002). Structure and association of ATP-binding cassette transporter nucleotide-binding domains. Biochim Biophys Acta, 1561, 47–64.

36.

Kerr

, Reynolds

, and Cove

. (2005). ABC proteins and antibiotic drug resistance: Is it all about transport?. Biochem Soc Trans, 33, 1000–1002.

37.

Kim

, Pinto

, Getnet

, et al. (2014). A draft map of the human proteome. Nature, 509, 575–581.

38.

Kim

, Subhadra

, Whang

, et al. (2017). Clinical Mycobacterium abscessus strain inhibits autophagy flux and promotes its growth in murine macrophages. Pathog Dis. 75.

39.

Koh

, Song

, Kang

, et al. (2010). An outbreak of skin and soft tissue infection caused by Mycobacterium abscessus following acupuncture. Clin Microbiol Infect, 16, 895–901.

40.

Kurtz

, McKinnon

, Runge

, Ting

, and Braunstein

. (2006). The SecA2 secretion factor of Mycobacterium tuberculosis promotes growth in macrophages and inhibits the host immune response. Infect Immun, 74, 6855–6864.

41.

Larsen

, Cosentino

, Rasmussen

, et al. (2012). Multilocus sequence typing of total-genome-sequenced bacteria. J Clin Microbiol, 50, 1355–1361.

42.

, and Durbin

. (2009). Fast and accurate short read alignment with Burrows-Wheeler transform. Bioinformatics, 25, 1754–1760.

43.

Macheras

, Konjek

, Roux

, et al. (2014). Multilocus sequence typing scheme for the Mycobacterium abscessus complex. Res Microbiol, 165, 82–90.

44.

Manikandan

, Geerlof

, Zozulya

, Svergun

, and Weiss

. (2011). Structural studies on the enzyme complex isopropylmalate isomerase (LeuCD) from Mycobacterium tuberculosis. Proteins, 79, 35–49.

45.

McKenna

, Hanna

, Banks

, et al. (2010). The Genome Analysis Toolkit: A MapReduce framework for analyzing next-generation DNA sequencing data. Genome Res, 20, 1297–1303.

46.

Miranda-CasoLuengo

, Staunton

, Dinan

, Lohan

, and Loftus

. (2016). Functional characterization of the Mycobacterium abscessus genome coupled with condition specific transcriptomics reveals conserved molecular strategies for host adaptation and persistence. BMC Genomics, 17, 553.

47.

Moore

, and Frerichs

. (1953). An unusual acid-fast infection of the knee with subcutaneous, abscess-like lesions of the gluteal region; report of a case with a study of the organism, Mycobacterium abscessus, n. sp. J Invest Dermatol, 20, 133–169.

48.

Morris

, Nguyen

, Gatfield

, et al. (2005). Ancestral antibiotic resistance in Mycobacterium tuberculosis. Proc Natl Acad Sci U S A, 102, 12200–12205.

49.

Mougari

, Bouziane

, Crockett

, et al. (2017). Selection of resistance to clarithromycin in Mycobacterium abscessus subspecies. Antimicrob Agents Chemother, 61, e00943–16.

50.

Nagarajha Selvan

, Kaviyil

, Nirujogi

, et al. (2014). Proteogenomic analysis of pathogenic yeast Cryptococcus neoformans using high resolution mass spectrometry. Clin Proteomics, 11, 5.

51.

Nessar

, Cambau

, Reyrat

, Murray

, and Gicquel

. (2012). Mycobacterium abscessus: A new antibiotic nightmare. J Antimicrob Chemother, 67, 810–818.

52.

Nguyen

, and Thompson

. (2006). Foundations of antibiotic resistance in bacterial physiology: The mycobacterial paradigm. Trends Microbiol, 14, 304–312.

53.

Pang

, Ngeow

, Wong

, and Liam

. (2013). Mycobacterium abscessus—To treat or not to treat. Respirol Case Rep, 1, 31–33.

54.

Pathan

, Keerthikumar

, Ang

, et al. (2015). FunRich: An open access standalone functional enrichment and interaction network analysis tool. Proteomics, 15, 2597–2601.

55.

Pawlik

, Garnier

, Orgeur

, et al. (2013). Identification and characterization of the genetic changes responsible for the characteristic smooth-to-rough morphotype alterations of clinically persistent Mycobacterium abscessus. Mol Microbiol, 90, 612–629.

56.

Petersen

, Brunak

, von Heijne

, and Nielsen

. (2011). SignalP 4.0: Discriminating signal peptides from transmembrane regions. Nat Methods, 8, 785–786.

57.

Pinto

, Verma

, Advani

, et al. (2018). Integrated multi-omic analysis of Mycobacterium tuberculosis H37Ra redefines virulence attributes. Front Microbiol, 9, 1314.

58.

Potgieter

, Nakedi

, Ambler

, et al. (2016). Proteogenomic analysis of Mycobacterium smegmatis using high resolution mass spectrometry. Front Microbiol, 7, 427.

59.

Ripoll

, Pasek

, Schenowitz

, et al. (2009). Non mycobacterial virulence genes in the genome of the emerging pathogen Mycobacterium abscessus. PLoS One, 4, e5660.

60.

Rison

, Mattow

, Jungblut

, and Stoker

. (2007). Experimental determination of translational starts using peptide mass mapping and tandem mass spectrometry within the proteome of Mycobacterium tuberculosis. Microbiology, 153, 521–528.

61.

Robinson

, Thorvaldsdottir

, Winckler

, et al. (2011). Integrative genomics viewer. Nat Biotechnol, 29, 24–26.

62.

Ruggles

, Krug

, Wang

, et al. (2017). Methods, tools and current perspectives in proteogenomics. Mol Cell Proteomics, 16, 959–981.

63.

Shaikh

, Fatima

, Shakil

, Rizvi

, and Kamal

. (2015). Antibiotic resistance and extended spectrum beta-lactamases: Types, epidemiology and treatment. Saudi J Biol Sci, 22, 90–101.

64.

Sharma

, Verma

, Advani

, et al. (2017). Whole genome sequencing of Mycobacterium tuberculosis isolates from extrapulmonary sites. OMICS, 21, 413–425.

65.

Smith

(2003). Mycobacterium tuberculosis pathogenesis and molecular determinants of virulence. Clin Microbiol Rev, 16, 463–496.

66.

Sohn

, Kim

, Jung Kwon

, Koh

, and Shin

. (2009). High virulent clinical isolates of Mycobacterium abscessus from patients with the upper lobe fibrocavitary form of pulmonary disease. Microb Pathog, 47, 321–328.

67.

Tsirigotaki

, De Geyter

, Sostaric

, Economou

, and Karamanou

. (2017). Protein export through the bacterial Sec pathway. Nat Rev Microbiol, 15, 21–36.

68.

van Embden

, Cave

, Crawford

, et al. (1993). Strain identification of Mycobacterium tuberculosis by DNA fingerprinting: Recommendations for a standardized methodology. J Clin Microbiol, 31, 406–409.

69.

Venter

, Smith

, and Payne

. (2011). Proteogenomic analysis of bacteria and archaea: A 46 organism case study. PLoS One, 6, e27587.

70.

Verma

, Pinto

, Patil

, et al. (2017). Quantitative Proteomic and Phosphoproteomic Analysis of H37Ra and H37Rv Strains of Mycobacterium tuberculosis. J Proteome Res, 16, 1632–1645.

71.

Viana-Niero

, Lima

, Lopes

, et al. (2008). Molecular characterization of Mycobacterium massiliense and Mycobacterium bolletii in isolates collected from outbreaks of infections after laparoscopic surgeries and cosmetic procedures. J Clin Microbiol, 46, 850–855.

72.

Viljoen

, Dubois

, Girard-Misguich

, Blaise

, Herrmann

, and Kremer

. (2017). The diverse family of MmpL transporters in mycobacteria: From regulation to antimicrobial developments. Mol Microbiol, 104, 889–904.

73.

Wang

, Pei

, Huang

, et al. (2013). The expression of ABC efflux pump, Rv1217c-Rv1218c, and its association with multidrug resistance of Mycobacterium tuberculosis in China. Curr Microbiol, 66, 222–226.

74.

Wankhade

, Ghadage

, and Bhore

. (2017). Breast abscess due to Mycobacterium abscessus: A rare case. Ann Trop Med Public Health, 10, 447–449.

75.

Wood

, and Salzberg

. (2014). Kraken: Ultrafast metagenomic sequence classification using exact alignments. Genome Biol, 15, R46.

76.

, Wagner

, Laird

, et al. (2010). PSORTb 3.0: Improved protein subcellular localization prediction with refined localization subcategories and predictive capabilities for all prokaryotes. Bioinformatics, 26, 1608–1615.

77.

Zerbino

, and Birney

. (2008). Velvet: Algorithms for de novo short read assembly using de Bruijn graphs. Genome Res, 18, 821–829.

78.

Zhang

, and Xie

. (2011). Mammalian cell entry gene family of Mycobacterium tuberculosis. Mol Cell Biochem, 352, 1–10.

79.

Zheng

, Chen

, Liu

, et al. (2017). proteogenomic analysis and discovery of immune antigens in Mycobacterium vaccae. Mol Cell Proteomics, 16, 1578–1590.

Supplementary Material

Please find the following supplemental material available below.

For Open Access articles published under a Creative Commons License, all supplemental material carries the same license as the article it is associated with.

For non-Open Access articles published, all supplemental material carries a non-exclusive license, and permission requests for re-use of supplemental material or any part of supplemental material shall be sent directly to the copyright owner as specified in the copyright notice associated with the article.

0.00 MB

0.04 MB

0.10 MB

0.20 MB

0.96 MB

0.14 MB

0.11 MB

0.13 MB

4.71 MB

0.13 MB

0.30 MB

0.33 MB

0.10 MB

42.08 MB

Rise of Clinical Microbial Proteogenomics: A Multiomics Approach to Nontuberculous Mycobacterium—The Case of Mycobacterium abscessus UC22

Abstract

Abstract

Introduction

Materials and Methods

M. abscessus culture and whole-genome sequencing

M. abscessus culture and protein extraction

Trypsin digestion and fractionation

LC-MS/MS analysis

Data analysis

Mapping and variant detection using M. abscessus UC22 reference genome

Multilocus sequence typing analysis

Database searches for peptide and protein identification

Bioinformatic analysis

Generation of customized strain-specific proteome database

Workflow for manual genome annotation

Analysis of virulence genes

Data availability

Results and Discussion

Strain information and genomic characterization of M. abscessus UC22 isolate

Identification of variants associated with natural and acquired resistance

Proteomic characterization of the M. abscessus UC22 strain

Additional features of the UC22 strain proteome

Proteogenomic analysis

Identification of novel genes

Correction of existing gene models

Confirmation of translational start sites

Identification of proteins encoded by the de novo assembled contigs

Identification of variant peptides in M. abscessus UC22 proteome

Cell envelope proteins and virulence-associated genes

Conclusions

Footnotes

Acknowledgments

Author Disclosure Statement

Abbreviations Used

References

Supplementary Material