Abstract
Objective:
The metabolic syndrome (MetS) is a description of a clustering of cardiometabolic risk factors in the same individual. This study searched for genetic loci associated with all five prespecified components of MetS to find a common pathophysiological link for this risk factor clustering.
Methods:
Using data from 291,107 individuals in the U.K. biobank, a genome-wide association study (GWAS) was performed versus each of the five components of the syndrome as continuous variables (glucose, systolic blood pressure, triglycerides, waist circumference, and high-density lipoprotein-cholesterol).
Results:
Using false discovery rate <0.05, three loci were related to all five MetS components (rs7575523; nearest gene LINC0112, rs3936511; intron of C5orf67, and rs111970447; intron of GIP). Of those, C5orf67 seems the most interesting candidate for clustering of risk factors, since previous GWASs in other samples have identified this locus as being related to all five risk factors. Also, genetic loci being related to the different combinations of four or three MetS components were presented. Generally, each MetS component combination was related to a unique genetic profile, and the genetic overlap between these combinations was low.
Conclusion:
A genetic locus was discovered being related to each of the five MetS components, being a candidate for a common pathophysiological link for risk factor clustering. In addition, genetic loci being related to different combinations of four or three MetS components were presented, and the genetic overlap between those combinations of MetS was low.
Introduction
It has been well recognized since the late 80s that cardiovascular risk factors tend to cluster in certain individuals. This phenomenon was named “syndrome X” 1 or “the metabolic risk factor syndrome,” 2 but later the name “metabolic syndrome” (MetS) has been widely used. A number of pathologies have been suggested as the underlying mechanism for this clustering, such as insulin resistance, 1 visceral adiposity, 3 accumulation of liver fat, 4 –6 and a reduced capillary blood flow, 7 but the details on the underlying mechanism(s) are not well known.
This issue was addressed in a recent publication in which it was postulated that a common unifying mechanism of MetS would be related to each of the individual five components of MetS; glucose, high-density lipoprotein (HDL), blood pressure, waist circumference, and triglycerides. With this assumption, it was investigated if 249 proteins were associated with all of the 5 components and found 20 of the proteins to be so. 8 However, since this was performed in a cross-sectional study, these relationships might be due to reverse causation, or to confounding by some other factor(s). From that perspective, it would be better to search for genetic associations, since relationships between genetic loci and traits are not subjected to reverse causation or confounding to the same degree. Therefore, in the present study, the aim was to search for genetic loci that were related to each of the five MetS components, with the hypothesis that such genes could be the basis for the phenomenon of clustering of cardiometabolic risk factors.
Since the most commonly used definitions of MetS, the National Cholesterol and Education Program (NCEP) criteria or the consensus criteria based in the NCEP definition demand that at least three out of five criteria should be fulfilled for MetS, several facets of MetS could exist. One extreme could be the rather lean individual with diabetes, hypertension, and low HDL, and another extreme facet of MetS could be the obese subject with dyslipidemia only. If it is believed that genes are involved in the clustering of risk factors, it is likely that the genetic basis for these two extreme cases of MetS is not the same. It would therefore be of value to also investigate the genetic links to different facets of MetS.
Five genome-wide association studies (GWASs) have search for genes being linked to MetS as a binary trait and 109 such genetic loci have been identified. 9 –12 The latest of these studies used the U.K. biobank (UKBB) in a similar manner as in the present study and 93 independent loci were found in that particular study. 13 However, as pointed out in a review on genetics of MetS by Brown and Walker, 14 the top hits in those studies were seen for genetic variants in or close to lipid-regulating genes, such as APOB, LPL, CETP, APOA5, GCKR, and ZNF259. This might well be due to the fact that two of the five traits used in the definition of MetS are lipids (HDL and triglycerides). It is not obvious that those lipid genes are related to all five of the MetS components.
In the present study, data from the UKBB were used to search for genetic determinants of clustering of risk factors. First, separate GWASs were performed for the five different components of MetS. This has previously been performed for these different five traits, but in different samples, 15 –18 not in the same individuals. This novel approach will allow searching for loci linked to all five MetS components within the same individuals, an approach that will, in combination with the large sample size, increase the ability to find genetic determinants of clustering of risk factors. Second, it was investigated if different combinations of four or three MetS components were associated with different genetic loci. This has not previously been performed in a standardized manner.
Methods
Sample
UKBB is a large, prospective cohort study conducted in a multicenter across the United Kingdom (
Genetic analysis
The genome-wide genetic data were available for 488,377 UKBB individuals. Genotyping was performed with the U.K. BiLEVE and UKBB Axiom arrays in 488,377 individuals and imputed with IMPUTE2 using both HRC and 1000 Genomes Phase 3 merged with the UK10K haplotype reference panels. Quality control of genetic markers consisted of tests for batch effects, plate effects, departures from Hardy–Weinberg equilibrium, sex effects, array effects, and discordance. 19
The March 2018 release of the imputed genetic marker data was used (including corrected imputations from the UK10K and 1000 Genomes Phase 3 reference panel). The sample was restricted to unrelated individuals with self-reported British descent and European ethnicity based on principal component analysis (n = 327,616). Furthermore, genetic markers with minor allele count ≥5822 were excluded and imputation quality <0.8. For chromosomal positions, hg19 was used.
The study was conducted in 291,107 individuals with nonmissing data on outcomes, covariates, and genotype data.
Clinical and biochemical data
Serum glucose, HDL-cholesterol, and triglycerides were measured by a Beckman Coulter AU5800, by standard methods. Since samples were taken at different times of fasting, glucose levels were adjusted down by 1.5 mmol/L if reported fasting time was 0 hr, −3.0 mmol/L if fasting was 1 hr, −1.0 mmol/L if fasting was 2 hrs, −0.3 mmol/L if fasting was 3 hrs, and no correction if fasting time was >3 hrs. For triglycerides, the levels were adjusted down by 0.1 mmol/L if reported fasting time was 1 hr, and the reductions were −0.2, −0.4, −0.6, −0.65, −0.4, and −0.1 mmol/L for times 2 to 7 hrs, respectively. These adjustments are based on a literature search, as well as own experience of the response to a mixed meal. Blood pressure was measured twice in the sitting position with the automated Omron device.
The metabolic syndrome
The harmonized NCEP criteria for MetS were used to define the five components of the syndrome and prevalent MetS (binary). 20 Three of the following five criteria should be fulfilled: blood pressure ≥130/85 mmHg or antihypertensive treatment, serum glucose ≥6.1 mmol/L or antidiabetic treatment, serum triglycerides ≥1.7 mmol/L, waist circumference >102 cm in men and >88 cm in women, and HDL-cholesterol <1.0 mmol/L in men and <1.3 in women.
Statistical methods
Five GWASs were performed using 9,463,307 genetic variants in linear regression models versus the five MetS components as continuous traits; systolic blood pressure, plasma glucose, HDL-cholesterol, serum triglycerides, and waist circumference. Ln-transformed values for glucose and triglycerides were used due to the skewed nature of the original distributions. These analyses were adjusted for age, sex, genetic analysis batch, and 20 principal components of population genetics. PLINK 2.0 was used for analysis. In those analyses, 10 mmHg was added to measure systolic blood pressure if on antihypertensive treatment and 1.5 mmol/L was added to glucose if on antidiabetic treatment to achieve adjustment for treatment.
A power analysis showed that when performing a GWAS, the given sample size will result in a 90% power to detect an association with beta = 0.007 (for an increase in one allele, with the trait on a standard deviation [SD] scale) using a P value of 5 × 10−8.
The next step searched for genetic loci that showed a false discovery rate (FDR) <0.05 for all five MetS traits. The linkage disequilibrium (LD) independency of those genetic loci was evaluated by the clumping function in MRbase in R 3.5.
As a secondary analysis, also genetic loci that showed an FDR <0.05 versus four components of the MetS, and P < 0.05 versus the fifth component (five such combinations) were investigated. In that context, also genetic loci that showed an FDR <0.05 versus three components of the MetS, and P < 0.05 versus the fourth and fifth component (10 such combinations) were investigated. These analyses were performed in STATA14.
Using the PhenoScanner homepage (
Results
Basic characteristics of the sample are given in Table 1. The genome wide-association analyses for the five MetS components are shown as Manhattan plots in Supplementary Figure S1.
Basic Characteristics of the Sample (n = 291,107). Means and Standard Deviations (in Parenthesis) or Proportions Are Given
BMI, body mass index; HDL, high-density lipoprotein; MetS, metabolic syndrome; SD, standard deviation.
Genes versus all five MetS components
Three independent loci were found being related to each of the five MetS components with an FDR <0.05 (Table 2). None of these loci showed P < 5 × 10−8 versus all five MetS components.
Genetic Loci Showing False Discovery Rate <0.05 for Relationships Versus All Five Metabolic Syndrome Components (High-Density Lipoprotein, Glucose, Waist Circumference, Triglycerides, and Systolic Blood Pressure as Continuous Variables)
Beta is listed first, standard error (in parenthesis listed second), and for each trait the P value is listed third. hg19 was used for positions. The betas (and standard errors) are given on an SD scale for each MetS component.
EAF, effective allele frequency; GLU, glucose; SBP, systolic blood pressure; TG, triglycerides.
One such locus was located in the intron region of C5orf67 (rs3936511). This protein-coding locus was most closely related to the HDL and TG components, and has previously been linked to triglyceride and HDL levels,
21
type 2 diabetes,
22
waist circumference,
23
fasting insulin,
24
trunk fat, and systolic blood pressure in other analyses of UKBB (
The second locus was intergenic with LINC01122 as nearest gene (rs7575523). This locus was most closely related to the waist criteria. According to PhenoScanner, this locus has previously been linked to waist circumference, 23 and to a number of obesity-related traits in other analyses in UKBB. No eQTL was found for this locus in GTex.
The third locus was in the intron region of GIP (rs111970447). This locus was most closely related to the glucose and SBP components. This locus was not found in PhenoScanner, GWAS catalog (
A search in the Ensembl database (
Regarding the loci in the GIP gene, reported in Table 2, no SNPs in close LD were reported in the Ensembl database.
Generally, the regression coefficients for the three major SNPs were in the 0.01 to 0.03 range, meaning that a one allele increase would correspond to a change in each trait by 1%–3% of the SD of that trait.
Genes versus combinations of four components
Most loci were found for the combination of high systolic blood pressure + high waist circumference + low HDL-cholesterol + high triglycerides (SBP_WC_HDL_TG) (n = 36). For the other four combinations of four components, 3–5 loci were identified (see Table 3 for the nearest gene, further details are found in Supplementary Table S1). There was no overlap between the loci found for the five different combinations of four MetS components.
Nearest Genes for Loci Identified to be Related to Four of the MetS Components, But Not the Fifth
The combinations are given in the first row.
FTO, fat mass and obesity-associated protein; WC, waist circumference.
Genes versus combinations of three components
As seen in Table 4, the highest number of identified loci for the combination of three MetS components was seen for WC_HDL_TG (n = 137), while for GLU_HDL_WC and GLU_SBP_HDL, only one locus was identified for each of these two combinations. Overlap of nine loci between combinations of three MetS components was found. These loci are given as the nearest gene in Table 4 (further details are found in Supplementary Table S1).
Number of Loci Identified to Be Related to Three of the MetS Components, But Not the Other Two
Discussion
Three genes were identified as being related to all five components of MetS at an FDR <0.05. Of those, a locus in the intron of the protein-coding gene C5orf67 was identified as a promising candidate for future studies on the molecular basis of clustering of cardiometabolic risk factors. Furthermore, distinct genetic associations were found for combinations of four or three MetS components.
Genetic loci and clustering of five risk factors
The primary aim of this study was to search for genetic loci linked to all components of MetS. In the past, GWASs have been performed for these different five traits, but in different samples, 15 –18 not in the same individuals. The novel approach to use a large sample, in which the genetic associations in these five traits were investigated in the same individuals, increases the probability to identify such common genetic variants. No loci were found that showed P < 5 × 10−8 versus all five components, but three loci showed an FDR <0.05 versus all five components. Of those, a locus in the intron of the protein-coding gene C5orf67 was the most interesting finding, since previous GWASs have shown this locus to be related to all the components of MetS, as well as to coronary heart disease, although in different samples. This gene is linked to expression in several tissues of interest, but no further information is found on the function or structure of the protein. However, based on the present findings, it remains an interesting locus to investigate further as a candidate for the clustering of cardiometabolic risk factors in the same individual. Furthermore, in a search of this locus in previous GWASs of the three major cardiovascular diseases, the P values were 0.00046 versus coronary heart disease in CARDIOGRAMplusC4D, 25 0.014 versus heart failure in the HERMES consortium (T. Lumbers and A. Henry, pers. comm.), and 0.094 versus ischemic stroke in the METASTROKE consortium, 26 making this an interesting locus also from the cardiovascular disease perspective.
It should, however, be recognized that the degree of variation for the different five MetS components explained by the C5orf67 loci is small, and therefore, it is not likely that any single gene would explain the clustering of risk factors, but that this phenomenon is due to interactions with several genes. However, the C5orf67 locus seems to be a promising locus to investigate further.
There are five previous GWASs on MetS as a binary trait. 9 –13 The latest of those used UKBB in a similar manner as in this study. Of the three loci identified as being related to all five MetS components in the present study, only C5orf67 was identified as a GWAS hit in the previous studies using MetS as a binary trait.
The locus identified on chromosome 2 with nearest gene LINC0112 has previously been linked to abdominal obesity. The locus identified in the intron part of the GIP gene has previously been associated with several MetS components. Therefore, these genetic regions are of interest to study further regarding the clustering of cardiometabolic risk factors.
Genetic loci and different facets of MetS
As a secondary aim, it was also investigated which genetic loci were linked to different presentations of MetS, since 5 combinations of 4 risk factors and 10 combinations of 3 risk factors exist, and all of those are defined as MetS. This analysis further emphasized that MetS has different genetic facets according to the components present, since no overlap existed between the loci identified for the five different variants of having four out of the five MetS components, and only a limited overlap existed between the loci identified for triplets of MetS components.
Adjustment for multiple tests
In the present study, no single genetic variant showed P < 5 × 10−8 versus all five MetS components. The somewhat more relaxed FDR was therefore used to search for genetic loci linked to all components of MetS. Since the FDR-based approach to compensate for multiple testing would result in more significant findings compared with the Bonferroni-based approach, and since there also is some controversy regarding the validity of FDR in the presence of nonindependent tests, the results in the present study have to be replicated in another sample to be regarded as valid.
The FDR approach used to adjust for multiple testing was used for each of the five MetS components separately. No further adjustment for the fact that five traits were investigated was performed since an overlap between certain genetic loci for pairs of MetS components, for example, the fat mass and obesity-associated protein (FTO) loci for waist circumference and diabetes (in GIANT and DIAGRAM, according to PhenoScanner), is known and therefore further adjustment might well result in an overadjustment.
Strengths and limitations
The major strength of the present study is the large sample size that allows for a search of genetic loci related to all components of MetS with a threshold of FDR <0.05 within the same individuals.
A limitation is that glucose and triglycerides were not measured in the fasting state in all individuals (HDL is not affected by fasting to a major degree). However, a correction based on the time of fast to account was used for that issue. It is therefore reassuring that the most interesting loci, C5orf67, previously have been linked to diabetes and high triglycerides 21 in studies with fasting subjects. Thus, misclassification due to the different times of fasting does not seem to be a major issue, and could, if anything, only drive the hypothesis testing toward null and produce false negative findings.
Another limitation is that no large independent sample was used for replication of the findings, since no other large sample has performed GWASs of all five components of MetS in the same individuals. Therefore, the findings need replication in another large sample to be conclusive.
Around 90% of the participants with self-reported British descent and European ethnicity in UKBB had data on the five components of MetS. Since the proportion of missing data on the risk factors is quite small and since no specific cause of why risk factor data are missing in some individuals is known, it is assumed that the present sample is representative of the UKBB sample with self-reported British descent and European ethnicity, and that no selection bias is present from that perspective.
In conclusion, a genetic locus in an intron of C5orf67 was discovered being related to each of the five MetS components, being a candidate for a common pathophysiological link for risk factor clustering. In addition, genetic loci being related to the different combinations of four or three MetS components were presented, and the genetic overlap between those combinations of MetS was low.
Footnotes
Acknowledgment
This research has been conducted using the U.K. Biobank Resource under application no. 13721.
Author Disclosure Statement
No conflicting financial interests exist.
Funding Information
No funding was received for this article.
Supplementary Material
Supplementary Figure S1
Supplementary Table S1
References
Supplementary Material
Please find the following supplemental material available below.
For Open Access articles published under a Creative Commons License, all supplemental material carries the same license as the article it is associated with.
For non-Open Access articles published, all supplemental material carries a non-exclusive license, and permission requests for re-use of supplemental material or any part of supplemental material shall be sent directly to the copyright owner as specified in the copyright notice associated with the article.
