Abstract
Background: Lactase nonpersistence (LNP) is characterized by the decrease in lactase expression in the small intestine. Studies have shown that −13910 C>T and −22018 G>A single-nucleotide polymorphisms (SNPs) located upstream of the lactase gene are associated with an LNP/lactase persistence (LP) trait. Objective: The current study evaluated the LP allelic frequency in 227 healthy Indian subjects consisting of North Indians, Maharashtrians, Gujaratis, Parsis, and South Indians, and for the first time assessed its relation with milk consumption pattern in Indian subjects. Methods: The two SNPs were genotyped using the polymerase chain reaction and restriction fragment length polymorphism methods. The milk consumption pattern for the studied subjects was noted by questionnaire. Results: The two SNPs were present in a strong linkage disequilibrium. LP prevalence varied in these Indian regional groups. The LP frequency was highest for North Indians and lowest for Parsis (p=0.03 CC vs. CT+TT, p=0.008 GG vs. GA+AA). South Indians had a lower LP frequency compared to North Indians (p=0.07 for each SNP). The milk consumption pattern varied in these Indian subgroups, with the Gujaratis exhibiting the highest milk intake and Parsis the lowest (p=0.04). Conclusion: Our study indicates that the milk intake in Indians might be influenced by their dietary habits in addition to their ancestral history. An overall correlation, however, between milk consumption and LP genotypes was not observed.
Introduction
In case of LNP, the unabsorbed lactose is fermented by colonic bacteria (Swallow, 2003; Heyman, 2006) producing symptoms of lactose intolerance such as diarrhea, bloating, flatulence, nausea, and abdominal pain (Vesa et al., 2000). Lactose intolerance may be due to primary or secondary hypolactasia. Milk is an important component of daily diet. Consumption of milk and other dairy products needs to be avoided in lactose-intolerant individuals. Such individuals thus become more susceptible to reduced bone mineral density and increased risk of bone fractures due to avoidance of milk, which is an excellent source of calcium (Newcomer et al., 1978; Obermayer-Pietsch et al., 2004). However in case of lactose intolerance caused due to secondary hypolactasia (celiac disease, Crohn's disease, or giardiasis), lactase deficiency can be controlled or reversed by appropriate diagnosis and management of the underlying disorder. Unnecessary avoidance of milk may be ill advised in these individuals. Hence, differentiation of primary from secondary hypolactasia is important to ensure that nutritional benefits from milk and dairy products are not unduly compromised.
LNP is genetically determined and is inherited as an autosomal recessive trait (Sahi et al., 1973). The minor alleles of two single-nucleotide polymorphisms (SNPs) −13910 C>T and −22018 G>A, upstream of the initiation codon of the LCT gene, are associated with the LP trait (Enattah et al., 2002). Functional studies have shown that −13910 C>T is involved in transcriptional regulation of the LCT gene expression (Kuokkanen et al., 2003). In vitro studies have shown that the −13910*T allele has increased enhancer activity (Olds and Sibley, 2003; Troelsen et al., 2003) and facilitates binding of the transcription factor OCT-1, thereby enhancing lactase promoter activity (Lewinsky et al., 2005). The −13910CT and −13910TT genotypes are associated with an LP with high lactase activity (>10 U/g protein), whereas the −13910CC genotype is associated with an LNP with low lactase activity (<10 U/g protein) (Enattah et al., 2007a; Kuchay et al., 2011). Functional studies on −22018 G>A have demonstrated that it does not play a significant role in enhancing LCT gene expression (Olds and Sibley, 2003; Troelsen et al., 2003), but is in linkage disequilibrium (LD) with −13910 C>T in another population (Rasinperä et al., 2004). However, recent studies in Northern Chinese (Xu et al., 2010) and Japanese-Brazilian populations (Mattar et al., 2010) have shown that −22018 G>A is a better predictor of the LP/LNP phenotype than −13910 C>T. Population studies have also revealed a correlation between −13910 C>T and phenotypic determinants of adult-type hypolactasia such as breath hydrogen (Büning et al., 2005; Krawczyk et al., 2008; Mattar et al., 2008; Mottes et al., 2008) and lactose tolerance tests (Ridefelt and Håkansson, 2005). Genetic studies have reported a variation in the frequencies of the −13910*T and −22018*A alleles in populations across the globe (Bersaglieri et al., 2004; Enattah et al., 2007b; Mattar et al., 2009). However, multiple SNPs such as −13907 C>G, −13915 T>G, and −14010 G>C were also reported to be associated with LNP in the African population where −13910 C>T could not predict the LP phenotype (Mulcare et al., 2004; Tishkoff et al., 2007).
Analysis using mitochondrial and chromosomal markers has shown that regional subgroups in India exhibit a large genetic diversity besides substantial cultural diversity (Majumder, 1998; Basu et al., 2003; Vishwanathan et al., 2004). A recent study on the Indian population reported a low frequency of the −13910*T allele in South relative to North Indians (Babu et al., 2010). Another study showed that the distribution of the −13910*T allele exhibits a Northwest-to-Southeast decline (Romero et al., 2012). Although a trend in the LP allelic distribution is observed in the Indian subpopulation, the influence of LP genotypes on the milk-drinking pattern is unknown.
In view of this, the objective of the present study was to determine the frequency of milk consumption in five Indian regional groups, and for the first time assess the relationship of the milk consumption pattern with the LP/LNP genotypes in an Indian cohort. We also estimated the allelic frequencies and LD of the −13910 C>T and −22018 G>A SNPs in these Indian regional groups.
Materials and Methods
Enrollment of healthy subjects
Two hundred and twenty-seven healthy asymptomatic adult subjects belonging to five Indian regional groups, namely Maharashtrian (n=61), Gujarati (n=56), Parsi (n=23), North Indian (n=39), and South Indian (n=48), enrolled at the Cumballa Hill Hospital and Heart Institute, Mumbai, were included with their written informed consent for genetic analysis. This study group was comprised of subjects between 19 and 78 years (mean age±standard deviation 45.2±15.3), 158 men (mean age: 44.3±15.9 years) and 69 women (mean age: 47.4±13.7 years). The milk consumption pattern was recorded for Maharashtrians (n=41), Gujaratis (n=121), Parsis (n=25), North Indians (n=32), and South Indians (n=40). The study participants were interviewed, and their demographic data, personal medical history, current medication, and milk drinking habits were recorded. Those who consumed 200 mL or more milk at least once a week were considered as milk drinkers. Those who never took milk were grouped as nonmilk drinkers. None of the study subjects had a history of celiac disease, Crohn's disease, giardiasis, or other conditions associated with secondary hypolactasia. A sample of venous blood was withdrawn in EDTA-anticoagulant tubes (Vacuette®; Greiner, Bio-One GmbH) for the extraction of genomic DNA. The study was approved by the Institutional Ethics Committee at the Cumballa Hill Hospital and Heart Institute.
Genotyping of −13910C>T and −22018 G>A
Genomic DNA was extracted by a standard salting-out procedure, and its yield and purity were determined using spectrophotometric methods. The DNA fragments flanking the desired SNPs were amplified using a polymerase chain reaction (PCR) method as previously described (Matthews et al., 2005). For genotyping of the −13910C>T and −22018 G>A polymorphisms, the PCR products of sizes 420 and 402 bp were digested overnight at 37°C with FaqI(BsmFI) (2 U) and HhaI (5 U) restriction enzymes (Fermentas Life Sciences), respectively. The digested products were analyzed by ethidium bromide-stained 3% agarose gel electrophoresis.
The −13907 C>G polymorphism reported in an African population falls in the restriction site recognized by the restriction enzyme FaqI(BsmFI). A substitution at this position can thus produce erroneous results while genotyping the −13910 C>T SNP using the PCR-restriction fragment length polymorphism method. Earlier studies have shown that −13910 C>T and −22018 G>A are in strong LD in some populations. Therefore, in the current study, DNA sequencing of PCR products of the 12 discordant samples, that is, 11 CC/GA and 1 CT/AA genotypes, was carried out to rule out the presence of −13907 C>G downstream of −13910 C>T that might interfere with the allele call at this locus, thereby affecting the level of LD.
Statistical analysis
The allelic frequencies were calculated by direct counting. The difference in the allelic frequency in different regional groups and the milk consumption pattern in different genotype groups was analyzed by the chi-square test. LD for the two diallelic loci was determined by r2 LD statistics (Hill, 1974).
Results
The genotype distributions and allele frequencies for the −13910 C>T and −22018 G>A polymorphisms of different regional groups are given in Tables 1 and 2, respectively. The genotype distribution for the −22018 G>A SNP was in the Hardy-Weinberg equilibrium (p=0.1), whereas the −13910 C>T polymorphism showed deviation from the same (p=0.02). The correspondence between genotypes at the two loci was 94.7%, and the r-value for LD (Pearson's correlation coefficient) was 0.94, indicating a strong LD between the two diallelic loci in the Asian Indian population. The frequency of LNP in men (81%) and women (81.2%) was comparable and not statistically significant (p=0.97). Sequence analysis of the 12 discordant samples did not reveal the presence of the −13907 C>G, −13915 T>G, and −14010 G>C polymorphisms, which were reported in an African population. The sequence results also confirmed the genotyping results for the 12 discordant samples.
CC and GG, lactase nonpersistence; CT, TT and GA, AA, lactase persistence; -, genotype was not detected in the regional group; SNP, single-nucleotide polymorphism.
The overall prevalence of LNP, defined by the −13910CC genotype in the study population, was 81.1%. The difference in the proportion of LNP genotypes such as CC versus CT+TT and GG versus GA+AA was borderline significant between the North and South Indians (p=0.07 for each SNP). The frequency of LP was highest for North Indians and lowest for Parsis (p=0.03 for CC versus CT+TT, p=0.008 GG versus GA+AA). The −13910TT genotype occurred in North Indians, Maharashtrians, and Gujaratis, but not in South Indians or Parsis. The genotype distribution at the two loci was comparable for Maharashtrians and Gujaratis originating in the western region of India. The proportion of GG versus GA+AA genotypes was higher for Parsis, also belonging to the western part of the country, than Maharashtrians (p=0.05).
Forty-seven percent of the study subjects were milk drinkers (Fig. 1). This habit occurred more in Gujaratis than Parsis (p=0.04), both ethnic groups hailing from Western India. The overall group study showed that more individuals with LP genotypes (CT and TT) reported drinking milk (Table 3, Fig. 2), but the difference did not reach statistical significance.

Milk consumption pattern (%) in Indian regional groups. The milk consumption frequency varied between regional groups as shown in the graph, but the difference was found to be statistically significant only between Gujarati and Parsi groups (p=0.04).

Comparison of distribution of milk drinkers and nonmilk drinkers with lactase persistence (LP) and lactase nonpersistence (LNP) genotypes. As shown in
Lactase nonpersistence genotype; blactase-persistent genotype.
Discussion
India has a genetically heterogeneous population due to limited inter-regional population admixture and distinctive regional cuisines among other variations. Our study included random sampling of the regional population based on a linguistic classification irrespective of their caste or tribe within the community. The overall frequency of LNP in the present study was 81.1%. The previous Indian studies based on genetic test analyzed only −13910 C>T, whereas the present study also focused on −22018 G>A, which has been reported to be associated with LP (Mattar et al., 2010; Xu et al., 2010). The two SNPs −13910 C>T and −22018 G>A, however, were present in strong LD. In the current study, a lack of association between gender and LNP status was observed. The sequence analysis did not reveal the presence of the −13907 C>G, −13915T>G, and −14010 G>C SNPs, which were earlier reported to be present in Africa (Tishkoff et al., 2007). This indicates that these variants may not be responsible for the LP/LNP trait in the studied Indian groups.
The current study is the first, to our knowledge, to analyze whether the milk intake by Indian subjects was influenced by the LP/LNP genotypes. Also, this is the first study to highlight the frequency of LP alleles in Indian Parsis.
The overall milk consumption pattern varied among the studied Indian regional groups. The Gujaratis exhibited the highest milk intake frequency, followed by North Indians, Maharashtrians, South Indians, and Parsis. Also in the current study, it was observed that the frequency of the LP −13910*T and −22018*A alleles was higher for the Indo-European groups—North Indians, Gujaratis, and Maharashtrians (Basu et al., 2003) in comparison with South Indians and Parsis. The low milk intake and allelic frequencies of South Indians can be due to their Dravidian origin that had a nonmilk drinking history. The difference in allelic frequencies observed may be due to the high prevalence of endogamy in the Indian population, which in turn contributes to the accumulation of mutations in the gene pool, thereby enabling them to maintain their own genetic identity. This declining trend from North to South India is also in accordance with the previous findings (Babu et al., 2010; Romero et al., 2012). The −22018*A allele also exhibited the same trend. However, Parsis who belong to the western part of India displayed the lowest LP allele frequency and also were an exception to the above declining distribution trend in our study. Possible explanation for this finding could be that their ancestors are of the Iranian origin that migrated to Western India over 1000 years ago (Singh et al., 2004), which has enabled them to maintain their own genetic and ethnic identity. It should be noted that the sample size of Parsi and North Indians in the present study was small, and hence a similar study in a larger sample size needs to be carried out to confirm the results.
The difference in the observed milk consumption pattern can be attributed to the ancestral history and regional cuisine adapted by the regional groups. Milk acts as a major source of protein in vegetarians, unlike nonvegetarians where proteins can be derived from meat. This supposition may explain the high frequency of milk consumption in Gujaratis who belong to the Indo-European linguistic group (Reich et al., 2009) and also follow a lactovegetarian diet. Parsis on the other hand, who are of Iranian origin, follow a nonvegetarian dietary pattern, and thus may not depend on milk as their protein source. This may explain why the milk consumption frequency was the lowest in this group. It needs to be noted that milk consumption difference between Gujaratis and Parsis was found to be of statistical significance on increasing the sample size of the two groups. However, genotype data for the extra samples were not recorded, and only milk consumption data were available. Hence, it is warranted to carry out a similar study in a larger sample size to further confirm the above-stated hypothesis.
The correlation between the milk consumption pattern and −13910 C>T was not observed. There can be two possible explanations for such a finding. First, individuals with LNP can consume small quantities of milk without developing symptoms (Vesa et al., 1996). Such findings were also reported in studies carried out in Estonia and Northwest Russia (Lember et al., 2006; Khabarova et al., 2009). Secondly, the lack of correlation observed could be because of the low frequency of lactose-intolerant individuals within the LNP group, which has not been accounted for in this study. Primarily, lactose intolerance develops in LNP individuals when unabsorbed lactose is fermented by colonic bacteria. However, there is a possibility that LNP individuals in the current study group may not be developing gastrointestinal symptoms after milk ingestion. Hence, it is warranted to determine the lactose intolerance status along with the genetic make-up of an individual to decide whether an individual can consume milk or not. A study along similar lines to validate the relationship between the frequency and quantity of milk intake with genotype and phenotype of lactose intolerance is underway in our laboratory.
Clinically, −13910 C>T detection has an advantage over the currently used diagnostic tools such as breath hydrogen test, lactose tolerance test, and lactase activity measurement in biopsy samples of small intestinal mucosa due to its high sensitivity and specificity to detect LNP (Rasinperä et al., 2004). A build-up of the information base using the genetic test for the Indian population along with their milk and dairy product intake data would facilitate planning of appropriate dietary intervention strategies to prevent lactose intolerance by identifying those at risk. The genetic test will also facilitate differential diagnosis of patients with primary and secondary hypolactasia to enable their better management. Thus, the −13910 C>T polymorphism can be used as a robust marker for determining the LNP status in the Indian population. LNP has also been reported to be associated with an increased risk of colorectal cancer (Rasinperä et al., 2005) and bone loss in young adults in some populations (Laaksonen et al., 2009). It also remains to be investigated if these findings hold true in the Indian population.
In summary, the current study shows that the declining trend in the LP allelic frequency exists from North to South India with an exception of the Parsi group, who belong to Western India. The pattern of milk intake varied in the studied regional groups, thus depicting the influence of regional cuisine variation along with their ancestral history. However, a correlation between LP genotypes and milk consumption pattern was not observed in this study.
Footnotes
Acknowledgments
The authors would like to thank the Baun Foundation Trust at the Cumballa Hill Hospital and Heart Institute, Mumbai, and University Grants Commission, India, for their financial assistance for the research work.
Author Disclosure Statement
No competing financial interests exist.
