Cluster analysis of milk fat yield trait in dairy cows using meta-analysis of the genome-wide association studies

Document Type : Research Paper

Authors

1 Department of Animal Sciences, Faculty of Agriculture, Ferdowsi University of Mashhad, Mashhad, Iran.

2 Animal Science department, Faculty of Agriculture, Ferdowsi university of Mashhad, Mashhad, Iran

Abstract

Introduction: Dairy milk is one of the most important economic products for any country. Also, milk fat has high impact on taste of milk and dairy products. The liver in ruminant animals including dairy cows plays an important role in carbohydrates metabolism, fats, vitamins, hormones, and etc. The absorbed nutrients pass through the liver from the gastrointestinal tract and enter the blood circulation system, and eventually enter the mammary glands of dairy cows. Therefore, the liver plays an essential role in cow lactation and fat production. All components that determine milk quality can be considered as quantitative traits controlled by many genes and influenced by environmental factors. If genetic markers can explain a significant part of the variation, they can be considered as ideal candidates for genomic selection. Previously, microsatellite markers were used to identify quantitative trait locus (QTL). But now, with the progress of science and the advent of the single-nucleotide polymorphisms (SNP) is used in genome-wide association studies (GWAS) to identify QTL. In dairy cows, some of the major genes with significant effects on milk fat have been identified in previous GWAS studies. Therefore, despite a large number of GWAS studies in dairy cows, the studies can be combined using meta-analysis to achieve higher power results. Meta-analysis is a statistical analysis that combines the results of scientific multiple studies. These studies contribute to our current understanding of the genetic regulation of milk fat yield traits. This approach provides a better understanding of the genetic architecture of complex traits. The network clustering algorithm and cluster identification is an important tool in the structural analysis of networks. The many clustering algorithms in different types are used for protein-protein interactions (PPIs) networks analysis. In this study, we used an algorithm known as MCODE to identify dense regions in the PPIs diagram. The overall purpose of PPIs network clustering is to a grouping of genes or proteins that are related according to some scales. The network of PPIs contains different proteins that play a role in different pathways. Because these genes or proteins are clustered based on the similarity of metric and are known as matrix distances. It is also important to predict molecular assemblies of protein interaction data because it provides another level of functional annotation. A total, the purpose of this study is to conduct a meta-analysis of GWAS in cluster analysis to identify genes that are effective in milk fat yield in dairy cows.
Material and methods: In this study, the data used were GWAS summary data. All data were collected from 19 published studies from 2010 to 2019. This research included main papers and dissertation (valid dissertations with published papers). All available genes were combined, synthesized, and evaluated using a meta-analysis method. The Cytoscape v3.7.2 software was used to analyze and visualize the genes examined by the STRING v1.5.0 plugin and to extract clusters from the MCODE v1.5.1 algorithm. Therefore, the results of the GWAS summary data were combined in molecular networks with PPIs, which have a significant role in increasing the association studies power to identify genes affecting milk fat trait. Also, the DAVID server was used to identify the gene ontology (GO) term enrichment in order to
detect enriched biological terms associated with genomic regions and to identify gene networks using functional annotation clustering tools based on enriched pathways analysis.
Results and discussion: A total, 223 genes were analyzed using the STRING plugin in Cytoscape software. Also, these genes were associated with at least one other gene and had a direct and partial correlation. In the gene network, the correlation created for milk fat yield trait included 213 genes or nodes and 219 edges (gene connection). The P-value calculated in the STRING network was statistically significant for enriched pathways in PPIs ( ).The collection of important genes and popular were evaluated using the MCODE plugin. Seven clusters were identified and grouped in this network. For instance, proteins in cluster 1 included: ARHGAP39, CPSF1, CYHR1, PPP1R16A, GRINA, MROH1, and SMPD5 genes. As shown in Table 2, cluster 1 (score=7) was connected with 7 node density to 21 nodes. This cluster showed proteins that play important roles in the internal space of the endoplasmic reticulum (cellular components), metal ion binding (molecular function), and integral to the membrane (cellular components). CPSF1, CYHR1, and GRINA are major genes involved in the internal space of the endoplasmic reticulum, metal ion binding, and membrane integral, respectively. It was found that clusters 1 and 2 have the highest score between all reported clusters.
Conclusion: The chart-based protein clustering was extracted from the PPIs network using the MCODE algorithm and the enriched pathways were extracted from the DAVID tool. This method determines the quality of the proteins involved in fat yield and helps to understand the molecular structure of proteins. These clusters based on existing biological knowledge can help data mining and system models understand network interactions and pathways. These protein clusters provide a deep insight into how genes interaction with each other in network analysis for fat yield. However, it was observed that meta-analysis of GWAS summary data can play an important role in the wide understanding of network visualization and cluster analysis of identified genes in enriched pathways. Therefore, cluster analysis can improve the identified genes power for economically important traits such as milk fat yield in a population of dairy cows and can be used in future genomic evaluations and breeding programs.

Keywords


der GD and Hogue CWV, 2003. An automated method for finding molecular complexes in large protein interaction networks. BMC Bioinformatics 4(1): 2.
Bauman DE and Bruce Currie W, 1980. Partitioning of Nutrients During Pregnancy and Lactation: A review of mechanisms involving homeostasis and homeorhesis. Journal of Dairy Science 63(9): 1514-1529.
Cai Z, Guldbrandtsen B, Lund MS, Sahana G, 2019. Dissecting closely linked association signals in combination with the mammalian phenotype database can identify candidate genes in dairy cattle. BMC Genetics 20: 1–12.
Cecchinato A, Macciotta NPP, Mele M, Tagliapietra F, Schiavon S, Bittante G and Pegolo S, 2019. Genetic and genomic analyses of latent variables related to the milk fatty acid profile, milk composition, and udder health in dairy cattle. Journal of Dairy Science 102(6): 5254–5265.
Chamberlain AJ, Hayes BJ, Savin K, Bolormaa S, McPartlan HC, Bowman PJ, Van Der Jagt C, MacEachern S, Goddard ME, Pryce J, Chamberlain AJ, Bowman PJ and Goddard M E, 2012. Validation of single nucleotide polymorphisms associated with milk production traits in dairy cattle. Journal of Dairy Science 95(2): 864–875.
Cole JB, Wiggans GR, Ma L, Sonstegard TS, Lawlor TJ, Crooker BA, Van Tassell CP, Yang J, Wang S, Matukumalli LK and Da Y, 2011. Genome-wide association analysis of thirty one production, health, reproduction and body conformation traits in contemporary U.S. Holstein cows. BMC Genomics 12: 1–17.
Djokovic R, Cincovic M, Kurcubic V, Petrovic M, Lalovic M, Jasovic B and Stanimirovic Z, 2014. Endocrine and metabolic status of dairy cows during transition period. The Thai Journal of Veterinary Medicine 44(1): 59-66.
Doncheva NT, Morris J H, Gorodkin J and Jensen LJ. 2019. Cytoscape stringApp: network analysis and visualization of proteomics data. Journal of Proteome Research 18 (2): 623–632.
Esposito G, Masucci F, Napolitano F, Braghieri A, Romano R, Manzo N and   Francia, A, 2014. Fatty acid and sensory profiles of Caciocavallo cheese as affected by management system. Journal of Dairy Science 97(4): 1918-1928.
Fang M, Fu W, Jiang D, Zhang Q, Sun D, Ding X and Liu J, 2014. A multiple-SNP approach for genome-wide association study of milk production traits in Chinese Holstein cattle. Plos One 9(8): e99544.
Fenelon MA and Guinee TP,1999. The effect of milk fat on Cheddar cheese yield and its prediction, using modifications of the Van Slyke cheese yield formula. Journal of Dairy Science 82(11): 2287-2299.
Fortes MRS, Reverter A, Zhang Y, Collis E, Nagaraj SH, Jonsson NN, Prayaga KC, Barris W and Hawken RJ, 2010. Association weight matrix for the genetic dissection of puberty in beef cattle. Proceedings of the National Academy of Sciences 107(31): 13642-13647.
Galderg AV, 1984. Finding maximum density subgraph. Technical report UCB/CSD 84-171.
Glickman MH and Ciechanover A, 2002. The ubiquitin-proteasome proteolytic pathway: Destruction for the sake of construction. Physiological Reviews 82(2): 373-428.
Gollapalli P, Hanumanthappa M and Pattar S, 2015. Cluster analysis of protein-protein interaction network of mycobacterium tuberculosis during host infection. Advances in Bioresearch 6(5):38–46.
Graber M, Kohler S, Kaufmann T, Doherr MG, Bruckmaier RM and van Dorland HA, 2010. A field study on characteristics and diversity of gene expression in the liver of dairy cows during the transition period. Journal of Dairy Science 93(11): 5200-5215.
Guo J, Jorjani H, Carlborg Ö, 2012. A genome-wide association study using international breeding-evaluation data identifies major loci affecting production traits and stature in the Brown Swiss cattle breed. BMC Genetics 13:82.
Hardie DG, 2003. Minireview: The AMP-activated protein kinase cascade: the key sensor of cellular energy status. Endocrinology 144(12): 5179-5183.
Hooijmans CR, IntHout J, Ritskes-Hoitinga M and Rovers MM, 2014. Meta-analyses of animal studies: An introduction of a valuable instrument to further improve healthcare. ILAR Journal 55(3): 418-426.
Huang DW, Sherman BT and Lempicki RA, 2009. Systematic and integrative analysis of large gene lists using DAVID bioinformatics resources. Nat Protoc 4:44–57.
Ibeagha-Awemu EM, Peters SO, Akwanji KA, Imumorin IG and Zhao X, 2016. High density genome wide genotyping-by-sequencing and association identifies common and low frequency SNPs, and novel candidate genes influencing cow milk traits. Scientific Reports 6: 31109.
Iso-Touru T, Sahana G, Guldbrandtsen B, Lund MS and Vilkki J, 2016. Genome-wide association analysis of milk yield traits in Nordic Red Cattle using imputed whole genome sequence variants. BMC Genetics 17(1): 1–12.
Jia P and Zhao Z, 2014. Network-assisted analysis to prioritize GWAS results: Principles, methods and perspectives. Human Genetics 133(2): 125-138.
Jiang L, Liu J, Sun D, Ma P, Ding X, Yu Y and  Zhang Q, 2010. Genome wide association studies for milk production traits in Chinese Holstein population. Plos One 5(10): e13661.
Jiang L, Liu X, Yang J, Wang H, Jiang J, Liu L, He S, Ding X, Liu J and Zhang Q, 2014. Targeted re-sequencing of GWAS loci reveals novel genetic variants for milk production traits. BMC Genomics 15: 1–9.
Kahn BB, Alquier T, Carling D and Hardie DG, 2005. AMP-activated protein kinase: Ancient energy gauge provides clues to modern understanding of metabolism. Cell Metabolism 1(1): 15-25.
Lean IJ, Rabiee AR, Duffield TF and Dohoo IR, 2009. Invited review: Use of meta-analysis in animal health and reproduction: Methods and applications. Journal of Dairy Science 92(8): 3545-3565.
Lee WJ, Monteith GR and Roberts-Thomson SJ, 2006. Calcium transport and signaling in the mammary gland: Targets for breast cancer. Biochimica et Biophysica Acta (BBA)-Reviews on Cancer 1765(2): 235-255.
Lin C, Cho YR, Hwang WC, Pei P and Zhang A, 2006. Clustering methods in a protein-protein interaction network. Knowledge Discovery in Bioinformatics: techniques, methods and application 1-35.
Mallett R, Hagen-Zanker J, Slater R and Duvendack M, 2012. The benefits and challenges of using systematic reviews in international development research. Journal of Development Effectiveness 4(3): 445-455.
Marete AG, Guldbrandtsen B, Lund MS, Fritz S, Sahana G and Boichard D, 2018. A meta-analysis including pre-selected sequence variants associated with seven traits in three french dairy cattle populations. Frontiers in Genetics 9: 522.
Martini M, Salari F and Altomonte I, 2016. The macrostructure of milk lipids: the fat globules. Critical Reviews in Food Science and Nutrition 56(7): 1209-1221.
Meredith BK, Kearney FJ, Finlay EK, Bradley DG, Fahey AG, Berry DP and Lynn DJ, 2012. Genome-wide associations for milk production and somatic cell score in Holstein-Friesian cattle in Ireland. BMC Genetics 13(1): 21.
Minozzi G, Nicolazzi EL, Stella A, Biffani S, Negrini R, Lazzari B, Ajmone-Marsan P and Williams JL, 2013. Genome wide analysis of fertility and production traits in Italian Holstein cattle. Plos One 8 (11): 1–10.
Mohammadi F, Tahmoorespur M and Javadmanesh A, 2018. Study of differentially expressed genes, related pathways and gene networks in sheep fetal muscle tissue in thin- and fat-tailed breeds. Animal Science Journal 123:301-312.
Nayeri S, Sargolzaei M, Abo-Ismail MK, May N, Miller SP, Schenkel F, Moore SS and Stothard P, 2016. Genome-wide association for milk production and female fertility traits in Canadian dairy Holstein cattle. BMC Genetics 17(1): 75.
Palombo V, Milanesi M, Sgorlon S, Capomaccio S, Mele, M, Nicolazzi, E, Ajmone-Marsan, P, Pilla, F, Stefanon, B and D'Andrea M, 2015. Searching new signals for production traits through gene-based association analysis in three Italian cattle breeds. Animal Genetics 46(4): 361-370.
Park YW, 2009. Overview of bioactive components in milk and dairy products. Bioactive Components in Milk and Dairy Products 1–12 p.
Raven LA, Cocks BG and Hayes BJ, 2014. Multibreed genome wide association can improve precision of mapping causative variants underlying milk production in dairy cattle. BMC Genomics 15(1): 62.
Reverter A and Fortes MRS, 2013. Breeding and genetics symposium: building single nucleotide polymorphism-derived gene regulatory networks:Towards functional genomewide association studies. Journal of Animal Science 91(2): 530-536.
Roche JR, Friggens NC, Kay JK, Fisher MW, Stafford KJ and Berry DP, 2009. Body condition score and its association with dairy cow productivity, health, and welfare. Journal of Dairy Science 92(12): 5769-5801.
Sahana G, Guldbrandtsen B, Thomsen B, Holm LE, Panitz F, Brøndum RF, Bendixen G and Lund MS, 2014. Genome-wide association study using high-density single nucleotide polymorphism arrays and whole-genome sequences for clinical mastitis traits in dairy cattle. Journal of Dairy Science 97(11): 7258-7275.
Saito R, Smoot ME, Ono K, Ruscheinski J, Wang PL, Lotia S, Pico AR, Bader GD, and Ideker T, 2012. A travel guide to Cytoscape plugins. Nature Methods 9 (11): 1069-1076.
Schlegel G, Ringseis R, Keller J, Schwarz FJ and Eder K, 2012. Changes in the expression of hepatic genes involved in cholesterol homeostasis in dairy cows in the transition period and at different stages of lactation. Journal of Dairy Science 95(7): 3826-3836.
Shannon P, Markiel A, Ozier O, Baliga NS, Jonathan T, Wang DR, 2003. Cytoscape: a software environment for integrated models of biomolecular interaction networks. Genome Research 13 (11): 2498-504.
Shennan DB and Peaker M, 2000. Transport of milk constituents by the mammary gland. Physiological Reviews 80(3): 925-951.
Shi L, Lv X, Liu L, Yang Y, Ma Z, Han B and Sun D, 2019. A post-GWAS confirming effects of PRKG1 gene on milk fatty acids in a Chinese Holstein dairy population. BMC Genetics 20(1): 53.
Shin D, Lee C, Park KDo, Kim H and Cho KH, 2017. Genome-association analysis of Korean Holstein milk traits using genomic estimated breeding value. Asian-Australasian Journal of Animal Sciences 30(3): 309–319.
SmoczyƄski M, 2017. Role of phospholipid flux during milk secretion in the mammary gland. Journal of Mammary Gland Biology and Neoplasia 22(2): 117-129.
Spehar M, 2015. Genomic Evaluation and Association Studies of Correlated Traits in Dairy and Dual Purpose Cattle Breeds. Doctoral dissertation, Ljubljana, Univ. of Ljubljana, Biotechnical faculty.
Spelman RJ, Coppieters W, Karim L, Van Arendonk JAM and Bovenhuis H, 1996. Quantitative trait loci analysis for five milk production traits on chromosome six in the Dutch Holstein-Friesian population. Genetics 144(4): 1799-1807.
Strucken EM, Bortfeldt RH, De Koning DJ and Brockmann GA, 2012. Genome-wide associations for investigating time-dependent genetic effects for milk production traits in dairy cattle. Animal Genetics 43(4): 375-382.
Tesfayonas, YG, 2014. Genome wide association study of milk composition traits in Swedish Red cows. 40.
Turner MD, Rennison ME, Handel SE, Wilde CJ and Burgoyne RD, 1992. Proteins are secreted by both constitutive and regulated secretory pathways in lactating mouse mammary epithelial cells. The Journal of Cell Biology 117(2): 269-278.
Vesterinen HM, Sena ES, Egan KJ, Hirst TC, Churolov L, Currie GL, Antonic A, Howells DW and Macleod MR, 2014. Meta-analysis of data from animal studies: A practical guide. Journal of Neuroscience Methods 221: 92-102.
Viollet B, Foretz M, Guigas B, Horman S, Dentin R, Bertrand L, Hue L and Andreelli F, 2006. Activation of AMP-activated protein kinase in the liver: A new strategy for the management of metabolic hepatic disorders. The Journal of Physiology 574(1): 41-53.
Wagner A and Fell DA, 2001. The small world inside large metabolic networks. Proceedings of the Royal Society of London. Series B: Biological Sciences. 268(1478): 1803-1810.
Wang X, Wurmser C, Pausch H, Jung S, Reinhardt F, Tetens J, Thaller G and Fries R, 2012. Identification and dissection of four major QTL affecting milk fat content in the German Holstein-Friesian population. Plos One 7(7): e40711.
Watts D and Strogatz S, 1998. Collective dynamics of ‘small-world’ networks. Nature 393 (6684): 440–2.
White HM, 2015. The role of TCA cycle anaplerosis in ketosis and fatty liver in periparturient dairy cows. Animals 5(3): 793-802.
Woalder TC, Chung J and Farese RV, 2017. Lipid droplet biogenesis. Physiology and Behavior 6(33): 491-510.
Zielke LG, Bortfeldt RH, Reissmann M, Tetens J, Thaller G and Brockmann GA, 2013. Impact of Variation at the FTO Locus on Milk Fat Yield in Holstein Dairy Cattle. Plos One 8(5): e63406.