Academic Editor:Li Xia, Department of Medicine, Stanford University School of Medicine
Checked for plagiarism: Yes
Review by: Single-blind
Bioinformatic Resources for Diabetic Nephropathy
The number of individuals with diabetes is increasing worldwide and a large subset of those affected will develop diabetic nephropathy. Diabetic nephropathy is the leading cause of end-stage renal disease, has serious health consequences for affected individuals, and represents a major monetary cost to healthcare providers.
Technological and analytical developments have enabled large-scale, collaborative studies that are revealing risk factors associated with diabetic nephropathy. However, much of the inherited predisposition and biological mechanisms underpinning risk of this disease remain to be identified. Meta-analyses and integrated pathway studies are becoming an increasingly important part of research for diabetic nephropathy including, genetic, epigenetic, transcriptomic, proteomic research, clinical observations and the development of animal models.
This report highlights current bioinformatic resources and standards of reporting to maximise interdisciplinary research for diabetic nephropathy. The identification of an -Omics profile that can lead to earlier diagnosis and / or offer improved clinical evaluation of individuals with diabetes would not only provide significant health benefits to affected individuals, but may also have major utility for the efficient use of healthcare resources.
Diabetes is a major public health concern with rates of diabetes increasing globally and approximately 40% of affected individuals developing diabetic nephropathy 1, 2, 3, 4. Diabetic nephropathy is the leading cause of end-stage renal disease and represents a substantial cost to healthcare providers 5, 6. Strategies that can help predict those individuals at higher risk of developing diabetic nephropathy, improve understanding of the pathogenesis of this disease, or suggest novel targets for optimised therapies are urgently required. With the increasing size and complexity of research studies, bioinformatics has become an essential discipline to help unravel the biological mechanisms that lead to diabetic nephropathy and end-stage renal disease in individuals with diabetes. Clinically-based resources such as routine laboratory measurements, hospitalisation records, treatment regimens and patient outcomes may help inform strategic planning, change healthcare policy, and contribute to ‘basic’ research discoveries 7, 8. Epidemiological studies confirm inherited risk factors influence the development and progression of diabetic nephropathy, however identifying clinically useful biomarkers and effective therapies is proving to be considerably more challenging. Recent technological advances enable cost-effective investigations of functional risk factors for diabetic nephropathy including genetic, epigenetic, transcriptomic, proteomic and metabolomic pathways coupled with data from clinical observations and animal models of diabetic kidney disease. Analysing integrated networks and pathways from rich and diverse data sources, often using systems biology-based approaches, is becoming an important component of diabetic nephropathy research.
Genetic epidemiology is moving away from single SNP studies, towards an emphasis on the comprehensive analysis of a candidate gene [candidate gene 9,10 or systematic literature reviews and meta-analyses 11,12,13,14,15. Several genome-wide association studies (GWAS) have been performed for diabetic nephropathy 16,17,18,19,20,21, but only two independent GWAS datasets are publicly available via dbGaP: (i) GoKinD US 19 and (ii) All Ireland-Warren3-GoKinD UK 21 collections. Recently, the GENIE consortium completed the first meta-analysis of GWAS for diabetic nephropathy with subsequent replication in more than 12,500 individuals, 21. Ongoing projects involve more comprehensive association studies in larger discovery cohorts together with deep next-generation resequencing to identify more elusive rare variants that may contribute to diabetic nephropathy. These types of studies maximise the chance of finding true genetic signals that influence risk of diabetic kidney disease, or the more extreme end-stage renal disease in individuals, but pose substantial challenges in terms of archiving the data so that it is usefully accessible to other researchers. ‘Raw’ datasets that are available to bona fide researchers are ideal in that they facilitate downstream analyses by whichever methods are most appropriate for individual applications (Table 1). Several older resources such as T2D-db 22 (http://t2ddb.ibab.ac.in, last updated for type 2 diabetes in 2009) and corgi 23 (http://go.qub.ac.uk/kidney-corgi, last updated for kidney genes in 2011) also contain useful data that support and promote interdisciplinary research.Table 1. Web-based resources
|GUDMAP GenitoUrinary Development Molecular Anatomy Project||Curated, gene expression datasets in development transgenic mice||www.gudmap.org|
|KUPKB The Kidney and Urinary Pathway Knowledge Base||-Omics datasets from scientific publications and other renal databases||http://www.kupkb.org|
|Nephromine||Comprises renal gene expression profiles||www.nephromine.org|
|T1DBASE Type 1 Diabetes Database||Curated, integrated datasets informing genetics across species||www.t1dbase.org|
|DiaComp Diabetic complications consortium||Data on animal models for diabetic complications, including nephropathy.||www.diacomp.org|
|dkCOIN National Institute of Diabetes, Digestive and Kidney Diseases (NIDDK) Consortium Interconnectivity Network||Toolkit of interconnected resources (datasets, reagents, and protocols) generated from individual consortia||www.dkcoin.org|
|dbGAP : Genotype-phenotype association studies||Case-control study for nephropathy in type 1 diabetes with 1801 participants using Illumina Omni1-quad||http://www.ncbi.nlm.nih.gov/proects/gap|
|phs000088.v 1.p 1|
|phs000018.v 1.p 1|
|Susceptibility Genes for Diabetic Nephropathy in Type 1 Diabetes ( GoKinD study participants and parents), NIDDK||Case-control study for nephropathy in type 1 diabetes with 1825 participants using Affymetrix 500K set|
|phs000302.v 1.p 1 Genetic Study on Nephropathy in Type-2 Diabetes||CC study for nephropathy in type 2 diabetes with 350 participants using Illumina 370CNV array.|
|phs000333.v 1.p 1 Family Investigation of Nephropathy and Diabetes (FIND) Study||CC study for nephropathy in type 2 diabetes with 2622 participants using Affymetrix 6.0|
|phs000389.v 1.p 1 GENIE UK-ROI Diabetic Nephropathy GWAS||Case-control study for nephropathy in type 1 diabetes with 1801 participants using Illumina Omni1-quad|
|GEO: Gene expression omnibus||http://www.ncbi.nlm.nih.gov/geo/|
|GSE20067||Case-control approach on 192 individuals using Illumina’s Infinium 27k methylation beadchip|
|GSE1009||Expression profiling on 6 kidney samples using Affymetrix Human Genome U95 Version 2|
|GDS3649||Analysis of HK2 proximal tubular cells using Illumina HumanWG-6 v3.0 expression beadchip|
|GDS961||Case-control comparison of glomeruli|
Comprehensive clinical and demographic information is very important when researchers combine data from multiple studies 24. Knowing the precise phenotype, inclusion/exclusion criteria, and potential confounding factors such as duration of diabetes and ancestry are critical to derive robust findings from meta-analyses.
Quality control is another essential element of all genetic studies, particularly in larger-scale studies where systematic bias may substantially affect the results; stringent quality control was highlighted as very important for a diabetic nephropathy GWAS study 25. Standardised guidelines have been suggested to help evaluate published genetic association studies and improve transparency of reporting; STrengthening the REporting of Genetic Association Studies (STREGA): an extension of the statement 26.
Multiple studies have been reported that suggest transcriptomic differences between individuals with and without diabetic nephropathy. Traditionally larger-scale studies of the transcriptome were conducted using DNA microarrays that comply with reporting standards designed to improve reliability and confidence in outcomes such as MIAME 27 and MAQC 28. Several transcriptomic studies are publicly available in the Gene Expression Omnibus 29, Nephromine 30, GUDMAP 31, and KUPKB 32 RNA-seq is a powerful sequence-based method that enables researchers to discover RNA biomarkers, novel isoforms, and to profile and quantify entire RNA transcripts across the transcriptome. RNA-seq may also provide insights into the potential functional impact of epigenetic modification to DNA and histones 33,34. RNA-Seq will provide more reliable, precise and informative measurements of the transcriptome, however challenges remain in terms of the sheer quantity of data generated and researcher’s unfamiliarity with this rapidly developing technique. Nonetheless, RNA-Seq is generating novel insights for the kidney transcriptome that are relevant for diabetic nephropathy 35.
Epigenetic modifications of the genome contribute to disease susceptibility, however much of the “inherited” epigenetic architecture remains unexplained. Emerging evidence for epigenetic phenomena has transformed investigations of heritable influences on disease and, complementary to genome-wide association studies (GWAS), it is now cost-effective to perform population-based studies of the epigenome 36,37. Epigenetic modifications modulate gene expression without changing the DNA sequence; these may be either stably inherited or dynamic epigenetic marks. Methylation is a key epigenetic feature that plays an important role in chromosomal integrity and regulation of gene expression with different methylation profiles now being associated with many complex diseases, including diabetes 38,39. Initial studies support an important role for differential methylation in diabetic nephropathy40,41, however as yet only one dataset is publicly available via the Gene Expression Omnibus 29 It is feasible that methylation profiles may lead to clinically useful biomarkers or direct researchers to novel therapeutic targets in individuals with diabetes. The identification of a genetic-epigenetic profile that can lead to earlier diagnosis and / or offer improved clinical evaluation would not only provide significant healthcare benefits to affected individuals, but may also have major utility for the efficient use of monetary resources
Other epigenetic features include chromatin regulation and RNA interference. Histone modifications do play a role in diabetic nephropathy 42, but large scale studies are not yet available. MicroRNAs have been an area of intense interest in recent years, with several markers highlighted with functionally important to modulate diabetic nephropathy 43,44,45. Non-protein coding RNAs are attractive targets for therapeutic intervention and as clinically useful biomarkers in the development of diabetic nephropathy. It is possible that epigenetic regulation of gene expression may represent a major contribution for diabetic nephropathy. An epigenomics resource at the National Center for Biotechnology Information (NCBI) has been created to serve as a comprehensive public repository for whole-genome epigenetic data sets 46.
Proteome Studies: Diabetic nephropathy involves a complex interaction of biological processes and proteomic analysis represents a potentially powerful approach to identify clinically relevant biomarkers. Centralised repositories exist for proteomic data such as the PRIDE (PRoteomics IDEntifications database; www.ebi.ac.uk/pride), and the Human Metabolome Database 47 has been developed for metabolomic data, but broadly accepted experimental and reporting standards for large-scale studies are still under development 48,49,50. Promising biomarkers for diabetic nephropathy are being suggested from multicentre collaborations and the integration of experimental and clinical data 51,52.
Efficient bioinformatic tools are becoming increasingly important to maximise the outcomes from individual and collaborative multi-centre research programmes. Web-based resources that store, organise and present complex information from diverse datasets enhance effective research. One such example that facilitates access to multidisciplinary information is dkCOIN 53 this collaborative resource was recently launched to share information from the Beta Cell Biology Consortium, the Nuclear Receptor Signalling Atlas, the Diabetic Complications Consortium, and Mouse Metabolic Phenotyping Centres. A systematic, multidisciplinary approach that combines clinical insight with basic biological research is not yet publicly available for diabetic nephropathy, but the use of integrated datasets is increasing (Figure 1). SysKid (systems biology towards novel chronic kidney disease diagnosis and treatment) is a consortium-driven effort that aims to define a comprehensive picture of the consequences of diabetes on kidney function (www.syskid.eu), although data is not publicly available. Systems biology is providing novel insights for diabetes 54,55,56 and for diabetic nephropathy in particular 57,58,59.
With the development of population based registries and biobank information, it is possible that clinical and research oriented databases will be integrated to form a rich, linked information resource, however multiple ethical and legal challenges need to be overcome before this becomes practical 60,61,62,63. The identification of an -Omics profile that can lead to earlier diagnosis and / or offer improved clinical evaluation would not only provide significant health benefits to affected individuals, but may also have major utility for the efficient use of healthcare resources. Bioinformatics is a key discipline that can aid our understanding of the initiation and progression of diabetic nephropathy. In addition, relevant education of healthcare providers is also important to ensure clinically relevant outcomes from –Omics projects that will help patient evaluation and management.