Abstract
Purpose
The endothetal Per-Arnt-Sim (PAS) domain protein 1 (EPAS1) gene which encodes hypoxia-inducible-factor-2 alpha (HIF2a) is a transcription factor that is involved in the response to hypoxia. EPAS1 has been found to have four (rs56721780, rs6756667, rs7589621, rs1868092) simple nucleotide polymorphisms (SNPs) associated with human disease.These SNPs were computationally examined with respect to changes in potential transcriptional factor binding sites (TFBS) and these changes were discussed in relation to disease and alterations in high altitude adaptation in humans.
Methods
The JASPAR CORE and ConSite databases were instrumental in identifying the TFBS. The Vector NTI Advance 11.5 computer program was employed in locating all theTFBS in theEPAS1 gene from 1.6 kb upstream of the transcriptional start site to 539 bps past the 3’UTR. The JASPAR CORE database was also involved in computing each nucleotide occurrence (%) within the TFBS.
Results
The EPAS1 SNPs in the promoter, intron two and the 3’UTR regions have previously been found to be significantly associated with disease and different levels of high-altitude hypoxia among native Tibetans. The SNP alleles were found to alter the DNA landscape for potential transcriptional factors (TFs) to attach resulting in changes in TFBS and thereby, alter which transcriptional factors potentially regulate the EPAS1 genesuch as for the glucocorticoid and mineralocorticoid nuclear receptor binding sites created by the rs7589621 rSNP EPAS1-G allele. These receptors regulate carbohydrate, protein and fat metabolism. Also the minor rs7589621 rSNP EPAS1-A creates a punitive TFBS for the FOXC TF which is an important regulator of cell viability and resistance to oxidative stress. These EPAS1 SNPs should be considered as regulatory (r) SNPs.
Conclusion
The alleles of each rSNP were found to generate unique TFBS resulting in potential changes in TF EPAS1 regulation. The punitive changes in TFBS created by the four rSNPs could very well influence the significant cline in allele frequencies seen in Tibetans with increasing altitude or the haplotype association with high altitude polycythemia in male Han Chinese. These regulatory changes were discussed with respect to changes in human health that result in disease and sickness.
Author Contributions
Academic Editor: Leonid Tarassishin, Associate Department of Pathology Albert Einstein College of Medicine United States.
Checked for plagiarism: Yes
Review by: Single-blind
Copyright © 2016 Norman E Buroker, et al.
Competing interests
The authors have declared that no competing interests exist.
Citation:
Introduction
The endothetal Per-Arnt-Sim (PAS) domain protein 1 (EPAS1) gene which encodes hypoxia-inducible-factor-2 alpha (HIF2a) is a transcription factor that is involved in the response to hypoxia. Hypoxia is a major geographical condition associated with high-altitude environments 1. Hypoxia-inducible-factors (HIFs) are heterodimers consisting of an oxygen-labile HIFa subunit and a stable HIFb subunit 2. During hypoxia conditions, three isoforms either HIF1a, HIF2a or HIF3a and HIF1b are activated and function as transcriptional regulators of genes involved with the hypoxia response 3, 4, 5. Genome wide association studies (GWAS) on high-altitude adaptation have implicated several single nucleotide polymorphisms (SNPs) in the regulatory region of the EPAS1 gene which are responsible for the genetic adaptation of high-altitude hypoxia in Tibetans 6, 7, 8. Genetic variation in the regulatory region of the EPAS1 gene may influence gene expression and contribute to changes in biological functions 9. EPAS1 is expressed in organs that are involved in oxygen transport and metabolism, such the lung, placenta and vascular endothelium 10, and is associated with many biological processes and diseases related to metabolism 11, angiogenesis 12, 13, inflammation 14, 15 and cancer 16, 17, 18.
The EPAS1 gene maps to human chromosome 2p21 and is about 120 kb in size with a coding region consisting of 15 exons 19. Four HIF2a SNPs (rs56721780, rs6756667, rs7589621 and rs1868092) have been significantly associated with different levels of high-altitude hypoxia among native Tibetans 20. The rs56721780 SNP in the HIF2a promoter region has also been significantly associated with high-altitude adaptation of Tibetans 9 while the rs6756667 SNP from intron two has been significantly associated with susceptibility to acute mountain sickness in individuals unaccustomed to high altitude environments 21. The rs1868092 SNP near the HIF2a 3’UTR has been associated with high altitude polycythemia in male Han Chinese at the Qinghai-Tibetan plateau 22.
Single nucleotide changes that affect gene expression by impacting gene regulatory sequences such as promoters, enhances, and silencers are known as regulatory SNPs (rSNPs) 23, 24, 25, 26. A rSNPs within a transcriptional factor binding site (TFBS) can change a transcriptional factor’s (TF) ability to bind its TFBS 27, 28, 29, 30 in which case the TF would be unable to effectively regulate its target gene 31, 32, 33, 34, 35. This concept is examined for the abovefour HIF2a rSNPs and their allelic association with TFBS, where computation analyses 36, 37, 38, 39 is used to identify TFBS alterations created by the HIF2a rSNPs. In this report, the rSNP associations with changes in potential TFBS are discussed with their possible relationship to disease or sickness in humans.
Methods
The JASPAR CORE database 40, 41 and ConSite 42 were used to identify the potential STAT4 TFBS in this study. JASPAR is a database of transcription factor DNA-binding preferences used for scanning genomic sequences where ConSite is a web-based tool for finding cis-regulatory elements in genomic sequences. The TFBS and rSNP location within the binding sites have previously been discussed 43. The Vector NTI Advance 11.5 computer program (Invitrogen, Life Technologies) was used to locate theTFBS in theEPAS1 gene (NCBI Ref Seq NM_001430) from 1.6 kb upstream of the transcriptional start site to 539 bps past the 3’UTR which represents a total of 91 kb. The JASPAR CORE database was also used to calculate each nucleotide occurrence (%) within the TFBS, where upper case lettering indicate that the nucleotide occurs 90% or greater and lower case less than 90%. The occurrence of each SNP allele in the TFBS is also computed from the database (Table 3).
Results
EPAS1 rSNPs and TFBS
The allele frequencies of four EPAS1 SNPs (rs56721780, rs6756667, rs7589621 and rs1868092) significantly associated with different levels of high-altitude hypoxia among native Tibetans 20 are presented in Table 1 along with low altitude Han Chinese and Japanese populations. The common rs56721780 SNP EPAS1 -G allele creates nine unique punitive TFBS for the REL, RELA, RUNX1,TFAP2A, TFAP1(var.2), TFAP2B, TFAP2B(var.2), TFAP2C and TFAP2C(var.2) TFs, which are involved with inflammation, immunity, differentiation, cell growth, tumorigenesis, apoptosis, hematopoiesis, transcriptional activation and repression, respectively (Table 2 & Table 3). The minor EPAS1 -C allele creates two unique punitive TFBS for the FOXP3 and HOXA5 TFs which are involved with the homeostasis of the immune system and specific identities on the anterior-posterior axis during development, respectively (Table 2 & Table 3). There are also four conserved TBFS for the HLTF, HNF4G, NFAT5 and SOX10 TFs which are involved altering chromatin structure, transcription, regulation of osmoprotective and inflammatory genes and embryonic development, respectively (Table 2, Table 3).
Table 1. EPAS1 (HIF2a) SNPs and high altitude hypoxia among native Tibetans. These SNPs have been found to be significantly associated with hypoxia in Tibetan populations. The SNPs are located in the EPAS1 gene. MAF is the minor allele frequency. Allele frequency data from reference 20.MAF | Han Chinese | Japanese | ||||||||
Tibetan populations/Altitude (M) | ||||||||||
Gene EPAS1 (HIF2a) | Gene Position | SNP | Chr Pos | Alleles | Bomi /2700 | Qamdo /3200 | Lhasa/3700 | Amdo /4700 | CHB/Beijing | JPT/Tokyo |
promoter | rs56721780 | 2:46523655 | G/C | C=0.184 | C=0.159 | C=0.328 | C=0.311 | C=0.01 | C=0.022 | |
intron 2 | rs6756667 | 2:46579409 | A/G | G=0.313 | G=0.266 | G=0.201 | G=0.104 | G=0.944 | G=0.889 | |
intron 2 | rs7589621 | 2:46582382 | G/A | A=0.254 | A=0.221 | A=0.163 | A=0.079 | A=0.789 | A=0.767 | |
past 3'UTR | rs1868092 | 2:46614202 | A/G | G=0.359 | G=0.327 | G=0.249 | G=0.156 | G=0.919 | G=0.924 |
TFs | Protein name | TF description |
---|---|---|
ATF4 | Activating Transcription Factor 4 | The protein encoded by this gene belongs to a family of DNA-binding proteins that includes the AP-1 family of transcription factors, cAMP-response element binding proteins (CREBs) and CREB-like proteins. |
ATF7 | Activating Transcription Factor 7 | Plays important functions in early cell signaling.Has no intrinsic transcriptional activity, but activates transcription on formation of JUN or FOS heterodimers. |
BARX1 | BARX Homeobox 1 | Transcription factor, which is involved in craniofacial development, in odontogenesis and in stomach organogenesis. |
BSX | Brain-Specific Homeobox | DNA binding protein that function as transcriptional activator. Is essentiel for normal postnatal growth and nursing. Is an essential factor for neuronal neuropeptide Y and agouti-related peptide function and locomotory behavior in the control of energy balance. |
CEBPa | CCAAT/enhancer binding protein (C/EBP), alpha | C/EBP is a DNA-binding protein that recognizes two different motifs: the CCAAT homology common to many promoters and the enhanced core homology common to many enhancers |
CEBPb | CCAAT/enhancer binding protein (C/EBP), beta | Important transcriptional activator regulating the expression of genes involved in immune and inflammatory responses. Binds to regulatory regions of several acute-phase and cytokines genes and probably plays a role in the regulation of acute-phase reaction, inflammation and hemopoiesis. |
CEBPg | CCAAT/enhancer binding protein (C/EBP), delta | The encoded protein is important in the regulation of genes involved in immune and inflammatory responses, and may be involved in the regulation of genes associated with activation and/or differentiation of macrophages. |
CEBPe | CCAAT/enhancer binding protein (C/EBP), epsilon | The encoded protein may be essential for terminal differentiation and functional maturation of committed granulocyte progenitor cells. Mutations in this gene have been associated with Specific Granule Deficiency, a rare congenital disorder. |
CREB1 | CAMP Responsive Element Binding Protein 1 | This gene encodes a transcription factor that is a member of the leucine zipper family of DNA binding proteins |
DBP | D Site Of Albumin Promoter (Albumin D-Box) Binding Protein | The encoded protein can bind DNA as a homo- or heterodimer and is involved in the regulation of some circadian rhythm genes. |
DLX6 | Distal-Less Homeobox 6 | This gene encodes a member of a homeobox transcription factor gene family similiar to the Drosophila distal-less gene. This family is comprised of at least 6 different members that encode proteins with roles in forebrain and craniofacial development. |
E2F6 | E2F transcription factor 6 | The protein encoded by this gene is a member of the E2F family of transcription factors. The E2F family plays a crucial role in the control of cell cycle and action of tumor suppressor proteins and is also a target of the transforming proteins of small DNA tumor viruses. |
EN1 | Engrailed homeobox 1 | Homeobox-containing genes are thought to have a role in controlling development. |
EN2 | Engrailed homeobox 2 | The human engrailed homologs 1 and 2 encode homeodomain-containing proteins and have been implicated in the control of pattern formation during development of the central nervous system. |
ESX1 | ESX Homeobox 1 | This gene likely plays a role in placental development and spermatogenesis. |
EVX1 | Even-Skipped Homeobox 1 | May play a role in the specification of neuronal cell types |
EVX2 | Even-Skipped Homeobox 2 | The encoded protein is a homeobox transcription factor that is related to the protein encoded by the Drosophila even-skipped (eve) gene, a member of the pair-rule class of segmentation genes. |
FIGLA | Folliculogenesis Specific Basic Helix-Loop-Helix | The protein is a basic helix-loop-helix transcription factor that regulates multiple oocyte-specific genes, including genes involved in folliculogenesis and those that encode the zona pellucida. |
FOS | Jun Proto-Oncogene | The Fos gene family consists of 4 members: FOS, FOSB, FOSL1, and FOSL2. These genes encode leucine zipper proteins that can dimerize with proteins of the JUN family, thereby forming the transcription factor complex AP-1. The FOS proteins have been implicated as regulators of cell proliferation, differentiation, and transformation. Controls osteoclast survival and size. As a dimer with JUN, activates LIF transcription. |
FOS::JUN | Jun Proto-Oncogene FBJ Murine Osteosarcoma Viral Oncogene Homolog | Promotes activity of NR5A1 when phosphorylated by HIPK3 leading to increased steroidogenic gene expression upon cAMP signaling pathway stimulation. Has a critical function in regulating the development of cells destined to form and maintain the skeleton. It is thought to have an important role in signal transduction, cell proliferation and differentiation. |
FOXA1 | Forkhead Box A1 | Transcription factor that is involved in embryonic development, establishment of tissue-specific gene expression and regulation of gene expression in differentiated tissues. |
FOXC1 | Forkhead box C1 | This gene belongs to the forkhead family of transcription factors which is characterized by a distinct DNA-binding forkhead domain. An important regulator of cell viability and resistance to oxidative stress. |
FOXH1 | Forkhead Box H1 | Transcriptional activator. Recognizes and binds to the DNA sequence 5-TGTGTGTATT-3. Required for induction of the goosecoid (GSC) promoter by TGF-beta or activin signaling. |
FOXP3 | Forkhead Box P3 | Transcriptional regulator which is crucial for the development and inhibitory function of regulatory T-cells (Treg). Plays an essential role in maintaining homeostasis of the immune system by allowing the acquisition of full conventional T-cells. Suppressive function and stability of the Treg lineage, and by directly modulating the expansion and function of conventional T-cells. |
GBX1 | Gastrulation Brain Homeobox 1 | Sequence-specific DNA binding transcription factor activity and sequence-specific DNA binding. An important paralog of this gene is DLX5. |
GBX2 | Gastrulation Brain Homeobox 2 | May act as a transcription factor for cell pluripotency and differentiation in the embryo |
GMEB2 | Glucocorticoid Modulatory Element Binding Protein 2 | This gene is a member of KDWK gene family. The product of this gene associates with GMEB1 protein, and the complex is essential for parvovirus DNA replication. |
GSX1 | GS Homeobox 1 | Activates the transcription of the GHRH gene. Plays an important role in pituitary development. |
HIC1 | Hypermethylated In Cancer 1 | This gene functions as a growth regulatory and tumor repressor gene. |
HIC2 | Hypermethylated In Cancer 2 | Transcriptional repressor |
HLF | Hepatic Leukemia Factor | The encoded protein forms homodimers or heterodimers with other PAR family members and binds sequence-specific promoter elements to activate transcription. |
HLTF | Helicase-like transcription factor | This gene encodes a member of the SWI/SNF family. Members of this family have helicase and ATPase activities and are thought to regulate transcription of certain genes by altering the chromatin structure around those genes. |
HMBOX1 | Homeobox Containing 1 | Transcription factor. Isoform 1 acts as a transcriptional repressor. |
HNF4g | Hepatocyte Nuclear Factor 4, Gamma | Transcription factor. Has a lower transcription activation potential than HNF4-alpha |
HOXA2 | Homeobox A2 | Sequence-specific transcription factor which is part of a developmental regulatory system that provides cells with specific positional identities on the anterior-posterior axis. |
HOXA5 | Hoxa5 | Sequence-specific transcription factor which is part of a developmental regulatory system that provides cells with specific positional identities on the anterior-posterior axis. |
HOXB2 | Homeobox B2 | Sequence-specific transcription factor which is part of a developmental regulatory system that provides cells with specific positional identities on the anterior-posterior axis. |
HOXB3 | Homeobox B3 | The encoded protein functions as a sequence-specific transcription factor that is involved in development. |
INSM1 | Insulinoma-Associated 1 | This gene is a sensitive marker for neuroendocrine differentiation of human lung tumors. |
ISL2 | ISL LIM Homeobox 2 | Transcriptional factor that defines subclasses of motoneurons that segregate into columns in the spinal cord and select distinct axon pathways. |
ISX | Intestine-Specific Homeobox | Transcription factor that regulates gene expression in intestine. May participate in vitamin A metabolism most likely by regulating BCO1 expression in the intestine. |
JDP(var.2) | Jun Dimerization Protein 2 | Component of the AP-1 transcription factor that represses transactivation mediated by the Jun family of proteins. Involved in a variety of transcriptional responses associated with AP-1 such as UV-induced apoptosis, cell differentiation, tumorigenesis and antitumogeneris. |
JUN | Jun Proto-Oncogene | Transcription factor that recognizes and binds to the enhancer heptamer motif 5-TGACGTCA-3. signaling pathway stimulation. Promotes activity of NR5A1 when phosphorylated by HIPK3 leading to increased steroidogenic gene expression upon cAMP signaling pathway stimulation. |
JUND(var.2) | Jun D Proto-Oncogene | Transcription factor that recognizes and binds to the enhancer heptamer motif 5'-TGACGTCA-3'. |
KLF5 | Kruppel-like factor 5 (intestinal) | Transcription factor that binds to GC box promoter elements. Activates transcription of genes. |
LBX2 | Ladybird Homeobox 2 | Putative transcription factor. |
LIN54 | Lin-54 DREAM MuvB Core Complex Component | Is a component of the LIN, or DREAM, complex, an essential regulator of cell cycle genes |
MAX | MGA, MAX Dimerization Protein | The protein encoded by this gene is a member of the basic helix-loop-helix leucine zipper (bHLHZ) family of transcription factors |
MEIS1 | Meis Homeobox 1 | Homeobox genes, of which the most well-characterized category is represented by the HOX genes, play a crucial role in normal development. |
MEIS3 | Meis Homeobox 3 | Sequence-specific DNA binding and RNA polymerase II core promoter proximal region sequence-specific DNA binding transcription factor activity involved in positive regulation of transcription. |
MEOX1 | Mesenchyme Homeobox 1 | Mesodermal transcription factor that plays a key role in somitogenesis and is specifically required for sclerotome development. |
MEOX2 | Mesenchyme Homeobox 2 | The encoded protein may play a role in the regulation of vertebrate limb myogenesis. Mutations in the related mouse protein may be associated with craniofacial and/or skeletal abnormalities, in addition to neurovascular dysfunction observed in Alzheimer's disease. |
MGA | MGA, MAX Dimerization Protein | Functions as a dual-specificity transcription factor, regulating the expression of both MAX-network and T-box family target genes. Functions as a repressor or an activator. |
MIXL1 | Mix Paired-Like Homeobox | Regulates cell fate during development. |
MSX1 | Msh Homeobox 1 | Acts as a transcriptional repressor. May play a role in limb-pattern formation. Acts in cranofacial development and specifically in odontogenesis. |
MZF1 | Myeloid Zinc Finger 1 | Binds to target promoter DNA and functions as trancription regulator. May be one regulator of transcriptional events during hemopoietic development. Isoforms of this protein have been shown to exist at protein level. |
NEUROD2 | Neuronal Differentiation 2 | Transcriptional regulator implicated in neuronal determination. Mediates calcium-dependent transcription activation by binding to E box-containing promoter. Critical factor essential for the repression of the genetic program for neuronal differentiation; prevents the formation of synaptic vesicle clustering at active zone to the presynaptic membrane in postmitotic neurons. |
NFAT5 | Nuclear Factor Of Activated T-Cells 5, Tonicity-Responsive | Transcription factor involved in the transcriptional regulation of osmoprotective and inflammatory genes. Regulates hypertonicity-induced cellular accumulation of osmolytes. |
NFATC3 | Nuclear Factor Of Activated T-Cells, Cytoplasmic, Calcineurin-Dependent 3. | Acts as a regulator of transcriptional activation. Plays a role in the inducible expression of cytokine genes in T-cells, especially in the induction of the IL-2. |
165735777748000NFE2L1:MAFG | Nuclear Factor, Erythroid 2-Like 1 V-Maf Avian Musculoaponeurotic ibrosarcoma Oncogene Homolog G | Nuclear factor erythroid 2-related factor (Nrf2) coordinates the up-regulation of cytoprotective genes via the antioxidant response element (ARE). MafG is a ubiquitously expressed small maf protein that is involved in cell differentiation of erythrocytes. It dimerizes with P45 NF-E2 protein and activates expression of a and b-globin. |
NFIA | Nuclear Factor I/A | Recognizes and binds the palindromic sequence 5-TTGGCNNNNNGCCAA-3 present in viral and cellular promoters transcription and replication and in the origin of replication of adenovirus type 2. These proteins are individually capable of activating transcription and replication |
NFIC | Nuclear Factor I/C (CCAAT-Binding Transcription Factor) | Recognizes and binds the palindromic sequence 5'-TTGGCNNNNNGCCAA-3' present in viral and cellular promoters and in the origin of replication of adenovirus type 2. These proteins are individually capable of activating transcription and replication. |
NFIL3 | Nuclear factor, interleukin 3 regulated | Expression of interleukin-3 (IL3; MIM 147740) is restricted to activated T cells, natural killer (NK) cells, and mast cell lines. |
NFIX | Nuclear Factor I/X (CCAAT-Binding Transcription Factor) | Sequence-specific DNA binding transcription factor activity and RNA polymerase II distal enhancer sequence-specific DNA binding transcription factor activity. |
NKX2-3 | NK2 Homeobox 3 | This gene encodes a homeodomain-containing transcription factor. The encoded protein is a member of the NKX family of homeodomain transcription factors. |
NKX2-8 | NK2 Homeobox 8 | Transcriptional factor. Diseases associated with NKX2-8 include esophageal cancer. |
NKX3-1 | NK3 Homeobox 1 | This gene encodes a homeobox-containing transcription factor. This transcription factor functions as a negative regulator of epithelial cell growth in prostate tissue. |
NKX3-2 | NK3 Homeobox 2 | This gene encodes a member of the NK family of homeobox-containing proteins. Transcriptional repressor that acts as a negative regulator of chondrocyte maturation. |
NKX6-1 | NK6 Homeobox 1 | Transcription factor which binds to specific A/T-rich DNA sequences in the promoter regions of a number of genes. Involved in transcriptional regulation in islet beta cells. Binds to the insulin promoter and is involved in regulation of the insulin gene. |
NR2C2 | Nuclear Receptor Subfamily 2, Group C, Member 2 | Orphan nuclear receptor that can act as a repressor or activator of transcription. An important repressor of nuclear receptor signaling pathways such as retinoic acid receptor, retinoid X, vitamin D3 receptor, thyroid hormone receptor and estrogen receptor pathways. |
NR3C1 | Nuclear Receptor Subfamily 3, Group C, Member 1 (Glucocorticoid Receptor) | Glucocorticoids regulate carbohydrate, protein and fat metabolism, modulate immune responses through supression of chemokine and cytokine production and have critical roles in constitutive activity of the CNS, digestive, hematopoietic, renal and reproductive systems. The protein encoded by this gene plays a role in protecting cells from oxidative stress and damage induced by ionizing radiation. |
NR3C2 | Nuclear Receptor Subfamily 3, Group C, Member 2 | This gene encodes the mineralocorticoid receptor, which mediates aldosterone actions on salt and water balance within restricted target cells. |
NR4A2 | Nuclear Receptor Subfamily 4, Group A, Member 2 | Transcriptional regulator which is important for the differentiation and maintenance of meso-diencephalic dopaminergic (mdDA) neurons during development. |
NRL | Neural Retina Leucine Zipper | This gene encodes a basic motif-leucine zipper transcription factor of the Maf subfamily. The encoded protein is conserved among vertebrates and is a critical intrinsic regulator of photoceptor development and function. |
PDX1 | Pancreatic and duodenal homeobox 1 | Activates insulin, somatostatin, glucokinase, islet amyloid polypeptide and glucose transporter type 2 gene transcription. Particularly involved in glucose-dependent regulation of insulin gene transcription. |
PHOX2A | Paired-Like Homeobox 2a | May be involved in regulating the specificity of expression of the catecholamine biosynthetic genes. Acts as a transcription activator/factor. |
POU2F1 | POU Class 2 Homeobox 1 | Transcription factor that binds to the octamer motif (5-ATTTGCAT-3) and activates the promoters of the genes for some small nuclear RNAs (snRNA) and of genes such as those for histone H2B and immunoglobulins. Modulates transcriptiontransactivation by NR3C1, AR and PGR |
POU3F1 | POU Class 3 Homeobox 1 | Transcription factor that binds to the octamer motif (5-ATTTGCAT-3). Thought to be involved in early embryogenesis and neurogenesis |
POU3F2 | POU Class 3 Homeobox 2 | This gene encodes a member of the POU-III class of neural transcription factors. The encoded protein is involved in neuronal differentiation and enhances the activation of corticotropin-releasing hormone regulated genes. |
POU3F3 | POU Class 3 Homeobox 3 | This gene encodes a POU-domain containing protein that functions as a transcription factor. The encoded protein recognizes an octamer sequence in the DNA of target genes. This protein may play a role in development of the nervous system. |
POU3F4 | POU Class 3 Homeobox 4 | This gene encodes a member of the POU-III class of neural transcription factors. This family member plays a role in inner ear development. The protein is thought to be involved in the mediation of epigenetic signals which induce striatal neuron-precursor differentiation. |
POU5F1 | POU Class 5 Homeobox 1B | This gene encodes a transcription factor containing a POU homeodomain that plays a key role in embryonic development and stem cell pluripotency. Aberrant expression of this gene in adult tissues is associated with tumorigenesis. Forms a trimeric complex with SOX2 on DNA and controls the expression of a number of genes involved in embryonic development such as YES1, FGF4, UTF1 and ZFP206. |
POU5F1B | POU Class 5 Homeobox 1B | This intronless gene was thought to be a transcribed pseudogene of POU class 5 homeobox 1; however, it has been reported that this gene can encode a functional protein. The protein has been shown to be a weak transcriptional activator and may play a role in carcinogenesis and eye development. |
REL | V-Rel Avian Reticuloendotheliosis Viral Oncogene Homolog | Proto-oncogene that may play a role in differentiation and lymphopoiesis. NF-kappa-B is a pleiotropic transcription factor which is present in almost all cell types and is involved in many biological processes such as inflammation, immunity, differentiation, cell growth, tumorigenesis and apoptosis. |
RELA | V-Rel Avian Reticuloendotheliosis Viral Oncogene Homolog A | RELA is a Protein Coding gene. NF-kappa-B is composed of NFKB1 or NFKB2 bound to either REL, RELA, or RELB. The most abundant form of NF-kappa-B is NFKB1 complexed with the product of this gene, RELA. Among its related pathways are PI3K-Akt signaling pathway and PI-3K cascade. |
RHOXF1 | Rhox Homeobox Family, Member 1 | This gene is a member of the PEPP subfamily of paired-like homoebox genes. The gene may be regulated by androgens and epigenetic mechanisms. The encoded nuclear protein is likely a transcription factor that may play a role in human reproduction. |
RUNX1 | Runt-Related Transcription Factor 1 | Core binding factor (CBF) is a heterodimeric transcription factor that binds to the core element of many enhancers and promoters. The protein encoded by this gene represents the alpha subunit of CBF and is thought to be involved in the development of normal hematopoiesis. |
RUNX3 | Runt-Related Transcription Factor 3 | This gene encodes a member of the runt domain-containing family of transcription factors. Found in a number of enhancers and promoters, and can either activate or suppress transcription. It also interacts with other transcription factors. It functions as a tumor suppressor, and the gene is frequently deleted or transcriptionally silenced in cancer. |
SOX10 | SRY (sex determining region Y)-box 10 | This gene encodes a member of the SOX (SRY-related HMG-box) family of transcription factors involved in the regulation of embryonic development and in the determination of the cell fate. |
SREBF1 | Sterol regulatory element binding transcription factor 1 | Transcriptional activator required for lipid homeostasis. Regulates transcription of the LDL receptor gene as well as the fatty acid and to a lesser degree the cholesterol synthesis pathway. |
SREBF2 | Sterol regulatory element binding transcription factor 2 | This gene encodes a member of the a ubiquitously expressed transcription factor that controls cholesterol homeostasis by regulating transcription of sterol-regulated genes. The encoded protein contains a basic helix-loop-helix-leucine zipper (bHLH-Zip) domain and binds the sterol regulatory element 1 motif. |
SRY | SRY (sex determining region Y)-box 10 | Transcriptional regulator that controls a genetic switch in male development. It is necessary and sufficient for initiating male sex determination by directing the development of supporting cell precursors |
TBP | TATA Box Binding Protein | General transcription factor that functions at the core of the DNA-binding multiprotein factor TFIID. Binding of TFIID to the TATA box is the initial transcriptional step of the pre-initiation complex (PIC), playing a role in the activation of eukaryotic genes transcribed by RNA polymerase II. |
TBX4 | T-Box 4 | Involved in the transcriptional regulation of genes required for mesoderm differentiation. |
TBX5 | T-Box 5 | This gene is a member of a phylogenetically conserved family of genes that share a common DNA-binding domain, the T-box. T-box genes encode transcription factors involved in the regulation of developmental processes. |
TEAD1 | TEA Domain Family Member 1 | This gene encodes a ubiquitous transcriptional enhancer factor that is a member of the TEA/ATTS domain family. This protein directs the transactivation of a wide variety of genes and, in placental cells, also acts as a transcriptional repressor. |
TEAD3 | TEA Domain Family Member 3 | This gene product is a member of the transcriptional enhancer factor (TEF) family of transcription factors, which contain the TEA/ATTS DNA-binding domain. It is predominantly expressed in the placenta and is involved in the transactivation of the chorionic somatomammotropin-B gene enhancer. |
TEAD4 | TEA Domain Family Member 4 | It is preferentially expressed in the skeletal muscle, and binds to the M-CAT regulatory element found in promoters of muscle-specific genes to direct their gene expression. |
TFAP2A | Transcription Factor AP-2 Alpha (Activating Enhancer Binding Protein 2 Alpha) | The protein encoded by this gene is a transcription factor that binds the consensus sequence 5'-GCCNNNGGC-3' and activates the transcription of some genes while inhibiting the transcription of others. |
TFAP2B | Transcription Factor AP-2 Beta (Activating Enhancer Binding Protein 2 Beta) | This gene encodes a member of the AP-2 family of transcription factors. AP-2 proteins form homo- or hetero-dimers with other AP-2 family members and bind specific DNA sequences. This protein functions as both a transcriptional activator and repressor. |
TFAP2C | Transcription Factor AP-2 Gamma (Activating Enhancer Binding Protein 2 Gamma) | Sequence-specific DNA-binding protein that interacts with inducible viral and cellular enhancer elements to regulate transcription of selected genes. AP-2 factors bind to the consensus sequence 5'-GCCNNNGGC-3' and activate genes involved in a large spectrum of important biological functions including proper eye, face, body wall, limb and neural tube development. |
TFEC | Transcription Factor EC | This gene encodes a member of the micropthalmia (MiT) family of basic helix-loop-helix leucine zipper differentiation. MiT transcription factors regulate the expression of target genes by binding to E-box recognition sequences as homo- or heterodimers, and play roles in multiple cellular processes including survival, growth and and differentiation. |
THAP1 | THAP domain containing, apoptosis associated protein 1 | DNA-binding transcription regulator that regulates endothelial cell proliferation and G1/S cell-cycle progression. |
USF1 | Upstream Transcription Factor 1 | This gene encodes a member of the basic helix-loop-helix leucine zipper family, and can function as a cellular.transcription factor. |
USF2 | Upstream Transcription Factor 2, C-Fos Interacting | Transcription factor that binds to a symmetrical DNA sequence (E-boxes) (5-CACGTG-3) that is found in a variety of viral and cellular promoters. |
YY1 | YY1 Transcription Factor | Multifunctional transcription factor that exhibits positive and negative control on a large number of cellular and viral genes by binding to sites overlapping the transcription start site |
YY2 | YY2 Transcription Factor | The protein encoded by this gene is a transcription factor that includes several Kruppel-like zinc fingers in its C-terminal region. It possesses both activation and repression domains, and it can therefore have both positive and negative effects on the transcription of target genes. |
ZNF354C | Zinc finger protein 354C | May function as a transcription repressor. |
EPAS1 (HIF2a) | ||||||
SNP | Allele | TFs | # of Sites | TFBS | Strand | |
---|---|---|---|---|---|---|
rs56721780 | G | HLTF | 1 | agcCtTtggg | plus | |
g=14% | ||||||
HNF4G | 1 | gaaaccCAaAGgcta | minus | |||
c=30% | ||||||
NFAT5 | 1 | ggTTtCccag | plus | |||
g=21% | ||||||
REL | 1 | tgggttTccC | plus | |||
g=6% | ||||||
REL | 1 | ttgggtTtcC | plus | |||
g=53% | ||||||
RELA | 1 | ttGggtTtCC | plus | |||
g=39% | ||||||
RELA | 1 | tgGgttTcCC | plus | |||
G=100% | ||||||
RUNX1 | 2 | gccTttGGgtt | plus | |||
G=92% | ||||||
SOX10 | 76 | cttTgg | plus | |||
g=5% | ||||||
TFAP2A | 1 | acCCaaagGct | minus | |||
C=99% | ||||||
TFAP2A | 1 | agCCtttgGgt | plus | |||
G=99% | ||||||
TFAP2A(var.2) | 1 | aaCCcaaaGgct | minus | |||
C=99% | ||||||
TFAP2A(var.2) | 1 | agCCtttgGgtt | plus | |||
G=94% | ||||||
TFAP2B | 1 | aaCCcaaaGgct | minus | |||
C=99% | ||||||
TFAP2B | 1 | agCCtttgGgtt | plus | |||
G=97% | ||||||
TFAP2B(var.2) | 1 | acCCaaaGGCt | minus | |||
C=100% | ||||||
TFAP2C | 1 | aacCcaaaGgct | minus | |||
C=97% | ||||||
TFAP2C | 1 | agcCtttgGgtt | plus | |||
G=97% | ||||||
TFAP2C(var.2) | 1 | acCCaaagGCt | minus | |||
C=98% | ||||||
TFAP2C(var.2) | 1 | agCCtttgGgt | plus | |||
G=99% | ||||||
C | FOXP3 | 20 | gcaaAgg | minus | ||
g=65% | ||||||
HLTF | 1 | agcCtTtgcg | plus | |||
c=24% | ||||||
HNF4G | 1 | gaaacgCAaAGgcta | minus | |||
g=24% | ||||||
HOXA5 | 2 | cgcaaagg | minus | |||
g=31% | ||||||
NFAT5 | 1 | cgTTtCccag | plus | |||
c=21% | ||||||
SOX10 | 75 | cttTgc | plus | |||
c=0% | ||||||
rs6756667 | A | ATF4 | 1 | aggTGAtGccAca | minus | |
t=48% | ||||||
CEBP | 1 | gTggCatcAcc | plus | |||
a=78% | ||||||
CEBP | 3 | gTgatgccAc | minus | |||
t=10% | ||||||
CEBP | 2 | aTgccacAAt | minus | |||
T=100% | ||||||
CEBP | 3 | gTgatgccAc | minus | |||
t=9% | ||||||
CEBP | 2 | aTgccacAAt | minus | |||
T=100% | ||||||
CEBP | 3 | gTgatgccAc | minus | |||
t=18% | ||||||
CEBP | 2 | aTgccacAAt | minus | |||
T=98% | ||||||
CREB1 | 6 | tGAtGcca | minus | |||
t=0% | ||||||
DBP | 1 | ggTgAtgccAca | minus | |||
t=38% | ||||||
DBP | 1 | tgTggcatcAcc | plus | |||
a=45% | ||||||
FIGLA | 1 | atCAcctTac | plus | |||
a=50% | ||||||
FOS::JUN | 34 | TgccacA | minus | |||
T=94% | ||||||
FOXH1 | 3 | gtgAtgccACa | minus | |||
t=5% | ||||||
HIC1 | 2 | aTgCCacaa | minus | |||
T=95% | ||||||
HIC2 | 2 | aTgCCacaa | minus | |||
T=98% | ||||||
HLF | 1 | ggTgatgccaca | minus | |||
t=11% | ||||||
HLF | 1 | tgTggcatcacc | plus | |||
a=33% | ||||||
HOXA2 | 2 | tggcATcAcc | plus | |||
A=90% | ||||||
JUN | 1 | aaggTGAtGccAc | minus | |||
t=62% | ||||||
JUND (var.2) | 1 | taaggTGAtgccAca | minus | |||
t=44% | ||||||
JUND (var.2) | 1 | ggtgaTGccacaAtc | minus | |||
T=100% | ||||||
MEIS1 | 15 | atGcCac | minus | |||
t=77% | ||||||
MEIS1 | 15 | gtGgCat | plus | |||
a=83% | ||||||
MEIS3 | 6 | gtGgCAtc | plus | |||
A=91% | ||||||
NFE2L1::MafG | 40 | caTcAc | plus | |||
a=85% | ||||||
NFIA | 1 | gaTGCCAcaa | minus | |||
T=100% | ||||||
NFIC | 84 | gTGGca | plus | |||
a=48% | ||||||
NFIX | 4 | gatGCCAca | minus | |||
t=18% | ||||||
NFIX | 4 | tgtGgCAtc | plus | |||
A=92% | ||||||
NRL | 1 | aaggtGatgcc | minus | |||
t=86% | ||||||
RHOXF1 | 3 | gtgAtgcc | minus | |||
t=68% | ||||||
RUNX1 | 1 | gatTgtGGcat | plus | |||
a=7% | ||||||
RUNX3 | 2 | atgCCaCAat | minus | |||
t=6% | ||||||
SREBF1 | 1 | aTCAccttac | plus | |||
a=48% | ||||||
SREBF2 | 2 | gTaaggTGAt | minus | |||
t=57% | ||||||
SREBF2(var.2) | 1 | gtaAgGTGAt | minus | |||
t=47% | ||||||
SREBF2(var.2) | 1 | atCACcTtAc | plus | |||
a=73% | ||||||
TBX4 | 1 | agGTGatg | minus | |||
t=45% | ||||||
TBX5 | 1 | agGtGatg | minus | |||
t=51% | ||||||
TEAD3 | 6 | tgATgCCa | minus | |||
T=100% | ||||||
TFEC | 1 | gtaAggtGat | minus | |||
t=49% | ||||||
THAP1 | 2 | atgCCacaa | minus | |||
t=63% | ||||||
G | ATF4 | 1 | aggTGAcGccAca | minus | ||
c=38% | ||||||
ATF7 | 1 | aggTGACGccAcaa | minus | |||
C=99% | ||||||
ATF7 | 1 | ttgTGgCGTcAcct | plus | |||
G=99% | ||||||
CEBP | 1 | gTgacgccAc | minus | |||
c=0% | ||||||
CEBP | 1 | gtggcgtcAc | plus | |||
g=80% | ||||||
CEBP | 1 | gTgacgccAc | minus | |||
c=88% | ||||||
CEBP | 1 | gTggcgtcAc | plus | |||
g=86% | ||||||
CEBP | 1 | gTgacgccAc | minus | |||
c=81% | ||||||
CEBP | 1 | gTggcgtcAc | plus | |||
g=73% | ||||||
CREB1 | 1 | tGAcGcca | minus | |||
c=18% | ||||||
CREB1 | 1 | tGgcGtca | plus | |||
G=91% | ||||||
DBP | 1 | ggTgAcgccAca | minus | |||
c=62% | ||||||
DBP | 1 | tgTggcgtcAcc | plus | |||
g=55% | ||||||
GMEB2 | 1 | tgACGcca | minus | |||
C=100% | ||||||
HLF | 1 | ggTgacgccaca | minus | |||
c=83% | ||||||
HLF | 1 | tgTggcgtcacc | plus | |||
g=67% | ||||||
INSM1 | 1 | tgtaaGGtGacg | minus | |||
c=8% | ||||||
JDP2(var.2) | 1 | ggTGACGcCAca | minus | |||
C=99% | ||||||
JDP2(var.2) | 1 | tgTGgCGTCAcc | plus | |||
G=98% | ||||||
JUN | 1 | aaggTGAcGccAc | minus | |||
c=16% | ||||||
JUN | 1 | attgTGgcGtcAc | plus | |||
G=97% | ||||||
JUND (var.2) | 1 | taaggTGAcgccAca | minus | |||
c=31% | ||||||
MEIS1 | 2 | gtGACgc | minus | |||
C=99% | ||||||
MEIS3 | 1 | gtGACgcc | minus | |||
C=97% | ||||||
MGA | 1 | tgGcGtcA | plus | |||
G=100% | ||||||
NFE2L1::MafG | 56 | ggTGAc | minus | |||
c=76% | ||||||
NFIX | 2 | gacGCCAca | minus | |||
c=30% | ||||||
NR4A2 | 7 | aAGgtgAc | minus | |||
c=57% | ||||||
RUNX1 | 1 | gatTgtGGcgt | plus | |||
g=4% | ||||||
RUNX3 | 1 | acgCCaCAat | minus | |||
c=9% | ||||||
SREBF1 | 1 | gTgAcgccac | minus | |||
c=88% | ||||||
SREBF1 | 1 | gTCAccttac | plus | |||
g=28% | ||||||
SREBF2 | 1 | gTaaggTGAc | minus | |||
c=34% | ||||||
SREBF2 | 1 | gTGgcgTcAc | plus | |||
g=77% | ||||||
SREBF2(var.2) | 1 | gtaAgGTGAc | minus | |||
c=53% | ||||||
SREBF2(var.2) | 1 | gtCACcTtAc | plus | |||
g=27% | ||||||
TBX4 | 4 | agGTGacg | minus | |||
c=29% | ||||||
TBX4 | 1 | tgGcGtca | plus | |||
G=100% | ||||||
TBX5 | 4 | agGtGacg | minus | |||
c=34% | ||||||
TFEC | 1 | gtaAggtGac | minus | |||
c=50% | ||||||
USF1 | 1 | gtaAggTGacg | minus | |||
c=78% | ||||||
USF2 | 1 | gtaAgGTGacg | minus | |||
c=5% | ||||||
ZNF354C | 21 | cgCCAC | minus | |||
c=38% | ||||||
rs7589621 | G | BARX1 | 1 | gtacTTAt | plus | |
g=33% | ||||||
BSX | 1 | gtacTTAt | plus | |||
g=26% | ||||||
DBP | 1 | aagTAcgTAAag | minus | |||
c=62% | ||||||
DBP | 1 | ctTTAcgTActt | plus | |||
g=55% | ||||||
DLX6 | 1 | gtacTTAt | plus | |||
g=26% | ||||||
EN1 | 1 | gtAcTtAt | plus | |||
g=20% | ||||||
EN2 | 1 | cgtacTtAtc | plus | |||
g=25% | ||||||
ESX1 | 1 | cgtacTtAtc | plus | |||
g=7% | ||||||
EVX1 | 1 | gataAgTAcg | minus | |||
c=27% | ||||||
EVX1 | 1 | cgtacTTAtc | plus | |||
g=33% | ||||||
EVX2 | 1 | gataAgTAcg | minus | |||
c=24% | ||||||
EVX2 | 1 | cgtacTTAtc | plus | |||
g=34% | ||||||
FOXA1 | 1 | acttTacgtaCttat | plus | |||
g=6% | ||||||
GBX1 | 1 | cgtAcTTAtc | plus | |||
g=25% | ||||||
GBX2 | 1 | cgtAcTTAtc | plus | |||
g=31% | ||||||
GMEB2 | 1 | gtACGTaa | minus | |||
C=98% | ||||||
GMEB2 | 1 | ttACGTac | plus | |||
G=98% | ||||||
GSX1 | 1 | cgtacttAtc | plus | |||
g=22% | ||||||
HLF | 1 | aagtacgtaaag | minus | |||
c=83% | ||||||
HLF | 1 | ctTtacgtactt | plus | |||
g=67% | ||||||
HLTF | 1 | gtaCtTatcc | plus | |||
g=20% | ||||||
HMBOX1 | 1 | cgtacTtAtc | plus | |||
g=15% | ||||||
HOXA2 | 1 | cgtacTtAtc | plus | |||
g=30% | ||||||
HOXB2 | 1 | cgtacTtAtc | plus | |||
g=33% | ||||||
HOXB3 | 1 | cgtacTTAtc | plus | |||
g=31% | ||||||
ISL2 | 1 | gtAcTtat | plus | |||
g=6% | ||||||
ISL2 | 1 | gtAcgtaa | minus | |||
c=34% | ||||||
ISX | 1 | gtAcTTAt | plus | |||
g=18% | ||||||
LBX2 | 1 | cgtacTtAtc | plus | |||
g=16% | ||||||
MEOX1 | 1 | cgtacTtAtc | plus | |||
g=45% | ||||||
MEOX2 | 1 | cgtacTtatc | plus | |||
g=63% | ||||||
MIXL1 | 1 | cgtacTtatc | plus | |||
g=20% | ||||||
MSX1 | 1 | gtacTtAt | plus | |||
g=26% | ||||||
NFATC3 | 1 | actTTaCgta | plus | |||
g=34% | ||||||
NFIL3 | 1 | TTAcGTActta | plus | |||
G=91% | ||||||
1379855743648500 | NKX2-3 | 1 | cgtACTTatc | plus | ||
g=48% | ||||||
NKX2-8 | 1 | gtaCTtatc | plus | |||
g=33% | ||||||
NKX3-2 | 1 | cgtACTTat | plus | |||
g=50% | ||||||
NKX6-1 | 1 | gtacTTAt | plus | |||
g=33% | ||||||
NR3C1 | 1 | aaGtACgtaaaGTgCct | minus | |||
C=100% | ||||||
NR3C1 | 1 | agGcACtttacGTaCtt | plus | |||
G=100% | ||||||
NR3C2 | 1 | aaGtACgtaaaGTgCct | minus | |||
C=100% | ||||||
NR3C2 | 1 | agGcACtttacGTaCtt | plus | |||
G=100% | ||||||
PDX1 | 1 | gtacTTAt | plus | |||
g=34% | ||||||
PHOX2A | 1 | ttAcgtacTta | plus | |||
g=20% | ||||||
POU2F1 | 1 | agtAcgtaaAgt | minus | |||
c=5% | ||||||
POU5F1B | 1 | tAcgtaaAg | minus | |||
c=4% | ||||||
RORA(var.2) | 1 | gatAagTacGTaAa | minus | |||
c=0% | ||||||
A | BARX1 | 4 | atacTTAt | plus | ||
a=28% | ||||||
BSX | 4 | atacTTAt | plus | |||
a=16% | ||||||
DBP | 1 | aagTAtgTAAag | minus | |||
t=38% | ||||||
DLX6 | 4 | atacTTAt | plus | |||
a=23% | ||||||
EN2 | 1 | catacTtAtc | plus | |||
a=13% | ||||||
ESX1 | 1 | catacTtAtc | plus | |||
a=18% | ||||||
EVX1 | 1 | gataAgTAtg | minus | |||
t=22% | ||||||
EVX2 | 1 | gataAgTAtg | minus | |||
t=18% | ||||||
EVX2 | 1 | catacTTAtc | plus | |||
a=25% | ||||||
FOXC1 | 2 | aggataAgtAt | minus | |||
t=64% | ||||||
GMEB2 | 2 | gtAtGTaa | minus | |||
t=0% | ||||||
GMEB2 | 2 | ttACaTac | plus | |||
a=0% | ||||||
GSX1 | 1 | catacttAtc | plus | |||
a=22% | ||||||
1370330934148500 | HLF | 1 | ctTtacatactt | plus | ||
a=33% | ||||||
1370330972248500 | HLTF | 2 | ataCtTatcc | plus | ||
a=27% | ||||||
HNF4G | 1 | agtatgtAaAGtgCc | minus | |||
t=37% | ||||||
HMBOX1 | 1 | catacTtAtc | plus | |||
a=10% | ||||||
HOXA2 | 1 | catacTtAtc | plus | |||
a=18% | ||||||
HOXB2 | 1 | catacTtAtc | plus | |||
a=21% | ||||||
HOXB3 | 1 | catacTTAtc | ||||
a=20% | plus | |||||
ISL2 | 4 | atAcTtat | plus | |||
a=28% | ||||||
ISX | 4 | atAcTTAt | plus | |||
a=15% | ||||||
LBX2 | 1 | catacTtAtc | plus | |||
a=12% | ||||||
LIN54 | 2 | cTTtacAta | plus | |||
A=100% | ||||||
NFATC3 | 1 | actTTaCata | plus | |||
a=55% | ||||||
NFIL3 | 1 | gTAtGTAAagt | minus | |||
t=65% | ||||||
NEUROD2 | 3 | taCaTActta | plus | |||
a=77% | ||||||
NKX2-3 | 1 | cacACTTatc | plus | |||
a=48% | ||||||
NKX2-8 | 6 | ataCTttc | plus | |||
a=23% | ||||||
NKX3-1 | 1 | catACTTat | plus | |||
a=18% | ||||||
NKX3-2 | 1 | catACTTat | plus | |||
a=30% | ||||||
NKX6-1 | 4 | atacTTAt | plus | |||
a=18% | ||||||
PDX1 | 4 | atacTTAt | plus | |||
a=21% | ||||||
POU2F1 | 1 | agtATgtaaAgt | minus | |||
T=92% | ||||||
POU3F1 | 1 | gtATgtaaAgtg | minus | |||
T=95% | ||||||
POU3F2 | 1 | gtATGtaaAgtg | minus | |||
T=95% | ||||||
POU3F3 | 1 | agtATGtaaAgtg | minus | |||
T=90% | ||||||
POU3F4 | 6 | tATGaaAT | minus | |||
T=99% | ||||||
POU5F1B | 2 | tATgtaaAg | minus | |||
T=92% | ||||||
RORA(var.2) | 1 | gatAagTatGTaAa | minus | |||
t=0% | ||||||
TBP | 1 | gtATgtAaagtgcct | minus | |||
T=97% | ||||||
TEAD1 | 3 | tacATaCtta | plus | |||
A=92% | ||||||
TEAD3 | 9 | acATaCtt | plus | |||
A=100% | ||||||
TEAD4 | 3 | tacATaCtta | plus | |||
A=94% | ||||||
rs1868092 | A | E2F6 | 2 | gaGatGGAggt | plus | |
a=16% | ||||||
HLTF | 2 | ctcCaTctca | minus | |||
t=63% | ||||||
HLTF | 3 | gcaCtTtgag | plus | |||
a=25% | ||||||
HNF4G | 1 | ccatctCAaAGtgca | minus | |||
t=30% | ||||||
HOXA5 | 10 | ctcaaagt | minus | |||
t=25% | ||||||
NKX2-3 | 1 | tgCACTTtga | plus | |||
a=34% | ||||||
NR2C2 | 1 | ccatctcaaaGtgca | minus | |||
t=81% | ||||||
NKX2-8 | 3 | gcaCTttga | plus | |||
a=37% | ||||||
SOX10 | 91 | cttTga | plus | |||
a=0% | ||||||
THAP1 | 5 | cctCCatct | minus | |||
t=16% | ||||||
YY1 | 1 | tgAgATGGaggt | plus | |||
A=94% | ||||||
YY2 | 1 | gacCtCCATct | minus | |||
t=40% | ||||||
G | E2F6 | 1 | ggGatGGAggt | plus | ||
g=69% | ||||||
HIC2 | 1 | aTcCCacaa | minus | |||
C=99% | ||||||
HLTF | 5 | gcaCtTtggg | plus | |||
g=25% | ||||||
HNF4G | 1 | ccatccCAaAGtgca | minus | |||
c=30% | ||||||
KLF5 | 1 | ctccatCCCa | minus | |||
C=100% | ||||||
KLF5 | 2 | cctcCatCCc | minus | |||
C=97% | ||||||
MZF1 | 85 | ttGGGA | plus | |||
G=95% | ||||||
NFIA | 1 | caTcCCAAag | minus | |||
C=100% | ||||||
NFIC | 85 | tTGGga | plus | |||
G=96% | ||||||
NFIX | 2 | catcCCAaa | minus | |||
C=90% | ||||||
NKX2-3 | 1 | tgCACTTtgg | plus | |||
g=30% | ||||||
NKX2-8 | 7 | gcaCTttgg | plus | |||
g=30% | ||||||
SOX10 | 76 | cttTgg | plus | |||
g=5% | ||||||
TEAD1 | 1 | tccATcCcaa | minus | |||
C=95% | ||||||
THAP1 | 2 | catCCcaaa | minus | |||
C=98% | ||||||
THAP1 | 3 | cctCCatcc | minus | |||
c=17% |
The common rs6756667 SNP EPAS1-A allele creates thirteen unique punitive TBFS for the CEBPa, FIGLA, FOS::JUN, FOXH1, HIC1 & 2, HOXA2, NFIA, NFIC, NRL, RHOXFA, TEAD3 and THAP1 TFs, which are involved with enhancers, folliculogenesis, signal transduction, tumor suppression, cell specific positional identities, transcription and replication, photoceptor development and function, transcriptional enhancer and transcription regulation respectively (Table 2 & Table 3). The minor EPAS1 -G allele creates nine unique punitive TFBS for the ATF7, GMEB2, INSM1, JDP2 (var.2), MGA, NR4A2, USF1 & 2, and ZNF354C TFs which are involved with early cell signaling, DNA replication, neuroendocrine differentiation of human lung tumors, tumorigenesis and anti-tumorigenesis, transcription activator or repressor, transcription regulator, cellular transcriptional factor and transcription repression, respectively (Table 3, Table 2). There are also twenty-one conserved TFBS for the ATF4, CEBPb, CEBPd, CEBPe, CREB1, DBP, HLF, JUN, JUND (var.2), MEIS1 & 3, NFE2L1:: MAFG, NFIX, RUNX1 & 3, SREBF1 & 2, SREBF2 (var.2), TBX4 & 5 and TFEC TFs which are c-AMP-response element binding proteins, DNA-binding proteins, immune and inflammatory responses, circadian rhythm, transcription activation, signaling pathway stimulator and enhancer, normal development, up-regulation of cytoprotective genes via the antioxidant response element, enhancer sequence-specific DNA binding TF, development of normal hematopoiesis, tumor suppressor, lipid homeostasis, cholesterol homeostasis, mesoderm differentiation and cellular processes, respectively (Table 2 & Table 3).
The common rs7589621 SNP EPAS1-G allele creates ten unique punitive TFBS for the EN1, FOXA1, GBX2, MEOX1 & 2, MIXL1, MSX1, NR3C1, NR3C2 and PHOX2A TFs which are involved with controlling development, embryonic development, cell pluripotency, sclerotome development, transcriptional repressor, regulation of carbohydrate, protein and fat metabolism, mediates aldosterone actions on salt and water balance, and catecholamine biosynthetic genes, respectively (Table 2 & Table 3). The minor rs7589621 SNP EPAS1-A allele creates eleven unique punitive TFBS for the FOXC1, HNF4G, LIN54, NEUROD2, POU3F1-4, TEAD1, and TEAD3 & 4, TFs which are involved with cell viability and resistance to oxidative stress, transcriptional repression, regulation of cell cycle genes, neuronal determination, early embryogenesis and neurogenesis, and enhancer for transcription, respectively (Table 2 & Table 3). There are also thirty-one conserved TFBS for the BARX1, BSX, DBP, DLX6, EN2, ESX1, EVX1 & 2, GMEB2, GSX1, HLF, HLTF, HMBOX1, HOXA2, HOXB2, HOXB3, ISL2, ISX, LBX2, NFATC3, NKX2-3, RORA AND TBP TFs which are involved with craniofacial development, transcriptional activation, circadian rhythm, roles in forebrain, central nervous system, placental development, specification of neuronal cell types, DNA replication, pituitary development, transcription activation, altering chromatin structure, transcriptional repressor, regulates development, axon pathways, regulates gene expression in the intestine, induces expression of cytokine genes in T-cells, homeodomain, nuclear hormone receptors, and the pre-initiation complex, respectively (Table 2 & Table 3).
The common rs1868092 SNP EPAS1-A allele creates four unique TFBS for the HOXA5, NR2C2, YY1 & 2 TFs which are involved with the development regulatory system, repression or activation of transcription, and positive and negative control of transcription at the transcription start site, respectively (Table 2 & Table 3). The minor EPAS1-G allele creates seven unique TFBS for the HIC2, KLF5, MZF1, NFIA, NFIC, NFIX and TEAD1 TFs which are involved with transcriptional repression, transcription, hemopoietic development, transcription and replication, and enhancer of transcription, respectively (Table 2 & Table 3). There are also seven conserved punitive TFBS for the E2F6, HLTF, HNF4G, NKX2-3, NKX2-8, SOX10 and THAP1 TFs which are involved in tumor suppression, chromatin structure, transcriptional activation, homeodomain, regulatory, and regulates endothelial cell proliferation, respectively (Table 2 & Table 3).
Discussion
Genome-wide association studies (GWAS) over the last decade have identified nearly 6,500 disease or trait-predisposing SNPs where only 7% of these are located in protein-coding regions of the genome 44, 45 and the remaining 93% are located within non-coding areas 46, 47 such as regulatory or intergenic regions. SNPs which occur in the putative regulatory region of a gene where a single base change in the DNA sequence of a potential TFBS may affect the process of gene expression are drawing more attention 23, 25, 48. A SNP in a TFBS can have multiple consequences. Often the SNP does not change the TFBS interaction nor does it alter gene expression since a transcriptional factor (TF) will usually recognize a number of different binding sites in the gene. In some cases the SNP may increase or decrease the TF binding which results in allele-specific gene expression. In rare cases, a SNP may eliminate the natural binding site or generate a new binding site. In which cases the gene is no longer regulated by the original TF. Therefore, functional rSNPs in TFBS may result in differences in gene expression, phenotypes and susceptibility to environmental exposure 48. Examples of rSNPs associated with disease susceptibility are numerous and several reviews have been published 48, 49, 50, 51.
The rs56721780 rSNP EPAS1-G allele G (+ strand) or C ( located in the unique RELA, RUNX1, TFAP2A, B & C TFBS have a 100%, 92% and 94-100% occurrence, respectively in humans (Table 3). Since these binding sites (BS) occur only once in the gene except for the RUNX1 TFBS which occurs twice, the rSNP G allele should have a tremendous impact on gene regulation by these TFs (Table 3). The minor rs56721780 rSNP EPAS1-C allele C (+ strand) or G ( located in the unique FOXP3 and HOXA5 TFBS have a 65% and 31% occurrence, respectively in humans. Since these TFBS have a low occurrence in humans and occur more than once in the gene, the respective TF would not be expected to have much impact on the regulation of the EPAS1 gene (Figure 1, Table 3).
Figure 1. Double stranded DNA from the EPAS1 gene showing the potential TFBS for twenty different TFs which can bind their respective DNA sequence either above (+) or below (-) the duplex (cf. Table 3). The rs56721780 rSNP common EPAS1-G allele is found in each of these TFBS. As shown, this rSNP is located in the promoter region of the EPAS1 gene. Also included with the potential TFBS is their % sequence homology to the duplex.
The rs6756667 rSNP EPAS1-A allele [A (+ strand) or T (- strand) located in the unique CEBPa, FOS::JUN, HIC 1 & 2, HOXA2, NFIA and NRL TFBS has a 78, 94, 95, 98, 90, 100, 86 and 100% occurrence, respectively in humans (Figure 2, Table 3); however, only the CEBPa, NFIA and NRL TFBS occur only once in the gene. Consequently the corresponding three TFs should impact the regulation of the EPAS1 gene. The minor rs6756667 rSNP EPAS1-G allele G (+ strand) or T ( located in the unique ATF7, GMEB2, JDP2, and MGA TFBS have a 99, 100, 99, 100% occurrence, respectively in humans and all only occur once in the EPAS1 gene. Consequently, the TFs for these TFBS could have some impact on the regulation of the EPAS1 gene (Table 2 & Table 3).
Figure 2. Double stranded DNA from the EPAS1 gene showing the potential TFBS for forty eight different TFs which can bind their respective DNA sequence either above (+) or below (-) the duplex (cf. Table 3). The rs7589621 rSNP common EPAS1-G allele is found in each of these TFBS. As shown, this rSNP is located in intron two of the EPAS1 gene. Also included with the potential TFBS is their % sequence homology to the duplex.
The rs7589621 rSNP EPAS1-G allele G (+ strand) or C ( located in the unique NR3C1 & 2 TFBSs have a 100% occurrence in humans and are found only once in the EPAS1 gene (Table 3). Consequently, the corresponding glucocorticoid and mineralocorticoid nuclear receptors which bind their respective BS should have a major impact on the regulation of the EPAS1 gene (Table 2 & Table 3). The minor rs7589621 rSNP EPAS1-A allele A (+ strand) or T ( located in the unique LIN54, POU3F1-4, TEAD1, 3, 4 TFBS has a 100, 95, 95, 90, 99, 92, 100 and 94% occurrence, respectively in humans (Table 3). However, only the POU3F1-3 TFBS occur once in the EPAS1 gene, consequently, their corresponding TFs should have an impact on the EPAS1 gene regulation. The remaining TFBS occur multiple times in the gene and would not be expected to have much impact on gene regulation (Figure 2, Table 2 & Table 3).
The rs1868092 rSNP EPAS1-A allele A (+ strand) or T ( located in the unique NR2C2 and YY1 TFBS have a 81 and 94%% occurrence, respectively in humans and only occur once in the EPAS1 gene (Tables 2 & 3). The NR2C2 orphan nuclear receptor which can occur as a repressor of activator of transcription and the YY1 TF which exhibits both positive and negative control of transcript should have an impact on the regulation of the EPAS1 gene (Table 2). The minor rs1868092 rSNP EPAS1-G allele G (+ strand) or C ( located in the unique HIC2, KLF5, NFIA, and TEAD1 TFBS have a 99, 100, 100 and 95% occurrence, respectively in humans and occur only once in the EPAS1 gene (Table 3). Since the corresponding TFs function as activators, enhancers and repressors, the occurrence of these TFBS should impact the regulation of the EPAS1 gene (Table 2 & Table 3).
Human diseases or conditions can be associated with rSNPs of the EPAS1 gene as illustrated above. What a change in the rSNP alleles can do, is to alter the DNA landscape around the SNP for potential TFs to attach and regulate a gene. As an example, the punitive TFBS associated with the rs56721780 common rSNP EPAS1-G allele from Table 3 as illustrated in Figure 1 as well as the rs7589621 common rSNP STAT4-G allele as illustrated in Figure 2. As can be seen in Table 3, these potential TFBS change when an individual carries the alternate allele. The importance of this has been illustrated in Figure 1 with the punitive TFAP2A, B & C TFBS where the common G allele has binding sites for these TFs and the minor C allele does not. The TFAP2A, B & C TFs act as activators and repressors and are involved in a large spectrum of biological functions such as proper eye, face, body wall, limb and neural tube development (cf. Table 2). Another example would be the punitive NR3C1 & 2 TFBS where the common rs7589621 rSNP G allele has created these binding sites for the glucocorticoid and mineralocorticoid nuclear receptors which regulate carbohydrate, protein and fat metabolism while the minor A allele has eliminated these TFBS (Figure 2, Table 2 & Table 3). Other examples can be found in Table 3.
Conclusions
SNPs that alter the TFBS are not only found in the promoter regions but in the introns, exons and the UTRs of a gene. The nucleus of the cell is where epigenetic alterations occur and TFs operate to convert chromosomes into single stranded DNA for mRNA transcription while it is the cytoplasm where mRNA is processed by separating exons and introns for protein translation. Consequently, it doesn’t matter where TFs bind the DNA in the nucleus because it is only there that TFs function. The SNPs outlined in this report should be considered as rSNPs since they change the DNA landscape for TF binding and have been associated with disease. In this report, examples have been described to illustrate that a change in rSNP alleles in the EPAS1 gene can provide different TFBS which in turn are also associated with human disease or alterations in human health such as adaptions to high altitude. The punitive changes in TFBS created by the four rSNPs could very well influence the significant cline in allele frequencies seen in Tibetans with increasing altitude 20 or the haplotype association with high altitude polycythemia in male Han Chinese 22. As an example, the minor rs7589621 SNP EPAS1-A creates a potential TFBS for the FOXC TF which is an important regulator of cell viability and resistance to oxidative stress. Where oxidative stress is linked to oxygen, hypoxia, heart failure and the hypoxia-inducible factor transcriptional factors 52. The potential alterations in TFBS obtained by computational analyses need to be verified by future protein/DNA electrophoretic mobility gel shift assays and gene expression studies.