Abstract
Purpose
Signal Transducer and Activator of Transcription 4 (STAT4) is important for signaling by interleukins (IL-12 and IL-23) and type 1 interferons and has been found to have several simple nucleotide polymorphisms (SNPs) associated with human disease. STAT4 SNPs were computationally examined with respect to changes in potential transcriptional factor binding sites (TFBS) and these changes were discussed in relation to human disease.
Methods
The JASPAR CORE and ConSite databases were instrumental in identifying the TFBS. The Vector NTI Advance 11.5 computer program was employed in locating all theTFBS in theSTAT4 gene from 4 kb upstream of the transcriptional start site to 8.3 kb past the 3’UTR. The JASPAR CORE database was also involved in computing each nucleotide occurrence (%) within the TFBS.
Results
The STAT4 SNPs in the 70 kb intron between exon 2 and 3 are in linkage disequilibrium and have previously been found to be significantly associated with several vasculitis diseases as well as diabetes. The SNP alleles were found to alter the DNA landscape for potential transcriptional factors (TFs) to attach resulting in changes in TFBS and thereby, alter which transcriptional factors potentially regulate the STAT4 gene. These STAT4 SNPs should be considered as regulatory (r) SNPs.
Conclusion
The alleles of each rSNP were found to generate unique TFBS resulting in potential changes in TF STAT4 regulation. These regulatory changes were discussed with respect to changes in human health that result in disease.
Author Contributions
Academic Editor: Norman E. Buroker,
Checked for plagiarism: Yes
Review by: Single-blind
Academic Editor: Liang Liu, Wake Forest School of Medicine
Checked for plagiarism: Yes
Review by: Single-blind
Copyright © 2016 Norman E.
Competing interests
The authors have declared that no competing interests exist.
Citation:
Introduction
The Janus Kinase-Signal Transducers and Activators of Transcription (JAK-STAT) pathways play a critical role in immune, neuronal, hematopoietic and hepatic systems 1. JAK-STAT is a principal signal transduction pathway in cytokine and growth factor signaling as well as regulating various cellular processes such as cell proliferation, differentiation migration and survival 2. JAK-STAT provides the principle intracellular signaling mechanism required for a wide array of cytokines 3, 4. The STAT portion of the signaling cascade has seven mammalian family members which are STAT1, 2, 3, 4, 5a, 5b and 6 3, 4. These STATs bind thousands of transcriptional factor binding sites (TFBS) in the genome and regulate the transcription of many protein-coding genes, miRNAs and long noncoding RNAs 4. The STAT 4 gene which is important for signaling by interleukins (IL-12 and IL-23) and type 1 interferons 4 has been found to have several simple nucleotide polymorphisms (SNPs) associated with human disease 5, 6, 7, 8, 9, 10, 11, 12. STAT4 transduces IL-12, IL-23 and type 1 interferon-mediated signals into helper T (Th) cells (Th1 and Th17) differentiation, monocyte activation, and interferon-gamma production 12, 13. The STAT4-dependent cytokine regulation is found in the pathogenesis of autoimmune disease 14, 15 such as systemic lupus erythematosus (SLE), rheumatoid arthritis (RA) and inflammatory bowel disease (IBD) 16, 17.
The STAT4 gene maps to human chromosome 2q32.3 and is about 143 kb in size. The coding region consists of 22 exons with a large 70 kb intron between exons 2 and 3. Several SNPs in the gene have been significantly association with Behcet’s Disease 18, diabetes risk 11, hepatitis B virus-related hepatocellular carcinoma 6, 10, 19, 20, inflammatory bowel disease 21, juvenile idiopathic arthritis 22, primary biliary cirrhosis and Crohn's disease 23, severe renal insufficiency in lupus nephritis 8, systemic lupus erythematosus 5 and ulcerative colitis 24 (Table 1). The rs7574865 STAT4 SNP has been found to be significantly associated with diabetes 11, hepatitis B virus-related hepatocellular carcinoma 6, 10, 19, 20, inflammatory bowel disease 21, juvenile idiopathic arthritis 22, primary biliary cirrhosis and Crohn's disease 23, severe renal insufficiency in lupus nephritis 8, systemic lupus erythematosus 5 and ulcerative colitis 24. The rs11889341 STAT4 SNP has been found to be significantly associated with diabetes 11, hepatitis B virus (HBV) infection, HBV-related cirrgisus and hepatocellular carcinoma 23, severe renal insufficiency in lupus nephritis 8, and systemic lupus erythematosus 5. The rs8179673 STAT4 SNP has been found to be significantly associated with diabetes 11, hepatitis B virus (HBV) infection, HBV-related cirrgisus and hepatocellular carcinoma 23 and systemic lupus erythematosus 5. The rs7582694 STAT4 SNP has been found to be significantly associated with hepatitis B virus (HBV) infection, HBV-related cirrgisus and hepatocellular carcinoma 23 and severe renal insufficiency in lupus nephritis 8. The rs7574070 and rs7572482 STAT4 SNPs have been found to be significantly associated with Behcet’s disease 18. The rs7572482 STAT4 SNP is located in the promoter region while the remaining SNPs are located in the large 70 kb intron between exon 2 and 3. The reports listed above indicate that these SNPs are in strong linkage disequilibrium (LD) with each other.
Table 1. STAT4 SNPs and disease. The SNPs have been found to be significantly associated with these diseases. The SNPs are located in STAT4 intron 3. MAF is the minor allele frequency. LD is linkage disequilibrium.Disease | SNP | Chr 2 Pos | Alleles | MAF | Risk Allele | LD | Study Group | Reference |
Behcet's | rs7574070 | 191145762 | A/C | C=0.47 | Yes | Chinese | 7 | |
rs7572482 | 191150346 | A/G | A=0.47 | Yes | ||||
Diabetes | rs11889341 | 191079016 | C/T | T=0.34 | Yes | Asian, Caucasian | 11 | |
rs7574865 | 191099907 | G/T | T=0.25 | Yes | ||||
rs8179673 | 191104615 | T/C | C=0.26 | Yes | ||||
rs10181656 | 191105153 | C/G | G=0.26 | Yes | ||||
Hepatitis B virus-related hepatocellular carcinoma | rs7574865 | 191099907 | G/T | T=0.25 | G | Chinese | 6 | |
HBV infection, HBV-related cirrgisus and hepatocellular carcinoma | rs7574865 rs7582694 rs11889341 rs8179673 | 191099907 191105394 191079016 191104615 | G/T G/C C/T T/C | T=0.25 C=0.33 T=0.34 C=0.26 | G G C T | Yes Yes Yes Yes | ||
Chinese | 10 | |||||||
HBV viral clearance | rs7574865 | 191099907 | G/T | T=0.25 | G | Tibetan, Uygur | 19 | |
Hepatocellular carcinoma | rs7574865 | 191099907 | G/T | T=0.25 | G | Korean | 20 | |
Inflammatiory bowel disease | rs7574865 | 191099907 | G/T | T=0.25 | G | Chinese, Causcians | 21 | |
Juvenil Idiopathic arthritis | rs7574865 | 191099907 | G/T | T=0.25 | T | Han Chinese | 22 | |
Primary biliary cirrhosis and Crohn's disease | rs7574865 | 191099907 | G/T | T=0.25 | T | Japanese | 23 | |
Severe renal insufficiency in lupus nephritis | rs11889341 rs7574865 rs7568275 rs7582694 | 191079016 191099907 191101726 191105394 | C/T G/T C/G G/C | T=0.28 T=0.23 G=0.28 C=0.22 | Yes Yes Yes Yes | Swedish | 8 | |
Systemic Lupus Erythematosus | rs11889341 rs7574865 rs8179673 rs10181656 | 191079016 191099907 191104615 191105153 | C/T G/T T/C C/G | T=0.28 T=0.23 C=0.26 G=0.26 | Yes Yes Yes Yes | European descent | 5 | |
Ulcerative colitis | rs7574865 | 191099907 | G/T | T=0.25 | T | European descent | 24 |
Single nucleotide changes that affect gene expression by impacting gene regulatory sequences such as promoters, enhances, and silencers are known as regulatory SNPs (rSNPs) 25, 26, 27, 28. A rSNPs within a transcriptional factor binding site (TFBS) can change a transcriptional factor’s (TF) ability to bind its TFBS 29, 30, 31, 32 in which case the TF would be unable to effectively regulate its target gene 33, 34, 35, 36, 37. This concept is examined for the above STAT4 rSNPs and their allelic association with TFBS, where computation analyses 38, 39, 40, 41 was used to identify TFBS alterations created by the STAT4 rSNPs. Recent reports have also introduced the concept of modeling of epigenetic modifications to transcriptional factor binding sites in the control of gene expression 42, 43. In this report, the rSNP associations with changes in potential TFBS are discussed with their possible relationship to these diseases in humans.
Methods
The JASPAR CORE database 44, 45 and ConSite 46 were used to identify the potential STAT4 TFBS in this study. JASPAR is a database of transcription factor DNA-binding preferences used for scanning genomic sequences where ConSite is a web-based tool for finding cis-regulatory elements in genomic sequences. The TFBS and rSNP location within the binding sites have previously been discussed 47. The Vector NTI Advance 11.5 computer program (Invitrogen, Life Technologies) was used to locate theTFBS in theSTAT4 gene (NCBI Ref Seq NM_003151) from 4 kb upstream of the transcriptional start site to 8.3 kb past the 3’UTR which represents a total of 130.9 kb. The JASPAR CORE database was also used to calculate each nucleotide occurrence (%) within the TFBS, where upper case lettering indicate that the nucleotide occurs 90% or greater and lower case less than 90%. The occurrence of each SNP allele in the TFBS is also computed from the database (Table 2 & Appendix).
Table 2. The STAT4 SNPs that were examined in this study where the minor allele is in red. Also listed are the transcriptional factors (TF), their potential binding sites (TFBS) containing these SNPs and DNA strand orientation. TFs in red differ between the SNP alleles. Where upper case nucleotide designates the 90% conserved BS region and red is the SNP location of the alleles in the TFBS. Below the TFBS is the nucleotide occurrence (%) obtained from the Jaspar Core database. Also listed are the number (#) of binding sites in the gene for the given TF. Note: TFs can bind to more than one nucleotide sequence.SNP | Allele | TFs | Protein name | # of Sites | TFBS | Strand | |||||||||||||||
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
rs7574865 | G | EN1 | Engrailed homeobox 1 | 1 | gaatagtggtt | plus | |||||||||||||||
g=20% | |||||||||||||||||||||
FOXA1 | Forkhead box A1 | 1 | ccacTaTTcaCattt | minus | |||||||||||||||||
c=0% | |||||||||||||||||||||
FOXA2 | Forkhead box A2 | 1 | TaTTcACatttt | minus | |||||||||||||||||
c=0.1% | |||||||||||||||||||||
FOXL1 | Forkhead box L1 | 8 | tgtgaATA | plus | |||||||||||||||||
g=17% | |||||||||||||||||||||
HLTF | Helicase-like transcription factor | 1 | tcaCaTtttg | minus | |||||||||||||||||
c=20% | |||||||||||||||||||||
MAX | MYC Associated Factor X | 3 | attCACaTtt | minus | |||||||||||||||||
C=100% | |||||||||||||||||||||
RUNX2 | Runt-related transcription factor 2 | 1 | gtgaataGTGGttat | plus | |||||||||||||||||
g=40% | |||||||||||||||||||||
SOX17 | SRY (sex determining region Y)-box 17 | 2 | cacATTtTg | minus | |||||||||||||||||
c=29% | |||||||||||||||||||||
ZNF354C | Zinc finger protein 354C | 81 | attCAC | minus | |||||||||||||||||
C=100% | |||||||||||||||||||||
T | ARID3A | AT rich interactive domain 3A | 81 | ATtAAc | minus | ||||||||||||||||
0.25 | (BRIGHT-like) | A=100% | |||||||||||||||||||
EN1 | Engrailed homeobox 1 | 1 | taatagtggtt | plus | |||||||||||||||||
t=30% | |||||||||||||||||||||
FOXA1 | Forkhead box A1 | 1 | ccacTaTTaaCattt | minus | |||||||||||||||||
a=0% | |||||||||||||||||||||
HLTF | Helicase-like transcription factor | 1 | taaCaTtttg | minus | |||||||||||||||||
a=34% | |||||||||||||||||||||
NKX2-5 | Natural killer 3 homeobox 2 | 23 | ttAAtag | plus | |||||||||||||||||
t=76% | |||||||||||||||||||||
NR1HE: RXRa | Nuclear Receptor Subfamily 1, Group H, Member 3 | 1 | TaaccactatTaacatttt | minus | |||||||||||||||||
Retinoid X receptor, alpha | a=44% | ||||||||||||||||||||
PDX1 | Pancreatic and duodenal homeobox 1 | 158 | tTAATa | plus | |||||||||||||||||
T=100% | |||||||||||||||||||||
PRRX2 | Paired related homeobox 2 | 518 | tATTA | minus | |||||||||||||||||
A=98% | |||||||||||||||||||||
RUNX2 | Runt-related transcription factor 2 | 1 | gttaataGTGGttat | plus | |||||||||||||||||
t=38% | |||||||||||||||||||||
SOX5 | SRY (sex determining region Y)-box 5 | 42 | AaTGTTa | plus | |||||||||||||||||
T=91% | |||||||||||||||||||||
SOX17 | SRY (sex determining region Y)-box 17 | 4 | aacATTtTg | minus | |||||||||||||||||
a=23% | |||||||||||||||||||||
SRY | Sex determining region Y | 4 | attaACAtt | minus | |||||||||||||||||
a=64% | |||||||||||||||||||||
rs11889341 | C | AR | Androgen Receptor | 1 | aaGaAtAagatGttc | minus | |||||||||||||||
G=100% | |||||||||||||||||||||
CEBPb | CCAAT/enhancer binding protein (C/EBP), beta | 1 | tcTTttaccAc | plus | |||||||||||||||||
c=6% | |||||||||||||||||||||
FOXL1 | Forkhead box L1 | 24 | aaagaATA | minus | |||||||||||||||||
g=26% | |||||||||||||||||||||
HLTF | Helicase-like transcription factor | 2 | attCtTttac | plus | |||||||||||||||||
C=100% | |||||||||||||||||||||
NR3C1 | Nuclear Receptor Subfamily 3, Group C, Member 1 | 1 | agaataagatGTtCa | minus | |||||||||||||||||
(Glucocorticoid Receptor) | g=80% | ||||||||||||||||||||
SOX3 | SRY (sex determining region Y)-box 3 | 1 | ttaTTcTttt | plus | |||||||||||||||||
c=7% | |||||||||||||||||||||
SOX5 | SRY (sex determining region Y)-box 5 | 73 | aTTcTTt | plus | |||||||||||||||||
c=0% | |||||||||||||||||||||
SOX6 | SRY (sex determining region Y)-box 56 | 6 | ttaTTcTttt | plus | |||||||||||||||||
c=0% | |||||||||||||||||||||
SRY | Sex determining region Y | 9 | taaaAgAAa | minus | |||||||||||||||||
g=0% | |||||||||||||||||||||
ZNF263 | Zinc finger protein 263 | 1 | agaGcAgtggtaaaagaataa | minus | |||||||||||||||||
g=75% | |||||||||||||||||||||
T | CDX2 | Caudal type homeobox 2 | 13 | tggtaAaaAAa | minus | ||||||||||||||||
0.34 | A=100% | ||||||||||||||||||||
FOXA1 | Forkhead box A1 | 1 | atctTaTTtttttac | plus | |||||||||||||||||
t=0% | |||||||||||||||||||||
FOXD1 | Forkhead Box D1 | 14 | gTAAAaAa | minus | |||||||||||||||||
A=100% | |||||||||||||||||||||
FOXD3 | Forkhead box D3 | 29 | tctTaTTttttt | plus | |||||||||||||||||
t=68% | |||||||||||||||||||||
FOXI1 | Forkhead box I1 | 29 | tctTaTTTtttt | plus | |||||||||||||||||
T=100% | |||||||||||||||||||||
FOXL1 | Forkhead box L1 | 59 | aaaaaATA | minus | |||||||||||||||||
a=57% | |||||||||||||||||||||
FOXO1 | Forkhead Box O1 | 13 | attTtTTtacc | plus | |||||||||||||||||
T=100% | |||||||||||||||||||||
FOXO1 | Forkhead Box O1 | 2 | tctTaTTtttt | plus | |||||||||||||||||
t=88% | |||||||||||||||||||||
FOXO3 | Forkhead Box O3 | 9 | ggtAAAaA | plus | |||||||||||||||||
A=92% | |||||||||||||||||||||
FOXP1 | Forkhead box P1 | 1 | cagtggtAAAaAaat | minus | |||||||||||||||||
A=100% | |||||||||||||||||||||
FOXP2 | Forkhead box P2 | 13 | tggTAAAaAaa | minus | |||||||||||||||||
A=100% | |||||||||||||||||||||
GATA1 | GATA binding protein 1 | 2 | atcTTATtttt | plus | |||||||||||||||||
t=88% | |||||||||||||||||||||
GATA1 | GATA binding protein 1 | 10 | tttTTAccact | plus | |||||||||||||||||
t=51% | |||||||||||||||||||||
GATA2 | GATA binding protein 2 | 2 | atttttTTAcCact | plus | |||||||||||||||||
t=54% | |||||||||||||||||||||
GATA2 | GATA binding protein 2 | 1 | aacatcTTATtttt | plus | |||||||||||||||||
t=72% | |||||||||||||||||||||
GATA3 | GATA Binding Protein 3 | 24 | AaATAAga | minus | |||||||||||||||||
A=100% | |||||||||||||||||||||
GATA4 | GATA binding protein 4 | 2 | tcTTATttttt | plus | |||||||||||||||||
t=79% | |||||||||||||||||||||
HLTF | Helicase-like transcription factor | 1 | catCtTattt | plus | |||||||||||||||||
t=25% | |||||||||||||||||||||
MEF2C | Myocyte Enhancer Factor 2C | 1 | ggtaaaaAAATAaga | minus | |||||||||||||||||
A=95% | |||||||||||||||||||||
MEF2C | Myocyte Enhancer Factor 2C | 1 | gtggtaaAAAaAtaa | minus | |||||||||||||||||
A=97% | |||||||||||||||||||||
NR3C1 | Nuclear Receptor Subfamily 3, Group C, Member 1 | 1 | aaaataagatGTtCa | minus | |||||||||||||||||
(Glucocorticoid Receptor) | a=15% | ||||||||||||||||||||
SOX5 | SRY (sex determining region Y)-box 5 | 164 | aTTtTTt | plus | |||||||||||||||||
t=0% | |||||||||||||||||||||
SOX6 | SRY (sex determining region Y)-box 56 | 8 | ttaTTtTttt | plus | |||||||||||||||||
t=0% | |||||||||||||||||||||
SRY | Sex determining region Y | 19 | taaaAaAAt | minus | |||||||||||||||||
a=0% | |||||||||||||||||||||
SRY | Sex determining region Y | 4 | gtaaAaAAa | minus | |||||||||||||||||
A=100% | |||||||||||||||||||||
ZNF263 | Zinc finger protein 263 | 1 | agaGcAgtggtaaaaaaataa | minus | |||||||||||||||||
a=19% | |||||||||||||||||||||
rs8179673 | T | ARID3A | AT rich interactive domain 3A | 397 | AatAAa | plus | |||||||||||||||
(BRIGHT-like) | t=63% | ||||||||||||||||||||
ARID3A | AT rich interactive domain 3A | 227 | ATttAa | minus | |||||||||||||||||
(BRIGHT-like) | A=100% | ||||||||||||||||||||
EN1 | Engrailed homeobox 1 | 1 | aaataaaggtc | plus | |||||||||||||||||
t=80% | |||||||||||||||||||||
FOXA1 | Forkhead box A1 | 1 | ccTTTaTTtaatata | minus | |||||||||||||||||
a=27% | |||||||||||||||||||||
FOXL1 | Forkhead box L1 | 25 | attaaATA | plus | |||||||||||||||||
T=91% | |||||||||||||||||||||
FOXL1 | Forkhead box L1 | 23 | atttaATA | minus | |||||||||||||||||
a=30% | |||||||||||||||||||||
GATA3 | GATA Binding Protein 3 | 27 | AaATAAag | plus | |||||||||||||||||
T=100% | |||||||||||||||||||||
HOXA5 | Hoxa5 | 27 | ctttatTt | minus | |||||||||||||||||
a=88% | |||||||||||||||||||||
LHX3 | LIM homeobox 3 | 3 | atATTAAaTaaag | minus | |||||||||||||||||
T=95% | |||||||||||||||||||||
NFIL3 | Nuclear factor, interleukin 3 regulated | 2 | TTAttTAAtat | minus | |||||||||||||||||
A=96% | |||||||||||||||||||||
NKX3-1 | NK3 homeobox 1 | 72 | tTAtTTA | minus | |||||||||||||||||
A=100% | |||||||||||||||||||||
RORA_1 | RAR-related orphan receptor A | 11 | ataaaGGTCc | plus | |||||||||||||||||
t=60% | |||||||||||||||||||||
RXRa | Retinoid X receptor, alpha | 5 | taaAGgtCcat | plus | |||||||||||||||||
t=5% | |||||||||||||||||||||
SOX2 | SRY (sex determining region Y)-box 2 | 11 | CCtTTaTt | minus | |||||||||||||||||
a=0% | |||||||||||||||||||||
SOX3 | SRY (sex determining region Y)-box 3 | 1 | cctTTaTtta | minus | |||||||||||||||||
a=0% | |||||||||||||||||||||
SOX6 | SRY (sex determining region Y)-box 6 | 1 | cctTTaTtta | minus | |||||||||||||||||
a=0% | |||||||||||||||||||||
SOX10 | SRY (sex determining region Y)-box 10 | 142 | cttTaT | minus | |||||||||||||||||
a=0% | |||||||||||||||||||||
SRY | Sex determining region Y | 9 | ttaaAtAAa | plus | |||||||||||||||||
t=7% | |||||||||||||||||||||
TBP | TATA Box Binding Protein | 1 | gtATAtAttaaataa | plus | |||||||||||||||||
t=16% | |||||||||||||||||||||
C | BRCA1 | breast cancer 1, early onset | 39 | acAaagg | plus | ||||||||||||||||
0.26 | c=81% | ||||||||||||||||||||
FOXA1 | Forkhead box A1 | 1 | ccTTTgTTtaatata | minus | |||||||||||||||||
g=9% | |||||||||||||||||||||
FOXA2 | Forkhead box A2 | 1 | TgTTtaatatat | minus | |||||||||||||||||
g=72% | |||||||||||||||||||||
FOXA2 | Forkhead box A2 | 1 | TaTggACctttg | minus | |||||||||||||||||
g=36% | |||||||||||||||||||||
FOXD1 | Forkhead Box D1 | 6 | tTAAACAa | plus | |||||||||||||||||
C=90% | |||||||||||||||||||||
FOXH1 | Forkhead Box H1 | 1 | tatAtTaaACa | plus | |||||||||||||||||
C=100% | |||||||||||||||||||||
FOXO1 | Forkhead Box O1 | 1 | cttTGTTtaat | minus | |||||||||||||||||
G=100% | |||||||||||||||||||||
FOXP1 | Forkhead box P1 | 1 | atatattAAAcAaag | plus | |||||||||||||||||
c=89% | |||||||||||||||||||||
FOXP2 | Forkhead box P2 | 1 | tatTAAACAaa | plus | |||||||||||||||||
C=99% | |||||||||||||||||||||
FOXQ1 | Forkhead box Q1 | 1 | ctttGTTTAat | minus | |||||||||||||||||
G=100% | |||||||||||||||||||||
HLTF | Helicase-like transcription factor | 1 | gacCtTtgtt | minus | |||||||||||||||||
g=17% | |||||||||||||||||||||
HNF1A | Hepatocyte Nuclear Factor 1 homeobox A | 1 | gtTTAaTatatact | minus | |||||||||||||||||
g=67% | |||||||||||||||||||||
HNF4A | Hepatocyte Nuclear Factor 4, Alpha | 1 | tggacctttgtttaa | minus | |||||||||||||||||
g=12% | |||||||||||||||||||||
HNF4G | Hepatocyte Nuclear Factor 4, Gamma | 1 | attaaaCAaAGgtcc | plus | |||||||||||||||||
C=93% | |||||||||||||||||||||
JUN::FOS | Jun Proto-Oncogene | 48 | TtAaacA | plus | |||||||||||||||||
FBJ Murine Osteosarcoma Viral Oncogene Homolog | c=83% | ||||||||||||||||||||
RXRa | Retinoid X receptor, alpha | 1 | caaAGgtCcat | plus | |||||||||||||||||
c=85% | |||||||||||||||||||||
SPX2 | SRY (sex determining region Y)-box 2 | 2 | CCtTgGTc | minus | |||||||||||||||||
G=100% | |||||||||||||||||||||
SOX3 | SRY (sex determining region Y)-box 3 | 1 | cctTTGTttA | minus | |||||||||||||||||
G=93% | |||||||||||||||||||||
SOX5 | SRY (sex determining region Y)-box 5 | 134 | tTTGTTt | minus | |||||||||||||||||
G=96% | |||||||||||||||||||||
SOX6 | SRY (sex determining region Y)-box 56 | 1 | cCtTTGTtta | minus | |||||||||||||||||
G=100% | |||||||||||||||||||||
SOX9 | SRY (sex determining region Y)-box 9 | 4 | cctTtGttt | minus | |||||||||||||||||
G=95% | |||||||||||||||||||||
SOX10 | SRY (sex determining region Y)-box 10 | 141 | cttTgT | minus | |||||||||||||||||
g=86% | |||||||||||||||||||||
SRY | Sex determining region Y | 4 | ttaaACAAa | plus | |||||||||||||||||
C=93% | |||||||||||||||||||||
TBP | TATA Box Binding Protein | 1 | gtATAtAttaaacaa | plus | |||||||||||||||||
c=30% | |||||||||||||||||||||
rs7582694 | G | BATF::JUN | Basic leucine zipper transcription factor, | 1 | tctaTGtgTcA | minus | |||||||||||||||
ATF-like Jun proto-oncogene | c=5% | ||||||||||||||||||||
CEBPa | CCAAT/enhancer binding protein (C/EBP), alpha | 1 | gTTgCatactc | minus | |||||||||||||||||
c=32% | |||||||||||||||||||||
FOSL2 | FOS-Like Antigen 2 | 1 | ctaTGtgTCAt | minus | |||||||||||||||||
c=19% | |||||||||||||||||||||
FOXC1 | Forkhead box C1 | 3 | atagaGTA | plus | |||||||||||||||||
g=25% | |||||||||||||||||||||
HLTF | Helicase-like transcription factor | 1 | acaCaTagag | plus | |||||||||||||||||
g=37% | |||||||||||||||||||||
HLF | Hepatic Leukemia Factor | 1 | ggTtgcatactc | minus | |||||||||||||||||
c=28% | |||||||||||||||||||||
JUN(var.2) | Jun Proto-Oncogene | 2 | actctaTGtgTCAt | minus | |||||||||||||||||
c=10% | |||||||||||||||||||||
JUNB | Jun B Proto-Oncogene | 1 | ctaTGtgTCAt | minus | |||||||||||||||||
c=20% | |||||||||||||||||||||
MAX | MYC Associated Factor X | 1 | tgaCACaTag | plus | |||||||||||||||||
g=29% | |||||||||||||||||||||
SOX3 | SRY (sex determining region Y)-box 3 | 2 | tctaTGTgtc | minus | |||||||||||||||||
c=75% | |||||||||||||||||||||
SOX10 | SRY (sex determining region Y)-box 10 | 77 | ctaTgT | minus | |||||||||||||||||
c=86% | |||||||||||||||||||||
T | T, Brachyury Homolog | 1 | cTAtGTGTcAt | minus | |||||||||||||||||
c=70% | |||||||||||||||||||||
C | BATF: JUN | Basic leucine zipper transcription factor, | 2 | tgtaTGtgTcA | minus | ||||||||||||||||
0.33 | ATF-like Jun proto-oncogene | g=27% | |||||||||||||||||||
BH1HE40 | Basic Helix-Loop-Helix Family, Member E40 | 2 | gaCACaTacag | plus | |||||||||||||||||
c=75% | |||||||||||||||||||||
CEBPa | CCAAT/enhancer binding protein (C/EBP), alpha | 1 | gTTgCatactg | minus | |||||||||||||||||
g=6% | |||||||||||||||||||||
FOSL2 | FOS-Like Antigen 2 | 1 | gtaTGtgTCAt | minus | |||||||||||||||||
g=39% | |||||||||||||||||||||
FOXC1 | Forkhead box C1 | 7 | atacaGTA | plus | |||||||||||||||||
c=25% | |||||||||||||||||||||
FOXC1 | Forkhead box C1 | 10 | atactGTA | minus | |||||||||||||||||
G=100% | |||||||||||||||||||||
HLF | Hepatic Leukemia Factor | 1 | ggTtgcatactg | minus | |||||||||||||||||
g=17% | |||||||||||||||||||||
HIF1a: ARNT | Hypoxia Inducible Factor 1, Alpha Subunit | 11 | gtatGTGt | minus | |||||||||||||||||
Aryl Hydrocarbon Receptor Nuclear Translocator | g=47% | ||||||||||||||||||||
JUN(var.2) | Jun Proto-Oncogene | 1 | actgtaTGtgTCAt | minus | |||||||||||||||||
g=32% | |||||||||||||||||||||
JUNB | Jun B Proto-Oncogene | 1 | gtaTGtgTCAt | minus | |||||||||||||||||
g=24% | |||||||||||||||||||||
rs7574070 | A | CEBPB | CCAAT/enhancer binding protein (C/EBP), beta | 1 | aaTgtCtccAt | plus | |||||||||||||||
A=100% | |||||||||||||||||||||
MZF1_1-4 | Myeloid Zinc Finger 1 | 112 | tGGaGA | minus | |||||||||||||||||
t=40% | |||||||||||||||||||||
NFE2L1::MafG | Nuclear Factor, Erythroid 2-Like 1 | 123 | caTGAa | plus | |||||||||||||||||
V-Maf Avian Musculoaponeurotic | a=85% | ||||||||||||||||||||
Fibrosarcoma Oncogene Homolog G | |||||||||||||||||||||
PAX2 | Paired box gene 2 | 3 | cttCatgg | minus | |||||||||||||||||
t=35% | |||||||||||||||||||||
RFX1 | Regulatory Factor X, 1 | 1 | ttcttCatgGagAC | minus | |||||||||||||||||
(Influences HLA Class II Expression) | t=84% | ||||||||||||||||||||
RFX5 | Regulatory factor X, 5 (influences HLA | 1 | cttCatgGagACatt | minus | |||||||||||||||||
class II expression) | t=78% | ||||||||||||||||||||
STAT3 | Signal transducer and activator of | 1 | cTtCatGgAga | minus | |||||||||||||||||
transcription 3 (acute-phase response factor) | t=30% | ||||||||||||||||||||
STAT5A::STAT5B | Signal transducer and activator of | 1 | gtcTCcatGAA | plus | |||||||||||||||||
transcription 5A and transcription 5B | a=43% | ||||||||||||||||||||
THAP1 | THAP domain containing, | 3 | tctCCatga | plus | |||||||||||||||||
apoptosis associated protein 1 | a=29% | ||||||||||||||||||||
ZNF354C | Zinc finger protein 354C | 87 | ctCCAt | plus | |||||||||||||||||
A=100% | |||||||||||||||||||||
C | EBF1 | Early B-cell factor 1 | 7 | ttCttcaGgGa | minus | ||||||||||||||||
0.47 | G=97% | ||||||||||||||||||||
ELK1 | ELK1, member of ETS oncogene family | 2 | ctccctGAag | plus | |||||||||||||||||
c=86% | |||||||||||||||||||||
STAT1 | Signal Transducer And Activator | 6 | cTTCagGGAga | minus | |||||||||||||||||
Of Transcription 1, 91kDa | G=96% | ||||||||||||||||||||
STAT3 | Signal transducer and activator of | 3 | cTtCagGgAga | minus | |||||||||||||||||
transcription 3 (acute-phase response factor) | g=45% | ||||||||||||||||||||
STAT4 | Signal Transducer And Activator Of Transcription 4 | 1 | cTtcaggGAgacat | minus | |||||||||||||||||
g=43% | |||||||||||||||||||||
STAT5A::STAT5B | Signal transducer and activator of | 4 | gtcTCcctGAA | plus | |||||||||||||||||
transcription 5A and transcription 5B | c=37% | ||||||||||||||||||||
TFAP2C | Transcription factor AP-2 gamma | 1 | ccattCttcAGggag | minus | |||||||||||||||||
(activating enhancer binding protein 2 gamma) | G=99% | ||||||||||||||||||||
THAP1 | THAP domain containing, | 3 | tctCCctga | plus | |||||||||||||||||
apoptosis associated protein 1 | c=68% | ||||||||||||||||||||
rs7572482 | A | ARID3A | AT rich interactive domain 3A | 237 | ATgAAa | plus | |||||||||||||||
(BRIGHT-like) | A=100% | ||||||||||||||||||||
EBF1 | Early B-cell factor 1 | 2 | ttCtCatGaaa | plus | |||||||||||||||||
a=27% | |||||||||||||||||||||
EBF1 | Early B-cell factor 1 | 1 | ttttCatGaGa | minus | |||||||||||||||||
t=37% | |||||||||||||||||||||
FOS | FBJ Murine Osteosarcoma Viral Oncogene Homolog | 1 | tcTttcTCAtg | plus | |||||||||||||||||
A=100% | |||||||||||||||||||||
NFE2L1::MafG | Nuclear Factor, Erythroid 2-Like 1 | 79 | caTGAg | minus | |||||||||||||||||
V-Maf Avian Musculoaponeurotic | T=100% | ||||||||||||||||||||
Fibrosarcoma Oncogene Homolog G | |||||||||||||||||||||
NFE2L1::MafG | Nuclear Factor, Erythroid 2-Like 1 | 123 | caTGAa | plus | |||||||||||||||||
V-Maf Avian Musculoaponeurotic | a=85% | ||||||||||||||||||||
Fibrosarcoma Oncogene Homolog G | |||||||||||||||||||||
POU5F1::SOX2 | POU Class 5 Homeobox 1 | 1 | ctTTctcATGaaaac | plus | |||||||||||||||||
SRY (Sex Determining Region Y)-Box 2 | A=90% | ||||||||||||||||||||
RFX1 | Regulatory Factor X, 1 | 1 | tttctCatgaaaAC | plus | |||||||||||||||||
(Influences HLA Class II Expression) | a=59% | ||||||||||||||||||||
SPIB | Spi-B transcription factor (Spi-1/PU.1 related) | 52 | tgaGaAA | minus | |||||||||||||||||
t=37% | |||||||||||||||||||||
SOX3 | SRY (sex determining region Y)-box 3 | 1 | tctTTcTcat | plus | |||||||||||||||||
a=1% | |||||||||||||||||||||
STAT3 | Signal transducer and activator of | 1 | tTcatgagAAa | minus | |||||||||||||||||
transcription 3 (acute-phase response factor) | t=60% | ||||||||||||||||||||
STAT4 | Signal Transducer And Activator Of Transcription 4 | 1 | tTcatgaGAAagaa | minus | |||||||||||||||||
t=35% | |||||||||||||||||||||
STAT6 | Signal Transducer And Activator Of Transcription 6 | 1 | tctTTCtcaaGAAaa | plus | |||||||||||||||||
Interleukin-4 Induced | a=18% | ||||||||||||||||||||
STAT6 | Signal Transducer And Activator Of Transcription 6 | 1 | gttTTCatgaGAAag | minus | |||||||||||||||||
Interleukin-4 Induced | t=51% | ||||||||||||||||||||
STAT5A::STAT5B | Signal transducer and activator of | 1 | ttTctcatGAA | plus | |||||||||||||||||
transcription 5A and transcription 5B | a=43% | ||||||||||||||||||||
STAT5A::STAT5B | Signal transducer and activator of | 1 | gtTTtcatGAg | minus | |||||||||||||||||
transcription 5A and transcription 5B | t=5% | ||||||||||||||||||||
G | ARNT | Aryl Hydrocarbon Receptor Nuclear Translocator | 14 | cACGaG | minus | ||||||||||||||||
0.47 | C=100% | ||||||||||||||||||||
ARNT | Aryl Hydrocarbon Receptor Nuclear Translocator | 14 | ctCGTG | plus | |||||||||||||||||
G=100% | |||||||||||||||||||||
ARNT::AHR | Aryl Hydrocarbon Receptor Nuclear Translocator | 14 | ctCGTG | plus | |||||||||||||||||
Aryl Hydrocarbon Receptor | G=96% | ||||||||||||||||||||
SOX3 | SRY (sex determining region Y)-box 3 | 10 | tctTTcTcgt | plus | |||||||||||||||||
g=1% | |||||||||||||||||||||
SPI1 | Spleen focus forming virus (SFFV) | 1 | acgagaaaGAAgtag | minus | |||||||||||||||||
proviral integration oncogene spi1 | c=12% | ||||||||||||||||||||
STAT3 | Signal transducer and activator of | 3 | tTcacgagAAa | minus | |||||||||||||||||
transcription 3 (acute-phase response factor) | c=36% | ||||||||||||||||||||
STAT4 | Signal Transducer And Activator Of Transcription 4 | 1 | tTcacgaGAAagaa | minus | |||||||||||||||||
c=52% | |||||||||||||||||||||
STAT6 | Signal Transducer And Activator Of Transcription 6 | 3 | tctTTCtcgaGAAaa | plus | |||||||||||||||||
Interleukin-4 Induced | g=65% | ||||||||||||||||||||
STAT6 | Signal Transducer And Activator Of Transcription 6 | 1 | gttTTCacgaGAAag | minus | |||||||||||||||||
Interleukin-4 Induced | c=37% | ||||||||||||||||||||
ZBTB33 | Zinc Finger And BTB Domain Containing 33 | 1 | ttCtCGtGaaaactg | plus | |||||||||||||||||
G=100% | |||||||||||||||||||||
rs7568275 | C | FOXA1 | Forkhead box A1 | 1 | gtaaTaTTaactgaa | minus | |||||||||||||||
g=6% | |||||||||||||||||||||
FOXI1 | Forkhead box I1 | 3 | ataTgTTcagtt | plus | |||||||||||||||||
c=0% | |||||||||||||||||||||
FOXL1 | Forkhead box L1 | 10 | tgaacATA | minus | |||||||||||||||||
g=9% | |||||||||||||||||||||
HLTF | Helicase-like transcription factor | 8 | gaaCaTataa | minus | |||||||||||||||||
g=20% | |||||||||||||||||||||
NKX2-5 | NK2 Homeobox 5 | 19 | ttAActg | minus | |||||||||||||||||
g=65% | |||||||||||||||||||||
SRF | Serum Response Factor | 1 | actgaaCatAtaaaGtaa | minus | |||||||||||||||||
(C-Fos Serum Response Element-Binding | g=0.44 | ||||||||||||||||||||
Transcription Factor) | |||||||||||||||||||||
G | BATF: JUN | Basic leucine zipper transcription factor, | 2 | atgtTGAgTtA | plus | ||||||||||||||||
0.28 | ATF-like Jun proto-oncogene | G=99% | |||||||||||||||||||
BATF::JUN | Basic leucine zipper transcription factor, | 1 | atatTaAcTcA | minus | |||||||||||||||||
ATF-like Jun proto-oncogene | c=82% | ||||||||||||||||||||
BRCA1 | Breast Cancer 1, Early Onset | 43 | tcAacat | minus | |||||||||||||||||
c=81% | |||||||||||||||||||||
FOXA1 | Forkhead box A1 | 1 | gtaaTaTTaactcaa | minus | |||||||||||||||||
c=33% | |||||||||||||||||||||
FOXA1 | Forkhead box A1 | 1 | tataTgTTgagttaa | plus | |||||||||||||||||
g=13% | |||||||||||||||||||||
FOXA2 | Forkhead box A2 | 1 | TgTTgAgttaat | plus | |||||||||||||||||
g=22% | |||||||||||||||||||||
FOXA2 | Forkhead box A2 | 1 | TaTTaActcaac | minus | |||||||||||||||||
c=33% | |||||||||||||||||||||
Foxd3 | Forkhead box D3 | 1 | ataTgTTgagtt | plus | |||||||||||||||||
g=21% | |||||||||||||||||||||
FOXD3 | Forkhead box D3 | 1 | taaTaTTaactc | minus | |||||||||||||||||
c=26% | |||||||||||||||||||||
FOXH1 | Forkhead Box H1 | 2 | ttaAcTcaACa | minus | |||||||||||||||||
c=70% | |||||||||||||||||||||
FOXI1 | Forkhead box I1 | 1 | ataTgTTgagtt | plus | |||||||||||||||||
g=0% | |||||||||||||||||||||
FOXL1 | Forkhead box L1 | 16 | tcaacATA | minus | |||||||||||||||||
c=17% | |||||||||||||||||||||
HLTF | Helicase-like transcription factor | 1 | caaCaTataa | minus | |||||||||||||||||
c=17% | |||||||||||||||||||||
JUN::FOS | Jun Proto-Oncogene | 17 | TaActcA | minus | |||||||||||||||||
FBJ Murine Osteosarcoma Viral Oncogene Homolog | c=83% | ||||||||||||||||||||
MAFF | V-Maf Avian Musculoaponeurotic | 1 | attaacTCAacAtataaa | minus | |||||||||||||||||
Fibrosarcoma Oncogene Homolog F | C=94% | ||||||||||||||||||||
MAFK | V-Maf Avian Musculoaponeurotic | 1 | ttaacTCAaCAtata | minus | |||||||||||||||||
Fibrosarcoma Oncogene Homolog K | C=96% | ||||||||||||||||||||
ZNF354C | Zinc finger protein 354C | 58 | ctCCAc | minus | |||||||||||||||||
C=100% | |||||||||||||||||||||
rs10181656 | C | AR | Androgen Receptor | 1 | tgGtACAagggGtga | minus | |||||||||||||||
G=95% | |||||||||||||||||||||
E2F6 | E2F transcription factor 6 | 3 | ggGtGaGAaga | minus | |||||||||||||||||
G=100% | |||||||||||||||||||||
GATA1 | GATA binding protein 1 | 4 | ctcTTcTCacc | plus | |||||||||||||||||
c=22% | |||||||||||||||||||||
GATA2 | GATA binding protein 2 | 1 | caactcTTcTCacc | plus | |||||||||||||||||
c=40% | |||||||||||||||||||||
GATA4 | GATA binding protein 4 | 3 | tcTTcTCaccc | plus | |||||||||||||||||
c=45% | |||||||||||||||||||||
HLTF | Helicase-like transcription factor | 7 | cccCtTgtac | plus | |||||||||||||||||
c=17% | |||||||||||||||||||||
KLF5 | Kruppel-like factor 5 (intestinal) | 1 | ttctCaCCCc | plus | |||||||||||||||||
C=100% | |||||||||||||||||||||
MZF1_1-4 | Myeloid Zinc Finger 1 | 58 | gGGtGA | minus | |||||||||||||||||
G=90% | |||||||||||||||||||||
MZF1_5-13 | Myeloid Zinc Finger 1 | 1 | caAgGgtga | minus | |||||||||||||||||
g=88% | |||||||||||||||||||||
NR1H3: RXRa | Nuclear Receptor Subfamily 1, Group H, Member 3 | 1 | TcaccccttgTaccactac | plus | |||||||||||||||||
Retinoid X receptor, alpha | c=77% | ||||||||||||||||||||
MAFF | V-Maf Avian Musculoaponeurotic | 1 | attaacTCAacAtataaa | minus | |||||||||||||||||
Fibrosarcoma Oncogene Homolog F | C=94% | ||||||||||||||||||||
MAFK | V-Maf Avian Musculoaponeurotic | 1 | ttaacTCAaCAtata | minus | |||||||||||||||||
Fibrosarcoma Oncogene Homolog K | C=96% | ||||||||||||||||||||
ZNF354C | Zinc finger protein 354C | 58 | ctCCAc | minus | |||||||||||||||||
C=100% | |||||||||||||||||||||
rs10181656 | C | AR | Androgen Receptor | 1 | tgGtACAagggGtga | minus | |||||||||||||||
G=95% | |||||||||||||||||||||
E2F6 | E2F transcription factor 6 | 3 | ggGtGaGAaga | minus | |||||||||||||||||
G=100% | |||||||||||||||||||||
GATA1 | GATA binding protein 1 | 4 | ctcTTcTCacc | plus | |||||||||||||||||
c=22% | |||||||||||||||||||||
GATA2 | GATA binding protein 2 | 1 | caactcTTcTCacc | plus | |||||||||||||||||
c=40% | |||||||||||||||||||||
GATA4 | GATA binding protein 4 | 3 | tcTTcTCaccc | plus | |||||||||||||||||
c=45% | |||||||||||||||||||||
HLTF | Helicase-like transcription factor | 7 | cccCtTgtac | plus | |||||||||||||||||
c=17% | |||||||||||||||||||||
KLF5 | Kruppel-like factor 5 (intestinal) | 1 | ttctCaCCCc | plus | |||||||||||||||||
C=100% | |||||||||||||||||||||
MZF1_1-4 | Myeloid Zinc Finger 1 | 58 | gGGtGA | minus | |||||||||||||||||
G=90% | |||||||||||||||||||||
MZF1_5-13 | Myeloid Zinc Finger 1 | 1 | caAgGgtga | minus | |||||||||||||||||
g=88% | |||||||||||||||||||||
NR1H3: RXRa | Nuclear Receptor Subfamily 1, Group H, Member 3 | 1 | TcaccccttgTaccactac | plus | |||||||||||||||||
Retinoid X receptor, alpha | c=77% | ||||||||||||||||||||
SP1 | Specificity Protein 1 | 1 | ttCtcacCcct | plus | |||||||||||||||||
C=80% | |||||||||||||||||||||
SREBF1 | Sterol regulatory element binding | 2 | cTCAcccctt | plus | |||||||||||||||||
transcription factor 1 | c=88% | ||||||||||||||||||||
SREBF2 | Sterol regulatory element binding | 2 | aaGgggTGAg | minus | |||||||||||||||||
transcription factor 2 | g=77% | ||||||||||||||||||||
ZNF263 | Zinc finger protein 263 | 1 | agtGgtacaaggggtgagaag | minus | |||||||||||||||||
g=59% | |||||||||||||||||||||
G | BRCA1 | Breast cancer 1, early onset | 12 | tcAgccc | plus | ||||||||||||||||
0.26 | g=9% | ||||||||||||||||||||
GATA1 | GATA binding protein 1 | 1 | ctcTTcTCagc | plus | |||||||||||||||||
g=34% | |||||||||||||||||||||
GATA2 | GATA binding protein 2 | 1 | caactcTTcTCagc | plus | |||||||||||||||||
g=45% | |||||||||||||||||||||
GATA4 | GATA binding protein 4 | 1 | tcTTcTCagcc | plus | |||||||||||||||||
g=34% | |||||||||||||||||||||
HLTF | Helicase-like transcription factor | 1 | gccCtTgtac | plus | |||||||||||||||||
g=20% | |||||||||||||||||||||
HNF4G | Hepatocyte Nuclear Factor 4, Gamma | 1 | gtggtaCAagGgctg | minus | |||||||||||||||||
c=46% | |||||||||||||||||||||
KLF5 | Kruppel-like factor 5 (intestinal) | 1 | tctCagCCCt | plus | |||||||||||||||||
g=38% | |||||||||||||||||||||
MAFB | v-maf musculoaponeurotic fibrosarcoma | 15 | Gctgagaa | minus | |||||||||||||||||
oncogene homolog B (avian) | c=80% | ||||||||||||||||||||
SP1 | Specificity Protein 1 | 1 | tctcagCcctt | plus | |||||||||||||||||
g=43% | |||||||||||||||||||||
SREBF1 | Sterol regulatory element binding | 1 | cTCAgccctt | plus | |||||||||||||||||
transcription factor 1 | g=12% | ||||||||||||||||||||
STAT3 | Signal transducer and activator of | 1 | gggCtgagAAg | minus | |||||||||||||||||
transcription 3 (acute-phase response factor) | C=91% | ||||||||||||||||||||
THAP1 | THAP domain containing, | 2 | cagCCcttg | plus | |||||||||||||||||
apoptosis associated protein 1 | g=6% |
Results
STAT4 rSNPs and TFBS
The STAT4 gene transcribes the transcriptional factor (TF) protein which is part of a family of STAT TFs that act as transcriptional activators in response to cytokines and growth factors. This protein is essential for mediating responses to IL12 in lymphocytes, and regulating the differentiation of T helper cells. Due to the importance of this gene in signal transduction and activation of transcription, STAT4 SNPs associated with disease were computationally evaluated with regard to TFBS. The rs7574865 STAT4 SNP located in the large 70 kb intron has been found to have the most significant association with human disease (Table 1).
The common rs7574865 SNP STAT4-G allele creates three unique TFBS for the FOXL1, MAX and ZNF354C TFs, which are involved with the regulation of metabolism, cell proliferation and gene expression during ontogenesis, transcription regulation and repression, respectively (Table 2, Appendix). The minor STAT4-T allele creates eight unique TFBS for the ARID3A, FOXQ1, NKX2-5, NR1HE::RXR1, PDX1, PRRX2, SOX5 and SRY TFs which are involved with the control of cell cycle progression, differentiation of lung epithelium, negative regulation of chondrocyte maturation, regulation of cholesterol homeostasis, glucose-dependent regulation of insulin gene transcription, proliferating fetal fibroblasts and the developing dermal layer, embryonic development and male development, respectively (Figure 1, Table 2, Appendix). There are also six conserved TBFS for the EN1, FOXA1, FOXA2, HLTF, RUNX2 and SOX17 TFs which are involved controlling development, embryonic development, altering chromatin structure, osteoblastic differentiation and transcription regulation, respectively (Table 2, Appendix).
Figure 1.Double stranded DNA from the STAT4 gene showing the potential TFBS for fourteen different TFs which can bind their respective DNA sequence either above (+) or below (-) the duplex (cf. Table 2). The rs7574865 rSNP minor STAT4-T allele is found in each of these TFBS. As shown, this rSNP is located in the 70 kb intron between exon 2 and 3 of the STAT4 gene. Also included with the potential TFBS is their % sequence homology to the duplex.
The common rs11889341 SNP STAT4-C allele creates three unique TBFS for the AR, CEBPb and the SOX3 TFs, which are involved with steroid-hormone activation, the regulation of acute-phase reaction, inflammation and hemopoiesis and the formation of the hypothalamo-pituitary axis, respectively (Figure 2, Table 2, Appendix). The minor STAT4-T allele creates thirteen unique TFBS for the CDX2, FOXD1, FOXD3, FOXI1, FOXO1, FOXO3, FOXP1, FOXPW, GATA1-4 and MEF2C TFs which are involved with the regulation of intestine-specific genes, kidney development, transcriptional activation and repression, kidney function, metabolic homeostasis in response to oxidative stress, a trigger for apoptosis, differentiation of lung epithelium, differentiation of lung epithelium, switching from fetal to adult hemoglobin, development and proliferation of hematopoietic and endocrine cell lineages, endothelial cell biology, cardiac myocyte enlargement and vascular development, respectively (Table 2, Appendix). There are also seven conserved TFBS for FOXL1, HLTF, NR3C1, SOX5, SOX6, SRY and ZNF263 TFs which are involved in the specification and differentiation of lung epithelium, altering chromatin structure, modulation of immune responses through suppression of chemokine and cytokine production, regulation of embryonic development, development of the central nervous system, a genetic switching in male development and transcription repression, respectively (Table 2, Appendix).
The common rs8179673 SNP STAT4-T allele creates nine unique TFBS for the ARID3A, EN1, FOXL1, GATA3, HOXA5, LHX3, NFIL3, NKX3-1 and RORa_1 TFs which are involved with involved with the control of cell cycle progression, the specification and differentiation of lung epithelium, endothelial cell biology, developmental regulatory system, pituitary development and motor neuron specification, expression of interleukin-3, regulation of embryonic development, cellular differentiation, immunity, circadian rhythm as well as lipid, steroid, xenobiotics and glucose metabolism, respectively (Table 2, Appendix). The minor rs8179673 SNP STAT4-C allele creates fifteen unique TFBS for the BRCA1, FOXA2, FOXD1, FOXO1, FOXP1 & 2, FOXQ1, HLTF, HNF1a, HNF4a, HNF4g, JUN::FOX, SOX5 and SOX6 TFs which are involved with tumor suppression, embryonic development, kidney development, insulin signaling, differentiation of lung epithelium, hair follicle differentiation, altering chromatin structure, regulation of the tissue specific expression of pancreatic islet cells and liver, regulation of several hepatic genes, cell proliferation and differentiation, transcriptional regulation and activation, respectively (Table 2, Appendix). There are also eight conserved TFBS for the FOXA1, RXRa, SOX2, SOX3, SOX6, SOX10, SRY and TBP TFs which are involved with embryonic development, retinoic acid-mediated gene activation, regulation of embryonic development, the formation of the hypothalamo-pituitary axis, development of the central nervous system, a genetic switching in male development and binding of TFIID to the TATA box, respectively (Table 2, Appendix).
The common rs7582694 SNP STAT4-G allele creates four unique TFBS for the MAX, SOX3, SOX10 and T TFs which are involved with transcription regulation, the formation of the hypothalamo-pituitary axis, regulation of embryonic development and mesoderm formation and differentiation, respectively (Table 2, Appendix). The minor STAT4-C allele creates three unique TFBS for the BH1HE40, HIF1a::ARNT and SRY TFs which are involved with the regulation of circadian rhythm, cellular and systemic responses to hypoxia, and a genetic switch in male development, respectively (Table 2, Appendix). There are also eight conserved TFBS for the BATF::JUN, CEBPa, FOSL2, FOXC1, HLTF, JUN (var.2) and JUNB TFs which are involved in negative regulation of AP-1/ATF transcriptional events, cell cycle regulation and body weight homeostasis, regulation of cell proliferation, differentiation, and transformation, cell viability and resistance to oxidative stress, altering chromatin structure, regulation of gene expression and gene activity, respectively (Table 2, Appendix).
The common rs7574070 SNP STAT4-A allele creates seven unique TFBS for the CEBPb, MZF1_1-4, NFEL1::MAFG, PAX2, RFX1 and RFX5 TFs which are involved with the regulation of acute-phase reaction, inflammation and hemopoiesis, hemopoietic development, up-regulation of cytoprotective genes, kidney cell differentiation and the activation of transcription from class II MHC promoters, respectively (Table 2, Appendix). The minor rs7574070 SNP STAT4-C allele creates five unique TFBS for the EBF1, ELK1, STAT1, STAT4 and TFAP2C TFs with are involved with transcriptional activation, the ras-raf-MAPK signaling cascade, transcriptional activation for cell viability in response to different cell stimuli and pathogens and activation of genes involved in a large spectrum of biological developmental functions, respectively (Table 2, Appendix). There are also three conserved TFBS for the STAT3, STAT5A::STAT5B and THAP1 TFs which are involved with signal transduction and transcriptional activation as well as the regulation of endothelial cell proliferation and the G1/S cell-cycle, respectively (Table 2, Appendix).
Figure 2.Double stranded DNA from the STAT4 gene showing the potential TFBS for nine different TFs which can bind their respective DNA sequence either above (+) or below (-) the duplex (cf. Table 2). The rs11889341 rSNP common STAT4-C allele is found in each of these TFBS. As shown, this rSNP is located in the 70 kb intron between exon 2 and 3 of the STAT4 gene. Also included with the potential TFBS is their % sequence homology to the duplex.
The common rs7572482 SNP STAT4-A allele creates eight unique TFBS for the ARID3A, EBF1, FOS, NFE2L1::MAFG, POU5F1::SOX2, RFX1, SPIB and STAT5A::STAT5B TFs which are involved with the control of cell cycle progression, transcriptional activation, regulation of cell proliferation, differentiation, and transformation, up-regulation of cytoprotective genes, embryonic development and stem cell pluripotency, regulation factor essential for MHC class II genes expression, lymphoid-specific enhancer, signal transduction and activation of transcription, respectively (Table 2, Appendix). The minor common rs7572482 SNP STAT4-G allele creates four unique TFBS for the ARNT, ARNT::AHR, SPI1 and ZBTB33 TFs which are involved with xenobiotic metabolism, activates gene expression during myeloid and B-lymphoid cell development, and transcriptional regulation with bimodal DNA-binding specificity, respectively (Table 2, Appendix). There are also four conserved TFBS for the SOX3, STAT3, STAT4 and STAT6 TFs which are involved with the formation of the hypothalamo-pituitary axis, signal transduction and transcriptional activation, mediating responses to IL12 in lymphocytes, and regulating the differentiation of arthritis, exerting IL4 mediated biological responses, respectively (Table 2, Appendix).
The remaining two STAT4 SNPs (rs7568275 and rs10181656) that have been found to be significantly associated with human disease (Table 1) can be analyzed in a similar fashion as the SNPs above (Table 2, Appendix).
Discussion
Genome-wide association studies (GWAS) over the last decade have identified nearly 6,500 disease or trait-predisposing SNPs where only 7% of these are located in protein-coding regions of the genome 48, 49 and the remaining 93% are located within non-coding areas 50, 51 such as regulatory or intergenic regions. SNPs which occur in the putative regulatory region of a gene where a single base change in the DNA sequence of a potential TFBS may affect the process of gene expression are drawing more attention 25, 27, 52. A SNP in a TFBS can have multiple consequences. Often the SNP does not change the TFBS interaction nor does it alter gene expression since a transcriptional factor (TF) will usually recognize a number of different binding sites in the gene. In some cases the SNP may increase or decrease the TF binding which results in allele-specific gene expression. In rare cases, a SNP may eliminate the natural binding site or generate a new binding site. In which cases the gene is no longer regulated by the original TF. Therefore, functional rSNPs in TFBS may result in differences in gene expression, phenotypes and susceptibility to environmental exposure 52. Examples of rSNPs associated with disease susceptibility are numerous and several reviews have been published.52, 53, 54, 55.
The rs7574865 rSNP STAT4-G allele G (+ strand) or C ( located in the unique MAX and ZNF354C TFBS have a 100% occurrence in humans while the unique FOXL1 TFBS has a 17% occurrence (Table 2). Since these binding sites (BS) occur multiple times in the gene, the rSNP G allele should not have much of an impact gene regulation (Table 2). The minor rs7574865 rSNP STAT4-T allele T (+ strand) or A ( located in the unique ARID3A, FOXQ1 and PDX1 TFBS have a 100% occurrence in humans while the NR1HE::RXRa TFBS has a 44% occurrence (Figure 1, Table 2). Since all the unique BS for this allele occur multiple times in this gene, it would not be expected that these TFBS would have much of an effect on STAT4 regulation except for the NR1HE:: RXRa BS which occurs only once in the gene (Table 2) and is a key regulator of macrophage function, controlling transcriptional programs involved in lipid homeostasis and inflammation (Appendix). Since NR1HE:: RXRa protein duplex is part of the NR1 subfamily of the retinoid nuclear receptor superfamily, the presence of its TFBS created by the minor T allele could in part be responsible for the diseases listed in Table 1, that are significantly associated with this rSNP.
The rs11889341 rSNP STAT4-C allele [C (+ strand) or G (- strand) located in the unique AR TFBS has a 100% occurrence in humans and occurs only once in the gene (Figure 2, Table 2). The androgen receptor is a steroid-hormone activated transcription factor which stimulates transcription of androgen responsive genes that are expressed in bone marrow, mammary gland, prostate, testicular and muscle tissues. The absence of this TFBS created by the minor STAT4-T allele should have a major effect relating to the diseases listed in Table 1. The minor rs11889341 rSNP STAT4-T allele T (+ strand) or A ( located in the unique CDX2, FOXOD1, FOXI1, FOXO1, FOXP1, FOXP2 and GATA3 TFBS have a 100% occurrence in humans. These TFBS occur multiple times in the gene so they would not be expected to have much impact on the regulation of the gene, except for the FOXP1 TFBS which occurs only once in the gene (Table 2). Although FOXP1 is a member of the subfamily P of the forkhead box (FOX) transcription factors which play important roles in the regulation of tissue- and cell type-specific gene transcription during both development and adulthood, it is doubtful that the presence of this TFBS only with the minor T allele would have much impact on the regulation of the gene since there are other family members TFBS represented with the minor allele (Table 2). The SNP T allele is also located in the two unique MEF2C TFBS which have a 95 and 97% occurrence in humans and each occurs only once in the gene (Table 2). The MEF2C TF controls cardiac morphogenesis and myogenesis, and is also involved in vascular development and consequently the presence or absence of this TFBS should have an impact on the diseases listed in Table 1.
The rs8179673 rSNP STAT4-T allele T (+ strand) or A ( located in the unique ARID3A, GATA3 and NKX3-1 TFBSs have a 100% occurrence in humans while the LHX3 and NFIL3 TFBS have a 95 and 96% occurrence (Table 2). Since these TFBS occur more than once in the gene, it is doubtful that these BS would have much impact on the regulation of the gene. The minor rs8179673 rSNP STAT4-C allele C (+ strand) or G ( located in the unique FOXH1, FOXO1, FOXQ1, SOX2 and SOX6 TFBS have a 100% occurrence in humans while the HNF1a and HNF4g TFBS have a 67 and 93% occurrence, respectively (Table 2). All of these TFBS occur only once in the gene (Table 2). However, the FOXH1, FOXO1, FOXQ1, SOX2 and SOX6 TFBS are involved with transcription machinery and are represented in families consisting of multiple members. Therefore, it is unlikely that the presence or absence of one family member would have much impact on gene regulation especially since there are TFBS for multiple family members represented by the minor C allele (Table 2, Appendix). The unique HNF1a and HNF4g TFBS which also occur only once in the gene are BS for the HNF1a and HNF4g transcriptional activators that regulates the tissue specific expression of multiple genes, especially in pancreatic islet cells and in liver (Table 2, Appendix).
The rs7574070 rSNP STAT4-A allele A (+ strand) or T ( located in the unique CEBPb and ZNF354C TFBS have a 100% occurrence in humans while the RFX1 and RFX5 TFBS have a 84 and 78% occurrence, respectively (Table 2). The CEBPb, RFX1 and RFX5 TFBS occur only once in the gene while the ZNF354C TFBS occurs 87 times in the gene (Table 2); consequently, only the CEBPb, RFX1 and RFX5 TFBS might have an impact on gene regulation. The CEBPb TF is an important transcriptional activator regulating the expression of genes involved in immune and inflammatory responses; consequently, the loss of the BS with the minor C allele could have an impact on Behcet’s disease (Table 1). The RFX1 & 5 TFs are important regulatory factors essential for MHC class II gene expression and the loss of these BS with the presence of the minor C allele could also have an impact on Behcet’s disease (Table 1). The rs7574070 rSNP STAT4-C allele C (+ strand) or G ( located in the unique STAT4 and THAP2C TFBS have a 43 and 99% occurrence in humans and occur only once in the gene (Table 2). The STAT4 TF is important in regulating genes associated with systemic lupus erythematosus and rheumatoid arthritis (Appendix); consequently, the occurrence of this TFBS only with the minor C allele could have an impact on Behcet’s disease (Table 1). The THAP2C TF is involved with a large spectrum of important biological functions (Appendix) and also contribute to Behcet’s disease when the TFBS is only represented in the minor C allele (Table 2, Appendix).
The rs7572482 rSNP STAT4-A allele A (+ strand) or T ( located in the unique ARID3A, FOS, NFE2L1::MAFG and POU5F1::SOX2 TFBS have a 100% occurrence in humans except for the POU5F1::SOX2 TFBS for which it has a 90% occurrence (Table 2). The ARID3A and NFE2L1::MAFG TFBS occur multiple times in the gene while the FOS and POU5F1::SOX2 TFBS only occurs once (Table 2). The FOS TF is a regulator of cell proliferation, differentiation and transformation while the POU5F1::SOX2 TFs play a key role in embryonic development and stem cell pluripotency (Appendix) which could have an impact on Behcet’s disease (Table 1). The rs7572482 rSNP STAT4-G allele G (+ strand) or C ( located in the unique ARNT, ARNT::AHR and ZBTB33 TFBS have a 100% occurrence in humans except for the ARNT::AHR TFBS for which it has a 96% occurrence (Table 2). The ARNT and ARNT::AHR TFBS occur 14 times in the STAT4 gene while the ZBTB33 TFBS only occurs once. The ZBTB33 TF is a transcriptional regulator involved with zinc finger motifs and may contribute to the repression of target genes of the Wnt signaling pathway (Appendix). Similar logic can be used to evaluate the potential TFBS within the other STAT4 rSNPs found in the Table 1 & Table 2.
Human diseases or conditions can be associated with rSNPs of the STAT4 gene as illustrated above. What a change in the rSNP alleles can do, is to alter the DNA landscape around the SNP for potential TFs to attach and regulate a gene. As an example, the potential TFBS associated with the rs7574865 rSNP STAT4-T allele from Table 2 are illustrated in Figure 1 as well as the rs11889341 common rSNP STAT4-C allele illustrated in Figure 2. As can be seen in Table 2, these potential TFBS change when an individual carries the alternate allele. The importance of this has been illustrated in Figure 2 with the AR TFBS where the common C allele has this function and the minor T allele does not. The AR TF is a steroid-hormone activated transcription factor which stimulates transcription of androgen responsive genes. A second example would be the HNF1a and HNF4g TFBS where the minor rs8179673 T allele has this function while the common allele does not. This TF regulates the tissue specific expression of multiple genes, especially in pancreatic islet and liver cells. A third example found with the rs7574070 common rSNP STAT4-A allele and not for the minor C allele is for the RFX1 & 5 TFBS whose TFs are important regulatory factors essential for MHC class II gene expression. Other examples can be found in Table 2.
Conclusions
SNPs that alter the TFBS are not only found in the promoter regions but in the introns, exons and the UTRs of a gene. The nucleus of the cell is where epigenetic alterations occur and TFs operate to convert chromosomes into single stranded DNA for mRNA transcription while it is the cytoplasm where mRNA is processed by separating exons and introns for protein translation. Consequently, it doesn’t matter where TFs bind the DNA in the nucleus because it is only there that TFs function. The SNPs outlined in this report should be considered as rSNPs since they change the DNA landscape for TF binding and have been associated with disease. In this report, examples have been described to illustrate that a change in rSNP alleles in the STAT4 gene can provide different TFBS which in turn are also associated with disease in humans. The potential alterations in TFBS obtained by computational analyses need to be verified by future protein/DNA electrophoretic mobility gel shift assays and gene expression studies.
Appendix. Transcriptional factor (TF) discriptions.Appendix. Transcriptional factor (TF) discriptions . | |||
TFs | TF discription | ||
AR | The protein functions as a steroid-hormone activated transcription factor. Upon binding the hormone ligand, the receptor dissociates from accessory proteins, translocates into the nucleus, dimerizes, and then stimulates transcription of androgen responsive genes. They are expressed in bone marrow, mammary gland, prostate, testicular and muscle tissues where they exist as dimers coupled to Hsp90 and HMGB proteins. | ||
ARID3A | Transcription factor which may be involved in the control of cell cycle progression by the RB1/E2F1 pathway and in B-cell differentiation | ||
ARNT | This gene encodes a protein containing a basic helix-loop-helix domain and two characteristic PAS domains along with a PAC domain. The encoded protein binds to ligand-bound aryl hydrocarbon receptor and aids in the movement of this complex to the nucleus, where it promotes the expression of genes involved in xenobiotic metabolism. | ||
ARNT::AHR | The dimer alters transcription of target genes. Involved in the induction of several enzymes that participate in xenobiotic metabolism. | ||
BATF::JUN | The protein encoded by this gene is a nuclear basic leucine zipper protein that belongs to the AP-1/ATF superfamily of transcription factors. The leucine zipper of this protein mediates dimerization with members of the Jun family of proteins. This protein is thought to be a negative regulator of AP-1/ATF transcriptional events. | ||
Bhlhe40 | This gene encodes a basic helix-loop-helix protein expressed in various tissues. The encoded protein can interact with ARNTL or compete for E-box binding sites in the promoter of PER1 and repress CLOCK/ARNTL's transactivation of PER1. Transcriptional repressor involved in the regulation of the circadian rhythm by negatively regulating the activity of the clock genes and clock-controlled genes. | ||
BRCA1 | This gene encodes a nuclear phosphoprotein that plays a role in maintaining genomic stability, and it also acts as a tumor suppressor. | ||
CDX2 | This gene is a member of the caudal-related homeobox transcription factor gene family. The encoded protein is a major regulator of intestine-specific genes involved in cell growth an differentiation. major regulator of intestine-specific genes involved in cell growth an differentiation. | ||
CEBPA | C/EBP is a DNA-binding protein that recognizes two different motifs: the CCAAT homology common to many promoters and the enhanced core homology common to many enhancers | ||
CEBPB | Important transcriptional activator regulating the expression of genes involved in immune and inflammatory responses. Binds to regulatory regions of several acute-phase and cytokines genes and probably plays a role in the regulation of acute-phase reaction, inflammation and hemopoiesis. | ||
CRX | The protein encoded by this gene is a photoreceptor-specific transcription factor which plays a role in the differentiation of photoreceptor cells. This homeodomain protein is necessary for the maintenance of normal cone and rod function. | ||
E2F6 | The protein encoded by this gene is a member of the E2F family of transcription factors. The E2F family plays a crucial role in the control of cell cycle and action of tumor suppressor proteins and is also a target of the transforming proteins of small DNA tumor viruses. | ||
EBF1 | Transcriptional activator which recognizes variations of the palindromic sequence 5'-ATTCCCNNGGGAATT-3' | ||
ELK1 | The protein encoded by this gene is a nuclear target for the ras-raf-MAPK signaling cascade. | ||
EN1 | Homeobox-containing genes are thought to have a role in controlling development. | ||
FOS | The Fos gene family consists of 4 members: FOS, FOSB, FOSL1, and FOSL2. These genes encode leucine zipper proteins that can dimerize with proteins of the JUN family, thereby forming the transcription factor complex AP-1. The FOS proteins have been implicated as regulators of cell proliferation, differentiation, and transformation. Controls osteoclast survival and size. As a dimer with JUN, activates LIF transcription. | ||
FOSL2 | The Fos gene family consists of 4 members: FOS, FOSB, FOSL1, and FOSL2. These genes encode leucine zipper proteins that can dimerize with proteins of the JUN family, thereby forming the transcription factor complex AP-1. The FOS proteins have been implicated as regulators of cell proliferation, differentiation, and transformation. Controls osteoclast survival and size. As a dimer with JUN, activates LIF transcription. Activates CEBPB transcription in PGE2-activated osteoblasts. | ||
FOXA1 | Transcription factor that is involved in embryonic development, establishment of tissue-specific gene expression and regulation of gene expression in differentiated tissues. | ||
FOXA2 | Involved in embryonic development, establishment of tissue-specific gene expression and regulation of gene expression in differentiated tissues. | ||
FOXC1 | This gene belongs to the forkhead family of transcription factors which is characterized by a distinct DNA-binding forkhead domain. An important regulator of cell viability and resistance to oxidative stress. | ||
FOXD1 | This gene belongs to the forkhead family of transcription factors which are characterized by a distinct forkhead domain. Studies of the orthologous mouse protein indicate that it functions in kidney development by promoting nephron progenitor differentiation, and it also functions in the development of the retina and optic chiasm. | ||
FOXD3 | This gene belongs to the forkhead family of transcription factors which is characterized by a distinct DNA-binding forkhead domain. Acts are a transcriptional activator and repressor. | ||
FOXH1 | Transcriptional activator. Recognizes and binds to the DNA sequence 5-TGTGTGTATT-3. Required for induction of the goosecoid (GSC) promoter by TGF-beta or activin signaling. | ||
FOXI1 | This gene belongs to the forkhead family of transcription factors which is characterized by a distinct DNA-binding forkhead domain. Transcriptional activator required for the development of normal hearing, sense of balance and kidney function. | ||
FOXL1 | FOX transcription factors are characterized by a distinct DNA-binding forkhead domain and play critical roles in the regulation of multiple processes including metabolism, cell proliferation and gene expression during ontogenesis. Transcriptional repressor. It plays an important role in the specification and differentiation of lung epithelium. | ||
FOXO1 | Transcription factor that is the main target of insulin signaling and regulates metabolic homeostasis in response to oxidative stress. | ||
FOXO3 | This gene belongs to the forkhead family of transcription factors which are characterized by a distinct forkhead domain. This gene likely functions as a trigger for apoptosis through expression of genes necessary for cell death. | ||
FOXP1 | This gene belongs to subfamily P of the forkhead box (FOX) transcription factor family. Forkhead box transcription factors play important roles in the regulation of tissue- and cell type-specific gene transcription during both development and adulthood. Transcriptional repressor. Plays an important role in the specification and differentiation of lung epithelium. | ||
FOXQ1 | This gene belongs to the forkhead family of transcription factors which is characterized by a distinct DNA-binding forkhead domain. Plays a role in hair follicle differentiation. | ||
FOXP2 | Transcriptional repressor that may play a role in the specification and differentiation of lung epithelium. May also play a role in developing neural, gastrointestinal and cardiovascular tissues. | ||
GATA1 | The protein plays an important role in erythroid development by regulating the switch of fetal hemoglobin to adult hemoglobin. | ||
GATA2 | A member of the GATA family of zinc-finger transcription factors that are named for the consensus nucleotide sequence they bind in the promoter regions of target genes and play an essential role in regulating transcription of genes involved in the development and proliferation of hematopoietic and endocrine cell lineages. | ||
GATA3 | Plays an important role in endothelial cell biology. | ||
GATA4 | This protein is thought to regulate genes involved in embryogenesis and in myocardial differentiation and function. Promotes cardiac myocyte enlargement. | ||
HIF1a: ARNT | HIF1 is a homodimeric basic helix-loop-helix structure composed of HIF1a, the alpha subunit, and the aryl hydrocarbon receptor nuclear translocator (Arnt), the beta subunit. The protein encoded by HIF1 is a Per-Arnt-Sim (PAS) transcription factor found in mammalian cells growing at low oxygen concentrations. It plays an essential role in cellular and systemic responses to hypoxia. | ||
HLTF | This gene encodes a member of the SWI/SNF family. Members of this family have helicase and ATPase activities and are thought to regulate transcription of certain genes by altering the chromatin structure around those genes. | ||
HNF1A | Transcriptional activator that regulates the tissue specific expression of multiple genes, especially in pancreatic islet cells and in liver. | ||
HNF4a | The encoded protein controls the expression of several genes, including hepatocyte nuclear factor 1 alpha, a transcription factor which regulates the expression of several hepatic genes | ||
HNF4g | Transcription factor. Has a lower transcription activation potential than HNF4-alpha | ||
HOXA5 | Sequence-specific transcription factor which is part of a developmental regulatory system that provides cells with specific positional identities on the anterior-posterior axis. | ||
JUN (var.2) | This gene is the putative transforming gene of avian sarcoma virus 17. It encodes a protein which is highly similar to the viral protein, and which interacts directly with specific target DNA sequences to regulate gene expression. | ||
JUN::FOS | Promotes activity of NR5A1 when phosphorylated by HIPK3 leading to increased steroidogenic gene expression upon cAMP signaling pathway stimulation. Has a critical function in regulating the development of cells destined to form and maintain the skeleton. It is thought to have an important role in signal transduction, cell proliferation and differentiation. | ||
JUNB | Transcription factor involved in regulating gene activity following the primary growth factor response. Binds to the DNA sequence 5-TGACGTCA-3 | ||
KLF5 | Transcription factor that binds to GC box promoter elements. Activates transcription of genes. | ||
LHX3 | This gene encodes a member a large protein family which carry the LIM domain, a unique cysteine-rich zinc-binding domain. The encoded protein is a transcription factor that is required for pituitary development and motor neuron specification. | ||
MAFB | The encoded nuclear protein represses ETS1-mediated transcription of erythroid-specific genes in myeloid cells. This protein plays an essential role in the regulation of hematopoiesis and may play a role in tumorigenesis. | ||
MAFF | The protein encoded by this gene is a basic leucine zipper (bZIP) transcription factor that lacks a transactivation domain. Interacts with the upstream promoter region of the oxytocin receptor gene. May be involved in the cellular stress response | ||
415290786765000MAFK | Since they lack a putative transactivation domain, the small Mafs behave as transcriptional repressors when they dimerize among themselves. they seem to serve as transcriptional activators by dimerizing with other (usually larger) basic-zipper proteins and recruiting them to specific DNA-binding sites. Small Maf proteinS heterodimerize with Fos and may act as competitive repressors of the NF-E2 transcription factor. | ||
MAX | The protein encoded by this gene is a member of the basic helix-loop-helix leucine zipper (bHLHZ) family of transcription factors | ||
MEF2C | Transcription activator which binds specifically to the MEF2 element present in the regulatory regions of development. many muscle-specific genes. Controls cardiac morphogenesis and myogenesis, and is also involved in vascular development. | ||
MZF1_1-4 | Binds to target promoter DNA and functions as trancription regulator. May be one regulator of transcriptional events during hemopoietic development. Isoforms of this protein have been shown to exist at protein level. | ||
MZF1_5-13 | Binds to target promoter DNA and functions as trancription regulator. May be one regulator of transcriptional events during hemopoietic development. Isoforms of this protein have been shown to exist at protein level. | ||
NFE2L1:MAFG | Nuclear factor erythroid 2-related factor (Nrf2) coordinates the up-regulation of cytoprotective genes via the antioxidant response element (ARE). MafG is a ubiquitously expressed small maf protein that is nvolved in cell differentiation of erythrocytes. It dimerizes with P45 NF-E2 protein and activates expression of a and b-globin. | ||
NFIL3 | Expression of interleukin-3 (IL3; MIM 147740) is restricted to activated T cells, natural killer (NK) cells, and mast cell lines. | ||
NKX2-5 | This gene encodes a member of the NK family of homeobox-containing proteins. Transcriptional repressor that acts as a negative regulator of chondrocyte maturation. | ||
NKX3-1 | This gene encodes a homeobox-containing transcription factor. This transcription factor functions as a negative regulator of epithelial cell growth in prostate tissue. | ||
Nr1h3::Rxra | The protein encoded by this gene belongs to the NR1 subfamily of the nuclear receptor superfamily. The NR1 family members are key regulators of macrophage function, controlling transcriptional programs involved in lipid homeostasis and inflammation. This protein is highly expressed in visceral organs, including liver, kidney and intestine. It forms a heterodimer with retinoid X receptor (RXR), and regulates expression of target genes containing retinoid response elements. Studies in mice lacking this gene suggest that it may play an important role in the regulation of cholesterol homeostasis. | ||
NR3C1 | Glucocorticoids regulate carbohydrate, protein and fat metabolism, modulate immune responses through supression of chemokine and cytokine production and have critical roles in constitutive activity of the CNS, digestive, hematopoietic, renal and reproductive systems. | ||
PAX2 | Probable transcription factor that may have a role in kidney cell differentiation. | ||
PDX1 | Activates insulin, somatostatin, glucokinase, islet amyloid polypeptide and glucose transporter type 2 gene transcription. Particularly involved in glucose-dependent regulation of insulin gene transcription. | ||
POU5F1::SOX2 | This gene encodes a transcription factor containing a POU homeodomain that plays a key role in embryonic development and stem cell pluripotency. Aberrant expression of this gene in adult tissues is associated with tumorigenesis. Forms a trimeric complex with SOX2 on DNA and controls the expression of a number of genes involved in embryonic development such as YES1, FGF4, UTF1 and ZFP206. | ||
PRRX2 | The DNA-associated protein encoded by this gene is a member of the paired family of homeobox proteins. Expression is localized to proliferating fetal fibroblasts and the developing dermal layer, with downregulated expression in adult skin. | ||
RFX1 | This gene is a member of the regulatory factor X gene family, which encodes transcription factors that contain a highly-conserved winged helix DNA binding domain. The protein encoded by this gene is structurally related to regulatory factors X2, X3, X4, and X5. Regulatory factor essential for MHC class II genes expression. Binds to the X boxes of MHC class II genes. | ||
RFX5 | Activates transcription from class II MHC promoters. Recognizes X-boxes. | ||
RORA_1 | The protein encoded by this gene is a member of the NR1 subfamily of nuclear hormone receptors. Orphan nuclear receptor. Binds DNA as a monomer to hormone response elements (HRE) containing a single core motif half-site preceded by a short A-T-rich sequence. | ||
RUNX2 | Transcription factor involved in osteoblastic differentiation and skeletal morphogenesis. Essential for the maturation of osteoblasts and both intramembranous and endochondral ossification. | ||
RXRa | Retinoid X receptors (RXRs) and retinoic acid receptors (RARs), are nuclear receptors that mediate the biological effects of retinoids by their involvement in retinoic acid-mediated gene activation. | ||
SOX2 | This intronless gene encodes a member of the SRY-related HMG-box (SOX) family of transcription factors involved in the regulation of embryonic development and in the determination of cell fate. The product of this gene is required for stem-cell maintenance in the central nervous system, and also regulates gene expression in the stomach. | ||
SOX3 | Transcription factor required during the formation of the hypothalamo-pituitary axis. May function as a | ||
switch in neuronal development. Keeps neural cells undifferentiated by counteracting the activity of proneural | |||
SOX5 | This gene encodes a member of the SOX (SRY-related HMG-box) family of transcription factors involved in the | ||
regulation of embryonic development and in the determination of the cell fate. | |||
The encoded protein may act as a transcriptional regulator after forming a protein complex with other proteins | |||
and may play a role in chondrogenesis. | |||
SOX6 | The encoded protein is a transcriptional activator that is required for normal | ||
development of the central nervous system, chondrogenesis and maintenance of cardiac and skeletal muscle cells. | |||
SOX9 | Plays an important role in the normal skeletal development. May regulate the expression of other genes | ||
involved in chondrogenesis by acting as a transcription factor for these genes | |||
SOX10 | This gene encodes a member of the SOX (SRY-related HMG-box) family of transcription factors involved | ||
in the regulation of embryonic development and in the determination of the cell fate. | |||
SOX17 | Acts as transcription regulator that binds target promoter DNA and bends the DNA. | ||
SP1 | Can activate or repress transcription in response to physiological and | ||
pathological stimuli. Regulates the expression of a large number of | |||
genes involved in a variety of processes such as cell growth, | |||
apoptosis, differentiation and immune responses. | |||
SPI1 | This gene encodes an ETS-domain transcription factor that activates gene expression during myeloid | ||
and B-lymphoid cell development | |||
SPIB | The protein encoded by this gene is a transcriptional activator that binds to the | ||
PU-box (5'-GAGGAA-3') and acts as a lymphoid-specific enhancer. | |||
SRF | This gene encodes a ubiquitous nuclear protein that stimulates both cell proliferation and differentiation. | ||
This protein binds to the serum response element (SRE) in the promoter region of target genes. | |||
Required for cardiac differentiation and maturation. | |||
SREBF1 | Transcriptional activator required for lipid homeostasis. Regulates transcription of the LDL receptor gene as well as the fatty acid and to a lesser degree the cholesterol synthesis pathway. | ||
SREBF2 | This gene encodes a member of the a ubiquitously expressed transcription factor that controls cholesterol homeostasis by regulating transcription of sterol-regulated genes. The encoded protein contains a basic helix-loop-helix-leucine zipper (bHLH-Zip) domain and binds the sterol regulatory element 1 motif. | ||
SRY | Transcriptional regulator that controls a genetic switch in male development. It is necessary and sufficient for initiating male sex determination by directing the development of supporting cell precursors | ||
STAT1 | The protein encoded by this gene is a member of the STAT protein family. In response to cytokines and growth factors, STAT family members are phosphorylated by the receptor associated kinases, and then form homo- or heterodimers that translocate to the cell nucleus where they act as transcription activators. This protein can be activated by various ligands including interferon-alpha, interferon-gamma, EGF, PDGF and IL6. This protein mediates the expression of a variety of genes, which is thought to be important for cell viability in response to different cell stimuli and pathogens. | ||
STAT3 | Signal transducer and transcription activator that mediates cellular responses to interleukins, KITLG/SCF and other growth factors | ||
STAT4 | Carries out a dual function: signal transduction and activation of transcription. Involved in IL12 signaling This protein is essential for mediating responses to IL12 in lymphocytes, and regulating the differentiation of arthritis. T helper cells. Mutations in this gene may be associated with systemic lupus erythematosus and rheumatoid arthritis. | ||
STAT5A:STAT5B | Carries out a dual function: signal transduction and activation of transcription. Regulates the expression of milk proteins during lactation. | ||
STAT6 | This protein plays a central role in exerting IL4 mediated biological responses. It is found to induce the expression of BCL2L1/BCL-X(L), which is responsible for the anti-apoptotic activity of IL4. Carries out a dual function: signal t ransduction and activation of transcription. Involved in IL4/interleukin-4- and IL3/interleukin-3-mediated signaling. | ||
T | The protein encoded by this gene is an embryonic nuclear transcription factor that binds to a specific DNA element, the palindromic T-site. It binds through a region in its N-terminus, called the T-box, and effects transcription of genes required for mesoderm formation and differentiation. | ||
TBP | General transcription factor that functions at the core of the DNA-binding multiprotein factor TFIID. Binding of TFIID to the TATA box is the initial transcriptional step of the pre-initiation complex (PIC), playing a role in the activation of eukaryotic genes transcribed by RNA polymerase II. | ||
TFAP2C | Sequence-specific DNA-binding protein that interacts with inducible viral and cellular enhancer elements to regulate transcription of selected genes. AP-2 factors bind to the consensus sequence 5'-GCCNNNGGC-3' and activate genes involved in a large spectrum of important biological functions including proper eye, face, body wall, limb and neural tube development. | ||
THAP1 | DNA-binding transcription regulator that regulates endothelial cell proliferation and G1/S cell-cycle progression. | ||
ZBTB33 | This gene encodes a transcriptional regulator with bimodal DNA-binding specificity, which binds to methylated CGCG and also to the non-methylated consensus KAISO-binding site TCCTGCNA. The protein contains an N-terminal POZ/BTB domain and 3 C-terminal zinc finger motifs. It recruits the N-CoR repressor complex to promote histone deacetylation and the formation of repressive chromatin structures in target gene promoters. It may contribute to the repression of target genes of the Wnt signaling pathway, and may also activate transcription of a subset of target genes by the recruitment of catenin delta-2 (CTNND2). | ||
ZNF263 | Might play an important role in basic cellular processes as a transcriptional repressor. | ||
ZNF354C | May function as a transcription repressor. |
References
- 1.Wang Y, Qu A, Wang H.Signal transducer and activator of transcription 4 in liver diseases. , Int J Biol Sci;11: 448-55.
- 2.Khanna P, Chua P J, Bay B H, Baeg G H.The JAK/STAT signaling cascade in gastric carcinoma (Review). , Int J Oncol
- 3.Heneghan A F, Pierre J F, Kudsk K A.. JAK-STAT and intestinal mucosal immunology. JAKSTAT;2: 25530.
- 4.O'Shea J J, Schwartz D M, Villarino A V, Gadina M, McInnes I B et al.The JAK-STAT pathway: impact on human disease and therapeutic intervention. Annu Rev Med;66:. 311-28.
- 5.Taylor K E, Remmers E F, Lee A T, Ortmann W A, Plenge R M et al.Specificity of the STAT4 genetic association for severe disease manifestations of systemic lupus erythematosus. , PLoS Genet 2008, 1000084.
- 6.Jiang D K, Sun J, Cao G, Liu Y, Lin D et al.Genetic variants in STAT4 and HLA-DQ genes confer risk of hepatitis B virus-related hepatocellular carcinoma. Nat. Genet;45: 72-5.
- 7.Hou S, Yang Z, Du L, Jiang Z, Shu Q et al.Identification of a susceptibility locus in STAT4 for Behcet's disease in Han Chinese in a genome-wide association study. Arthritis Rheum;64:. 4104-13.
- 8.Bolin K, Sandling J K, Zickert A, Jonsen A, Sjowall C et al.Association of STAT4 polymorphism with severe renal insufficiency in lupus nephritis. PLoS One;8: e84450
- 9.Kim E S, Kim S W, Moon C M, Park J J, Kim T I et al.. Interactions between IL17A, IL23R, and STAT4 polymorphisms confer susceptibility to intestinal Behcet's disease in Korean population. Life Sci;90: 740-6.
- 10.Lu Y, Zhu Y, Peng J, Wang X, Wang F et al. () genetic polymorphisms association with spontaneous clearance of hepatitis B virus infection.Immunol Res;. 62, 146-52.
- 11.Yi J, Fang X, Wan Y, Wei J.Huang J.STAT4 polymorphisms and diabetes risk: a meta-analysis with 18931 patients and 23833 controls. , Int J Clin Exp Med; 8, 3566-72.
- 12.Kumar A, Das S, Agrawal A, Mukhopadhyay I, Ghosh B.Genetic association of key Th1/Th2 pathway candidate genes, IRF2, IL6, IFNGR2, STAT4 and IL4RA, with atopic asthma in the Indian population. , J Hum Genet; 60, 443-8.
- 13.Mathur A N, Chang H C, Zisoulis D G, Stritesky G L, Yu Q et al. (2007) Stat3,Stat4 direct development of IL-17-secreting Th cells. , J Immunol2007; 178, 4901-7.
- 14.Chitnis T, Najafian N, Benou C, Salama A D, Grusby M J et al. (2001) Effect of targeted disruption of STAT4 and STAT6 on the induction of experimental autoimmune encephalomyelitis. , J Clin Invest2001; 108, 739-47.
- 15.Mo C, Chearwae W, O'Malley J T, Adams S M, Kanakasabai S et al.. Bright JJ.(2008) Stat4 isoforms differentially regulate inflammation and demyelination in experimental allergic encephalomyelitis.J Immunol2008; 181, 5681-90.
- 16.Remmers E F, Plenge R M, Lee A T, Graham R R, Hom G et al. (2007) STAT4 , the risk of rheumatoid arthritis and systemic lupus erythematosus.N. , Engl J Med2007; 357, 977-86.
- 17.Mudter J, Weigmann B, Bartsch B, Kiesslich R, Strand D et al. (2005) Activation pattern of signal transducers and activators of transcription (STAT) factors in inflammatory bowel diseases. , Am J Gastroenterol2005; 100, 64-72.
- 18.Hou S, Kijlstra A, Yang P.The genetics of Behcet's disease in a Chinese population. Front Med;6:. 354-9.
- 19.Liao Y, Cai B, Li Y, Chen J, Ying B et al.. Association of HLA-DP/DQ, STAT4 and IL-28B variants with HBV viral clearance in Tibetans and Uygurs in China. Liver Int;35: 886-96.
- 20.Kim L H, Cheong H S, Namgoong S, Kim J O, Kim J H et al.Replication of genome wide association studies on hepatocellular carcinoma susceptibility loci of STAT4 and HLA-DQ in a Korean population. Infect Genet Evol;33:. 72-6.
- 21.Liu Q F, Li Y, Zhao Q H, Wang Z Y, Hu S et al.Association of STAT4 rs7574865 polymorphism with susceptibility to inflammatory bowel disease: A systematic review and meta-analysis. Clin Res Hepatol Gastroenterol.
- 22.Fan Z D, Wang F F, Huang H, Huang N, Ma H H et al.. STAT4rs7574865 G/T ,PTPN22rs2488457 G/Cpolymorphismsinfluence the risk of developing juvenile idiopathic arthritis in Han Chinese patients.PLoS One;10: e0117389 .
- 23.Aiba Y, Yamazaki K, Nishida N, Kawashima M, Hitomi Y et al.Disease susceptibility genes shared by primary biliary cirrhosis and Crohn's disease in the Japanese population. , J Hum Genet
- 24.Xu L, Dai W Q, Wang F, He L, Zhou Y Q et al.. Association of STAT4 gene rs7574865G > T polymorphism with ulcerative colitis risk: evidence from 1532 cases and 3786 controls. Arch Med Sci;10: 419-24.
- 25.Knight J C.Functional implications of genetic variation in non-coding DNA for disease susceptibility and gene regulation. , Clin Sci (Lond),2003; 104, 493-501.
- 26.Knight J C. (2005) Regulatory polymorphisms underlying complex disease traits. , Journal of molecular medicine 83, 97-109.
- 27.Wang X, Tomso D J, Liu X, Bell D A. (2005) Single nucleotide polymorphism in transcriptional regulatory regions and expression of environmentally responsive genes. Toxicol Appl Pharmacol2005;. 207, 84-90.
- 28.Wang X, Tomso D J, Chorley B N, Cho H Y, Cheung V G et al. (2007) Identification of polymorphic antioxidant response elements in the human genome. Hum Mol Genet2007;. 16, 1188-200.
- 29.Claessens F, Verrijdt G, Schoenmakers E, Haelens A, Peeters B et al. (2001) Selective DNA binding by the androgen receptor as a mechanism for hormone-specific gene regulation. The Journal of steroid biochemistry and molecular biology 2001;76:. 23-30.
- 30.Hsu M H, Savas U, Griffin K J, Johnson E F.Regulation of human cytochrome P450 4F2 expression by sterol regulatory element-binding protein and lovastatin. , J Biol Chem2007; 282, 5225-36.
- 31.Takai H, Araki S, Mezawa M, Kim D S, Li X et al. (2008) AP1 binding site is another target of FGF2 regulation of bone sialoprotein gene transcription. , Gene2008; 410, 97-104.
- 32.Buroker N E, Huang J Y, Barboza J, Ledee D R, Eastman R J et al.The adaptor-related protein complex 2, alpha 2 subunit (AP2alpha2) gene is a peroxisome proliferator-activated receptor cardiac target gene. The protein journal 2012;31:. 75-83.
- 33.Huang C N, Huang S P, Pao J B, Hour T C, Chang T Y et al. (2012) Genetic polymorphisms in oestrogen receptor-binding sites affect clinical outcomes in patients with prostate cancer receiving androgen-deprivation therapy. , Journal of internal medicine2012; 271, 499-509.
- 34.Huang C N, Huang S P, Pao J B, Chang T Y, Lan Y H et al.Genetic polymorphisms in androgen receptor-binding sites predict survival in prostate cancer patients receiving androgen-deprivation therapy. Annals of oncology : official journal of the European Society for Medical Oncology / ESMO2012;. 23, 707-13.
- 35.Yu B, Lin H, Yang L, Chen K, Luo H et al. (2012) Genetic variation in the Nrf2 promoter associates with defective spermatogenesis in humans. , Journal of molecular medicine
- 36.Wu J, Richards M H, Huang J, Al-Harthi L, Xu X et al.Human FasL gene is a target of beta-catenin/T-cell factor pathway and complex FasL haplotypes alter promoter functions. PLoS One 2011;6:. 26143.
- 37.Alam M, Pravica V, Fryer A A, Hawkins C P, Hutchinson.Novel polymorphism in the promoter region of the human nerve growth-factor gene. International journal of immunogenetics 2005;32:. 379-82.
- 38.Kumar A, Purohit R.Computational investigation of pathogenic nsSNPs. in CEP63 protein. Gene2012; 503: 75-82.
- 39.Kamaraj B, Purohit R.Computational screening of disease-associated mutations in OCA2 gene. , Cell Biochem Biophys2014; 68, 97-109.
- 40.Kumar A, Rajendran V, Sethumadhavan R, Shukla P, Tiwari S et al.Computational SNP analysis: current approaches and future prospects. , Cell Biochem Biophys2014; 68, 233-9.
- 41.Kumar A, Purohit R.Use of long term molecular dynamics simulation in predicting cancer associated SNPs. , PLoS Comput Biol 2014, 1003318.
- 42.Liu L, Zhao W, Zhou X.Modeling co-occupancy of transcription factors using chromatin features. Nucleic Acids Res.
- 43.Liu L, Jin G, Zhou X.Modeling the relationship of epigenetic modifications to transcription factor binding. Nucleic Acids Res;43:. 3873-85.
- 44.Bryne J C, Valen E, Tang M H, Marstrand T, Winther O et al. (2008) JASPAR, the open access database of transcription factor-binding profiles: new content and tools in the 2008 update. Nucleic Acids Res2008;36:. 102-6.
- 45.Sandelin A, Alkema W, Engstrom P, Wasserman W W, Lenhard B.JASPAR: an open-access database for eukaryotic transcription factor binding profiles. , Nucleic Acids Res2004; 32, 91-4.
- 46.Sandelin A, Wasserman W W, Lenhard B.ConSite: web-based prediction of regulatory elements using cross-species comparison. , Nucleic Acids Res2004; 32, 249-52.
- 47.Buroker N E, Ning X H, Zhou Z N, Li K, Cen W J et al.AKT3, ANGPTL4, eNOS3, and VEGFA associations with high altitude sickness in Han and Tibetan Chinese at the Qinghai-Tibetan Plateau. International journal of hematology;96:. 200-13.
- 48.Pennisi E. (2011) The Biology of Genomes. Disease risk links to gene regulation. Science. 332-1031.
- 49.Kumar V, Wijmenga C, Withoff S.From genome-wide association studies to disease mechanisms: celiac disease as a model for autoimmune diseases. Semin Immunopathol2012;. 34, 567-80.
- 50.Hindorff L A, Sethupathy P, Junkins H A, Ramos E M, Mehta J P et al.Potential etiologic and functional implications of genome-wide association loci for human diseases and traits. Proc Natl Acad Sci U S A2009;106: 9362 - 7 .
- 51.Kumar V, Westra H J, Karjalainen J, Zhernakova D V, Esko T et al.Human disease-associated genetic variation impacts large intergenic non-coding RNA expression. , PLoS Genet2013; 9, 1003201.
- 52.Chorley B N, Wang X, Campbell M R, Pittman G S, Noureddine M A et al.Discovery and verification of functional single nucleotide polymorphisms in regulatory genomic regions: current and developing technologies. Mutat Res2008;. 659, 147-57.
- 53.Prokunina L, Alarcon-Riquelme M E.Regulatory SNPs in complex diseases: their identification and functional validation. Expert Rev Mol Med2004;. 6, 1-15.
Cited by (3)
- 1.Buroker Norman E., 2017, Identifying Changes in Punitive Transcriptional Factor Binding Sites Created by PPAR<i>α/δ/γ</i> SNPs Associated with Disease, Journal of Biosciences and Medicines, 05(04), 81, 10.4236/jbm.2017.54008
- 2.Vanvanhossou Sèyi Fridaïus Ulrich, Yin Tong, Scheper Carsten, Fries Ruedi, Dossa Luc Hippolyte, et al, 2021, Unraveling Admixture, Inbreeding, and Recent Selection Signatures in West African Indigenous Cattle Populations in Benin, Frontiers in Genetics, 12(), 10.3389/fgene.2021.657282
- 3.Jin Yunyun, Cai Hanfang, Liu Jiming, Lin Fengpeng, Qi Xinglei, et al, 2016, The 10 bp duplication insertion/deletion in the promoter region within paired box 7 gene is associated with growth traits in cattle, Archives Animal Breeding, 59(4), 469, 10.5194/aab-59-469-2016