PODKOLODNAYA O.A.1, STEPANENKO I.L.
Laboratory of Theoretical Genetics, Institute of Cytology and Genetics (Siberian Branch of the Russian Academy of Sciences), Lavrentieva 10, Novosibirsk, 630090, Russia;
1Corresponding author e-mail:opodkol@bionet.nsc.ru
Keywords: database, transcription regulation, erythroid cells, differentiation, vertebrate ontogenesis, gene expression, transcription factor
The ESRG-TRRD database, compiling the information on the genes with specific transcription regulation expressed in erythroid cells, is described. The genes included in the database may be divided into certain major groups: globins, transcription factors, enzymes of the heme biosynthetic pathway, etc. All the genes contain regulatory regions responsible for erythroid-specific transcription regulation. Occurrence of alternative promoters is a typical peculiarity of many nonglobin genes described in the database. The transcription factors GATA-1, NF-E2, EKLF, PREIIBF, SSP, BGP1, b-DRF, NF-E3, and NF-E4 determine the tissue- and stage-specificity of gene transcription regulation in erythroid cells.
Introduction
The ESRG-TRRD (Erythroid-Specific Regulated Genes) database is a constituent of the TRRD (Transcription Regulatory Regions Database [A.E. Kel. et al., 1997; N.A. Kolchanov et al., 1998] and was designed to accumulate the information on the genes expressed in the erythroid cells in the course of their differentiation at different stages of vertebrate ontogenesis. This information forms two blocks: the data on gene expression and structure-function organization of their regulatory regions. Peculiarities of gene expression in different cell types, organs, and tissues depending on developmental stage of the organism and effect of external factors are described in the former block along with the references to the gene regulatory regions and transcription factor binding sites involved in realization of a particular expression pattern. The latter block describes the gene regulatory regions, transcription factor binding sites, and the factors binding to these sites. The database also includes a system of cross-referenced hypertexts available to the Internet users at http://www.bionet.nsc.ru/trrd/.
Contents of the ESRG-TRRD database
The ESRG-TRRD database contains description of 42 genes including, first, the genes coding for erythroid-specific proteins, such as globins, as well as the genes with a less restricted expression, the transcription of which is regulated specifically in erythroid cells. Genes encoding a number of enzymes, transcription factors, and cell surface antigens represent the latter group. In addition, the genes encoding the protein products involved in erythroid differentiation and their receptors are also included into the database. Listed in Table 1 is the information on the genes contained in the ESRG-TRRD: gene name, organism name, accession number of gene in the TRRD, number of regulatory regions and transcription factor binding sites described in the database, and EMBL accession number of the gene.
The ESRG-TRRD database contains a formalized description of 92 regulatory units (promoters, silencers, enhancers), and 411 transcription factor binding sites. Analysis of the information contained in the ESRG-TRRD database allowed us to reveal a multilevel organization of the regions involved in transcription regulation of erythroid-specific genes. The locus control region (LCR), a regulatory region common for all the genes of one globin cluster, is the highest regulatory level described in this database. This regulatory region aids the tissue-specific regulation of all the genes of the cluster and their switching in the course of ontogenesis. LCRs are marked with DNase I hypersensitive sites, which may be located in 5′- and 3′- regions of globin clusters. A minimal functional domain of beta-globin cluster LCR hypersensitive site includes GATA, AP-1/NF-E2, and CACC binding sites for erythroid-specific transcription factors GATA-1 and NF-E2 and transcription factor Sp1. Functional features of alpha- and beta-globin cluster LCRs differ something, possibly depending on chromatin structure of a – and b -globin clusters.
The next regulation level is represented by the elements providing gene transcription initiation and modulation of transcription – promoters, silencers, and enhancers. These elements were designated as regulatory units and may have tissue- and stage-specific characteristics.
Table 1. The genes described in the ESRG-TRRD
Gene name |
Organism |
TRRD |
Regulatory units |
Binding sites |
EMBL AC |
aA-globin |
chicken |
G000046 |
2 |
13 |
X56801 |
alpha-D-globin |
chicken |
G000047 |
1 |
3 |
V00411 |
pi-globin |
chicken |
G001169 |
1 |
7 |
V00408 |
betaA-globin |
chicken |
G000051 |
2 |
24 |
V00409 |
betaH-globin |
chicken |
G000053 |
1 |
7 |
X13142 |
rho-globin |
chicken |
G001184 |
4 |
8 |
L17432 |
alpha1-globin |
human |
G000195 |
2 |
13 |
J00153 |
alpha2-globin |
human |
G001202 |
2 |
8 |
J00153 |
zeta2-globin |
human |
G000419 |
2 |
9 |
J00182 |
beta-globin |
human |
G000215 |
4 |
21 |
U01317 |
delta-globin |
human |
G000240 |
1 |
2 |
U01317 |
Agamma globin |
human |
G000261 |
3 |
32 |
M91037 |
Ggamma-globin |
human |
G001095 |
1 |
5 |
V00517 |
epsilon-globin |
human |
G000253 |
3 |
8 |
V00508 |
alpha-globin |
rabbit |
G001076 |
5 |
12 |
M74142 |
alpha1-globin |
mouse |
G000468 |
2 |
15 |
J00410 |
zeta-globin |
mouse |
G001223 |
1 |
2 |
X62302 |
beta major -globin |
mouse |
G000484 |
5 |
12 |
M18646 |
porphobilinogen deaminase |
human |
G000359 |
2 |
10 |
M18800 M95623 |
porphobilinogen deaminase |
mouse |
G000584 |
4 |
12 |
M28663 |
delta-aminolevulinate dehydratase |
human |
G001186 |
2 |
– |
X64467 |
L-type pyruvate kinase gene |
rat |
G000763 |
3 |
25 |
X85791 |
glutathion peroxidase |
mouse |
G001229 |
2 |
11 |
X03920 |
15-lipoxygenase |
rabbit |
G001274 |
2 |
12 |
X90860 |
carbonic anhydrase I |
mouse |
G001260 |
2 |
3 |
– |
carbonic anhydrase I |
human |
G001261 |
2 |
8 |
– |
HOXB2 |
human |
G001216 |
1 |
2 |
X78978 |
GATA1 |
mouse |
G000512 |
2 |
3 |
X56854 |
GATA1 |
chicken |
G000060 |
1 |
13 |
M59937 |
TAL1 |
human |
G001188 |
1 |
6 |
S53698 |
TAL1 |
mouse |
G001189 |
2 |
7 |
U01530 |
erythroid Kruppel-like factor |
mouse |
G001081 |
1 |
5 |
– |
erythropoietin |
human |
G000254 |
2 |
5 |
M11319 |
erythropoietin |
mouse |
G001221 |
1 |
2 |
M12930 |
erythropoietin receptor |
mouse |
G000507 |
3 |
4 |
M38133 |
erythropoietin receptor |
human |
G001222 |
4 |
11 |
M77244 |
glycophorin A |
human |
G001082 |
1 |
– |
M24123 |
glycophorin B |
human |
G000275 |
1 |
8 |
M57233 |
glycophorin C |
human |
G000276 |
1 |
1 |
X14242 |
glycoprotein D |
human |
G001262 |
1 |
2 |
X85785 |
transferrin receptor |
human |
G000406 |
1 |
4 |
X05339 |
histone H5 |
chicken |
G000063 |
4 |
19 |
J00870 |
alpha locus control region |
human |
G001085 |
1 |
10 |
S49899 |
alpha locus control region |
mouse |
G001185 |
1 |
12 |
U08220 |
beta-LCR |
human |
G000328 |
2 |
15 |
– |
A minimal promoter of globin genes providing a low level of constitutive expression of a reporter construct comprises, as a rule, ÑÀÑÑÑ, ÑÑÀÀÒ, and ÒÀÒÀ boxes. Occurrence of functional GATA sites, binding with GATA-1 factor, determines the tissue-specificity of transcription from most globin promoters. Each of the globin genes has both positive and negative regulatory elements; together with LCR, these elements provide a high level of erythroid-specific transcription and gene switching in the course of ontogenesis.
Occurrence of alternative erythroid-specific promoters is a peculiarity of most genes of enzymes and transcription factors (10 of 14) contained in the ESRG-TRRD database. The distances between the alternative promoters differ considerably in the genes described. For example, erythroid- and colon-specific promoters in human carbonic anhydrase I gene (G0012601) are separated by 36 kb intron, whereas the distance between the alternative promoters in mouse TAL-1 gene (G001189) is about 300 bp. Transcription activity of the genes with erythroid-specific regulation is modulated by positive and negative regulatory elements, enhancers and silencers, providing transcription increase in erythroid cells by positive regulatory elements, as for example, in human porphobilinogen deaminase gene (G000584)[ C. Porcher, et al 1991], or its suppression by the silencers in non-erythroid cells, as in rabbit 15-lipoxygenase gene (G001274)[O’Prery and P.R. Harrison1995]. Among 46 nonpromotor regulatory units described in the database, 22 are positive; 11, negative.
Interaction of the transcription factor with the corresponding binding site is an elementary event underlying the transcription regulation. Over 400 transcription factor binding sites are described in the ESRG-TRRD database. The corresponding transcription factors are identified for 227 of the binding sites (110, broadly expressed; 117, erythroid). Among commonly expressed factors, the most of binding sites were revealed for transcription factor Sp1 (38 sites). The group of erythroid factors is represented by factors GATA-1, NF-E2, EKLF, PREIIBF, SSP, BGP1, b-DRF, NF-E3, and NF-E4. Factors EKLF, PREIIBF, SSP, and NF-E4 are involved in regulation of stage-specific globin expression.
Fig 1. The role of transcription factor GATA-1 in the terminal erythroid differentiation ( rectangle – gene; ellipse – protein; arrow – activation )
Binding of GATA-1 and NF-E2 transcription factors determines in most cases the tissue-specific gene expression. The ESRG-TRRD database describes 94 GATA-1 binding sites. Note that binding sites for this factor were recorded in virtually all genes compiled in the ESRG-TRRD. Occurrence of such sites in the genes of factor GATA-1 itself provides a possibility of positive autoregulation of these genes (Fig. 1). The second circuit of GATA-1 positive autoregulation is connected with the presence of GATA-1 binding sites in the erythropoietin receptor gene (G000507, G001222) promoter. Although the details of the EpoR-based GATA-1 positive autoregulation mechanism are vague, the process may contribute considerably to the increase in GATA-1 level. The increase in GATA-1 level, in turn, stimulates the transcription of various erythroid-specific genes (Fig. 1) due to the presence of GATA-1 binding sites in their regulatory regions [Podkolodnaya and Stepanenko, 1997]. In addition, GATA-1 stimulates transcription of the genes coding for transcription factors EKLF, TAL1, RBTN2, and HOXB2 [Crossley et al., 1994; Lecointe et al., 1994; Royer-Pokora et al., 1990; Vieille-Grosjean et al., 1995]. These factors, in turn, stimulate additionally the erythroid-specific genes through their binding sites located in the regulatory regions of the genes. These processes result in a shift towards terminal erythroid differentiation accompanied with a high expression of late erythroid-specific genes.
Conclusion
The ESRG-TRRD database, compiling the available information on expression and transcription regulation peculiarities of the genes expressed at different developmental stages in vertebrate erythroid cells of which transcription is regulated specifically during erythropoiesis, is described. Characteristic of the genes coding for globins is a more complex structure-function hierarchy of the regulatory regions providing the tissue- and stage-specific regulation of their transcription. Occurrence of alternative promoters is characteristic of many genes of enzymes and transcription factors which are expressed in erythroid cells. Further development of this work will involve not only the expanding of the database, but also the analysis of relations between expression patterns and structure of regulatory regions of the genes described and construction of the gene network for this functional gene group.
Acknowledgments
This work was partially supported by the Russian State Human Genome Program (12312 GCh-5) and Russian Foundation for Basic Research (grants 97-04-49740 and 96-04-50006).
References
- Kolchanov, E.V. Ignatieva, O.V. Kel-Margoulis, A.E. Kel, E.A. Ananko., O.A. Podkolodnaya, I.L. Stepanenko, T.I. Merkulova, T.N. Goryachkovskaya, F.A. Kolpakov, N.L. Podkolodny, S.V. Lavryushev, D.A. Grigorovich, A.S. Frolov, A.G. Romashchenko, Transcription regulatory regions database (TRRD): new possibilities provided by release 4.0 this issue
- Kel A.E., N.A. Kolchanov, O.V. Kel, A.G. Romashchenko, E.A. Anan’ko, E.V. Ignatieva, T.I. Merkulova, O.A. Podkolodnaya, I.L. Stepanenko, A.V. Kochetov, F.A. Kolpakov, N.L. Podkolodnyi, and A.N. Naumochkin, “TRRD: database on transcription regulatory regions of eukaryotic genes” Mol. Biol. (Msk.) 31, 521-530 (1997).
- Porcher, G. Pitiot, M. Plumb, S. Lowe, H. de Verneuil and B. Grandchamp, «Characterization of hypersensitive sites, protein-binding motifs, and regulatory elements in both promoters of the mouse porphobilinogen deaminase gene» J. Biol. Chem. 266, 10562-10569 (1991).
- O’Prery and P.R. Harrison, «Tissue-specific regulation of the rabbit 15-lipoxygenase gene in erythroid cells by a transcriptional silencer» Nucleic Acids Res. 23, 3664-3672 (1995).
- Crossley, A.P. Tsang, J.J. Bieker., and S.H. Orkin, «Regulation of the erythroid Kruppel-like factor (EKLF) gene promoter by the erythroid transcription factor GATA-1» J. Biol. Chem. 269, 15440-15444 (1994).
- Lecointe, O. Bernard, K. Naert, V. Joulin, C.J. Larsen, P.H. Romej and Mathieu-Mahul D., «GATA-and SP1-binding sites are required for the full activity of the tissue-specific promoter of the tal-1 gene» Oncogene 9, 2623-2632 (1994).
- Royer-Pokora, M. Rogers, T.H. Zhu, S. Schneider, U. Loos and U. Beolitz, «The TTG-2/RBTN2 T cell oncogene encodes two alternative transcripts from two promoters: the distal promoter is removed by most 11p13 translocations in acute T cell leukaemia’s (T-ALL)» Oncogene 10, 1353-1360 (1990).
- Vieille-Grosjean and P. Huber, «Transcription factor GATA-1 regulates human HOXB2 gene expression in erythroid cells» J. Biol. Chem. 270, 4544-4550 (1995).