{"id":601,"date":"2023-03-15T11:46:53","date_gmt":"2023-03-15T04:46:53","guid":{"rendered":"https:\/\/conf.icgbio.ru\/bgrs98\/?page_id=601"},"modified":"2023-09-04T15:23:45","modified_gmt":"2023-09-04T08:23:45","slug":"042_recognition-accuracy-of-dna-functional-sites-can-be-increased-by-averaging-partial-recognitions","status":"publish","type":"page","link":"https:\/\/conf.icgbio.ru\/bgrs98\/abstracts\/abstract-list\/042_recognition-accuracy-of-dna-functional-sites-can-be-increased-by-averaging-partial-recognitions\/","title":{"rendered":"RECOGNITION ACCURACY OF DNA FUNCTIONAL SITES CAN BE INCREASED BY AVERAGING PARTIAL RECOGNITIONS"},"content":{"rendered":"<p><a href=\"https:\/\/conf.icgbio.ru\/bgrs98\/abstracts\/authors-index\/#ponomarenko_mp\">PONOMARENKO M.P.<\/a>,\u00a0<a href=\"https:\/\/conf.icgbio.ru\/bgrs98\/abstracts\/authors-index\/#frolov\">FROLOV A.S.<\/a>,\u00a0<a href=\"https:\/\/conf.icgbio.ru\/bgrs98\/abstracts\/authors-index\/#ponomarenko_jv\">PONOMARENKO J.V.<\/a>,\u00a0<a href=\"https:\/\/conf.icgbio.ru\/bgrs98\/abstracts\/authors-index\/#podkolodnaya\">PODKOLODNAYA O.A.<\/a>,\u00a0<a href=\"https:\/\/conf.icgbio.ru\/bgrs98\/abstracts\/authors-index\/#vorobiev\">VOROBIEV D.G.<\/a>,\u00a0<a href=\"https:\/\/conf.icgbio.ru\/bgrs98\/abstracts\/authors-index\/#kol\">KOLCHANOV N.A.<\/a>,\u00a0<a href=\"https:\/\/conf.icgbio.ru\/bgrs98\/abstracts\/authors-index\/#overton\">OVERTON G.C.<\/a><\/p>\n<p>Laboratory of Theoretical Genetics, Institute of Cytology and Genetics, Siberian Branch of the Russian Academy of Sciences, 10 Lavrentiev Ave., Novosibirsk, 630090, Russia, e-mail:\u00a0<a href=\"mailto:pon@bionet.nsc.ru\" target=\"_blank\" rel=\"noopener\">pon@bionet.nsc.ru<\/a><\/p>\n<p>GC Overton 105 111 Center for Bioinformatics, UPenn, Philadelphia, USA; e-mail:\u00a0coverton@cbil.humgen.upenn.edu<\/p>\n<p><a href=\"https:\/\/conf.icgbio.ru\/bgrs98\/abstracts\/keywords-index\/\">Keywords<\/a>: site recognition, activated database, central limit theorem, program generation<\/p>\n<p>&nbsp;<\/p>\n<p>The Central Limit Theorem-based approach for increasing accuracy of the recognition of a functional site in an arbitrary DNA sequence has been suggested. It implies the averaging of a huge number of partial recognitions of the site into the \u201cmean recognition\u201d of this site. To generate a huge number of the partial recognition within the framework of the consensus and frequency matrix formalisms, a lot of novel oligonucleotide alphabets were used. On this basis, the activated database SAMPLES has been created. SAMPLES contains the sets of experimentally identified functional sites aligned and transformed into recognizing procedures. The SAMPLES applicability was tested using GATA-1 and C\/EBP transcription factor binding sites. SAMPLES is available at <a href=\"http:\/\/wwwmgs.bionet.nsc.ru\/\" target=\"_blank\" rel=\"noopener\">http:\/\/wwwmgs.bionet.nsc.ru\/<\/a>.<\/p>\n<p><b>Introduction<\/b><\/p>\n<p>Recognition of functional sites in nucleotide sequences is one of the key episodes in genomic DNA annotation [1]. A huge number of methods have been so far developed to address the problem (for review, see [2]). The most widely used are the consensus and matrix methods [3-7] based on the evolutionary conservative nucleotides of functional sites. Recent evaluations of the accuracies of these methods for annotation of long genomic DNA fragments [8, 9] have demonstrated, on the one hand, a drastic progress in recognition of unknown genes and regulatory regions encoded in the genomic DNA and, on the other, the demand for considerable increase in the accuracy of the recognizing procedures for the functional sites in the actual application of genomic DNA annotation [8, 9].<\/p>\n<p>Basing on the above bulk of intelligence, we are suggesting a systemic approach aiding the increase in the accuracy of a given functional site recognition in the course of genomic DNA annotation. It implies the averaging of a huge number of partial recognitions of the site analyzed into the \u201cmean recognition\u201d of this site. The consensuses and frequency matrixes in 20 novel computable alphabets have been used, and the activated database SAMPLES has been created. It contains experimentally identified DNA site sequences multiply aligned and transformed into the C-code recognition programs. The approach proposed was tested using GATA-1 and C\/EBP transcription factor binding sites. SAMPLES is available at\u00a0<a href=\"http:\/\/wwwmgs.bionet.nsc.ru\/\" target=\"_blank\" rel=\"noopener\">http:\/\/wwwmgs.bionet.nsc.ru\/<\/a><\/p>\n<p><b>System and methods<\/b><\/p>\n<p>The scheme of the system suggested is shown in Fig.1. Its key module is the automated C-generator for the programs recognizing a functional site; the initial data for this recognition are the aligned sequences of this site DNA.<\/p>\n<p>For the site consensus, the letters positioned with frequencies higher than the threshold\u00a0<b>f<\/b><sub>0<\/sub>\u00a0are selected. Fig. 2 exemplifies the simplest C-code programs recognizing GATA-1 site by (a) its consensus and (b) its frequency matrix as well as (c and d) the simplest C-code programs generated for a huge number of consensuses and frequency matrices using the novel alphabets (Table 1). Using the database GibbsAlign of the multiply aligned site sequences (Fig. 1), SAMPLES generates all the consensuses and frequency matrices\u00a0<b>{f<\/b><sub>k<\/sub><b>}<\/b>\u00a0recognizing this site in an arbitrary DNA\u00a0<b>(1? k? K).<\/b>\u00a0The simplest simultaneous usage of all the partial recognitions is averaging their values\u00a0<b>{f<\/b><sub>k<\/sub><b>(S)}<\/b>\u00a0over the region\u00a0<b>S<\/b>\u00a0of a DNA sequence is the following:<\/p>\n<table border=\"0\" width=\"100%\" cellspacing=\"0\" cellpadding=\"0\">\n<tbody>\n<tr>\n<td width=\"90%\"><img loading=\"lazy\" class=\"alignnone size-full wp-image-602\" src=\"https:\/\/conf.icgbio.ru\/bgrs98\/wp-content\/uploads\/sites\/111\/2023\/03\/Thesis42_Image1.gif\" alt=\"\" width=\"142\" height=\"18\" \/><\/td>\n<td width=\"10%\">(1)<\/td>\n<\/tr>\n<\/tbody>\n<\/table>\n<p>where the recognition values\u00a0<b>f<\/b><sub>k<\/sub><b>(S)<\/b> are normalized as <img loading=\"lazy\" class=\"alignnone size-full wp-image-603\" src=\"https:\/\/conf.icgbio.ru\/bgrs98\/wp-content\/uploads\/sites\/111\/2023\/03\/Thesis42_Image2.gif\" alt=\"\" width=\"296\" height=\"18\" \/>and the excision rule is: <img loading=\"lazy\" class=\"alignnone size-full wp-image-604\" src=\"https:\/\/conf.icgbio.ru\/bgrs98\/wp-content\/uploads\/sites\/111\/2023\/03\/Thesis42_Image3.gif\" alt=\"\" width=\"278\" height=\"17\" \/>\u00a0The Central Limit Theorem states that this mean recognition\u00a0<b>F<sub>K<\/sub><\/b>\u00a0should be Gaussian with the variation\u00a0<b>K<sup>-1\/2<\/sup><\/b>-reduced with the\u00a0<b>K<\/b>\u00a0increase.<\/p>\n<p>&nbsp;<\/p>\n<p><b>Results and discussions<\/b><\/p>\n<p>The mean recognition parameter suggested has been used to process the experimentally identified and aligned DNA sequences of GATA-1 and C\/EBP transcription factor binding sites (the total number of the sequences analyzed was 102 and 62, respectively). All the data are available in the databases SAMPLES and GibbsAlign at\u00a0<a href=\"http:\/\/wwwmgs.bionet.nsc.ru\/\" target=\"_blank\" rel=\"noopener\">http:\/\/wwwmgs.bionet.nsc.ru\/<\/a>\u00a0(Fig. 1). Each of these two sequence sets was randomly divided into non-overlapping 50%-subsets, training and control. In the training sets, all the possible consensuses and frequency matrices were generated. Their C-programs recognizing the sites are stored in the database ConsFreq and executable in the library MeanRec,\u00a0<a href=\"http:\/\/wwwmgs.bionet.nsc.ru\/\" target=\"_blank\" rel=\"noopener\">http:\/\/wwwmgs.bionet.nsc.ru\/<\/a>\u00a0(Fig.1).<\/p>\n<p>Using the control subsets and 1000 random DNA sequences, each of the partial recognition procedures was tested. These control results are listed in Table 2. Similarly, their mean recognitions have been tested (Table 2). Note that the partial recognitions differ from one another in their means and standard deviations as well as type I and II error rates of both the site and random DNA sequences. Essentially, it is impossible to predict what partial recognition would be the best for an arbitrary site; in contrast, the mean recognition appears the best for each of the three sites tested. Fig. 3 illustrates that the mean recognition can decrease both the type I and II errors,\u00a0<b>a<\/b><sub>1<\/sub>\u00a0and\u00a0<b>a<\/b><sub>2<\/sub>, with respect to the frequency matrix. This is shown for (a) GATA-1 and (b) C\/EBP transcription factor binding sites.<\/p>\n<p>To study this, we analyzed the alteration of the statistical properties of the standard deviation of the GATA-1 mean recognition\u00a0<b>F<\/b><sub>K<\/sub>\u00a0with the growth of the total number\u00a0<b>K<\/b>\u00a0of the GATA-1 averaged partial recognition procedures. In this test, for each value\u00a0<b>K,<\/b>\u00a0ten different combinations of the partial recognition procedures\u00a0<b>{<\/b>f<sub>k<\/sub><b>}<\/b>\u00a0was randomly chosen and their standard deviations for 51 control sequences of the GATA-1 site and 1000 random DNA sequences were calculated and averaged. Two variants of this analysis were carried out: with and without the GATA-1 consensus and frequency matrix used traditionally for the GATA-1 recognition. The results obtained are presented in Fig. 4 for (a) the GATA-1 control subset and (b) for the random DNA (bold line, with traditional recognitions; broken line, without). In case of GATA-1 sites (Fig. 4a), the standard deviation value is approximately constant at any\u00a0<b>K<\/b>\u00a0value when the traditional recognitions are employed (Fig. 4a, bold line). This means that the GATA-1 sequences analyzed have been optimized by their preliminary alignment to create these traditional recognition procedures [11]. When the traditional recognition procedures were not involved (Fig. 4a, broken line), the standard deviation value is approximately\u00a0<b>K<\/b><sup>-1\/2<\/sup>-fold decreasing with the\u00a0<b>K<\/b>\u00a0value (as is stated by the Central Limit Theorem) until its alignment-dependent level is reached. Essentially, the Central Limit Theorem-established decreases are for the random DNA sequences in both variants, with and without the traditional recognition procedures (Fig. 4b). These results pinpoint that the mean recognition\u00a0<b>F<\/b><sub>K<\/sub>\u00a0is increasing the accuracy of a given functional site recognition through the\u00a0<b>K<\/b><sup>-1\/2<\/sup>-fold decrease of the standard deviation of the non-site sequences, which is responsible for the type II error\u00a0<b>a<\/b><sub>2<\/sub>.<\/p>\n<p><b>Conclusion<\/b><\/p>\n<p>In this work, we introduce the idea of simultaneous involvement of as many procedures recognizing a functional site in an arbitrary DNA sequence as we can design. The simplest implementation of this idea is averaging the all partial recognizing procedures available (\u201cmean recognition\u201d). Unexpectedly, the analysis of the mean recognition shows that its statistical properties are described by the Central Limit Theorem. Essentially, this theorem establishes that the mean recognition\u00a0<b>F<\/b><sub>K<\/sub>\u00a0should became Gaussian and its standard deviation\u00a0<b>K<\/b><sup>-1\/2<\/sup>-decreased with the total number\u00a0<b>K<\/b>\u00a0of the partial recognitions averaged. We have actually observed this essential decrease (Fig. 4b). This yields that the mean recognition behavior should be predictable by the Central Limit Theorem even when each of its partial recognitions are heuristic with their unpredictable behavior. Thus, the mean recognition is the systemic approach increasing the accuracy of the functional site recognition for genomic DNA annotation.<\/p>\n<p>Further development of the mean recognition approach will focus on the increase in the total number\u00a0<b>K&gt;&gt;100<\/b>\u00a0of the averaged partial recognitions\u00a0<b>{f<\/b><sub>k<\/sub><b>}<\/b>\u00a0through involvement of additional methods, such as Information Content, Perceptron, Neural Network, etc. Various weighting of the partial recognitions within the mean recognition will be also studied. Finally, it is interesting to implement the Central Limit Theorem to design the mean recognitions for increasing the accuracy of the coding potentials of exons, the non-coding potentials of introns, and the regulatory potentials of promoters [12-14].<\/p>\n<p>We are grateful to Ms. Galina Chirikova for the help in translation. This work was granted by NIH 2-R01-RR04026-08A2; Russian Human Genome Project; Russian Foundation for Basic Research 97-04-49740, 96-04-50006, 97-07-90309, 98-07-90126; SB RAS IG-97N13 and the Young Scientists Awards\u201997\/98.<\/p>\n<p>&nbsp;<\/p>\n<p><b>References<\/b><\/p>\n<ol>\n<li>J.W. Fickett, Trends Genet.,\u00a0<b>12<\/b>, 316 (1996).<\/li>\n<li>M.S. Gelfand, J. Comput. Biol.,\u00a0<b>2<\/b>, 87 (1995).<\/li>\n<li>P. Bucher, J. Mol. Biol.,\u00a0<b>212<\/b>, 563 (1990).<\/li>\n<li>S. Karlin and V. Brendel, Science,\u00a0<b>257<\/b>, 39 (1992)<\/li>\n<li>K. Quandt, K. Frech, et al., Nucleic Acids Res.,\u00a0<b>23<\/b>, 4878 (1995).<\/li>\n<li>E.C. Uberbacher, Y. Xu, and R.J. Mural, Methods Enzymol.,\u00a0<b>266<\/b>, 259 (1996).<\/li>\n<li>Q.K. Chen, G.Z. Hertz, and G.D. Stormo, Comput. Appl. Biosci.,\u00a0<b>13<\/b>, 29 (1997).<\/li>\n<li>J.W. Fickett and A.G. Hatzigeorgiou, Genome Res.,\u00a0<b>7<\/b>, 861 (1997)<\/li>\n<li>M. Burset and R Guigo, Genomics,\u00a0<b>34<\/b>, 353 (1996).<\/li>\n<li>Y.V. Kondrakhin, V.V. Shamin, and N.A. Kolchanov, Comput. Appl. Biosci.,\u00a0<b>10<\/b>, 597 (1994)<\/li>\n<li>C. Lawrence, Comput. Chem.,<b>\u00a018,<\/b>\u00a0255 (1994).<\/li>\n<li>V.V. Solovyev, A.A. Salamov, and C.B. Lawrence, Nucleic Acids Res.,\u00a0<b>22,<\/b>\u00a05156 (1994).<\/li>\n<li>R. Guigo and J.W. Fickett, J. Mol. Biol.,\u00a0<b>253,<\/b>\u00a051 (1995).<\/li>\n<li>Y.V. Kondrakhin, et al., Comput. Appl. Biosci.,\u00a0<b>11,<\/b>\u00a0477 (1995).<\/li>\n<\/ol>\n<p>&nbsp;<\/p>\n<p><a href=\"https:\/\/conf.icgbio.ru\/bgrs98\/wp-content\/uploads\/sites\/111\/2023\/03\/Thesis42_Image4.gif\" target=\"_blank\" rel=\"noopener\"><img loading=\"lazy\" class=\"alignnone wp-image-605 size-full\" src=\"https:\/\/conf.icgbio.ru\/bgrs98\/wp-content\/uploads\/sites\/111\/2023\/03\/Thesis42_Image4.gif\" alt=\"\" width=\"325\" height=\"217\" \/><\/a><\/p>\n<p>Figure 1. The scheme of the database SAMPLES.<\/p>\n<p>&nbsp;<\/p>\n<table border=\"0\" width=\"100%\" cellspacing=\"0\" cellpadding=\"0\">\n<tbody>\n<tr>\n<td width=\"50%\">\n<p align=\"center\"><a href=\"https:\/\/conf.icgbio.ru\/bgrs98\/wp-content\/uploads\/sites\/111\/2023\/03\/Thesis42_Image9.gif\" target=\"_blank\" rel=\"noopener\"><img loading=\"lazy\" class=\"alignnone wp-image-610 size-full\" src=\"https:\/\/conf.icgbio.ru\/bgrs98\/wp-content\/uploads\/sites\/111\/2023\/03\/Thesis42_Image9.gif\" alt=\"\" width=\"266\" height=\"242\" \/><\/a><\/p>\n<\/td>\n<td width=\"50%\">\n<p align=\"center\"><a href=\"https:\/\/conf.icgbio.ru\/bgrs98\/wp-content\/uploads\/sites\/111\/2023\/03\/Thesis42_Image11.gif\" target=\"_blank\" rel=\"noopener\"><img loading=\"lazy\" class=\"alignnone wp-image-612 size-full\" src=\"https:\/\/conf.icgbio.ru\/bgrs98\/wp-content\/uploads\/sites\/111\/2023\/03\/Thesis42_Image11.gif\" alt=\"\" width=\"266\" height=\"242\" \/><\/a><\/p>\n<\/td>\n<\/tr>\n<tr>\n<td colspan=\"2\" width=\"100%\">\n<p align=\"center\"><a href=\"https:\/\/conf.icgbio.ru\/bgrs98\/wp-content\/uploads\/sites\/111\/2023\/03\/Thesis42_Image10.gif\" target=\"_blank\" rel=\"noopener\"><img loading=\"lazy\" class=\"alignnone wp-image-611 size-full\" src=\"https:\/\/conf.icgbio.ru\/bgrs98\/wp-content\/uploads\/sites\/111\/2023\/03\/Thesis42_Image10.gif\" alt=\"\" width=\"520\" height=\"18\" \/><\/a><\/p>\n<\/td>\n<\/tr>\n<tr>\n<td colspan=\"2\" width=\"100%\">\n<p align=\"center\"><a href=\"https:\/\/conf.icgbio.ru\/bgrs98\/wp-content\/uploads\/sites\/111\/2023\/03\/Thesis42_Image12.gif\" target=\"_blank\" rel=\"noopener\"><img loading=\"lazy\" class=\"alignnone wp-image-613 size-full\" src=\"https:\/\/conf.icgbio.ru\/bgrs98\/wp-content\/uploads\/sites\/111\/2023\/03\/Thesis42_Image12.gif\" alt=\"\" width=\"520\" height=\"265\" \/><\/a><\/p>\n<\/td>\n<\/tr>\n<\/tbody>\n<\/table>\n<p>Figure 2. The C-programs generated to recognize GATA-1 site using the traditional (a) consensus and (b) frequency matrix as well as (c) the novel 5 bp-consensus of the alphabet Nx<sub>16<\/sub>\u00a0and (d) frequency matrix of the alphabet WSx<sub>4<\/sub>\u00a0from Table 1. The frequency values (b) reasoning the GATA-1 consensus (d) are\u00a0<u>underlined<\/u>.<\/p>\n<p>&nbsp;<\/p>\n<p>Table 1. The alphabets used to construct the consensuses and frequency matrices recognizing the site<\/p>\n<table border=\"1\" width=\"100%\" cellspacing=\"0\" cellpadding=\"0\">\n<tbody>\n<tr>\n<td valign=\"TOP\" width=\"8%\"><\/td>\n<td valign=\"TOP\" width=\"3%\"><\/td>\n<td valign=\"TOP\" width=\"53%\">\n<p align=\"CENTER\">Alphabet E<sub>n<\/sub>={e<sub>1<\/sub>, &#8230; . e<sub>n<\/sub>} of the oligonucleotides of L length<\/p>\n<\/td>\n<td valign=\"TOP\" width=\"5%\"><\/td>\n<td valign=\"TOP\" width=\"14%\">\n<p align=\"CENTER\">Consensus<\/p>\n<\/td>\n<td valign=\"TOP\" width=\"6%\">\n<p align=\"CENTER\">Freq-<\/p>\n<\/td>\n<td valign=\"TOP\" width=\"11%\">\n<p align=\"CENTER\">Previous<\/p>\n<\/td>\n<\/tr>\n<tr>\n<td valign=\"TOP\" width=\"8%\">\n<p align=\"CENTER\">Name<\/p>\n<\/td>\n<td valign=\"TOP\" width=\"3%\">\n<p align=\"CENTER\">L<\/p>\n<\/td>\n<td valign=\"TOP\" width=\"53%\">\n<p align=\"CENTER\">(M=A\/C, K=G\/T, R=A\/G, Y=T\/C, W=A\/T, S=G\/C, x=A\/T\/G\/C)<\/p>\n<\/td>\n<td valign=\"TOP\" width=\"5%\">\n<p align=\"CENTER\">n<\/p>\n<\/td>\n<td valign=\"TOP\" width=\"14%\">\n<p align=\"CENTER\">threshold, f<sub>0<\/sub><\/p>\n<\/td>\n<td valign=\"TOP\" width=\"6%\">\n<p align=\"CENTER\">uensy<\/p>\n<\/td>\n<td valign=\"TOP\" width=\"11%\">\n<p align=\"CENTER\">usage<\/p>\n<\/td>\n<\/tr>\n<tr>\n<td valign=\"TOP\" width=\"8%\">\n<p align=\"CENTER\">N<sub>4<\/sub><\/p>\n<\/td>\n<td valign=\"TOP\" width=\"3%\">\n<p align=\"CENTER\">1<\/p>\n<\/td>\n<td valign=\"TOP\" width=\"53%\">A, T, G, C<\/td>\n<td valign=\"TOP\" width=\"5%\">\n<p align=\"CENTER\">4<\/p>\n<\/td>\n<td valign=\"TOP\" width=\"14%\">\n<p align=\"CENTER\">0.500<\/p>\n<\/td>\n<td valign=\"TOP\" width=\"6%\">\n<p align=\"CENTER\">+<\/p>\n<\/td>\n<td valign=\"TOP\" width=\"11%\">\n<p align=\"CENTER\">Traditional<\/p>\n<\/td>\n<\/tr>\n<tr>\n<td valign=\"TOP\" width=\"8%\">\n<p align=\"CENTER\">N<sub>16<\/sub><\/p>\n<\/td>\n<td valign=\"TOP\" width=\"3%\">\n<p align=\"CENTER\">2<\/p>\n<\/td>\n<td valign=\"TOP\" width=\"53%\">AA, AT, AG, AC, TA, TT, &#8230;., GC, CA, CT, CG, CC<\/td>\n<td valign=\"TOP\" width=\"5%\">\n<p align=\"CENTER\">16<\/p>\n<\/td>\n<td valign=\"TOP\" width=\"14%\">\n<p align=\"CENTER\">0.333<\/p>\n<\/td>\n<td valign=\"TOP\" width=\"6%\">\n<p align=\"CENTER\">+<\/p>\n<\/td>\n<td valign=\"TOP\" width=\"11%\">\n<p align=\"CENTER\">This work<\/p>\n<\/td>\n<\/tr>\n<tr>\n<td valign=\"TOP\" width=\"8%\">\n<p align=\"CENTER\">N<sub>64<\/sub><\/p>\n<\/td>\n<td valign=\"TOP\" width=\"3%\">\n<p align=\"CENTER\">3<\/p>\n<\/td>\n<td valign=\"TOP\" width=\"53%\">AAA, AAT, AAG, &#8230;., CGC, CCA, CCT, CCG, CCC<\/td>\n<td valign=\"TOP\" width=\"5%\">\n<p align=\"CENTER\">64<\/p>\n<\/td>\n<td valign=\"TOP\" width=\"14%\">\n<p align=\"CENTER\">0.125<\/p>\n<\/td>\n<td valign=\"TOP\" width=\"6%\">\n<p align=\"CENTER\">&#8211;<\/p>\n<\/td>\n<td valign=\"TOP\" width=\"11%\">\n<p align=\"CENTER\">[10]<\/p>\n<\/td>\n<\/tr>\n<tr>\n<td valign=\"TOP\" width=\"8%\">\n<p align=\"CENTER\">Nx<sub>16<\/sub><\/p>\n<\/td>\n<td valign=\"TOP\" width=\"3%\">\n<p align=\"CENTER\">3<\/p>\n<\/td>\n<td valign=\"TOP\" width=\"53%\">AxA, AxT, AxG, AxC, &#8230;., GxC, CxA, CxT, CxG, CxC<\/td>\n<td valign=\"TOP\" width=\"5%\">\n<p align=\"CENTER\">16<\/p>\n<\/td>\n<td valign=\"TOP\" width=\"14%\">\n<p align=\"CENTER\">0.333<\/p>\n<\/td>\n<td valign=\"TOP\" width=\"6%\">\n<p align=\"CENTER\">+<\/p>\n<\/td>\n<td valign=\"TOP\" width=\"11%\">\n<p align=\"CENTER\">This work<\/p>\n<\/td>\n<\/tr>\n<tr>\n<td valign=\"TOP\" width=\"8%\">\n<p align=\"CENTER\">Nx<sub>64<\/sub><\/p>\n<\/td>\n<td valign=\"TOP\" width=\"3%\">\n<p align=\"CENTER\">5<\/p>\n<\/td>\n<td valign=\"TOP\" width=\"53%\">AxAxA, AxAxT, AxAxG, &#8230;., CxCxT, CxCxG, CxCxC<\/td>\n<td valign=\"TOP\" width=\"5%\">\n<p align=\"CENTER\">64<\/p>\n<\/td>\n<td valign=\"TOP\" width=\"14%\">\n<p align=\"CENTER\">0.125<\/p>\n<\/td>\n<td valign=\"TOP\" width=\"6%\">\n<p align=\"CENTER\">&#8211;<\/p>\n<\/td>\n<td valign=\"TOP\" width=\"11%\">\n<p align=\"CENTER\">This work<\/p>\n<\/td>\n<\/tr>\n<tr>\n<td valign=\"TOP\" width=\"8%\">\n<p align=\"CENTER\">MK<sub>4<\/sub><\/p>\n<\/td>\n<td valign=\"TOP\" width=\"3%\">\n<p align=\"CENTER\">2<\/p>\n<\/td>\n<td valign=\"TOP\" width=\"53%\">MM, MK, KM, KK<\/td>\n<td valign=\"TOP\" width=\"5%\">\n<p align=\"CENTER\">4<\/p>\n<\/td>\n<td valign=\"TOP\" width=\"14%\">\n<p align=\"CENTER\">0.500<\/p>\n<\/td>\n<td valign=\"TOP\" width=\"6%\">\n<p align=\"CENTER\">+<\/p>\n<\/td>\n<td valign=\"TOP\" width=\"11%\">\n<p align=\"CENTER\">This work<\/p>\n<\/td>\n<\/tr>\n<tr>\n<td valign=\"TOP\" width=\"8%\">\n<p align=\"CENTER\">MK<sub>8<\/sub><\/p>\n<\/td>\n<td valign=\"TOP\" width=\"3%\">\n<p align=\"CENTER\">3<\/p>\n<\/td>\n<td valign=\"TOP\" width=\"53%\">MMM, MMK, MKM, MKK, KMM, KMK, KKM, KKK<\/td>\n<td valign=\"TOP\" width=\"5%\">\n<p align=\"CENTER\">8<\/p>\n<\/td>\n<td valign=\"TOP\" width=\"14%\">\n<p align=\"CENTER\">0.250<\/p>\n<\/td>\n<td valign=\"TOP\" width=\"6%\">\n<p align=\"CENTER\">+<\/p>\n<\/td>\n<td valign=\"TOP\" width=\"11%\">\n<p align=\"CENTER\">This work<\/p>\n<\/td>\n<\/tr>\n<tr>\n<td valign=\"TOP\" width=\"8%\">\n<p align=\"CENTER\">KM<sub>16<\/sub><\/p>\n<\/td>\n<td valign=\"TOP\" width=\"3%\">\n<p align=\"CENTER\">4<\/p>\n<\/td>\n<td valign=\"TOP\" width=\"53%\">MMMM, MMMK, MMKM, MMKK, &#8230;, KKKM, KKKK<\/td>\n<td valign=\"TOP\" width=\"5%\">\n<p align=\"CENTER\">16<\/p>\n<\/td>\n<td valign=\"TOP\" width=\"14%\">\n<p align=\"CENTER\">0.333<\/p>\n<\/td>\n<td valign=\"TOP\" width=\"6%\">\n<p align=\"CENTER\">+<\/p>\n<\/td>\n<td valign=\"TOP\" width=\"11%\">\n<p align=\"CENTER\">This work<\/p>\n<\/td>\n<\/tr>\n<tr>\n<td valign=\"TOP\" width=\"8%\">\n<p align=\"CENTER\">MKx<sub>4<\/sub><\/p>\n<\/td>\n<td valign=\"TOP\" width=\"3%\">\n<p align=\"CENTER\">3<\/p>\n<\/td>\n<td valign=\"TOP\" width=\"53%\">MxM, MxK, KxM, KxK<\/td>\n<td valign=\"TOP\" width=\"5%\">\n<p align=\"CENTER\">4<\/p>\n<\/td>\n<td valign=\"TOP\" width=\"14%\">\n<p align=\"CENTER\">0.500<\/p>\n<\/td>\n<td valign=\"TOP\" width=\"6%\">\n<p align=\"CENTER\">+<\/p>\n<\/td>\n<td valign=\"TOP\" width=\"11%\">\n<p align=\"CENTER\">This work<\/p>\n<\/td>\n<\/tr>\n<tr>\n<td valign=\"TOP\" width=\"8%\">\n<p align=\"CENTER\">MKx<sub>8<\/sub><\/p>\n<\/td>\n<td valign=\"TOP\" width=\"3%\">\n<p align=\"CENTER\">5<\/p>\n<\/td>\n<td valign=\"TOP\" width=\"53%\">MxMxM, MxMxK, MxKxM, MxKxK, &#8230;, KxKxM, KxKxK<\/td>\n<td valign=\"TOP\" width=\"5%\">\n<p align=\"CENTER\">8<\/p>\n<\/td>\n<td valign=\"TOP\" width=\"14%\">\n<p align=\"CENTER\">0.250<\/p>\n<\/td>\n<td valign=\"TOP\" width=\"6%\">\n<p align=\"CENTER\">+<\/p>\n<\/td>\n<td valign=\"TOP\" width=\"11%\">\n<p align=\"CENTER\">This work<\/p>\n<\/td>\n<\/tr>\n<tr>\n<td valign=\"TOP\" width=\"8%\">\n<p align=\"CENTER\">RY<sub>4<\/sub><\/p>\n<\/td>\n<td valign=\"TOP\" width=\"3%\">\n<p align=\"CENTER\">2<\/p>\n<\/td>\n<td valign=\"TOP\" width=\"53%\">RR, RY, YR, YY<\/td>\n<td valign=\"TOP\" width=\"5%\">\n<p align=\"CENTER\">4<\/p>\n<\/td>\n<td valign=\"TOP\" width=\"14%\">\n<p align=\"CENTER\">0.500<\/p>\n<\/td>\n<td valign=\"TOP\" width=\"6%\">\n<p align=\"CENTER\">+<\/p>\n<\/td>\n<td valign=\"TOP\" width=\"11%\">\n<p align=\"CENTER\">This work<\/p>\n<\/td>\n<\/tr>\n<tr>\n<td valign=\"TOP\" width=\"8%\">\n<p align=\"CENTER\">RY<sub>8<\/sub><\/p>\n<\/td>\n<td valign=\"TOP\" width=\"3%\">\n<p align=\"CENTER\">3<\/p>\n<\/td>\n<td valign=\"TOP\" width=\"53%\">RRR, RRY, RYR, RYY, YRR, YRY, YYR, YYY<\/td>\n<td valign=\"TOP\" width=\"5%\">\n<p align=\"CENTER\">8<\/p>\n<\/td>\n<td valign=\"TOP\" width=\"14%\">\n<p align=\"CENTER\">0.250<\/p>\n<\/td>\n<td valign=\"TOP\" width=\"6%\">\n<p align=\"CENTER\">+<\/p>\n<\/td>\n<td valign=\"TOP\" width=\"11%\">\n<p align=\"CENTER\">This work<\/p>\n<\/td>\n<\/tr>\n<tr>\n<td valign=\"TOP\" width=\"8%\">\n<p align=\"CENTER\">RY<sub>16<\/sub><\/p>\n<\/td>\n<td valign=\"TOP\" width=\"3%\">\n<p align=\"CENTER\">4<\/p>\n<\/td>\n<td valign=\"TOP\" width=\"53%\">RRRR, RRRY, RRYR, &#8230;, YYRR, YYRY, YYYR, YYYY<\/td>\n<td valign=\"TOP\" width=\"5%\">\n<p align=\"CENTER\">16<\/p>\n<\/td>\n<td valign=\"TOP\" width=\"14%\">\n<p align=\"CENTER\">0.333<\/p>\n<\/td>\n<td valign=\"TOP\" width=\"6%\">\n<p align=\"CENTER\">+<\/p>\n<\/td>\n<td valign=\"TOP\" width=\"11%\">\n<p align=\"CENTER\">This work<\/p>\n<\/td>\n<\/tr>\n<tr>\n<td valign=\"TOP\" width=\"8%\">\n<p align=\"CENTER\">RYx<sub>4<\/sub><\/p>\n<\/td>\n<td valign=\"TOP\" width=\"3%\">\n<p align=\"CENTER\">3<\/p>\n<\/td>\n<td valign=\"TOP\" width=\"53%\">RxR, RxY, YxR, YxY<\/td>\n<td valign=\"TOP\" width=\"5%\">\n<p align=\"CENTER\">4<\/p>\n<\/td>\n<td valign=\"TOP\" width=\"14%\">\n<p align=\"CENTER\">0.500<\/p>\n<\/td>\n<td valign=\"TOP\" width=\"6%\">\n<p align=\"CENTER\">+<\/p>\n<\/td>\n<td valign=\"TOP\" width=\"11%\">\n<p align=\"CENTER\">This work<\/p>\n<\/td>\n<\/tr>\n<tr>\n<td valign=\"TOP\" width=\"8%\">\n<p align=\"CENTER\">RYx<sub>8<\/sub><\/p>\n<\/td>\n<td valign=\"TOP\" width=\"3%\">\n<p align=\"CENTER\">5<\/p>\n<\/td>\n<td valign=\"TOP\" width=\"53%\">RxRxR, RxRxY, RxYxR, RxYxY, &#8230;, YxYxR, YxYxY<\/td>\n<td valign=\"TOP\" width=\"5%\">\n<p align=\"CENTER\">8<\/p>\n<\/td>\n<td valign=\"TOP\" width=\"14%\">\n<p align=\"CENTER\">0.250<\/p>\n<\/td>\n<td valign=\"TOP\" width=\"6%\">\n<p align=\"CENTER\">+<\/p>\n<\/td>\n<td valign=\"TOP\" width=\"11%\">\n<p align=\"CENTER\">This work<\/p>\n<\/td>\n<\/tr>\n<tr>\n<td valign=\"TOP\" width=\"8%\">\n<p align=\"CENTER\">WS<sub>4<\/sub><\/p>\n<\/td>\n<td valign=\"TOP\" width=\"3%\">\n<p align=\"CENTER\">2<\/p>\n<\/td>\n<td valign=\"TOP\" width=\"53%\">WW, WS, SW, SS<\/td>\n<td valign=\"TOP\" width=\"5%\">\n<p align=\"CENTER\">4<\/p>\n<\/td>\n<td valign=\"TOP\" width=\"14%\">\n<p align=\"CENTER\">0.500<\/p>\n<\/td>\n<td valign=\"TOP\" width=\"6%\">\n<p align=\"CENTER\">+<\/p>\n<\/td>\n<td valign=\"TOP\" width=\"11%\">\n<p align=\"CENTER\">This work<\/p>\n<\/td>\n<\/tr>\n<tr>\n<td valign=\"TOP\" width=\"8%\">\n<p align=\"CENTER\">WS<sub>8<\/sub><\/p>\n<\/td>\n<td valign=\"TOP\" width=\"3%\">\n<p align=\"CENTER\">3<\/p>\n<\/td>\n<td valign=\"TOP\" width=\"53%\">WWW, WWS, WSW, WSS, SWW, SWS, SSW, SSS<\/td>\n<td valign=\"TOP\" width=\"5%\">\n<p align=\"CENTER\">8<\/p>\n<\/td>\n<td valign=\"TOP\" width=\"14%\">\n<p align=\"CENTER\">0.250<\/p>\n<\/td>\n<td valign=\"TOP\" width=\"6%\">\n<p align=\"CENTER\">+<\/p>\n<\/td>\n<td valign=\"TOP\" width=\"11%\">\n<p align=\"CENTER\">This work<\/p>\n<\/td>\n<\/tr>\n<tr>\n<td valign=\"TOP\" width=\"8%\">\n<p align=\"CENTER\">WS<sub>16<\/sub><\/p>\n<\/td>\n<td valign=\"TOP\" width=\"3%\">\n<p align=\"CENTER\">4<\/p>\n<\/td>\n<td valign=\"TOP\" width=\"53%\">WWWW, WWWS, WWSW, &#8230;, SSSW, SSSS<\/td>\n<td valign=\"TOP\" width=\"5%\">\n<p align=\"CENTER\">16<\/p>\n<\/td>\n<td valign=\"TOP\" width=\"14%\">\n<p align=\"CENTER\">0.333<\/p>\n<\/td>\n<td valign=\"TOP\" width=\"6%\">\n<p align=\"CENTER\">+<\/p>\n<\/td>\n<td valign=\"TOP\" width=\"11%\">\n<p align=\"CENTER\">This work<\/p>\n<\/td>\n<\/tr>\n<tr>\n<td valign=\"TOP\" width=\"8%\">\n<p align=\"CENTER\">WSx<sub>4<\/sub><\/p>\n<\/td>\n<td valign=\"TOP\" width=\"3%\">\n<p align=\"CENTER\">3<\/p>\n<\/td>\n<td valign=\"TOP\" width=\"53%\">WxW, WxS, WxS, SxS<\/td>\n<td valign=\"TOP\" width=\"5%\">\n<p align=\"CENTER\">4<\/p>\n<\/td>\n<td valign=\"TOP\" width=\"14%\">\n<p align=\"CENTER\">0.500<\/p>\n<\/td>\n<td valign=\"TOP\" width=\"6%\">\n<p align=\"CENTER\">+<\/p>\n<\/td>\n<td valign=\"TOP\" width=\"11%\">\n<p align=\"CENTER\">This work<\/p>\n<\/td>\n<\/tr>\n<tr>\n<td valign=\"TOP\" width=\"8%\">\n<p align=\"CENTER\">WSx<sub>8<\/sub><\/p>\n<\/td>\n<td valign=\"TOP\" width=\"3%\">\n<p align=\"CENTER\">5<\/p>\n<\/td>\n<td valign=\"TOP\" width=\"53%\">WxWxW, WxWxS, WxSxW, &#8230;, SxSxW, SxSxS<\/td>\n<td valign=\"TOP\" width=\"5%\">\n<p align=\"CENTER\">8<\/p>\n<\/td>\n<td valign=\"TOP\" width=\"14%\">\n<p align=\"CENTER\">0.250<\/p>\n<\/td>\n<td valign=\"TOP\" width=\"6%\">\n<p align=\"CENTER\">+<\/p>\n<\/td>\n<td valign=\"TOP\" width=\"11%\">\n<p align=\"CENTER\">This work<\/p>\n<\/td>\n<\/tr>\n<\/tbody>\n<\/table>\n<p>&nbsp;<\/p>\n<p>Table 2. The partial and mean recognizing procedures generated for GATA-1 and C\/EBP sites<\/p>\n<table border=\"1\" width=\"100%\" cellspacing=\"0\" cellpadding=\"0\">\n<tbody>\n<tr>\n<td colspan=\"3\" valign=\"TOP\" width=\"28%\">\n<p align=\"CENTER\">Recognition procedure<\/p>\n<\/td>\n<td colspan=\"4\" valign=\"TOP\" width=\"34%\">\n<p align=\"CENTER\">GATA-1 (51 sequences, control)<\/p>\n<\/td>\n<td colspan=\"4\" valign=\"TOP\" width=\"38%\">\n<p align=\"CENTER\">C\/EBP (99 sequences, control)<\/p>\n<\/td>\n<\/tr>\n<tr>\n<td valign=\"TOP\" width=\"16%\"><\/td>\n<td valign=\"TOP\" width=\"6%\">Name<\/td>\n<td valign=\"TOP\" width=\"6%\">Type<\/td>\n<td valign=\"TOP\" width=\"11%\">\n<p align=\"CENTER\">site, m\u0411 d<\/p>\n<\/td>\n<td valign=\"TOP\" width=\"6%\">\n<p align=\"CENTER\">a\u00a0<sub>1<\/sub><\/p>\n<\/td>\n<td valign=\"TOP\" width=\"11%\">\n<p align=\"CENTER\">rand, m\u0411 d<\/p>\n<\/td>\n<td valign=\"TOP\" width=\"6%\">\n<p align=\"CENTER\">a\u00a0<sub>2<\/sub><\/p>\n<\/td>\n<td valign=\"TOP\" width=\"11%\">\n<p align=\"CENTER\">site, m\u0411 d<\/p>\n<\/td>\n<td valign=\"TOP\" width=\"8%\">\n<p align=\"CENTER\">a\u00a0<sub>1<\/sub><\/p>\n<\/td>\n<td valign=\"TOP\" width=\"11%\">\n<p align=\"CENTER\">rand, m\u0411 d<\/p>\n<\/td>\n<td valign=\"TOP\" width=\"8%\">\n<p align=\"CENTER\">a\u00a0<sub>2<\/sub><\/p>\n<\/td>\n<\/tr>\n<tr>\n<td valign=\"TOP\" width=\"16%\">Traditional<\/td>\n<td valign=\"TOP\" width=\"6%\">N<sub>4<\/sub><\/td>\n<td valign=\"TOP\" width=\"6%\">Freq<\/td>\n<td valign=\"TOP\" width=\"11%\">0.80\u0411 0.48<\/td>\n<td valign=\"TOP\" width=\"6%\">\n<p align=\"CENTER\">0.04<\/p>\n<\/td>\n<td valign=\"TOP\" width=\"11%\">-1.02\u0411 0.52<\/td>\n<td valign=\"TOP\" width=\"6%\">\n<p align=\"CENTER\">0.04<\/p>\n<\/td>\n<td valign=\"TOP\" width=\"11%\">0.93\u0411 0.47<\/td>\n<td valign=\"TOP\" width=\"8%\">\n<p align=\"CENTER\">0.05<\/p>\n<\/td>\n<td valign=\"TOP\" width=\"11%\">-0.99\u0411 0.56<\/td>\n<td valign=\"TOP\" width=\"8%\">\n<p align=\"CENTER\">0.05<\/p>\n<\/td>\n<\/tr>\n<tr>\n<td valign=\"TOP\" width=\"16%\">Traditional<\/td>\n<td valign=\"TOP\" width=\"6%\">N<sub>4<\/sub><\/td>\n<td valign=\"TOP\" width=\"6%\">Cons<\/td>\n<td valign=\"TOP\" width=\"11%\">0.80\u0411 0.63<\/td>\n<td valign=\"TOP\" width=\"6%\">\n<p align=\"CENTER\">0.02<\/p>\n<\/td>\n<td valign=\"TOP\" width=\"11%\">-1.09\u0411 0.63<\/td>\n<td valign=\"TOP\" width=\"6%\">\n<p align=\"CENTER\">0.07<\/p>\n<\/td>\n<td valign=\"TOP\" width=\"11%\">1.05\u0411 0.46<\/td>\n<td valign=\"TOP\" width=\"8%\">\n<p align=\"CENTER\">0.02<\/p>\n<\/td>\n<td valign=\"TOP\" width=\"11%\">-0.83\u0411 0.62<\/td>\n<td valign=\"TOP\" width=\"8%\">\n<p align=\"CENTER\">0.11<\/p>\n<\/td>\n<\/tr>\n<tr>\n<td valign=\"TOP\" width=\"16%\">Best for C\/EBP<\/td>\n<td valign=\"TOP\" width=\"6%\">N<sub>16<\/sub><\/td>\n<td valign=\"TOP\" width=\"6%\">Freq<\/td>\n<td valign=\"TOP\" width=\"11%\"><i>0.74\u0411 0.48<\/i><\/td>\n<td valign=\"TOP\" width=\"6%\">\n<p align=\"CENTER\"><i>0.08<\/i><\/p>\n<\/td>\n<td valign=\"TOP\" width=\"11%\"><i>-1.05\u0411 0.30<\/i><\/td>\n<td valign=\"TOP\" width=\"6%\">\n<p align=\"CENTER\"><i>0.01<\/i><\/p>\n<\/td>\n<td valign=\"TOP\" width=\"11%\"><b>0.90\u0411 0.55<\/b><\/td>\n<td valign=\"TOP\" width=\"8%\">\n<p align=\"CENTER\"><b>0.06<\/b><\/p>\n<\/td>\n<td valign=\"TOP\" width=\"11%\"><b>-1.12\u0411 0.43<\/b><\/td>\n<td valign=\"TOP\" width=\"8%\">\n<p align=\"CENTER\"><b>0.02<\/b><\/p>\n<\/td>\n<\/tr>\n<tr>\n<td valign=\"TOP\" width=\"16%\">Best for GATA-1<\/td>\n<td valign=\"TOP\" width=\"6%\"><b>N<sub>16<\/sub><\/b><\/td>\n<td valign=\"TOP\" width=\"6%\"><b>Cons<\/b><\/td>\n<td valign=\"TOP\" width=\"11%\"><b>0.87\u0411 0.67<\/b><\/td>\n<td valign=\"TOP\" width=\"6%\">\n<p align=\"CENTER\"><b>0.04<\/b><\/p>\n<\/td>\n<td valign=\"TOP\" width=\"11%\"><b>-1.04\u0411 0.35<\/b><\/td>\n<td valign=\"TOP\" width=\"6%\">\n<p align=\"CENTER\"><b>0.02<\/b><\/p>\n<\/td>\n<td valign=\"TOP\" width=\"11%\">1.12\u0411 0.75<\/td>\n<td valign=\"TOP\" width=\"8%\">\n<p align=\"CENTER\">0.10<\/p>\n<\/td>\n<td valign=\"TOP\" width=\"11%\">-0.94\u0411 0.49<\/td>\n<td valign=\"TOP\" width=\"8%\">\n<p align=\"CENTER\">0.03<\/p>\n<\/td>\n<\/tr>\n<tr>\n<td valign=\"TOP\" width=\"16%\">Introduced in [10]<\/td>\n<td valign=\"TOP\" width=\"6%\">N<sub>64<\/sub><\/td>\n<td valign=\"TOP\" width=\"6%\">Cons<\/td>\n<td valign=\"TOP\" width=\"11%\"><i>0.89\u0411 0.84<\/i><\/td>\n<td valign=\"TOP\" width=\"6%\">\n<p align=\"CENTER\"><i>0.08<\/i><\/p>\n<\/td>\n<td valign=\"TOP\" width=\"11%\"><i>-0.96\u0411 0.20<\/i><\/td>\n<td valign=\"TOP\" width=\"6%\">\n<p align=\"CENTER\"><i>0.02<\/i><\/p>\n<\/td>\n<td valign=\"TOP\" width=\"11%\">1.16\u0411 1.15<\/td>\n<td valign=\"TOP\" width=\"8%\">\n<p align=\"CENTER\">0.13<\/p>\n<\/td>\n<td valign=\"TOP\" width=\"11%\">-0.96\u0411 0.42<\/td>\n<td valign=\"TOP\" width=\"8%\">\n<p align=\"CENTER\">0.07<\/p>\n<\/td>\n<\/tr>\n<tr>\n<td valign=\"TOP\" width=\"16%\">Examples of novel<\/td>\n<td valign=\"TOP\" width=\"6%\">WS<sub>4<\/sub><\/td>\n<td valign=\"TOP\" width=\"6%\">Freq<\/td>\n<td valign=\"TOP\" width=\"11%\"><b>0.86\u0411 0.48<\/b><\/td>\n<td valign=\"TOP\" width=\"6%\">\n<p align=\"CENTER\"><b>0.06<\/b><\/p>\n<\/td>\n<td valign=\"TOP\" width=\"11%\"><b>-0.92\u0411 0.66<\/b><\/td>\n<td valign=\"TOP\" width=\"6%\">\n<p align=\"CENTER\"><b>0.09<\/b><\/p>\n<\/td>\n<td valign=\"TOP\" width=\"11%\">0.79\u0411 0.73<\/td>\n<td valign=\"TOP\" width=\"8%\">\n<p align=\"CENTER\">0.13<\/p>\n<\/td>\n<td valign=\"TOP\" width=\"11%\">-0.82\u0411 0.81<\/td>\n<td valign=\"TOP\" width=\"8%\">\n<p align=\"CENTER\">0.16<\/p>\n<\/td>\n<\/tr>\n<tr>\n<td valign=\"TOP\" width=\"16%\">alphabets usage<\/td>\n<td valign=\"TOP\" width=\"6%\">WSx<sub>4<\/sub><\/td>\n<td valign=\"TOP\" width=\"6%\">Freq<\/td>\n<td valign=\"TOP\" width=\"11%\"><b>0.87\u0411 0.55<\/b><\/td>\n<td valign=\"TOP\" width=\"6%\">\n<p align=\"CENTER\"><b>0.08<\/b><\/p>\n<\/td>\n<td valign=\"TOP\" width=\"11%\"><b>-0.92\u0411 0.70<\/b><\/td>\n<td valign=\"TOP\" width=\"6%\">\n<p align=\"CENTER\"><b>0.09<\/b><\/p>\n<\/td>\n<td valign=\"TOP\" width=\"11%\">0.84\u0411 0.80<\/td>\n<td valign=\"TOP\" width=\"8%\">\n<p align=\"CENTER\">0.13<\/p>\n<\/td>\n<td valign=\"TOP\" width=\"11%\">-0.97\u0411 0.87<\/td>\n<td valign=\"TOP\" width=\"8%\">\n<p align=\"CENTER\">0.14<\/p>\n<\/td>\n<\/tr>\n<tr>\n<td colspan=\"3\" valign=\"TOP\" width=\"28%\">Only novel averaged<\/td>\n<td valign=\"TOP\" width=\"11%\">0.78\u0411 0.64<\/td>\n<td valign=\"TOP\" width=\"6%\">\n<p align=\"CENTER\">0.08<\/p>\n<\/td>\n<td valign=\"TOP\" width=\"11%\">-1.00\u0411 0.33<\/td>\n<td valign=\"TOP\" width=\"6%\">\n<p align=\"CENTER\">0.02<\/p>\n<\/td>\n<td valign=\"TOP\" width=\"11%\">1.01\u0411 0.53<\/td>\n<td valign=\"TOP\" width=\"8%\">\n<p align=\"CENTER\">0.03<\/p>\n<\/td>\n<td valign=\"TOP\" width=\"11%\">-0.98\u0411 0.41<\/td>\n<td valign=\"TOP\" width=\"8%\">\n<p align=\"CENTER\">0.03<\/p>\n<\/td>\n<\/tr>\n<tr>\n<td colspan=\"3\" valign=\"TOP\" width=\"28%\">Mean Recognition<\/td>\n<td valign=\"TOP\" width=\"11%\">0.78\u0411 0.63<\/td>\n<td valign=\"TOP\" width=\"6%\">\n<p align=\"CENTER\">0.08<\/p>\n<\/td>\n<td valign=\"TOP\" width=\"11%\">-1.00\u0411 0.34<\/td>\n<td valign=\"TOP\" width=\"6%\">\n<p align=\"CENTER\">0.02<\/p>\n<\/td>\n<td valign=\"TOP\" width=\"11%\">1.01\u0411 0.52<\/td>\n<td valign=\"TOP\" width=\"8%\">\n<p align=\"CENTER\">0.03<\/p>\n<\/td>\n<td valign=\"TOP\" width=\"11%\">-0.98\u0411 0.42<\/td>\n<td valign=\"TOP\" width=\"8%\">\n<p align=\"CENTER\">0.03<\/p>\n<\/td>\n<\/tr>\n<\/tbody>\n<\/table>\n<p>&nbsp;<\/p>\n<table border=\"0\" width=\"100%\" cellspacing=\"0\" cellpadding=\"0\">\n<tbody>\n<tr>\n<td width=\"50%\">\n<p align=\"center\"><a href=\"https:\/\/conf.icgbio.ru\/bgrs98\/wp-content\/uploads\/sites\/111\/2023\/03\/Thesis42_Image5.gif\" target=\"_blank\" rel=\"noopener\"><img loading=\"lazy\" class=\"alignnone wp-image-606 size-full\" src=\"https:\/\/conf.icgbio.ru\/bgrs98\/wp-content\/uploads\/sites\/111\/2023\/03\/Thesis42_Image5.gif\" alt=\"\" width=\"246\" height=\"222\" \/><\/a><\/p>\n<\/td>\n<td width=\"50%\">\n<p align=\"center\"><a href=\"https:\/\/conf.icgbio.ru\/bgrs98\/wp-content\/uploads\/sites\/111\/2023\/03\/Thesis42_Image6.gif\" target=\"_blank\" rel=\"noopener\"><img loading=\"lazy\" class=\"alignnone wp-image-607 size-full\" src=\"https:\/\/conf.icgbio.ru\/bgrs98\/wp-content\/uploads\/sites\/111\/2023\/03\/Thesis42_Image6.gif\" alt=\"\" width=\"248\" height=\"224\" \/><\/a><\/p>\n<\/td>\n<\/tr>\n<tr>\n<td colspan=\"2\" width=\"100%\">Figure 3. The &#8220;mean recognition&#8221; of (a) GATA-1 and (b) C\/EBP sites (&#8220;Mean&#8221;) decreases both type I and II errors,\u00a0<b>a<sub>1<\/sub><\/b>\u00a0and\u00a0<b>a<sub>2<\/sub><\/b>, compared to traditionally used method of frequency matrix (&#8220;Traditional&#8221;)<\/td>\n<\/tr>\n<\/tbody>\n<\/table>\n<p>&nbsp;<\/p>\n<table border=\"0\" width=\"100%\" cellspacing=\"0\" cellpadding=\"0\">\n<tbody>\n<tr>\n<td width=\"50%\">\n<p align=\"center\"><a href=\"https:\/\/conf.icgbio.ru\/bgrs98\/wp-content\/uploads\/sites\/111\/2023\/03\/Thesis42_Image7.gif\" target=\"_blank\" rel=\"noopener\"><img loading=\"lazy\" class=\"alignnone wp-image-608 size-full\" src=\"https:\/\/conf.icgbio.ru\/bgrs98\/wp-content\/uploads\/sites\/111\/2023\/03\/Thesis42_Image7.gif\" alt=\"\" width=\"246\" height=\"222\" \/><\/a><\/p>\n<\/td>\n<td width=\"50%\">\n<p align=\"center\"><a href=\"https:\/\/conf.icgbio.ru\/bgrs98\/wp-content\/uploads\/sites\/111\/2023\/03\/Thesis42_Image8.gif\" target=\"_blank\" rel=\"noopener\"><img loading=\"lazy\" class=\"alignnone wp-image-609 size-full\" src=\"https:\/\/conf.icgbio.ru\/bgrs98\/wp-content\/uploads\/sites\/111\/2023\/03\/Thesis42_Image8.gif\" alt=\"\" width=\"206\" height=\"170\" \/><\/a><\/p>\n<\/td>\n<\/tr>\n<tr>\n<td colspan=\"2\" width=\"100%\">Fig. 4. The standard deviation of the mean recognition score is decreasing with the total number of the partial recognitions averaged over (a) the site analyzed and (b) the random DNA, as predicted by the Central Limit Theorem.<\/td>\n<\/tr>\n<\/tbody>\n<\/table>\n","protected":false},"excerpt":{"rendered":"<p>PONOMARENKO M.P.,\u00a0FROLOV A.S.,\u00a0PONOMARENKO J.V.,\u00a0PODKOLODNAYA O.A.,\u00a0VOROBIEV D.G.,\u00a0KOLCHANOV N.A.,\u00a0OVERTON G.C. Laboratory of Theoretical Genetics, Institute of Cytology and Genetics, Siberian Branch of the Russian Academy of Sciences, 10 Lavrentiev Ave., Novosibirsk, 630090, Russia, e-mail:\u00a0pon@bionet.nsc.ru GC Overton 105 111 Center for Bioinformatics, UPenn, &hellip; <a href=\"https:\/\/conf.icgbio.ru\/bgrs98\/abstracts\/abstract-list\/042_recognition-accuracy-of-dna-functional-sites-can-be-increased-by-averaging-partial-recognitions\/\">Continue reading <span class=\"meta-nav\">&rarr;<\/span><\/a><\/p>\n","protected":false},"author":13,"featured_media":0,"parent":97,"menu_order":0,"comment_status":"closed","ping_status":"closed","template":"","meta":[],"_links":{"self":[{"href":"https:\/\/conf.icgbio.ru\/bgrs98\/wp-json\/wp\/v2\/pages\/601"}],"collection":[{"href":"https:\/\/conf.icgbio.ru\/bgrs98\/wp-json\/wp\/v2\/pages"}],"about":[{"href":"https:\/\/conf.icgbio.ru\/bgrs98\/wp-json\/wp\/v2\/types\/page"}],"author":[{"embeddable":true,"href":"https:\/\/conf.icgbio.ru\/bgrs98\/wp-json\/wp\/v2\/users\/13"}],"replies":[{"embeddable":true,"href":"https:\/\/conf.icgbio.ru\/bgrs98\/wp-json\/wp\/v2\/comments?post=601"}],"version-history":[{"count":9,"href":"https:\/\/conf.icgbio.ru\/bgrs98\/wp-json\/wp\/v2\/pages\/601\/revisions"}],"predecessor-version":[{"id":1508,"href":"https:\/\/conf.icgbio.ru\/bgrs98\/wp-json\/wp\/v2\/pages\/601\/revisions\/1508"}],"up":[{"embeddable":true,"href":"https:\/\/conf.icgbio.ru\/bgrs98\/wp-json\/wp\/v2\/pages\/97"}],"wp:attachment":[{"href":"https:\/\/conf.icgbio.ru\/bgrs98\/wp-json\/wp\/v2\/media?parent=601"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}