{"id":959,"date":"2023-03-20T15:28:53","date_gmt":"2023-03-20T08:28:53","guid":{"rendered":"https:\/\/conf.icgbio.ru\/bgrs98\/?page_id=959"},"modified":"2023-04-12T13:09:09","modified_gmt":"2023-04-12T06:09:09","slug":"083_rapid-estimates-of-statistical-significance-of-the-pairwise-nucleotide-sequence-alignment","status":"publish","type":"page","link":"https:\/\/conf.icgbio.ru\/bgrs98\/abstracts\/abstract-list\/083_rapid-estimates-of-statistical-significance-of-the-pairwise-nucleotide-sequence-alignment\/","title":{"rendered":"RAPID ESTIMATES OF STATISTICAL SIGNIFICANCE OF THE PAIRWISE NUCLEOTIDE SEQUENCE ALIGNMENT"},"content":{"rendered":"<p><a href=\"https:\/\/conf.icgbio.ru\/bgrs98\/abstracts\/authors-index\/#seledtsov\">SELEDTSOV I.A.<\/a>,\u00a0<a href=\"https:\/\/conf.icgbio.ru\/bgrs98\/abstracts\/authors-index\/#kolpakov\">KOLPAKOV F.A.<\/a><sup>+<\/sup><\/p>\n<p>Institute of Cytology and Genetics, Siberian Branch of the Russian Academy of Sciences, 10 Lavrentiev Ave., Novosibirsk, 630090, Russia;<br \/>\n<a href=\"mailto:fedor@bionet.nsc.ru\" target=\"_blank\" rel=\"noopener\">fedor@bionet.nsc.ru<\/a><\/p>\n<p>+Corresponding author<\/p>\n<p><a href=\"https:\/\/conf.icgbio.ru\/bgrs98\/abstracts\/keywords-index\/\">Keywords<\/a>: statistical significance, nucleotide sequence alignment, regression analysis<\/p>\n<p><b>Abstract<\/b><\/p>\n<p>Statistical significance of the similarity observed is the main question while comparing sequences. This problem has not yet been solved mathematically for optimal aligning of the sequences containing insertions and deletions. We have carried out the regression analysis of the observed similarity of random sequences depending on their length and nucleotide composition and are proposing a practical method to estimate the probability of the similarity observed to be statistically significant. The regression parameters being determined for a given alignment scheme (similarity matrix and penalties for deletions) for a pair of nucleotide sequences, the statistical significance of the similarity observed can be precisely estimated basing on only their lengths and nucleotide composition.<b><\/b><\/p>\n<p><b>1. Introduction<\/b><\/p>\n<p>This work is dealing with estimation of the statistical significance of the similarity observed while aligning a pair of nucleotide sequences. The researcher always needs to know whether the alignment he has obtained actually indicates any functional and\/or evolutionary relationship between the sequences or is just accidental. Statistical estimation is the tool able to answer the question how frequently the alignment of this type can appear by chance. If the probability of such event is low, we can consider the event improbable and, consequently, the sequences related to a certain degree.<\/p>\n<p>Numerous works are devoted to estimation of the statistical significance of the alignment observed, due to the importance of this problem. However, the approaches proposed in these works have the following common limitations:<\/p>\n<ol type=\"a\">\n<li>Their applicability to only local similarity regions.<\/li>\n<li>Necessary negative values of the expectation of a dot-matrix element which is not always possible within the existing alignment schemes (symbol similarity matrix, penalties for deletions and insertions, etc.).<\/li>\n<li>The implication that the sequences compared have equal or approximately equal lengths.<\/li>\n<li>No consideration of the particular nucleotide composition of the sequences compared, distorting the estimation considerably.<\/li>\n<\/ol>\n<p>Note that the limitations (c) and (d) are partially overcome in the approach proposed by Waterman and Vingron (1994). The goal of this work was to develop the method for estimating statistical significance of the observed alignment of two sequences free from the above-listed disadvantages.<b><\/b><\/p>\n<p><b>2. The proposed approach<\/b><\/p>\n<p>Let&#8217;s consider the weight determined according to the following equation as a measure of the quality of the alignment of two sequences:<\/p>\n<p align=\"CENTER\"><img loading=\"lazy\" class=\"alignnone size-full wp-image-960\" src=\"https:\/\/conf.icgbio.ru\/bgrs98\/wp-content\/uploads\/sites\/111\/2023\/03\/Thesis83_Image116.gif\" alt=\"\" width=\"149\" height=\"45\" \/>,<\/p>\n<p>where <img loading=\"lazy\" class=\"alignnone size-full wp-image-961\" src=\"https:\/\/conf.icgbio.ru\/bgrs98\/wp-content\/uploads\/sites\/111\/2023\/03\/Thesis83_Image117.gif\" alt=\"\" width=\"26\" height=\"24\" \/>\u00a0are the values of the similarity of the symbols\u00a0<i><b>i<\/b><\/i>\u00a0and\u00a0<i><b>j<\/b><\/i>;\u00a0<i><b>p1(i)<\/b><\/i>\u00a0and\u00a0<i><b>p2(i),<\/b><\/i>\u00a0the symbols located in the\u00a0<i><b>i<\/b><\/i>th alignment positions of the first and second sequences, respectively;\u00a0<i><b>L<\/b><\/i>, the effective alignment length;\u00a0<i><b>K<\/b><\/i>,<i><b>\u00a0<\/b><\/i>the total number of insertions in the alignment; and d , the penalty for each insertion.\u00a0<i><b>S,<\/b><\/i>\u00a0d\u00a0<b>,\u00a0<\/b>and the selection of\u00a0<i><b>L<\/b><\/i>\u00a0represent the alignment scheme. The alignment with the maximal value of the weight\u00a0<i><b>*W<\/b><\/i>\u00a0exists among all the possible alignments of two sequences.\u00a0<i><b>*W<\/b><\/i>\u00a0is an random variable with a certain distribution function\u00a0<b><i>F<sub>S,r\u00a0<\/sub><\/i>,d\u00a0<i>,L1,L2(*W),<\/i><\/b>\u00a0where r is a totality of the parameters describing the difference in the nucleotide compositions of the first and second sequences;\u00a0<i><b>L1,<\/b><\/i>\u00a0the length of the first sequence; and\u00a0<i><b>L2,<\/b><\/i> the length of the second sequence. The tail area of the distribution function <img loading=\"lazy\" class=\"alignnone size-full wp-image-962\" src=\"https:\/\/conf.icgbio.ru\/bgrs98\/wp-content\/uploads\/sites\/111\/2023\/03\/Thesis83_Image118.gif\" alt=\"\" width=\"66\" height=\"16\" \/>(<sub> <img loading=\"lazy\" class=\"alignnone size-full wp-image-963\" src=\"https:\/\/conf.icgbio.ru\/bgrs98\/wp-content\/uploads\/sites\/111\/2023\/03\/Thesis83_Image119.gif\" alt=\"\" width=\"124\" height=\"20\" \/><\/sub>, where\u00a0<i><b>Ds<\/b><\/i>\u00a0is the dispersion of\u00a0<i><b>*W<\/b><\/i>;\u00a0<i><b>Av<\/b><\/i>, the mathematical expectation of\u00a0<i><b>*W<\/b><\/i>) appears to have the same analytic form for all the possible values of\u00a0<i><b>S, r , d , L1,<\/b><\/i>\u00a0and<i><b>\u00a0L2<\/b><\/i>, namely:<\/p>\n<p align=\"right\"><img loading=\"lazy\" class=\"size-full wp-image-964 alignleft\" src=\"https:\/\/conf.icgbio.ru\/bgrs98\/wp-content\/uploads\/sites\/111\/2023\/03\/Thesis83_Image120.gif\" alt=\"\" width=\"176\" height=\"21\" \/>(1)<\/p>\n<p>where <img loading=\"lazy\" class=\"alignnone size-full wp-image-965\" src=\"https:\/\/conf.icgbio.ru\/bgrs98\/wp-content\/uploads\/sites\/111\/2023\/03\/Thesis83_Image121.gif\" alt=\"\" width=\"322\" height=\"62\" \/>\u00a0is the tail area of the normal distribution (Fig. 1)<\/p>\n<p><img loading=\"lazy\" class=\"alignnone size-full wp-image-966\" src=\"https:\/\/conf.icgbio.ru\/bgrs98\/wp-content\/uploads\/sites\/111\/2023\/03\/Thesis83_Image122.gif\" alt=\"\" width=\"204\" height=\"209\" \/><\/p>\n<p>Figure 1. Correspondence of the logarithms of the functions <img loading=\"lazy\" class=\"alignnone size-full wp-image-967\" src=\"https:\/\/conf.icgbio.ru\/bgrs98\/wp-content\/uploads\/sites\/111\/2023\/03\/Thesis83_Image123.gif\" alt=\"\" width=\"70\" height=\"21\" \/> (bold line) and <img loading=\"lazy\" class=\"alignnone size-full wp-image-968\" src=\"https:\/\/conf.icgbio.ru\/bgrs98\/wp-content\/uploads\/sites\/111\/2023\/03\/Thesis83_Image124.gif\" alt=\"\" width=\"91\" height=\"21\" \/>\u00a0(fine line).<\/p>\n<p>Thus, we only need to determine a concrete dependence of the coefficients\u00a0<i><b>A<\/b><\/i>,\u00a0<i><b>B,<\/b><\/i>\u00a0and\u00a0<i><b>C<\/b><\/i>\u00a0on the lengths of the sequences aligned, their composition, and alignment scheme. It is interesting to consider the distribution function of the variable:<\/p>\n<p align=\"right\"><img loading=\"lazy\" class=\"size-full wp-image-969 alignleft\" src=\"https:\/\/conf.icgbio.ru\/bgrs98\/wp-content\/uploads\/sites\/111\/2023\/03\/Thesis83_Image125.gif\" alt=\"\" width=\"153\" height=\"25\" \/>(2)<\/p>\n<p>The corresponding function will be designated <img loading=\"lazy\" class=\"alignnone size-full wp-image-970\" src=\"https:\/\/conf.icgbio.ru\/bgrs98\/wp-content\/uploads\/sites\/111\/2023\/03\/Thesis83_Image126.gif\" alt=\"\" width=\"60\" height=\"21\" \/>\u00a0hereinafter. This function appears the same for all the possible values of\u00a0<i><b>S, r , d , L1,\u00a0<\/b><\/i>and<i><b>\u00a0L2<\/b><\/i>, and the corresponding parameters (see equation 1)\u00a0<i><b>A&#8217;<\/b><\/i>,\u00a0<i><b>B&#8217;,<\/b><\/i>\u00a0and\u00a0<i><b>C&#8217;\u00a0<\/b><\/i>can be easily calculated. It should be emphasized that the values of these coefficients are calculated only once and are independent of\u00a0<i><b>S, r , d , L1,\u00a0<\/b><\/i>and<i><b>\u00a0L2.<\/b><\/i>\u00a0Thus, the only thing we need to do is to determine the dependence of\u00a0<i><b>Av<\/b><\/i>\u00a0and\u00a0<i><b>Ds<\/b><\/i>\u00a0on\u00a0<i><b>S, r , d , L1,\u00a0<\/b><\/i>and<i><b>\u00a0L2.\u00a0<\/b><\/i>The problem is solved when this dependence is determined.<\/p>\n<p><b>3. Regression analysis<\/b><\/p>\n<p>Described in this section is the determination of the dependences of\u00a0<i><b>Av<\/b><\/i>\u00a0and\u00a0<i><b>Ds<\/b><\/i>\u00a0on\u00a0<i><b>L1, L2,\u00a0<\/b><\/i>and<i><b>\u00a0r\u00a0<\/b><\/i>, that is, on the parameters independent of the alignment scheme. Determination of their dependence on the scheme is the goal for further studies. Linear regression analysis was employed to determine the dependence of\u00a0<i><b>Av<\/b><\/i>\u00a0and\u00a0<i><b>Ds<\/b><\/i>\u00a0on<i><b>\u00a0L1, L2,\u00a0<\/b><\/i>and<i><b>\u00a0r\u00a0<\/b><\/i>.<i><b><\/b><\/i><\/p>\n<p><i><b>3.1. The dependence on the nucleotide sequence lengths<\/b><\/i><\/p>\n<p>The fixed nucleotide composition provided, the dependence of\u00a0<i><b>Av<\/b><\/i>\u00a0and\u00a0<i><b>Ds<\/b><\/i>\u00a0on the lengths of the sequences aligned is:<\/p>\n<p align=\"right\"><img loading=\"lazy\" class=\"size-full wp-image-971 alignleft\" src=\"https:\/\/conf.icgbio.ru\/bgrs98\/wp-content\/uploads\/sites\/111\/2023\/03\/Thesis83_Image127.gif\" alt=\"\" width=\"361\" height=\"24\" \/>(3)<\/p>\n<p align=\"right\"><img loading=\"lazy\" class=\"size-full wp-image-972 alignleft\" src=\"https:\/\/conf.icgbio.ru\/bgrs98\/wp-content\/uploads\/sites\/111\/2023\/03\/Thesis83_Image128.gif\" alt=\"\" width=\"331\" height=\"30\" \/>(4)<\/p>\n<p>The dependence of the\u00a0<i><b>Av<\/b><\/i>\u00a0value observed on the predicted value at different lengths of the sequences (the lengths varied from 20 to 500 bp) is shown in Fig. 2. It illustrates that equation 3 provides a good approximation of the observed\u00a0<i><b>Av<\/b><\/i>\u00a0values. Approximation of the\u00a0<i><b>Ds<\/b><\/i>\u00a0values by equation 4 is of the same accuracy (not shown).<i><b><\/b><\/i><\/p>\n<p><i><b>3.2. Dependence on the nucleotide sequence composition<\/b><\/i><\/p>\n<p>However, the nucleotide sequences enriched with one and the same nucleotide will be inclined to display higher values of the alignment weights. Thus, it is necessary to take into account the differences in their nucleotide composition while estimating the statistical significance of the alignment observed.<\/p>\n<table border=\"0\" width=\"100%\" cellspacing=\"0\" cellpadding=\"0\">\n<tbody>\n<tr>\n<td width=\"50%\"><img loading=\"lazy\" class=\"alignnone size-full wp-image-981\" src=\"https:\/\/conf.icgbio.ru\/bgrs98\/wp-content\/uploads\/sites\/111\/2023\/03\/Thesis83_Image137.gif\" alt=\"\" width=\"610\" height=\"520\" \/><\/td>\n<\/tr>\n<tr>\n<td width=\"50%\">Figure 2. The dependence of the observed\u00a0<i><b>Av<\/b><\/i>\u00a0value on the predicted\u00a0<i><b>Av<\/b><\/i>(<i><b>l<\/b><\/i>) at different lengths and fixed nucleotide composition of the aligned sequences. X axis, predicted\u00a0<i><b>Av<\/b><\/i>(<i><b>l<\/b><\/i>) values; Y axis, observed values.<\/td>\n<\/tr>\n<\/tbody>\n<\/table>\n<p>&nbsp;<\/p>\n<table border=\"0\" width=\"100%\" cellspacing=\"0\" cellpadding=\"0\">\n<tbody>\n<tr>\n<td width=\"50%\"><img loading=\"lazy\" class=\"alignnone size-full wp-image-973\" src=\"https:\/\/conf.icgbio.ru\/bgrs98\/wp-content\/uploads\/sites\/111\/2023\/03\/Thesis83_Image129.gif\" alt=\"\" width=\"600\" height=\"500\" \/><\/td>\n<\/tr>\n<tr>\n<td width=\"50%\">Figure 3. The dependence of the observed\u00a0<i><b>Av<\/b><\/i>\u00a0value on the predicted value at different r and fixed lengths of the sequences. X axis, predicted\u00a0<i><b>Av<\/b><\/i>(r ) values; Y axis, observed values.<\/td>\n<\/tr>\n<\/tbody>\n<\/table>\n<p>&nbsp;<\/p>\n<p>We used the following variables to describe the differences in the nucleotide composition:<\/p>\n<p><img loading=\"lazy\" class=\"alignnone size-full wp-image-974\" src=\"https:\/\/conf.icgbio.ru\/bgrs98\/wp-content\/uploads\/sites\/111\/2023\/03\/Thesis83_Image130.gif\" alt=\"\" width=\"128\" height=\"37\" \/>, that reflects the probability of nucleotide coincidence and<\/p>\n<p><img loading=\"lazy\" class=\"alignnone size-full wp-image-975\" src=\"https:\/\/conf.icgbio.ru\/bgrs98\/wp-content\/uploads\/sites\/111\/2023\/03\/Thesis83_Image131.gif\" alt=\"\" width=\"157\" height=\"37\" \/>, that reflects the difference in the nucleotide composition,<\/p>\n<p>where\u00a0<i><b>f<sub>k<\/sub>(i)<\/b><\/i>\u00a0is the frequency of the\u00a0<i><b>i<\/b><\/i>th nucleotide in the sequence\u00a0<i><b>k<\/b><\/i>.<\/p>\n<p>The fixed and equal lengths (<i><b>L1=L2<\/b><\/i>) of the nucleotide sequences provided, the dependence of\u00a0<i><b>Av<\/b><\/i>\u00a0and\u00a0<i><b>Ds<\/b><\/i>\u00a0on\u00a0<i><b>p<\/b><\/i>\u00a0and\u00a0<i><b>dP<\/b><\/i>\u00a0can be represented as:<\/p>\n<p align=\"RIGHT\"><img loading=\"lazy\" class=\"size-full wp-image-976 alignleft\" src=\"https:\/\/conf.icgbio.ru\/bgrs98\/wp-content\/uploads\/sites\/111\/2023\/03\/Thesis83_Image132.gif\" alt=\"\" width=\"207\" height=\"24\" \/>(5)<\/p>\n<p align=\"RIGHT\"><img loading=\"lazy\" class=\"size-full wp-image-977 alignleft\" src=\"https:\/\/conf.icgbio.ru\/bgrs98\/wp-content\/uploads\/sites\/111\/2023\/03\/Thesis83_Image133.gif\" alt=\"\" width=\"209\" height=\"26\" \/>(6)<\/p>\n<p>The dependence of the observed\u00a0<i><b>Av<\/b><\/i>\u00a0value on the\u00a0<i><b>Av<\/b><\/i>(r ) predicted at different r (different nucleotide composition) is shown in Fig. 3. It illustrates that equation 5 provides a good approximation of the\u00a0<i><b>Av<\/b><\/i>\u00a0values observed. Equation 6 gives slightly worse approximation (the correlation coefficient is 0.91) for\u00a0<i><b>Ds<\/b><\/i>\u00a0values (not shown).<i><b><\/b><\/i><\/p>\n<p><i><b>3.3. Dependence on the lengths and nucleotide compositions<\/b><\/i><\/p>\n<p>We suggested the following general view of the\u00a0<i><b>Av\u00a0<\/b><\/i>and<i><b>\u00a0Ds\u00a0<\/b><\/i>dependences on the lengths and nucleotide compositions:<\/p>\n<p align=\"CENTER\">Av(l,r ) ~ Av(l)*Av(r )<\/p>\n<p align=\"CENTER\"><b><i>Ds(l,<\/i>\u00a0r\u00a0<i>) ~ Ds(l)*Ds(r )<\/i><\/b><\/p>\n<p>Then, the\u00a0<i><b>Av<\/b><\/i>\u00a0and\u00a0<i><b>Ds<\/b><\/i>\u00a0dependences on the sequence lengths (at\u00a0<i><b>L=L1=L2<\/b><\/i>),\u00a0<b>p\u00a0<\/b>and\u00a0<i><b>dP<\/b><\/i>\u00a0values can be represented by the following equations:<\/p>\n<p align=\"RIGHT\"><img loading=\"lazy\" class=\"size-full wp-image-978 alignleft\" src=\"https:\/\/conf.icgbio.ru\/bgrs98\/wp-content\/uploads\/sites\/111\/2023\/03\/Thesis83_Image134.gif\" alt=\"\" width=\"277\" height=\"24\" \/>(7)<\/p>\n<p align=\"RIGHT\"><img loading=\"lazy\" class=\"size-full wp-image-979 alignleft\" src=\"https:\/\/conf.icgbio.ru\/bgrs98\/wp-content\/uploads\/sites\/111\/2023\/03\/Thesis83_Image135.gif\" alt=\"\" width=\"261\" height=\"26\" \/>(8)<\/p>\n<p>The dependence of the observed\u00a0<i><b>Av<\/b><\/i>\u00a0value on the\u00a0<i><b>Av<\/b><\/i>(<i><b>l<\/b><\/i>, r ) predicted at different\u00a0<i><b>l<\/b><\/i>\u00a0and r is shown in Fig. 4. It illustrates the good approximation of the\u00a0<i><b>Av<\/b><\/i>\u00a0values observed provided by equation 7. Equation 8 gives the similar accuracy approximation for the\u00a0<i><b>Ds\u00a0<\/b><\/i>values (not shown).<\/p>\n<table border=\"0\" width=\"35%\" cellspacing=\"0\" cellpadding=\"0\">\n<tbody>\n<tr>\n<td width=\"100%\"><img loading=\"lazy\" class=\"alignnone size-full wp-image-980\" src=\"https:\/\/conf.icgbio.ru\/bgrs98\/wp-content\/uploads\/sites\/111\/2023\/03\/Thesis83_Image136.gif\" alt=\"\" width=\"590\" height=\"480\" \/><\/td>\n<\/tr>\n<tr>\n<td width=\"100%\">Figure 4. The dependence of the observed\u00a0<i><b>Av<\/b><\/i>\u00a0value on the\u00a0<i><b>Av<\/b><\/i>(<i><b>l<\/b><\/i>,) predicted at different\u00a0<i><b>L<\/b><\/i>\u00a0and r . X axis, predicted\u00a0<i><b>Av(l,<\/b><\/i>\u00a0r\u00a0<i><b>)<\/b><\/i>\u00a0values; Y axis, observed values.<\/td>\n<\/tr>\n<\/tbody>\n<\/table>\n<p><b>4. Conclusion<\/b><\/p>\n<p>If we have a certain alignment with the weight\u00a0<i><b>*W<\/b><\/i>, it is possible to estimate the values of\u00a0<i><b>Av<\/b><\/i>\u00a0and\u00a0<i><b>Ds\u00a0<\/b><\/i>using equations 3-8, find\u00a0<i><b>*W\u2019<\/b><\/i>\u00a0using equation 2, determine directly its statistical significance using equation 1, and have the basis to speak about certain relatedness of the sequences aligned.<\/p>\n<p>Study of the behavior of\u00a0<i><b>Av<\/b><\/i>\u00a0and\u00a0<i><b>Ds<\/b><\/i>\u00a0at\u00a0<i><b>L1<img loading=\"lazy\" class=\"alignnone size-full wp-image-982\" src=\"https:\/\/conf.icgbio.ru\/bgrs98\/wp-content\/uploads\/sites\/111\/2023\/03\/Thesis83_noteq.gif\" alt=\"\" width=\"10\" height=\"10\" \/>\u00a0L2\u00a0<\/b><\/i>and an arbitrary nucleotide composition is currently in progress. The dependence on the aligned scheme is also planned to be studied.<b><\/b><\/p>\n<p><b>Appendix<\/b><\/p>\n<p>The Needleman-Wunsch algorithm was used for nucleotide sequence alignment (Needleman, Wunsch, 1970).<\/p>\n<p>Pair alignment of 1000-10 000 randomly generated sequences was carried out to estimate the observed<i><b>\u00a0Av<\/b><\/i>\u00a0and\u00a0<i><b>Ds<\/b><\/i>\u00a0and values.<\/p>\n<p>The following alignment schemes were studied: considering and not considering the terminal gaps; with various matrices of nucleotide symbol similarity, and with different penalties for gaps and its origination.<b><\/b><\/p>\n<p><b>Acknowledgements<\/b><\/p>\n<p>The work was supported by the State Scientific Program &#8220;The Human Genome&#8221; of the Russian State Committee for Science and Technology and the Russian Foundation for Basic Research (grant No. 96-04-50006).<b><\/b><\/p>\n<p>References<\/p>\n<ol>\n<li>S.B. Needleman and C.D. Wunsch, &#8220;A general method applicable to the search for similarities in the amino acid sequences of two proteins&#8221;, J. Mol. Biol.\u00a0<b>48<\/b>, 443 (1970)<\/li>\n<li>M. Waterman and M. Vingron, &#8220;Rapid and accurate estimates of statistical significance for sequence data base searches&#8221;, Proc. Natl. Acad. Sci. USA,\u00a0<b>91<\/b>, 4625 (1994)<\/li>\n<\/ol>\n","protected":false},"excerpt":{"rendered":"<p>SELEDTSOV I.A.,\u00a0KOLPAKOV F.A.+ Institute of Cytology and Genetics, Siberian Branch of the Russian Academy of Sciences, 10 Lavrentiev Ave., Novosibirsk, 630090, Russia; fedor@bionet.nsc.ru +Corresponding author Keywords: statistical significance, nucleotide sequence alignment, regression analysis Abstract Statistical significance of the similarity observed &hellip; <a href=\"https:\/\/conf.icgbio.ru\/bgrs98\/abstracts\/abstract-list\/083_rapid-estimates-of-statistical-significance-of-the-pairwise-nucleotide-sequence-alignment\/\">Continue reading <span class=\"meta-nav\">&rarr;<\/span><\/a><\/p>\n","protected":false},"author":13,"featured_media":0,"parent":97,"menu_order":0,"comment_status":"closed","ping_status":"closed","template":"","meta":[],"_links":{"self":[{"href":"https:\/\/conf.icgbio.ru\/bgrs98\/wp-json\/wp\/v2\/pages\/959"}],"collection":[{"href":"https:\/\/conf.icgbio.ru\/bgrs98\/wp-json\/wp\/v2\/pages"}],"about":[{"href":"https:\/\/conf.icgbio.ru\/bgrs98\/wp-json\/wp\/v2\/types\/page"}],"author":[{"embeddable":true,"href":"https:\/\/conf.icgbio.ru\/bgrs98\/wp-json\/wp\/v2\/users\/13"}],"replies":[{"embeddable":true,"href":"https:\/\/conf.icgbio.ru\/bgrs98\/wp-json\/wp\/v2\/comments?post=959"}],"version-history":[{"count":4,"href":"https:\/\/conf.icgbio.ru\/bgrs98\/wp-json\/wp\/v2\/pages\/959\/revisions"}],"predecessor-version":[{"id":1421,"href":"https:\/\/conf.icgbio.ru\/bgrs98\/wp-json\/wp\/v2\/pages\/959\/revisions\/1421"}],"up":[{"embeddable":true,"href":"https:\/\/conf.icgbio.ru\/bgrs98\/wp-json\/wp\/v2\/pages\/97"}],"wp:attachment":[{"href":"https:\/\/conf.icgbio.ru\/bgrs98\/wp-json\/wp\/v2\/media?parent=959"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}