Types of variable sites are shown below C: Our analysis is based on the alignment of four LTR sequences as illustrated in figure 1B. Based on the configuration of nucleotides, variable sites can be classified into several categories fig.

A type-C site arises if a mutation that occurs after speciation is involved in a gene conversion event. A mutation that occurred before speciation will result in a type-N site if the site does not experience any gene conversion.

In addition, there are variable sites that do not belong to the three categories, including those with more than two variants denoted by W in fig. In this study, we primarily focus on type-C and -N sites. The extent of gene conversion can be measured by the proportion of type-C sites: The presence of type-C sites can be evidence of gene conversion, but multiple mutations at a single site can also create a type-C site with no gene conversion.

Therefore, we used a statistical method Gao and Innan ; Osada and Innan to test the null hypothesis of no gene conversion in which the effect of multiple mutations is taken into account. The null model considers the evolutionary history of four sequences, A5, A3, B5, and B3, because the split of species A and B.

No gene conversion is assumed, so that all four sequences independently accumulate random neutral mutations. It should be noted that gene conversion between paralogous LTRs does not affect the expectation of orthologous divergence between the two species.

To be conservative, we use a simple two-nucleotide model in which only two states, 0 and 1, are allowed and recurrent mutations occur between 0 and 1. For example, consider the most recent common ancestor of a pair of LTR transposons from species A and B, which is denoted by M. Thus, two independent mutations in the two LTRs result in a type-C site with a probability of 0. It should be noted that is smaller if four nucleotides are allowed.

The statistical test examines if the observed number of type-C sites is significantly larger than that expected under the null model with no gene conversion.

When we initiated this study Aprilthere were at least four species pairs that meet these criteria: We ignored the human—chimpanzee pair and comparisons of multiple strains within single species because they are so closely related that there is no sufficient information on nucleotide divergence.

These two software programs are designed to identify full-length LTR retrotransposons that possess a pair of high homology regions but use different algorithms. For all pairs of LTRs identified, the presence of their orthologous elements was examined in the genome of the second species of the four species pairs e.

