The following tables were created using a perl script written by Sam Cartinhour which I modified.
Alll repeats must appear at least 5 times to be detected. Homopolymer repeats (AAA,TTT, etc.) are not captured.
The number following the sequence id/name (e.g. TOVAA05TV-1)refers to the "iteration" or number of repeats found in the sequence. Sequences with only one repeat will have -1. Sequences with multiple repeats have -1, -2, etc.
Length = the length of the sequence in base pairs
Motif = the repeat found
Reps = the number of times that repeat occurs in the sequence
Start = the base position of the beginning of the repeat
Flank = the number of bases "to the right of" the repeat (a measure of how centered the repeat is within the sequence)
%GC = the percent GC content of the flanking region (useful for primer design considerations)
You may download the sequences as either text files or excel spreadsheets. Links are on the bottom of the respective pages. If you are using a Macintosh, please click and hold the link, then select save link as...
If you have any questions, please feel free to contact me