iii. ORF Finder at BCM
iv. Summary
v. Primer picks
iii. ORF Finder at BCM
iv. Summary
v. Primer picks
i. BLAST results
ii. ORF Finder at NCBI
iii. ORF Finder at BCM
iv. Summary
v. Primer picks
i. BLAST results
ii. ORF Finder at NCBI
iii. ORF Finder at BCM
iv. Summary
v. Primer picks
i. BLAST results
ii. ORF Finder at NCBI
iii. ORF Finder at BCM
iv. Summary
v. Primer picks
vi.
Received chromatogrphas from cereon on CD.
Three libraries were present:
1. Lib3306
2. Lib 3307 a primer
3. Lib 3307 z primer
Ran phred to obtain base calls and quality scores
Ran crossmatch to mask vector sequences
Ran phrap to assemble the sequences
Used consed to visualize the assemblies
Obtained additional sequences from Hsin-Mei
1. 05_t700m13r
2. 06_t680_t3
3. 07_t680_t7
4. 99.bin
5. H19f
6. H19r
Used BLAST similarities to add ESTs from the Cereon and TIGR sequences
1. TOVAE71THB
2. TOVAP70TH
3. TOVCM01TH
4. TOVCO47TH
5. TOVDT19TH
6. cC-esflcLEL13M05a1
7. cC-esflcLEL13M05d1
8. cC-esflcLEL13001a1
9. cC-esflcLEL21J23d1
10. cC-esflcLEL23K03a1
11. cC-esflcLEL23K03d1
12. cC-esle00003e12a1
13. cC-esle00003e12d1
14. tomato010290.t3
Resulting file = BAC19.fasta.screen.ace.5
Why did I leave the vector contaminated sequences in the contig? They are easier to remove that way. New trick I learned at Cereon….
61 Contigs:
Contig 61: 501 reads, 39,440 bp GC%=32
From: bp 1 to bp 578 = H19F therefore we are able to orient this contig as the "beginning" of the BAC.
Contig 60: 383 reads, 32,272 bp
From : bp 23144 to bp 23744 = H19R therefore we are able to orient this contig as the "end" of the BAC.
Contig 58: 48 reads 4649 bp
Contig 57: 40 reads 4953 bp = empty insert (e.g. vector only)
Contig 56: 39 reads, 5388 bp
Contig 55: 19 reads 1240 bp = empty insert (e.g. vector only)
Contig 54: 7 reads 1590 bp = only reads present are cC-esflcLEL13001a1, cC-esflcLEL23K03a1, cC-esflcLEL23K03d1, cC-esflcLEL13O01d1, 07_t680_t7.abi, tomato010290.t3, 06_t680_t3.abi. I believe these are all ESTs, so, until they contig with the group, I won't be analyzing them.
Contig 53: 4 reads 709 bp = only EST reads (TOVAP70TH, TOVAE71THB, TOVDT19TH, TOVCM01TH)
Contig 52: 3 reads 901 bp
Contig 51: 3 reads 1170 bp
Contig 50: 3 reads 78 bp
Contig 49: 3 reads 225 bp
Contig 48: 2 reads 272 bp
Contig 47: 2 reads 219 bp
Contig 46: 2 reads 150 bp
Contig 45: 2 reads, 201 bp
Contig 44: 2 reads, 544 bp
Contig 43: 2 reads, 89 bp
Contig 42: 2 reads, 266 bp
Contig 41: 2 reads, 103 bp
Contig 40: 2 reads, 544 bp
Contig 39: 2 reads, 190 bp
Contig 38: 2 reads 381 bp
Contig 37: 2 reads, 392 bp
Contig 36: 2 reads, 354 bp
Contig 35: 2 reads, 669 bp
Contig 34: 2 reads, 613 bp
Contig 33: 2 reads, 201 bp
Contig 32: 2 reads, 248 bp
Contig 31: 2 reads, 621 bp
Contig 30: 2 reads, 200 bp
Contig 29: 2 reads, 386 bp
Contig 28: 2 reads, 389 bp
Contig 27: 2 reads, 576 bp
Contig 26: 2 reads, 392 bp
Contig 25: 2 reads, 105 bp
Contig 24: 2 reads, 498 bp
Contig 23: 2 reads, 206 bp
Contig 22: 2 reads 1023 bp
Contig 21: 2 reads, 167 bp
Contig 20: 2 reads, 282 bp
Contig 19: 2 reads, 125 bp
Contig 18: 2 reads, 125 bp
Contig 17: 2 reads, 148 bp
Contig 16: 2 reads, 73 bp
Contig 15: 2 reads, 300 bp
Contig 14: 2 reads, 74 bp
Contig 13: 2 reads, 112 bp
Contig 12: 2 reads 428 bp
Contig 11: 2 reads, 277 bp
Contig 10: 2 reads, 465 bp
Contig 9: 2 reads, 465 bp
Contig 8: 2 reads, 216 bp
Contig 7: 2 reads, 52 bp
Contig 6: 2 reads. 217 bp
Contig 5: 2 reads, 217 bp
Contig 4: 2 reads, 763 bp
Contig 3: 2 reads, 658 bp
Contig 2: 2 reads 823 bp
Contig 1: 1 reads 819 bp
Sequence Name |
Start bp |
End bp |
E-value |
Strand |
cC-esflcLEL21J23d1 |
1 |
803 |
0.0 |
Plus |
H19F |
1 |
560 |
0.0 |
Plus |
TPRCK57THB (cLER16J17) |
38585 |
39100 |
0.0 |
Plus |
H2f-2 from HSK |
22825 |
23308 |
0.0 |
Plus |
H2f-2 from HSK |
23374 |
23440 |
4e-020 |
Plus |
cC-esflcLEL13M05a1 |
12615 |
13555 |
0.0 |
Plus |
cC-esflcLEL13M05d1 |
12686 |
13536 |
0.0 |
Plus |
cC-esflcLEL13M17d1 |
12669 |
13550 |
0.0 |
Plus |
cC-esflcLEL21J23d1 |
38522 |
39441 |
0.0 |
Plus |
14804 |
15061 |
e-141 |
Plus |
|
15224 |
15440 |
e-117 |
Plus |
|
18913 |
19007 |
6e-044 |
Plus |
|
14678 |
14747 |
5e-029 |
Plus |
|
1 |
194 |
e-103 |
Plus/Minus |
|
38742 |
39009 |
7e-90 |
Plus/Minus |
|
38058 |
38122 |
4e-014 |
Plus/Minus |
|
Arabidopsis cDNA H7D12T7 (W43868) |
26231 |
26425 |
1e-069 |
Plus |
Arabidopsis cDNA G9G1T7 (N95853) |
26235 |
26425 |
2e-065 |
Plus |
Arabidopsis cDNA F4C6T7 (N96461) |
26231 |
26425 |
3e-064 |
Plus |
Maize cDNA ag88e01 (AI065545) |
26238 |
26410 |
3e-052 |
Plus |
Rice cDNA E1273_4Z (AU075650) |
26231 |
26410 |
1e-051 |
Plus/Minus |
Rice cDNA E1273_4Z (AU075650) |
26110 |
26171 |
1e-007 |
Plus/Minus |
Tobacco chloroplast (Z00044) |
26203 |
26425 |
3e-086 |
Plus |
Tobacco chloroplast (Z00044) |
26203 |
26425 |
3e-086 |
Plus/Minus |
Tobacco chloroplast (Z00044) |
26111 |
26184 |
2e-019 |
Plus |
S. nigrum chloroplast (Y18934) |
26203 |
26425 |
3e-086 |
Plus/Minus |
S. nigrum chloroplast (Y18934) |
26111 |
26184 |
2e-031 |
Plus/Minus |
C.reflexa chloroplast (X72584) |
26203 |
26410 |
2e-075 |
Plus/Minus |
C.reflexa chloroplast (X72584) |
26153 |
26181 |
2e-004 |
Plus/Minus |
Brassica napus 30S (AF124376) |
26231 |
26425 |
2e-069 |
Plus/Minus |
Brassica napus 30S (AF124376) |
26110 |
26184 |
1e-017 |
Plus/Minus |
Soybean plastid (X07675) |
26236 |
26425 |
4e-064 |
Plus/Minus |
Soybean plastid (X07675) |
26111 |
26169 |
1e-008 |
Plus/Minus |
Soybean chloroplast (X05013) |
26248 |
26425 |
5e-057 |
Plus/Minus |
Maize chloroplast (X86563) |
26231 |
26410 |
5e-054 |
Plus |
Maize chloroplast (X86563) |
26231 |
26410 |
5e-054 |
Plus/Minus |
Lathraea clandestina (rps7) gene (AF030982) |
26248 |
26425 |
3e-052 |
Plus/Minus |
Rice chloroplast (X15901) |
26231 |
26410 |
1e-051 |
Plus/Minus |
Rice chloroplast (X15901) |
26231 |
26410 |
3e-009 |
Plus |
Rice chloroplast (X15901) |
26111 |
26171 |
3e-009 |
Plus |
Rice chloroplast (X15901) |
26111 |
26171 |
3e-009 |
Plus/Minus |
|
|
|
|
|
The chloroplast hits are STRANGE. They appear to be perfectly palindromic. Perhaps a retrotransposon?
Sequence Name |
Start bp |
End bp |
E-value |
Strand |
6105 |
6679 |
0.0 |
Plus / Minus |
|
Nicotiana plumbaginifolia mRNA for U2 snRNP auxiliary factor (Y18351) |
14816 |
15264 |
0.0 |
Plus / Minus |
Nicotiana plumbaginifolia mRNA for U2 snRNP auxiliary factor (Y18351) |
18665 |
18949 |
5e-096 |
Plus / Minus |
Nicotiana plumbaginifolia mRNA for U2 snRNP auxiliary factor (Y18351) |
13945 |
14099 |
3e-045 |
Plus / Minus |
Nicotiana plumbaginifolia mRNA for U2 snRNP auxiliary factor (Y18351) |
14583 |
14712 |
7e-040 |
Plus / Minus |
Nicotiana plumbaginifolia mRNA for U2 snRNP auxiliary factor (Y18351) |
15614 |
15730 |
3e-039 |
Plus / Minus |
Nicotiana plumbaginifolia mRNA for U2 snRNP auxiliary factor (Y18351) |
16591 |
16653 |
9e-024 |
Plus / Minus |
Nicotiana plumbaginifolia mRNA for U2 snRNP auxiliary factor (Y18351) |
18268 |
18377 |
3e-023 |
Plus / Minus |
Nicotiana plumbaginifolia mRNA for U2 snRNP auxiliary factor (Y18351) |
15815 |
15879 |
3e-020 |
Plus / Minus |
Nicotiana plumbaginifolia mRNA for U2 snRNP auxiliary factor (Y18351) |
14399 |
14470 |
3e-017 |
Plus / Minus |
Nicotiana plumbaginifolia mRNA for U2 snRNP auxiliary factor (Y18351) |
17540 |
17614 |
3e-014 |
Plus / Minus |
Nicotiana plumbaginifolia mRNA for U2 snRNP auxiliary factor (Y18351) |
16868 |
16936 |
8e-012 |
Plus / Minus |
Nicotiana plumbaginifolia mRNA for U2 snRNP auxiliary factor (Y18351) |
16736 |
16783 |
1e-007 |
Plus / Minus |
Nicotiana plumbaginifolia mRNA for U2 snRNP auxiliary factor (Y18351) |
13749 |
13773 |
0.42 |
Plus / Minus |
18665 |
18949 |
e-157 |
Plus / Minus |
|
19623 |
19765 |
1e-072 |
Plus / Minus |
|
18327 |
18377 |
9e-018 |
Plus / Minus |
|
13852 |
14099 |
e-133 |
Plus / Minus |
|
13733 |
13826 |
2e-043 |
Plus / Minus |
|
14399 |
14470 |
3e-030 |
Plus / Minus |
|
14573 |
14629 |
2e-021 |
Plus / Minus |
|
25995 |
26441 |
e-123 |
Plus / Minus |
|
24557 |
24703 |
9e-021 |
Plus / Minus |
|
18665 |
18780 |
1e-056 |
Plus / Minus |
|
18268 |
18377 |
5e-053 |
Plus / Minus |
|
|
|
|
|
|
Sequence Name |
Start bp |
End bp |
E-value |
Strand |
H19R |
23146 |
23733 |
0.00 |
Plus/Minus |
BA47f-2 |
16702 |
17063 |
e-162 |
Plus |
L.esculentum GBF4 mRNA (X74942) |
11596 |
11861 |
e-139 |
Plus / Minus |
L.esculentum GBF4 mRNA (X74942) |
15809 |
16016 |
8e-098 |
Plus / Minus |
L.esculentum GBF4 mRNA (X74942) |
11822 |
11995 |
5e-087 |
Plus / Minus |
L.esculentum GBF4 mRNA (X74942) |
16419 |
16580 |
4e-084 |
Plus / Minus |
L.esculentum GBF4 mRNA (X74942) |
12275 |
12405 |
1e-065 |
Plus / Minus |
L.esculentum GBF4 mRNA (X74942) |
14921 |
15002 |
2e-036 |
Plus / Minus |
L.esculentum GBF4 mRNA (X74942) |
16707 |
16787 |
1e-035 |
Plus / Minus |
L.esculentum GBF4 mRNA (X74942) |
16279 |
16336 |
5e-022 |
Plus / Minus |
L.esculentum GBF4 mRNA (X74942) |
16105 |
16146 |
2e-012 |
Plus / Minus |
|
|
|
|
|
Sequence Name |
Start bp |
End bp |
E-value |
Strand |
cC-esflcLEL1G15c1 |
4333 |
4600 |
e-140 |
Plus / Minus |
cC-esflcLEL6D02a1 & cC-esflcLEL6D02d1 |
4303 |
4600 |
e-129 |
Plus / Minus |
cLES15K9 |
4400 |
4600 |
e-102 |
Plus / Plus |
BA10r-2 |
3197 |
3418 |
4e-080 |
Plus / Plus |
|
|
|
|
|
Sequence Name |
Start bp |
End bp |
E-value |
Strand |
BA10r-2 |
1252 |
1764 |
0.0 |
Plus |
cLER16J17 |
4756 |
5034 |
e-156 |
Plus / Minus |
cC-esflcLEL21J23d1 |
4850 |
5034 |
2e-073 |
Plus / Minus |
|
|
|
|
|
|
|
|
|
|
Started with NCBI's ORF finding program: http://www.ncbi.nlm.nih.gov/gorf/gorf.html accepting only those ORFs which are blastp hits better than 10e-50 Only examining ORFs whose length is greater than 210 bp. N.B. Some of these ORFs have low quality hits to Arabidopsis sequences, e.g. 20070..20372. N.B. I can't use the ncbi page to blast against dbEST which is probably more informative. Need to get some genefinder software on the Theory Center Cluster.
Frame |
From |
To |
Hit |
-3 |
6208 |
7284 |
None |
+3 |
28482 |
29282 |
None |
-1 |
12768 |
13358 |
Z99708 - Ap2 |
-1 |
6798 |
7205 |
None |
+1 |
14809 |
15144 |
None |
-2 |
21143 |
21451 |
None |
-1 |
20070 |
20372 |
None |
-3 |
20743 |
21015 |
None |
-3 |
32647 |
32916 |
None |
+3 |
25848 |
26117 |
None |
-1 |
16293 |
16559 |
None |
+2 |
6971 |
7237 |
None |
-2 |
494 |
5197 |
None |
+2 |
15209 |
15463 |
None |
-3 |
1 |
243 |
None |
+2 |
770 |
1009 |
None |
-2 |
39203 |
39439 |
None |
-2 |
4894 |
5121 |
None |
-2 |
9701 |
9925 |
None |
+1 |
26125 |
26346 |
None |
-1 |
10092 |
10313 |
None |
-1 |
12258 |
12473 |
None |
-1 |
4101 |
4313 |
None |
-1 |
35253 |
35462 |
None |
+2 |
34466 |
34675 |
None |
Frame |
From |
To |
Hit |
+2 |
683 |
2182 |
Putative vicilin storage protein, Arabidopsis AC006135 |
-1 |
25991 |
26779 |
Nothing |
-1 |
6515 |
7075 |
Nothing |
-1 |
18662 |
19039 |
U2 snRNP aux factor Y18351 |
+3 |
6447 |
6818 |
Nothing |
-3 |
2997 |
3368 |
Nothing |
-1 |
7664 |
8032 |
Nothing |
-1 |
26999 |
27310 |
Nothing |
-2 |
8440 |
8736 |
Nothing |
-3 |
29025 |
29300 |
Nothing |
+2 |
6200 |
6469 |
Nothing |
-3 |
13779 |
14042 |
Nothing |
-2 |
13960 |
14205 |
Nothing |
-3 |
25281 |
25511 |
Nothing |
+3 |
8712 |
8942 |
Nothing |
-3 |
18183 |
18410 |
Nothing |
-3 |
2724 |
2951 |
Nothing |
+1 |
7036 |
7251 |
Nothing |
|
|
|
|
|
|
|
|
Frame |
From |
To |
Hit |
+1 |
9415 |
1105 |
Z99708 Arabidopsis SCARECROW-like protein |
-2 |
248 |
682 |
None |
+2 |
22796 |
23203 |
None |
-1 |
15597 |
15947 |
None |
+2 |
23417 |
23728 |
None |
-3 |
22882 |
23187 |
None |
-3 |
2515 |
2772 |
None |
-1 |
17223 |
17474 |
None |
-1 |
10287 |
10514 |
None |
+3 |
12060 |
12284 |
None |
+3 |
21105 |
21317 |
None |
+2 |
2936 |
3148 |
None |
-3 |
19135 |
19344 |
None |
|
|
|
|
http://dot.imgen.bcm.tmc.edu:9331/gene-finder/gf.html
BESTORF with Plant database = no reliable predictions
FGENEP with Plant database
Gene # |
Strand |
ORF# |
feature |
start |
end |
BLAST |
1 |
- |
1 |
CDSf |
2470 |
2478 |
N/A (3 aa) |
2 |
- |
1 |
CDSo |
6208 |
7284 |
AAD30261 |
|
|
|
|
|
|
|
3 |
+ |
1 |
CDSf |
14809 |
15060 |
|
3 |
+ |
2 |
CDSi |
15224 |
15439 |
|
3 |
+ |
3 |
CDSi |
15794 |
15814 |
|
3 |
+ |
4 |
CDSi |
16339 |
16416 |
|
3 |
+ |
5 |
CDSi |
17623 |
17652 |
|
3 |
+ |
6 |
CDSi |
18115 |
18150 |
|
3 |
+ |
7 |
CDSi |
18690 |
18752 |
|
3 |
+ |
8 |
CDSl |
23249 |
23266 |
|
3 |
+ |
|
Poly-A: |
23393 |
|
|
|
|
|
|
|
|
|
4 |
+ |
1 |
CDSf |
28482 |
29174 |
Nothing |
4 |
+ |
2 |
CDSi |
29725 |
29856 |
|
4 |
+ |
3 |
CDSi |
31670 |
31714 |
|
4 |
+ |
4 |
CDSi |
31903 |
31926 |
|
4 |
+ |
5 |
CDSi |
32148 |
32237 |
|
4 |
+ |
6 |
CDSl |
33502 |
33513 |
|
4 |
+ |
|
Poly-A: |
33607 |
|
|
BESTORF with Plant Database =
In the + direction there's one predicted ORF from 668 to 2179
>BESTORF 1 1 fragment (s) 683 - 2179 499 aa, chain +
MKVPEDVIEEVLAGTEVPAIVHGVPKSTKKKKNLWEMEAQFMKTVLGRGSYSFFDNRRNK
KKSSQLFNVFQEKPDFENCNGWSTVINRKKLPALKGSQIGIYVVNLTKGSMMGPHWNPMA
TEIGIAIQGEGMVRVVCSKSGTGCKNMRFKVEEGDVFVVPRFDPMAQMAFNNNSFVFVGF
STTTKKHHPQYLTGKASVLRTLDRQILEASFNVGNTTMHQILEAQGDSVILECTSCAEEE
KRLMEEEMRKEEEEAKKKEEARKAEEERREKEAEEERKRQEEEARKREEEEIRRRQEEEE
ARRRQEEEEEERERQEARKKQEEEEAAQREAEQARREEEEAEKRRQEEEESRREEKARRR
QQEEARRREEEEAAKRQHEEEAEREAEEARRIEEEEAQREAEEARRIQQEEEAERARRRE
EEAETRRKEEEEEESRRQEEESRRSEEEAAREAERERQEEAERQEEARRREEETEERHQQ
EETEEEEPGQPEMNGYSSN
Which has a hit of 2e-55 to gi|4218005 (AC006135) putative vicilin storage protein (globulin-like) [Arabidopsis thaliana]
FGENEP with Plant database
Found four genes:
1 Gene with many exons:
204 - 5177 256 aa Chain: +
ESLSEPYSSIRKMVLGFDKKVLQAAFHVPEDVIEEVLAGTEVPAIVHGVPKSTKKKKNLW
EMEAQFMKTVLGRGSYSFFDNRRNKKKSSQLFNVFQEKPDFENCNGWSTVINRKKLPALK
GSQIGIYVVNLTKGSMMGPHWNPMATEIGIAIQGEGMVRVVCSKSGTGCKNMRFKVEEGD
VFVVPSTTTKKHHPQYLTGKASVLRTLDRQILEASFNVGNTTMHQILEAQEIQSIREKEH
RFPSQTIYRLRQRAST
Blastp vs nr: 1e-52
gi|4218005 (AC006135) putative vicilin storage protein (globulin-like) [Arabidopsis thaliana]
2 Gene with one exon:
8440 - 8736 98 aa Chain: -
MPADPDNSSAMNDSTGSGEASVSSSGNQVVPLKESAKKKRNLPGMPGKSCSVTILDEPNF
TNIQRRFDFLFLVALIRSRCRGYCFVTNNFAGDEQICL
Blastp vs nr: Nothing
3 Gene with many exons:
12547 - 19657 539 aa Chain: -
MVLLLTVTPSLSILLVTPRKKGAHQEAGRRRGRKGVIKTGTGTGIEIEIETGTEVRKGTK
IGIGTERGRRTVIVITEIVIGIGVIGGRGSEIEMKMIFSELEITIGEFKCSSKSEETMTK
IERTDRDTSLALGVDPSIDQGLDHGHHLRAKGSVVLIWHLPPVHCYLALLMLQVHGPSIS
LQFKYLENCHVFPGHSVEVRIEMHGYILKAMLGYLPHCADNGRQVPGTTNPSIPGMFSNM
FPLAAGQFGALPIMPVQAMTQQATRHARRVYVGGLPPTANEQSETICSCRISAGNTAGPG
DAVVNVYINHEKKFAFVEMRSVEEASNAMALDGVIFEGGPVKVRRPSDYNPSLAATLGPS
QPSPNLNLAAVGLTPGSSGGLEGPDRIFVGGLPYYFTESQIRELLESFGQLRGFDLVKDR
ETGNSKALNGIKMGDKTLTVRRANQGTTQPNPEQESVLLHAQQQIALQRFMLQPGALATK
VLCLTEVVSVDELKDDDDYQDILEDMRIECGKFGALLNVVIPRPNPNGEPTPGLGKELD
Blastp vs nr: e-162
emb|CAA77136| (Y18351) U2 snRNP auxiliary factor, large subunit [Nicotiana plumbaginifolia]
4 Gene with many exons:
24260 - 32060 369 aa Chain: -
IVSTKWVPSSTCADFAVGIFIGDKNNIGRLHSQPQISTSILFMQLYSSDHHCSIVRKNKL
PGNYKDIGEHWLHLKYHGTRSDISARHLQKMAMEGGSPVPFSMIPWGWGRGRGRVGCLRN
LLKNVLELGFLREQEAIWESMGKQINDIMHGGRNRPTLKNPLNPKANIPALSSSDKPTPT
TFLRTSLSEEIRGALPKALQNVFRIHKHAETWVKQNFIDGIKKMQRLRKGELCEDLGRNC
PSQDLCCLFSMLYIYFTFLECLLQICELMEKEDPMKAIISPAGISDALRKEIEAVVNQIA
VNIHGVYVLKSSPDNPQYDALRKVVIDLFMAEGSNAKLKKASIVEAAKLQLGRDVTTVEF
QKVVAEVNN
Blastp vs nr: Nothing
BESTORF with Plant database
In the + direction there's one predicted ORF from 9415 to 11022
>BESTORF 1 1 fragment (s) 9415 - 11022 536 aa, chain +
MKVPFSTNDNVSSKPLVNSNNSFTFPAATNGSNLCYEPKSVLELRRSPSPIVDKQIITTN
PDLSALCGGEDPLQLGDHVLSNFEDWDSLMRELGLHDDSASLSKTNPLTHSESLTQFHNL
SEFSAESNQFPSPDFSFSDTNFPQQFPTVNQASFINALDLSGDIHQNWSVGFDYVDELIR
FAECFETNAFQLAHVILARLNQRLRSAAGKPLQRAAFYFKEALQAQLAGSARQTRSSSSS
DVIQTIKSYKILSNISPIPMFSSFTANQAVLEAVDGSMLVHVIDFDIGLGGHWASFMKEL
ADKAECRKANAPILRITALVPEEYAVESRLIRENLTQFARELNIGFEIDFVLIRTFELLS
FKAIKFMEGEKTAVLLSPAIFRRVGSGFVNELRRISPNVVVHVDSEGLMGYGAMSFRQTV
IDGLEFYSTLLESLEAANIGGGNCGDWMRKIENFVLFPKIVDMIGAVGRRGGGGSWRDAM
VDAGFRPVGLSQFADFQADCLLGRVQVRGFHVAKRQAEMLLCWHDRALVATSAWRC
Which has a hit of e-115 to Z99708
SCARECROW-like protein from Arabidopsis
FGENEP with Plant database
Predicts a 23 ORF gene from 5412 - 23660 which has a dual domain hit to:
G-box-binding protein - tomato (X74942) at an E-value of 2e-118 (blastp vs nr) from 347-598 AA
Protein kinase - Arabidopsis (X92728) at an E-value of 8e-27 (blastp vs nr) from 1 - 91 AA
Gene with many exons:
5412 - 23660 643 aa Chain: -
NLLLTEDKTTIKLADFGLAREDAEAEMTTEAGTYRWMAPEMFSMDPIRVGVKKYYNHKVD
VYSFSMILWELLTNSTPFKGRSNIMVAYATATLYDTKIEWILRSQCFQTQEPRFCFLYGC
SLFRIIFFFLNLFGGGITNLTFWKIRLGTFLFHLRFTTQFDDQFSMFKRFLENWATEWDV
PVPFRFVPFRLHTGRDSLNRYTERDGTVVSYRCTSTGSSQFRSGSDPKNKKNTYFCNPKK
CRPVYPIHPVLFSRFIILSLSKPFLDYSRGYLHIQLIDPMGAGEESTPTKTSKPPLTQET
PTAPSYPDWSSSMQAYYSAGATPPFFASPVASPAPHPYMWGGQHPLMPPYGTPVPYPALY
PPAGVYAHPNIATPAPNSVPANPEADGKGPEGKDRNSSKKLKVCSGGKAGDNGKVTSGSG
NDGATQSDESRSEGASAQNNPAKENHPTSIHGNPVTMPATNLNIGMDVWNASAAGPGAIK
IQQNATGPVIGHEGRMNDQWIQEERELKRQKRKQSNRESARRSRLRKQLELTTMDNHMRE
KFELGNTNERMEVMAECEELQRRVEALSHENHSLKDELQRLSEECEKLTSENNLIKQEKA
SINQAKLSINPPLNIRIIFHVDYKNIYVVGKGKMRGQNTKNQF
BESTORF with Plant database
In the + direction there's one predicted ORF from 2632 to 3228
>BESTORF 1 1 fragment (s) 2632 - 3228 199 aa, chain +
MHKNSQSSDSLSHIVSLTSKISSNLTRIDPKFSSIIRSYGVVGMFDHTLKIYHQMDDLGT
PRPVISFNVLLSACVRSKLYDCVPQLLDEISVKYGFLPDKVSYDILIRSYCEMRSSKMAI
KILKVIEEKSVEITTITFTTILHSFYKKGKNDEAEKVWNEMVNKGCGPDVGAYNVKIINI
QGGDLEGVKALIEEMTMLD
Which has a hit 5e-46 to emb|CAA05629.1| (AJ002597) membrane-associated salt-inducible protein like [Arabidopsis thaliana]
FGENEP with Plant database
Predicts one gene with three exons from 534 - 1010
1 Gene with many exons:
534 - 1010 83 aa Chain: -
METDSDQDEVVREVDVWLTPSANEFYVLQYPLRSEWRPYGLDERCQDVRLRPSSAEMEVD
LAIDFDSKNFDRDSVHAATIKKQ
With no similarity to anything in the database.
Last updated on: August 20, 1999.