Mini-Users Guide to Using Consed on Syntom

The purpose of this document is to introduce you to using syntom to examine genomic assemblies using the Consed software package.

Obtaining an Account

To obtain an account on syntom, please contact Andreas Matern (alm13@cornell.edu)

Logging On

You can log onto syntom from any computer which has an internet connection. I'll provide examples for logging on using a Windows NT machine. From the Start menu, select Run.

Enter telnet syntom.cit.cornell.edu in the Open field and hit OK. If all goes well, the follwing screen should appear:

Enter your login name at the prompt and hit enter, and then enter your password and hit enter.

The window will now tell you where your last login was from and then display a friendly message :-)

You are now logged into syntom.

Basic Commands

Command	Function
ls	lists all the files in the directory
cd	puts you in your home directory
cd /path/to/directory	changes your directory
exit	logs out
cp file1 file2	copies file1 to file2
mv file1 file2	moves file1 to file1
cd ..	changes one directory up the directory tree
pwd	prints the working directory

These are just some basic commands. There are many more -- I have some Linux books on my shelf in Theresa's office, feel free to check them out but please don't steal them....

Starting X Windows

Consed uses a graphical interface which requires an X-windows emulator. We'll use Exceed (which is installed on most of the machines in G-04). To start Exceed on your machine go to the start menu -> Programs -> Exceed -> Exceed. A Hummingbird picture should open and then disappear. You now need to tell syntom the address of your machine so it can draw the windows on the appropriate machine. The easiest way to figure out what your IP address is to logout of your current telnet session to syntom (type exit) and then log back in (Run -> telnet syntom.cit.cornell.edu) After the Password: field there should be a line which says:

Last login: <date> <time> from genomics8.cit.cornell.edu (<- this will be the address)

That's the IP address of the machine you are using.

To tell syntom where to display the windows, simply type:

export DISPLAY=genomics8.cit.cornell.edu:0

Don't forget to append the colon zero (:0) !

Getting to the Cereon BAC directory

The BAC19 consed information is here: /home/amatern/sequences/bac19new/

To get to it, type cd /home/amatern/sequences/bac19new

To see what files are there, type ls

You'll see three directories:

chromat_dir	the chromatographs
phd_dir	the phd files generated by phrap
edit_dir	this is the directory you'll be working in

To get to edit_dir, simply type cd edit_dir

Typing ls gives you a list of all the files that are there:

To start consed, simply type consed_linux

If all goes well, there should be a consed window on your screen!

Here is the documentation from consed, I've added a little bit of information, but for the most part, this is the documentation that comes with the program. There are a couple of features of consed that are not yet implemented on syntom. When you find something that doesn't work, e-mail me and I'll get it working....

CONSED 9.0 DOCUMENTATION

CONTENTS:

WHAT IS NEW IN CONSED 9.0

QUICK TOUR OF CONSED

ADVANCED PHRAP/CONSED USAGE

INSTALLING CONSED

NOTE TO SGI USERS

FOR PROGRAMMERS AND FELLOW TRAVELLERS ONLY

MONITORS AND MICE FOR CONSED

PRIMER PICKING PARAMETERS

AUTOFINISH PARAMETERS

NEW ACE FILE FORMAT

WHAT THE COLORS MEAN

------------------------------------------------------------------------

WHAT IS NEW IN CONSED 9.0

This section is mainly intended for advanced consed users. Novice

users should consult the Quick Tour (below).

---------------------------------------------------------------------

Note to Linux users: The 'scroll all traces' bug is fixed.

Note to Solaris 2.7 users: Consed now works on this version of

solaris.

---------------------------------------------------------------------

Autofinish Improvements

* Reverses (universal primer reverse reads) are now suggested in order

to close gaps and improve low quality regions in addition to flanking

gaps.

* Autofinish now evaluates itself--after you do the reads it

suggests, you can run it and it will tell you how well the reads

solved the problems they were supposed to solve.

* Oligos are tagged (when you use -doExperiments).

* doNotFinish tags can be used to tell autofinish to not try to

finish particular regions.

* There are many more flags, allowing you a great amount of control

over autofinish. For example, if you wanted the first round of autofinish

to not choose any custom oligo experiments, fine. If you wanted

autofinish to only close gaps and not improve the error rate within

contigs, fine.

* The autofinish output is very detailed and verbose. Thus in

addition there are 3 summary lists of experiments to do (one file for

forward universal primer experiments, one file for reverse universal

primer experiments, and one file for custom oligo walks.) These

summary files are easily imported into Excel. You can use the last

one to email order oligos.

---------------------------------------------------------------------

Consed:

* Consed already had the ability to tear a contig into 2 and join 2

contigs into one. Now it also has the ability to move a single read

to a different location within an assembly. Now you have much better

control in fixing a misassembly.

* You used to be able to compare a contigs to one other contig. Now

you can compare a contig to many other contigs.

* For sites with LONG, LONG read names: you can now customize how

much space consed saves for displaying read names. You can also

customize the initial size of the important windows.

* In the Traces Window, you can move left and right with the arrow

keys.

* In the Aligned Reads Window, you can instantly move to the

beginning and/or end of a read. Similarly, you can move instantly

to the beginning and/or end of the consensus.

* The ABI base calls can be hidden, if you like, thus allowing you

to see more traces at once.

* All documentation windows can be searched and printed out.

* In the past you could see all tags of a particular type for a

particular contig. Now there is also a function to see all tags of a

particular type in any contig.

* You can now write all contigs to a file in FASTA format with a

single click.

* You can navigate to multiple locations while staying in the Aligned

Reads Window--you don't have to switch windows with each location.

* For the primers that consed picks, consed will show you the

alignment of the closest false match. This will help you in deciding

if you want to raise consed.primersMaxMatchElsewhereScore

* The template picking part of primer picking has been further improved:

------------------------------------------------- (template)

---> (primer)

<----distance to end of template---->

This 'distance to end of template' gives the longest read you

could possibly make with this primer and this template. If this

distance is too short, you can now reject the template. The consed

resource to set is:

consed.primersWhenChoosingATemplateMinPotentialReadLength: 500

* If you want to pick templates yourself, you can turn off consed's

template picking. This is particularly useful if you haven't bothered

to customize determineReadTypes.perl

* If you are using an old version of phred or if you haven't

installed it correctly (with all kinds of bad effects), consed will

warn you.

* Previously, consed reported the error rate for a contig. But some

contigs have long tails of low quality bases and you would like to

know the error rate for the contig without that long tail. Now you

can do that: You can get the error rate for a specified region.

* Programmers can now append RT tags to the ace file. (See FOR

PROGRAMMERS AND FELLOW TRAVELLERS in README.txt)

* Programmers can popup a trace by a command from a different

program.

----------------------------------------------------------------------------

QUICK TOUR OF CONSED

Release 9.0

Consed is a program for viewing and editing assemblies assembled with

the phrap assembly program.

If you are already an advanced consed user, you should read through

this and do any of the exercises on features that you are unfamiliar

with. I frequently run across people who are doing something in

consed a hard way month after month, and request a new feature to make

things easier, when that new feature is already in consed.

If you have never used consed before, to follow this Quick Tour will

take you less than 2 hours. However, it will save you approximately 2

days in agony. If you have 2 extra days to spare, and prefer to waste

them in agony, then do not do this Quick Tour and instead immediately

skip down to 'INSTALLING CONSED' below.

When you do the quick tour, I encourage you to be free about changing

the data set. If you really mess things up (such as changing all a

read's bases to N's), no problem--just delete the data set and start

again with a fresh copy.

The software is already downloaded and your syntom profile should be correctly set so that you don't need to do the following - ALM

1) After downloading the distribution with netscape (see www.phrap.org

and click on 'consed'), copy the distribution to a unix computer (if

it is not already on one), and then unpack the files by typing the

appropriate line below (which one depends on what you named the file

downloaded by netscape):

zcat consed_solaris.tar.Z | tar -xvf -

zcat consed_alpha.tar.Z | tar -xvf -

zcat consed_hp.tar.Z | tar -xvf -

zcat consed_sgi.tar.Z | tar -xvf -

zcat consed_linux.tar.Z | tar -xvf -

Note: You must untar on a UNIX computer--not on an NT computer.

2) The only unix commands you must learn are the following 3:

pwd -- this tells you were you are

ls -- this tells you what files are there (Same as DIR in DOS)

cd -- this moves you (Same as CD in DOS)

That's it--use them a lot!

USING CONSED GRAPHICALLY

3) Type the following:

cd /home/amatern/sequences/standard/edit_dir

4) start consed by typing the appropriate command below:

consed_linux

Two windows will appear. One of these will have the list of .ace

files and say 'select assembly file to open' and

'standard.fasta.screen.ace.1'. Double click on that name. The first

window goes away.

You will now see a list of one contig and a list of reads. This is the

'Main Consed Window'.

Double click on 'Contig1'.

The 'Aligned Reads Window' will appear.

Try scrolling back and forth. Try scrolling by dragging the thumb of

the scrollbar. Also try scrolling by clicking on the 4 << < > >>

buttons for scrolling by small amounts. For scrolling by tiny

amounts, click on the arrows at either end of the scrollbar. For

scrolling by huge amounts, use the middle mouse button and just click

on some location on the scrollbar. For scrolling to the beginning or

end of the contig, use the <<< or >>> buttons.

(Question: why can't you just move the scrollbar to the extreme left

in order to go to the beginning of the contig? Answer: in typical

assemblies, there are reads that protrude beyond the beginning of the

contig and reads that protrude beyond the end of the contig. Moving

the scrollbar to the extreme left will scroll the contig to the

beginning of the leftmost read--typically far to the left of the

beginning of the contig. Thus you should get in the habit of using

the <<< and >>> buttons.)

Notice the colors. The bases that are in red are the ones that

disagree with the consensus.

Notice the different shades of grey background (around the bases).

They have the following meanings, but first, you need to understand

the meaning of the quality values:

A quality value of 10 means 1 error in ten to the 1.0 power

A quality value of 20 means 1 error in ten to the 2.0 power

A quality value of 30 means 1 error in ten to the 3.0 power

A quality value of 40 means 1 error in ten to the 4.0 power

and for quality values in between:

A quality value of 25 means 1 error in ten to the 2.5 power

Get the idea?

(These have actually been empirically verified--if you are interested

in the gory details, read the phred papers:

Ewing B, Hillier L, Wendl M, Green P: Basecalling of automated

sequencer traces using phred. I. Accuracy assessment. Genome Research

8, 175-185 (1998).

Ewing B, Green P: Basecalling of automated sequencer traces using

phred. II. Error probabilities. Genome Research 8, 186-194 (1998).

In that same copy of the journal is a paper about consed, as well.)

Also notice the upper and lowercase. This is just a cruder indication

of the quality of the bases.

5) To see the quality value of a particular base, point at it and click

with the left mouse button.

These quality values are shown in grey scales:

Quality 0 through 4 is given by dark grey

Quality 5 through 9 is given by a shade lighter

Quality 10 through 14 is given by a shade still lighter

Quality of 40 through 97 is given by white (the brightest shade)

A quality value of 99 is reserved for bases that have been edited and

the user is absolutely sure of the base ('high quality edited').

A quality value of 98 is reserved for bases that have been edited and

the user is not sure of the base ('low quality edit').

The ends of the reads shows bases that are grey and have a black

background. These are the low quality ends of the reads or the

unaligned ends of reads, as determined by phrap.

To see the quality of a base, click on it. You will see the quality

displayed in the Info Box on the Aligned Reads Window.

6) Click on a base on a read. Then hold down the control key and

type 'a'. You will move to the beginning of the read. Hold down the

control key and type 'e'. You will move to the end of the read.

(Emacs users will recognize these commands.)

7) Scroll so that location 490 is about in the middle of the aligned

reads window. Push the left mouse button down on the menu item 'Dim'.

There will be a list of choices that will appear. Drag the cursor

down to 'Dim Nothing' and release. Now look what happened to the

color of the bases. The ends of the reads that used to be with a

black background now appear red with a grey background. You are

seeing the clipped-off bases with all the same information as any

other base. Since there is a huge amount of red (discrepant) bases,

the screen becomes distracting and busy. Thus by default the low

quality clipped-off bases are made with a black background and a grey

foreground so they don't distract you.

Notice there is a distinction here between 'low quality ends of

reads' and 'unaligned ends of reads'. Unaligned ends of reads can be

low quality as well, or they can be high quality, as in the case of

chimeric reads.

You can play with the dimming options a bit. Then return it to 'Dim

Low Quality' for the rest of this tour.

TRACES AND EDITING

8) Point with the mouse at a base of one of the reads and click with the

both mouse buttons simultaneously. It's difficult at first, but you'll quickly get the hang of it. (If you have a 2 button mouse, see MONITORS AND MICE FOR CONSED below.) The Trace Window showing the traces for that stretch of read should popup.

There are 3 rows of bases in the trace window:

'con' is the consensus

'edt' is where you can edit the base calls of the read

'phd' is the original phred base calls

Notice that a red rectangle blinks (the 'cursor') in the corresponding

positions of the Aligned Reads Window and the Trace Window.

9) Try editing in the Trace Window. You can click the left mouse

button on a base in the 'edt' line to set the cursor (a blinking red

rectangle). You can directly overstrike a base by typing a letter.

Try this. Try undoing it (by clicking on 'undo' ). If you want to

undo more than one edit, you will have to go back to the main consed

window and click on the button labeled 'Undo Edit...'--you will learn

that later.

You can move left and right with the arrow keys.

We believe that the user should change a base call only while

examining the traces. That is why editing is done here--not in the

Aligned Reads Window.

10) You can insert a column of pads by pushing the space bar. Try

this. (You may need to click on a base on the 'edt' line first.)

(For those of you new to editing assemblies, a 'pad', which in consed

and phrap is represented by the '*' character, is used to align

two or more sequences such as these:

gttgacagtaatcta

gttgacataatcta

in which one sequence has an inserted or deleted base with respect to

the other. By inserting the pad character, it is possible to get a

good alignment:

gttgacagtaatcta

gttgaca*taatcta

This is the purpose of pad character--it is just a placeholder.)

You can then overstrike a pad with a base. In this way you

can insert a base, and still preserve the alignment.

11) Try highlighting a stretch of a read on the edt line by holding

down both mouse buttons and dragging the cursor over some bases.

They will turn yellow as you drag. Then release the mouse buttons. A

window will popup giving you some choices of what to do with those

(yellow) bases.:

Make High Quality--makes the highlighted bases edited high quality

(99). This tells phrap (when it reassembles) that you are

sure of the sequence here.

Change Consensus--make the highlighted bases edited high quality and

change the consensus to agree with that stretch of the read.

This is a directive to phrap (upon reassembly) to use that

stretch of that read to be the consensus.

Make low quality--makes the highlighted bases edited low quality.

This tells phrap (when it reassembles) that you are not sure

of the bases here and phrap can go ahead and make a join even

if the bases in this region don't match perfectly.

Make Low Quality to Left End--same as above, but all the way to

the left end of the read.

Make Low Quality to Right End--same as above, but all the way to

the right end of the read.

Change to n's--Change the highlighted bases to n's which means

they are unknown bases. This tells phrap (when it

reassembles) to not make any join based on these bases. It is

useful when you believe the bases may be in the chimeric

portion of a read.

Change to n's to left--same as above but to left end.

Change to n's to right--same as above but to right end.

Add Comment Tag--allows user to add a comment to a stretch of read

bases.

Add Tag--allows user to add any tag to a stretch of read bases.

Dismiss--you decided you don't really want to do anything with

this stretch of bases.

This popup is made so that nothing else works until you choose

something. Try each of these choices, except for tags, which you'll

try below.

'Change Consensus' has an additional function--if a read extends out

on the right beyond the end of the consensus, you can extend the

consensus by using this function. You might want to do this, for

example, if crossmatch did not correctly find the cloning site and

thus clipped too much. You can add these bases back to the consensus

by using 'Change Consensus'. (You can't try it with this dataset

since no read extends beyond the end of the consensus, but you may see

this phenomenon with your own data.)

12) To delete a base, overstrike it with a '*' character. (Phrap

ignores '*', so this is the same as deleting the character.) If you

overstrike all bases in a column with * characters so the entire

column consists of *'s (including the consensus base), there is no way

to remove the column. This is OK since when you export the

consensus (try the exercise on EXPORTING THE CONSENSUS), the

*'s are not exported. While you are editing in consed,

we believe there should be a visual indication that a base was

deleted.

SAVING THE ASSEMBLY

13) To save the assembly, pull down the 'File' menu on the Aligned

Reads Window, and release on 'Save assembly'. A box will pop up with

a suggested name. I suggest you always use the one it suggests. The

idea is that the ace files:

(project).fasta.screen.ace.1

(project).fasta.screen.ace.2

(project).fasta.screen.ace.3

(project).fasta.screen.ace.4

(project).fasta.screen.ace.5

are in order of how old they are. If you feel you are taking up too

much disk space, then start deleting the ace files starting at the

oldest. I do not recommend that you overwrite existing ace files.

The version numbers just keep growing, and that is not a problem.

EXPORTING THE CONSENSUS

14) Exporting the consensus. Bring the Aligned Reads Window into view

again. Hold down the left mouse button on the 'File' menu and

release the button on 'Export consensus sequence'. Notice that the

consensus will be stored (in this case) in a file called

'Contig1.fasta'. Click 'OK'. There is now a file in your edit_dir

directory called 'Contig1.fasta' that has the consensus sequence in

it. If you want to see the file, bring up another Xterm (if you are

UNIX literate), and type:

cd standard/edit_dir

more Contig1.fasta

15) Fancier exporting the consensus. Bring the Aligned Reads Window

into view again. Hold down the left mouse button on the 'File' menu

but this time release on 'Export consensus sequence (with

options)...'. Just export a little snip of the consensus, from 400

to 410. (You will notice this contains a pad * character.) Ask for

both the bases file and the quality file. Click 'OK'. Consed will

want to call this file 'Contig1.fasta' again. You can overwrite the

existing file.

Look in your other Xterm at these files:

more Contig1.fasta

more Contig1.fasta.qual

The one file contains the bases (but no * pads) and the other

contains the corresponding qualities of those bases.

16) Exporting the consensus of all contigs at once: Go to the Main

Consed Window. Point to 'File', hold down the left mouse button, and

release on 'Write all contigs to fasta file'. You then can choose a

filename for all contigs to be written to.

17) (For this step, first click on the 'Dim' menu and release on 'Dim

Nothing'.) Point to the 'Color' menu, hold down the left mouse button

and release on 'Color Means Edited and Tags'. Notice that the bases

that you have edited (make sure you have edited some bases) will stand

out in either white or grey (depending on whether the base was made

high quality or low quality). Observe this both in the Trace Window

and the Aligned Reads window. This colormode is useful if you are

interested in easily spotting which bases are edited.

Return to the 'Color Means Quality and Tags' colormode by the

following: point to the 'Color' menu, hold down the left moust button

and release on 'Color Means Quality and Tags'.

FIND MAIN WINDOW

18) On the Aligned Reads window, click on 'Find Main Win'. This will

cause the Consed Main Window to pop up in the event you have buried it under

other windows or iconified it. (This may not with some settings of

your X emulator. In that case you will have to find and click on the

Main Window to bring it up.)

MULTIPLE UNDO EDIT

19) Now that the Consed Main Window is visible, click the 'Undo Edit...'

button. There will be a popup indicating the most recent edit. Click

'undo'. Then you will see the edit that was done before that. Click

'undo'. You can continue undoing if you like. You now know how to

undo more than one edit. You cannot choose which edits to undo and

which to not undo--edits can only be undone in precisely reverse order

from the order you made them.

SCROLLING TRACES AND ALIGNED READS TOGETHER

20) In the Aligned Reads window, scroll along the contig to a

different point. Click the left mouse button on a read whose trace is

already up. Notice that the existing trace instantly scrolls to the

corresponding location. Now go to the Trace Window and scroll the

traces to a new location. Click on the edt line with the left mouse

button. You will notice that the Aligned Reads window will instantly

scroll to the corresponding location. Thus you can keep the Aligned

Reads window and the traces scrolled to the same location.

EXAMINING ALL TRACES

21) Go to a region where there are lots of reads, say base 1660. Push

down the right mouse button and release on 'Display traces for all

reads'. You will see all traces displayed in a scrolling window. You

can drag the scrollbar on the right down and up to see all the traces.

This feature is particularly useful for polymorphism/mutation

detection work. This feature was added to work in cooperation with

polyphred. To see it in action, exit consed.

CONSED-POLYPHRED INTERACTION

Polyphred is a program for finding polymorphic sites; it was developed by

Debbie Nickerson's group (contact them at http://droog.mbt.washington.edu).

We have a test database, 'polyphred', which has had polyphred run on

it already. Polyphred has put a polymorphism tag on each polymorphic

site.

Type:

cd ../../polyphred/edit_dir

../../consed_(computer type)

where (computer type) is one of solaris, hp, alpha, sgi, or linux.

Double click on example2.fasta.screen.ace.1

When consed comes up, you should see 2 contigs.

Double click on Contig2

In the Aligned Reads Window, push the left mouse button while pointing

to the 'Navigate' menu and release on

'Toggle feature: when navigating to consensus location, pop up all

traces (currently off)'

That will turn this feature on.

Now push the left mouse button while pointing to the 'Navigate' menu

and release on 'Tags'. Up should pop a list of tag types. Double

click on 'polymorphism'. Polyphred has already been run so the

consensus is tagged with polymorphism tags at each polymorphic site.

Up will pop a window labelled 'Polymorphism Tags' with a list of

sites. Click on 'Next'.

If you correctly followed the instructions above, all the traces should

pop up at the first polymorphic site. You may want to reposition the

traces window to see it better.

Now ignore the original 'Polymorphism Tags' window and instead click

on 'Next' in the *traces* window. This will take you to the next

polymorphic site. Pretty nice, huh?

After you are done playing with this feature, exit consed and go back

to the previous database:

cd ../../standard/edit_dir

../../consed_(computer type)

Double click on standard.fasta.screen.ace.1

Double click on Contig1 to bring up the Aligned Reads Window again in

preparation for the next step.

NAVIGATING

22) In the Aligned Reads window, pull down the Navigate menu and

release on 'Low consensus quality'. You will see a list of locations.

Move the 'Low consensus quality' window down so you can see the

Aligned Reads window. Repeatedly click on 'Next' until you reach the

end of the list. (Low consensus quality means an area in which the

bases each have too high probability of being wrong.) This saves you from

having to look through large amounts of high quality data trying to

find problem areas.

Alternatively, you can click on the 'Prev' and 'Next' buttons on the

Aligned Reads Window. Thus you can keep the Aligned Reads Window in

front with input focus and keep the Low consensus quality window

pushed out of the way.

You may want to click on the 'Save' button in the Low consensus

quality Window to save to a file a copy of this list of problem areas

as you work through them.

In our experience, this will be the most important navigate list you

will use. In fact, finishing consists mainly of adding reads and

rephrapping until this list is reduced to nothing.

23) Dismiss the Low consensus quality window. Pull down the

'Navigate' menu again and release on 'High quality discrepancies as

above, but omitting tagged compressions and G_dropouts'. You will

probably notice there are no entries (unless you created some yourself

by editing). That is because there are no high quality discrepancies

with this dataset. So let's force there to be some by lowering the

quality threshold. First, dismiss the High quality discrepancies

window.

Click on 'Find Main Win'. In the main consed window, pulldown the

'Options' menu and release on 'General Preferences'. Notice that the

default for 'Threshold for High Quality Discrepancy' is 40. Change it

to 15 and click 'Apply & Dismiss'.

Then follow the steps above to bring up the High quality discrepancies

menu. Now you will see several entries. Click 'next' repeatedly to

go successively to the next high quality discrepancy in the Aligned

Reads Window.

You can also double click on a particular line in the High quality

discrepancies window to go to that location. Alternatively, you can

single click on a line and then click the 'Go' button.

Dismiss the High quality discrepancies window.

24) Similarly, try the other navigate lists: Unaligned high quality

regions (this list will be empty with this data set), Edits, Regions

covered by only 1 strand and only 1 chemistry, and Regions covered by only 1

subclone.

Unaligned high quality regions are regions in which the traces are

high quality so there is no question of the bases, but the region

differs so much from other reads that phrap has given up trying to

align the region with the consensus. This could be due to a chimeric

read, or perhaps the read belongs somewhere else.

We believe that regions covered by only 1 subclone should be covered

by a 2nd subclone to prevent the possibility of there being a deletion

in the single subclone.

There are so many different problem lists that you may forget to check

one of them and thus miss a serious problem. Thus we combined them

all into a single list. This is the first menu item: 'Low Cons/High

Qual Discrep/Single Stranded/Single Subclone/Unaligned High'. We

suggest you use this list.

25) Also try navigate by tags by selecting 'tags' under navigate: when

the Select Tag Type Window appears, double click on 'compression'.

(Note that you can't do anything else until you deal with this

window.) This gives a list of a particular tag type in a particular

contig.

26) There is also a way of getting a list of a particular tag type in

all contigs: Click on 'Find Main Win'. In the Main Consed Window,

point to the 'Navigate' menu, hold down the left mouse button, and

release on 'Tags in all contigs'. Continue as in the previous step.

PRIMER-PICKING

**** Temporary step ****

After you have completed the 'install vector files' step (below), you

should never do this.

Click on 'Find Main Win'. On the Main Window, open the Options menu,

and release on 'Primer Picking Preferences'. Notice the question

'Screen Primers Against Sequences in File?' (If you have trouble

finding this question, scroll the Primer Picking Preferences list

down. It is between 'PrimersNumberOfTemplatesToDisplayInFront' and

'Pick subclone templates for primers?' Click on 'False'. Then click

'Apply & Dismiss' and the Primer Picking Preferences box will pop

down.

(In real use, 'Screen Primers Against Sequences in File?' should be

set to 'True'. I have had you set it to False just this once so you

can go ahead and see how this is supposed to work until your system

administrator has time to correctly install the vector sequences file.

**** end of temporary step ****

27) Go to some location near the right end of the contig, say base

2570. Click with the right mouse button on the consensus and click on

either one of the top strand primer choices (either from subclone

template or from clone template). Consed will pause a moment, and

then there will appear a selection of primers that pass all of

consed's requirements. Templates are also chosen for each primer.

You may have to scroll the primer list to the right to see the

templates. Consed lists these templates in order of quality--all of

them will cover the read you want to make.

Double click on one of the primers in the Primers Window. That will

cause the Aligned Reads Window to scroll to show that oligo in

context. Click on 'Accept Primer'. Notice that a yellow oligo tag is

created on the consensus for that primer. That tag contains all the

information you need to order that oligo and do the reaction--you will

learn how to pop it up below under 'tags'.

What is the difference between 'Pick Primer from Subclone Template'

and 'Pick Primer from Clone Template'?

There are 3 differences:

A. which vector file the primers are screened against. In the former

case, the primer is screened against the file primerSubcloneScreen.seq

and in the latter case against the file primerCloneScreen.seq

B. In checking for false matches elsewhere in the assembly, if the

template is the whole clone, then consed must check for false matches

in the *entire* assembly, including all other contigs. But if the

template is just going to be a subclone, consed only needs to check

elsewhere in that subclone. Actually, to be conservative, consed

checks for false matches +/- the maximum insert size of a subclone.

C. If you are picking primers for subclone template, then the primer

picker can also pick the subclone templates. If it doesn't find any

suitable subclone template, it will reject the primer. (By default,

picking of subclone templates is turned off. You can turn it on

temporarily or permanently. To turn it on temporarily, go to the

Consed Main Window, point to the Options menu, hold down the left

mouse button and release on 'Primer Picking Preferences'. Scroll down

to 'Pick Subclone Templates for Primers' and click 'True'. Click on

'Apply and Dismiss'. To change this permanently, see CONSED

CUSTOMIZATION below. Beware: you must correctly customize

determineReadTypes.perl for template picking to work. See INSTALLING

CONSED below.

If you are interested in the details of primer-picking, see the

section 'PRIMER PARAMETERS' (below).

When you are done editing and have saved the assembly and exited

consed, run ace2Oligos.perl (supplied with this distribution--make

sure your system administration installed it) which will extract all

the oligos you just created. This is handy for email ordering of

oligos.

In the xterm, type:

ace2Oligos.perl standard.fasta.screen.ace.2 oligos.txt

where standard.fasta.screen.ace.2 is whatever the name is of the ace

file you just saved.

SEARCH FOR STRING

28) Try the 'Search for String' button (left side of the Aligned Reads

Window). Type in a string (such as aaaca), and click 'ok'. There

should be a list of 'hits'. Double click on one of the hits (or

single click on it and click on 'go'.) Notice that the Aligned Reads

Window scrolls to that position and has the cursor on the found

string. (It might be complemented.)

Dismiss this window. Try this again, only this time in the Search For

String Window select 'Search Just Reads'. Then click 'OK'. You will

notice there are many more hits. This is because this shows hits in

each read, even if they are at the same consensus position.

COPY AND PASTE

29) In the Aligned Reads Window, swipe some bases by holding down the

left mouse button. You should see the bases turn yellow, at least

temporarily. Then click the 'Search for String' button. Use the

middle mouse button to paste the bases you have just swiped into the

'Query string:' box. Notice that you can swipe bases either from the

consensus or from a read.

The search for string is case-insensitive so don't worry about the

pasting being upper or lowercase.

CORRECTING FALSE JOINS MADE BY PHRAP

30) Phrap may put several reads together that you believe do not belong

together. (For example, you may see several high quality

discrepancies between the reads.) If you are sure these reads do not

belong together, you can force a subsequent reassembly by phrap to not

assemble those reads together. You do this by finding a location

where there is a high quality discrepancy. Then click on the read

with the right mouse button and release on 'Tell phrap not to overlap

reads discrepant at this location'. There are no high quality

discrepancies with this dataset so consed won't let you do this.

(Try it and see.) However, when you use your own data, you may get

the chance!

ADDING READS

31) For this to work, your system administrator must have set up

everything correctly. (See below in INSTALLING CONSED.) Assuming you

have set everything up correctly, you can now experiment with adding

reads.

Now bring up consed again using ace file standard.fasta.screen.ace.1

If it asks if you want to apply edits, just say 'no'.

On the Main Window, click on the Add New Reads button. There will

appear a list of files ending with .fof. These are files that contain

lists of chromatograms. Double click on 'reads_to_add.fof' There

should be lots of progress output in the xterm from which you started

consed. When it completes, there will be a Reads Added Window popup

with a report of which reads were added. In this case, it should say

that 9 reads were successfully added and list them.

TEARS AND JOINS

32) When phrap really screws up, you may want to just tear the contig

apart in several places and then join the pieces back together in a

different way. Although we discourage you from doing this, we do give

you the power to do it, if you want to. Let's try it:

Go to location 1550. Point the mouse at the consensus base at 1550

and push the right mouse button down. Release the button on 'Tear

Contig at This Consensus Position'. Up will pop a list of reads with

2 little buttons next to them <- and ->. Leave everything as it is

and just click 'Do Tear'. (If you want to play around with which

reads goes into which contig, do that another time.)

Now you should have 2 Aligned Reads Windows on top of each other. One

should contain 'Contig2' and the other 'Contig3'.

Now let's join these 2 contigs back together:

Click on 'Search for String' and type in the following bases:

agctgccatc

Click 'OK'.

Search for string should find 2 locations, one in Contig2 and one in

Contig3:

Contig2 (consensus) 1447-1456 (uncomplemented)

Contig3 (consensus) 829-838 (uncomplemented)

Double click on the first one. The Aligned Reads Window for Contig2

will scroll to location 1447 and the window will raise up. In that

Aligned Reads Window, click on 'Compare Cont'.

Now double click on the 'Contig3' line in the above Search for String

results. The Aligned Reads Window for Contig3 will scroll to location

829 and lift up. In that Aligned Reads Window, click on 'Compare

Cont'.

Now the Compare Contigs Window should be visible. In the Compare

Contigs Window, try scrolling back and forth. You can change the

cursors (blinking red), but if you do, please return them to the

locations 1447 and 829 for the next step. The cursors 'pin' these

bases together when doing an alignment. (The algorithm is a pinned

Smith-Waterman alignment.)

Click on Align. Try scrolling the alignment by dragging the thumb in

the lower half of the Compare Contigs. An 'X' means there is a

discrepancy between the 2 contigs. There is also a 'P' (see if you

can find it!) The P indicates the bases that you pinned together.

Click with the left mouse button on either contig in the bottom

alignment. You will notice that both contigs will have the red

blinking cursor in the same position. Click on 'Scroll Both Aligned

Reads Windows' and look at the Aligned Reads Windows to see that they

scroll to the corresponding positions. You can have traces up for the

contigs, and they will scroll as well. Experiment with this. Then

click 'Join'. The 2 previous Aligned Reads Windows will disappear and

there will be a new one which has a new contig 'Contig4'. You have

made a join!

It is possible to have more than one Compare Contigs windows up at a

time. This allows you to investigate a repeat that has more than 2 copies.

Compare Contigs is one method of exploring joins of contigs that were

not made by phrap. Another method is to use phrapview, supplied with

phrap. phrapview gives a high level view of all internal joins while

'compare contigs' shows the alignment of a single internal join. Some

users have found them to work well together--phrapview to find a join

and, having found it, 'compare contigs' to examine it in more detail.

REMOVING READS

33) You can also remove individual reads and put them into their own

contigs. For example, in the Aligned Reads Window, go to location

2000. Point to the read name of read djs74_2664.s1 and hold down the

right mouse button. Release on 'Put read djs74_2664.s1 into its own

contig.' Consed will ask you 'Are you sure...?' Answer 'yes'.

Presto-chango! The read is put into its own contig and the old

contig is redrawn without the read in it. At this point you should

save the assembly--you should always save the assembly after removing

a read.