Multiple alignment program - interface to
ClustalW program
qualifiers for parameter: sequence
Name of old dendrogram file
Name of old dendrogram file
Protein pairwise alignment matrix options
The scoring table which describes the similarity of each
amino acid to each other.
There are three 'in-built' series of weight matrices offered.
Each consists of several matrices which work differently at
different evolutionary distances. To see the exact details, read
the documentation. Crudely, we store several matrices in
memory, spanning the full range of amino acid distance (from
almost identical sequences to highly divergent ones). For very
similar sequences, it is best to use a strict weight matrix
which only gives a high score to identities and the most
favoured conservative substitutions. For more divergent
sequences, it is appropriate to use 'softer' matrices which give
a high score to many other frequent substitutions.
1) BLOSUM (Henikoff). These matrices appear to be the best
available for carrying out data base similarity (homology
searches). The matrices used are: Blosum80, 62, 45 and 30.
2) PAM (Dayhoff). These have been extremely widely used since
the late '70s. We use the PAM 120, 160, 250 and 350 matrices.
3) GONNET . These matrices were derived using almost the same
procedure as the Dayhoff one (above) but are much more up to
date and are based on a far larger data set. They appear to be
more sensitive than the Dayhoff series. We use the GONNET 40,
80, 120, 160, 250 and 350 matrices.
We also supply an identity matrix which gives a score of 1.0 to
two identical amino acids and a score of zero otherwise. This
matrix is not very useful.
DNA pairwise alignment matrix options
The scoring table which describes the scores assigned to
matches and mismatches (including IUB ambiguity codes).
Filename of user pairwise matrix
Filename of user pairwise matrix
Protein multiple alignment matrix options
This gives a menu where you are offered a choice of
weight matrices. The default for proteins is the PAM series
derived by Gonnet and colleagues. Note, a series is used! The
actual matrix that is used depends on how similar the sequences
to be aligned at this alignment step are. Different matrices
work differently at each evolutionary distance.
There are three 'in-built' series of weight matrices offered.
Each consists of several matrices which work differently at
different evolutionary distances. To see the exact details, read
the documentation. Crudely, we store several matrices in
memory, spanning the full range of amino acid distance (from
almost identical sequences to highly divergent ones). For very
similar sequences, it is best to use a strict weight matrix
which only gives a high score to identities and the most
favoured conservative substitutions. For more divergent
sequences, it is appropriate to use 'softer' matrices which give
a high score to many other frequent substitutions.
1) BLOSUM (Henikoff). These matrices appear to be the best
available for carrying out data base similarity (homology
searches). The matrices used are: Blosum80, 62, 45 and 30.
2) PAM (Dayhoff). These have been extremely widely used since
the late '70s. We use the PAM 120, 160, 250 and 350 matrices.
3) GONNET . These matrices were derived using almost the same
procedure as the Dayhoff one (above) but are much more up to
date and are based on a far larger data set. They appear to be
more sensitive than the Dayhoff series. We use the GONNET 40,
80, 120, 160, 250 and 350 matrices.
We also supply an identity matrix which gives a score of 1.0 to
two identical amino acids and a score of zero otherwise. This
matrix is not very useful. Alternatively, you can read in your
own (just one matrix, not a series).
Nucleotide multiple alignment matrix options
This gives a menu where you are offered amenu where a
single matrix (not a series) can be selected.
Filename of user multiple alignment matrix
Filename of user multiple alignment matrix
Slow pairwise alignment: gap opening
penalty
The penalty for opening a gap in the pairwise
alignments.
Slow pairwise alignment: gap extension
penalty
The penalty for extending a gap by 1 residue in the
pairwise alignments.
Fast pairwise alignment: similarity scores:
K-Tuple size
This is the size of exactly matching fragment that is
used. INCREASE for speed (max= 2 for proteins; 4 for DNA),
DECREASE for sensitivity. For longer sequences (e.g. >1000
residues) you may need to increase the default.
Fast pairwise alignment: similarity scores:
gap penalty
This is a penalty for each gap in the fast alignments. It
has little affect on the speed or sensitivity except for
extreme values.
Fast pairwise alignment: similarity scores:
number of diagonals to be considered
The number of k-tuple matches on each diagonal (in an
imaginary dot-matrix plot) is calculated. Only the best ones
(with most matches) are used in the alignment. This parameter
specifies how many. Decrease for speed; increase for
sensitivity.
Fast pairwise alignment: similarity scores:
diagonal window size
This is the number of diagonals around each of the 'best'
diagonals that will be used. Decrease for speed; increase for
sensitivity.
Fast pairwise alignment: similarity scores:
suppresses percentage score
Fast pairwise alignment: similarity scores:
suppresses percentage score
Multiple alignment: Gap opening penalty
The penalty for opening a gap in the alignment.
Increasing the gap opening penalty will make gaps less
frequent.
Multiple alignment: Gap extension penalty
The penalty for extending a gap by 1 residue. Increasing
the gap extension penalty will make gaps shorter. Terminal gaps
are not penalised.
Use end gap separation penalty
End gap separation: treats end gaps just like internal
gaps for the purposes of avoiding gaps that are too close (set
by 'gap separation distance'). If you turn this off, end gaps
will be ignored for this purpose. This is useful when you wish
to align fragments where the end gaps are not biologically
meaningful.
Gap separation distance
Gap separation distance: tries to decrease the chances
of gaps being too close to each other. Gaps that are less than
this distance apart are penalised more than other gaps. This
does not prevent close gaps; it makes them less frequent,
promoting a block-like appearance of the alignment.
No residue specific gaps
Residue specific penalties: amino acid specific gap
penalties that reduce or increase the gap opening penalties at
each position in the alignment or sequence. As an example,
positions that are rich in glycine are more likely to have an
adjacent gap than positions that are rich in valine.
List of hydrophilic residues
This is a set of the residues 'considered' to be
hydrophilic. It is used when introducing Hydrophilic gap
penalties.
No hydrophilic gaps
Hydrophilic gap penalties: used to increase the
chances of a gap within a run (5 or more residues) of
hydrophilic amino acids; these are likely to be loop or random
coil regions where gaps are more common. The residues that are
'considered' to be hydrophilic are set by '-hgapres'.
Cut-off to delay the alignment of the most
divergent sequences
This switch, delays the alignment of the most distantly
related sequences until after the most closely related sequences
have been aligned. The setting shows the percent identity level
required to delay the addition of a sequence; sequences that are
less identical than this level to any other sequences will be
aligned later.
Format of the output sequence (outseq)
Format of the output sequence (outseq)
Start the web service and receive the result. Blocks until job is finished.
Start the web service and receive a job-id. Returns immediately.
Wait until a job (by job-id) has finished. Blocks until job is finished.
Get status information about a running job. Returns immediately.
Get the results of a job (by job-id)
Start the web service and receive the result. Blocks until job is finished.
Start the web service and receive a job-id. Returns immediately.
Wait until a job (by job-id) has finished. Blocks until job is finished.
Get status information about a running job. Returns immediately.
Get the results of a job (by job-id)