Multiple Sequence
Alignment
Multiple Sequence alignments and alignment parameters 1. Copy the Plasmodium falciparum erythrocyte
membrane proteins (pfemp) sequence fragments (which are in fasta
format) from this link and paste to a new
file on your
computer. 2. Copy sequence fragments from kinetoplastid membrane proteins from Leishmania (which are also in fasta format) to a new file from this link. 3. Using a text editor (or the sequence alignment editor BioEdit if you prefer) align the two sets of sequences by hand. Which alignment was harder and why? 4. Align the sequences using ClustalX and the default parameters and compare with the alignments you produced by hand. Which alignments do you think are better - yours or Clustal's? 5. Test the effect of parameter adjustments on
the alignments of the fragments of both proteins by repeating the
alignments using ClustalX but setting
large (e.g. 90) gap opening penalty and small (or even zero) gap
extension
penalty in the multiple alignment parameters and then large gap
extension penalty and small gap opening penalty. Try also using
small gap opening penalties (<1) and small gap extension penalties
(< 0.1). Notice any changes in the blocks at the bottom of the
screen showing quality of alignment. |
Profile alignments 1. Imagine that you are an HIV researcher. You
have spent the last several months sequencing a portion of GAG gene
from 158
samples which you have collected from patients in your country. You
have
aligned these sequences using ClustalX and carefully edited the
alignment with BioEdit. This is a large number of sequences to work
with and it has taken you quite some time to produce the alignment but
you are very proud of your efforts. A new sample turns up and after
sequencing it you would like to add it to your alignment but you really
don't want to start from scratch. You can use the profile alignment
method in ClustalX to add the additional sequence to the alignment
without disturbing the rest of the alignment as folows:
2. Copy the single HIV GAG sequence from here and save it on your computer. This is
the
sequence that you want to include in the alignment above (but without
disturbing the alignment). 3. Open ClustalX and switch to Profile
Alignment Mode. 4. Load the HIV alignment as profile one
(under the file menu). 5. Load the extra sequence as profile two. 6. To add the extra sequence to the alignment
select the 'align sequences to profile one' option under the alignment
menu. Take note of the name of the file where the results will be
stored. Note: ClustalX does not provide a good way of
viewing the alignments produced using the profile mode. You should open
the alignment produced in BioEdit. |
Profile alignments 2. Over coffee with a collegue you discover that she has sequenced the complete HIV gag gene from a small number of gag sequences. You would like to make an alignment of the entire set of sequences in order to see where on her sequences your sequences match. You realise that, strictly speaking, you are breaking one of the 'rules of thumb' of sequence alignment because her sequences are far longer than yours and normally to produce the best alignment you try to align sequences of comparable length. Nonetheless, just to get a rough idea of how the two alignments compare, you decide to align her alignment against your existing alignment. You know that this can be done using ClustalX as follows: 1. Your collegues sequences can be found here. Save them to a file. 2. With ClustalX again in the Profile Alignment Mode load your large alignment as profile 1 again. 3. Load your collegues alignment as profile 2 4. Select the align profile 1 to profile 2 option under the alignment menu. Again you should open the result in BioEdit to see the alignment that has been produced. Since your collegue has sequenced the entire GAG gene can you see which portion of the gene you have sequenced? |
If you have time repeat the alignment of the variable surface glycoprotein sequences using the QAlign programme http://www.ridom.de/qalign/. You will have to download and install this yourself. This part of the exercise is not essential and most people are not expected to have time to complete it. |