Make a multiple alignment and phylogenic tree of these four protein sequences using ClustalX:
STN2_HUMAN Q93045 STN2_MOUSE P55821 STN1_RAT P13668 ST2A_XENLA Q09001
First, retrieve the protein sequences from GenBank using the NCBI ENTREZ webserver:
Remember to choose Protein from the pulldown menu. Set the Display format to FASTA and copy the 4 protein sequences into one text file in multiple FASTA format (use Notepad).
Now, either download and install CLUSTALX on your computer:
or use a web CLUSTAL server:
http://www.ebi.ac.uk/clustalw http://www.cmbi.kun.nl/bioinf/tools/clustalw.shtml http://npsa-pbil.ibcp.fr/cgi-bin/npsa_automat.pl?page=/NPSA/npsa_clustalw.html
One advantage of most of the web-based tools is that they immediately produce a graphical phylogenetic tree. If you use a local copy of CLUSTALX, then you have to take the text tree file into some tree drawing software. I usually use the Pylodendron website:
but there are many stand alone programs for Windows (and Mac) computers i.e: TreeView
Once you make the alignment and the tree, you will immediately see something strange. The rat gene clusters away from the other 3. How can this be? Mouse is clearly closer to rat than to Xenopus (African frog).
The best solution to this kind of question is to add more sequences to your analysis.
Go to the NCBI BLAST web page and collect the sequences for top 5 hits (in FASTA format) using each of the 4 sequences as the query.
[You should immediately realize that something is up when you look at the pattern of overlaps between the various lists.]
Now re-build the multiple alignment with all of the sequences (remove the obvious duplicates).
Your resulting tree should help you figure out what is going on here. Just because a group of sequences has enough similarity to make an alignment does not mean that they are really all the same orthologous gene.
You should now begin to realize how important it is to establish that a pair of genes are truly orthologs before generalizing from the function of one to the other.