MBT/Genetics 541 Homework Assignment 2
Due Friday Apr. 13
- Read section 7.3 of Durbin et al.
- Write a program to:
- Read in a DNA sequence data set such as is produced by Dnatree.
Above 60 sites, Dnatree wraps the sequences in a way that will
require thought, so let's assume you have sequences 60 sites
long or less (i.e,, don't bother with the wrapping issue unless
you feel energetic).
- Take these sequences and, one site at a time, count the
number of changes they require on a phylogeny of this
tree topology:
(A,(B,(C,(D,(E,(F,(G,(H,(I,J)))))))))
Note that this is of a highly stereotyped shape, with the
lineage to A branching off first, then the lineage to B, then C,
and so on. I chose this so you do not have to deal with the
tedious bookkeeping of representing an arbitrary tree -- you should
be able to do everything with arrays (tables) in a simple way.
The Fitch algorithm will be the best one to use.
- Use Dnatree to produce a data set and then use your program
to count the number of changes it needs on this tree. Of course
you can also count then by using the tree rearrangement part of
Dnatree to make the tree and evaluate it. This will serve as
a check on your program (hint -- for testing try a data set
1 site long).
- email me (joe@genetics) the data set and your program's output. I can do (limited amounts of) answering questions by email as you work.