Introduction to Bioinformatics: CMSC54610-1, Spring quarter 2006.
Lecture 5
-
Chapter 4:
Distance methods for phylogeny
- Review from last week:
-
About
distance metrics
for biological sequences.
- Distances in drug design
- Consider the scoring matrices in Figure 2.4 on page 39 of
the text. If S is the scoring matrix, suppose that the diagonal is k.
Then D=k-S.
- Finding the tree for three nodes: solve 3x3 system of equations
- Rooted versus unrooted trees: too many but not enough
- Gene versus species trees: local versus global view
- UPGMA: clustering technique
- group closest nodes, define distances for new groups
- changes the distance matrix (cf. Table 4.2)
- The four point condition:
additive distance matrices
revised notes.
- Finding the right pairs: Sattath and Tversky, 1977
Psychometrika 42, pp. 319-345
-
Chapter 5:
Character based methods of phylogenetics
- Parsimony
- Main difference: internal nodes have a name
- Informative sites: a heuristic to find the critical determinants
- In the end, we compute a distance
- Bootstrapping
- Discussion of the
term paper
- Homework problems:
- 4-4, 4.8 (pages 95-96)
- Give an example of a distance matrix for which A>B=C and
show that there are trees with three different topologies
optimally close (distance = A-B) in the L1 norm (see section 3 of the
notes on the four point condition)
- 5-2 (page 116)
- Homework is due by the beginning of the next class.
- Homework should be turned in physically before the start of the class
- or sent electronically in standard formats in an emergency.