Computational Biology (2016–2017)
Peter Dawyndt · Universiteit Gent
Welcome to the Dodona course for the UGent module on Computational Biology (Master of Science in Computer Science and Master of Science in Statistical Data Analysis). The course contains Python3 programming exercises that support automated feedback.
You are not a member of this course.
An algorithm requiring the interpretation and manipulation of networks, or graphs. Rosalind
| Status | Status | Type | Title | Class progress |
|---|---|---|---|---|
| Overlap graphs | ||||
| Completing a tree | ||||
| Introduction to pattern matching | ||||
| Encoding suffix trees | ||||
| Identifying maximal repeats | ||||
| Finding the longest multiple repeat |
A tree modeling the evolutionary scenario deriving a collection of taxa from their proposed ancestors. Rosalind
| Status | Status | Type | Title | Class progress |
|---|---|---|---|---|
| Completing a tree | ||||
| Culling the forest | ||||
| Creating a distance matrix | ||||
| Paths in trees | ||||
| Inferring genotype from a pedigree |
Hidden Markov Models
Hidden Markov models. Rosalind
Solve the exercises of the HMM track on Rosalind. We will use the matrix visualization developed during the project to add the remaining exercises to Dodona with newly generated test cases and sometimes with some slight modifications to the problem statement. Solutions to these exercises will be presented by the students in class. You might find some inspiration about using HMM in the following Jupyter Notebook demos:
| Status | Type | Title | Class progress | Status | Actions |
|---|---|---|---|---|---|
| Probability of a hidden path | |||||
| Probability of an outcome given a hidden path | |||||
| Viterbi decoding | |||||
| Outcome likelihood |
- read chapter How do we compare DNA sequences? in the book by Compeau & Pevzner
- check the Eyeless demo (Jupyter Notebook)
| Status | Type | Title | Class progress | Status | Actions |
|---|---|---|---|---|---|
| Counting point mutations | |||||
| Transitions and transversions | |||||
| Creating a distance matrix | |||||
| Edit distance | |||||
| Edit distance alignment | |||||
| Counting optimal alignments | |||||
| Global alignment with scoring matrix | |||||
| Global alignment with constant gap penalty | |||||
| Local alignment with scoring matrix | |||||
| Maximizing the gap symbols of an optimal alignment | |||||
| Multiple alignment | |||||
| Global alignment with scoring matrix and affine gap penalty | |||||
| Overlap alignment | |||||
| Semiglobal alignment | |||||
| Local alignment with scoring matrix and affine gap penalty | |||||
| Isolating symbols in alignments | |||||
| Finding a motif with modifications | |||||
| Pairwise global alignment | |||||
| Suboptimal local alignment | |||||
| Global multiple alignment |
Find or design a (non-classic) dynamic programming problem that can be added to Dodona. No problems related to alignment / edit distance. Bonuses for problems that are applications in computational biology. Submit a ZIP-file to Indianio containing:
- problem description (introduction - assignment - examples)
- README file containing notes and references
- where did you find the assignment
- generating test cases
- non-trivial testing of solutions
- your sample solution
| Status | Type | Title | Class progress | Status | Actions |
|---|---|---|---|---|---|
| Rabbits and recurrence relations | |||||
| Mortal Fibonacci rabbits | |||||
| Let's go to the movies | |||||
| Subset sum problem | |||||
| Boltzmann's tombstone | |||||
| Throwing dice | |||||
| Minimum number of jumps | |||||
| Accordion folding | |||||
| Change problem | |||||
| Matrix chain multiplication | |||||
| Bytelandian gold coins | |||||
| Counting heads | |||||
| Longest palindromic subsequence |
- read chapter 02 in the textbook: All the sequences’ men
- check the Mycoplasma demo (Jupyter Notebook)
| Status | Type | Title | Class progress | Status | Actions |
|---|---|---|---|---|---|
| Transcribing DNA into RNA | |||||
| Translating RNA into protein | |||||
| Introduction to protein databases | |||||
| Finding genes with ORFs | |||||
| Protein translation |
An algorithm involving the manipulation and properties of chains of symbols.
| Status | Type | Title | Class progress | Status | Actions |
|---|---|---|---|---|---|
| Finding a shared motif | |||||
| Perfect matchings and RNA secondary structures | |||||
| Finding a spliced motif | |||||
| Catalan numbers and RNA secondary structures | |||||
| Speeding up motif finding | |||||
| Finding a shared spliced motif | |||||
| Ordering strings of varying length lexicographically | |||||
| Maximum matchings and RNA secondary structures | |||||
| Motzkin numbers and RNA secondary structures | |||||
| Interleaving two motifs | |||||
| Introduction to pattern matching | |||||
| Finding disjoint motifs in a gene | |||||
| Encoding suffix trees | |||||
| Linguistic complexity of a genome | |||||
| Identifying maximal repeats | |||||
| Finding all similar motifs |
The mathematical study of the chance of occurrence of random events, or the chance with which a specific event will occur. In preparation:
- read chapter 01 in the textbook: The first look at a genome
- check the Haemophilus demo (Jupyter Notebook)
| Status | Type | Title | Class progress | Status | Actions |
|---|---|---|---|---|---|
| Mendel's first law | |||||
| Introduction to random strings | |||||
| Calculating expected offspring | |||||
| Independent alleles | |||||
| Matching random motifs | |||||
| Expected number of restriction sites | |||||
| Independent segregation of chromosomes | |||||
| Counting disease carriers | |||||
| Inferring genotype from a pedigree | |||||
| Sex-linked inheritance | |||||
| The Wright-Fisher model of genetic drift | |||||
| Wright-Fisher's expected behavior | |||||
| The founder effect and genetic drift |
If you are an accomplished coder, then you can write a separate program for every new task you encounter. In practice, these programs only need to be written once and packaged into modules, where those of us who are not great coders can use them quickly and efficiently. In this first series of exercises, we will familiarize ourselves with a sampling of some of the more popular bioinformatics tools taken from “out of the box” software. In preparation:
- read chapter 00 in the textbook: Prologue
- get yourself familiar with the BioPython module
| Status | Type | Title | Class progress | Status | Actions |
|---|---|---|---|---|---|
| Counting DNA nucleotides | |||||
| Transcribing DNA into RNA | |||||
| Translating RNA into protein | |||||
| Introduction to protein databases | |||||
| Finding genes with ORFs | |||||
| Protein translation | |||||
| Complementing a strand of DNA | |||||
| Computing GC-content | |||||
| Finding a motif in DNA | |||||
| Consensus and profile | |||||
| Locating restriction sites | |||||
| RNA splicing | |||||
| Enumerating k-mers lexicographically | |||||
| k-Mer composition |