Deze cursus is gearchiveerd · Deze cursus is nu afgesloten. Je kunt je niet meer registreren of nieuwe oplossingen indienen, maar je eerdere werk en resultaten blijven beschikbaar.

Computational Biology (2016–2017)

Peter Dawyndt · Universiteit Gent

Welcome to the Dodona course for the UGent module on Computational Biology (Master of Science in Computer Science and Master of Science in Statistical Data Analysis). The course contains Python3 programming exercises that support automated feedback.

Je bent niet geregistreerd voor deze cursus.

Leerpad

Graph algorithms

An algorithm requiring the interpretation and manipulation of networks, or graphs. Rosalind

A tree modeling the evolutionary scenario deriving a collection of taxa from their proposed ancestors. Rosalind

BWT and FM-indices for pattern matching

Status Type Titel Voortgang groep Status Acties
Construction of Burrows-Wheeler transformed string
BWT-based read mapping

Hidden Markov Models

Hidden Markov models. Rosalind

Solve the exercises of the HMM track on Rosalind. We will use the matrix visualization developed during the project to add the remaining exercises to Dodona with newly generated test cases and sometimes with some slight modifications to the problem statement. Solutions to these exercises will be presented by the students in class. You might find some inspiration about using HMM in the following Jupyter Notebook demos:

Status Type Titel Voortgang groep Status Acties
Probability of a hidden path
Probability of an outcome given a hidden path
Viterbi decoding
Outcome likelihood
  • read chapter How do we compare DNA sequences? in the book by Compeau & Pevzner
  • check the Eyeless demo (Jupyter Notebook)
Status Type Titel Voortgang groep Status Acties
Counting point mutations
Transitions and transversions
Creating a distance matrix
Edit distance
Edit distance alignment
Counting optimal alignments
Global alignment with scoring matrix
Global alignment with constant gap penalty
Local alignment with scoring matrix
Maximizing the gap symbols of an optimal alignment
Multiple alignment
Global alignment with scoring matrix and affine gap penalty
Overlap alignment
Semiglobal alignment
Local alignment with scoring matrix and affine gap penalty
Isolating symbols in alignments
Finding a motif with modifications
Pairwise global alignment
Suboptimal local alignment
Global multiple alignment

Dynamic programming

Find or design a (non-classic) dynamic programming problem that can be added to Dodona. No problems related to alignment / edit distance. Bonuses for problems that are applications in computational biology. Submit a ZIP-file to Indianio containing:

  • problem description (introduction - assignment - examples)
  • README file containing notes and references
    • where did you find the assignment
    • generating test cases
    • non-trivial testing of solutions
  • your sample solution
Status Type Titel Voortgang groep Status Acties
Rabbits and recurrence relations
Mortal Fibonacci rabbits
Let's go to the movies
Subset sum problem
Boltzmann's tombstone
Throwing dice
Minimum number of jumps
Accordion folding
Change problem
Matrix chain multiplication
Bytelandian gold coins
Counting heads
Longest palindromic subsequence
  • read chapter 02 in the textbook: All the sequences’ men
  • check the Mycoplasma demo (Jupyter Notebook)
Status Type Titel Voortgang groep Status Acties
Transcribing DNA into RNA
Translating RNA into protein
Introduction to protein databases
Finding genes with ORFs
Protein translation

String algorithms

An algorithm involving the manipulation and properties of chains of symbols.

Status Type Titel Voortgang groep Status Acties
Finding a shared motif
Perfect matchings and RNA secondary structures
Finding a spliced motif
Catalan numbers and RNA secondary structures
Speeding up motif finding
Finding a shared spliced motif
Ordering strings of varying length lexicographically
Maximum matchings and RNA secondary structures
Motzkin numbers and RNA secondary structures
Interleaving two motifs
Introduction to pattern matching
Finding disjoint motifs in a gene
Encoding suffix trees
Linguistic complexity of a genome
Identifying maximal repeats
Finding all similar motifs

The mathematical study of the chance of occurrence of random events, or the chance with which a specific event will occur. In preparation:

  • read chapter 01 in the textbook: The first look at a genome
  • check the Haemophilus demo (Jupyter Notebook)
Status Type Titel Voortgang groep Status Acties
Mendel's first law
Introduction to random strings
Calculating expected offspring
Independent alleles
Matching random motifs
Expected number of restriction sites
Independent segregation of chromosomes
Counting disease carriers
Inferring genotype from a pedigree
Sex-linked inheritance
The Wright-Fisher model of genetic drift
Wright-Fisher's expected behavior
The founder effect and genetic drift

Bioinformatics tools

If you are an accomplished coder, then you can write a separate program for every new task you encounter. In practice, these programs only need to be written once and packaged into modules, where those of us who are not great coders can use them quickly and efficiently. In this first series of exercises, we will familiarize ourselves with a sampling of some of the more popular bioinformatics tools taken from “out of the box” software. In preparation:

  • read chapter 00 in the textbook: Prologue
  • get yourself familiar with the BioPython module
Status Type Titel Voortgang groep Status Acties
Counting DNA nucleotides
Transcribing DNA into RNA
Translating RNA into protein
Introduction to protein databases
Finding genes with ORFs
Protein translation
Complementing a strand of DNA
Computing GC-content
Finding a motif in DNA
Consensus and profile
Locating restriction sites
RNA splicing
Enumerating k-mers lexicographically
k-Mer composition

A warm-up challenge

Status Type Titel Voortgang groep Status Acties
Eenvoudige genoomassemblage