Inference Method

ECE 6502/BME 6550: Inference methods (Spring 2017)

In this course, we focus on statistical inference techniques and their applications. Inference allows us to learn about unobserved quantities from observed data based on a probability model. For example, we can infer the evolutionary relationships between organisms based on their genomic sequence data and a probability model of evolutionary changes. We will consider both frequentist and Bayesian methods, but will focus on the latter which aims to combine existing information with new observations in a statistically consistent manner. A main component of the course is computational methods that make possible Bayesian analysis of large datasets, which are common in many engineering and scientific disciplines, including machine learning, artificial intelligence, computational biology, and statistical physics.

Structure: The first two thirds of the course will consist of lectures. In the last third, enough time will be devoted to project presentations and the rest will be instructor lectures.

Activities: The homework will consist of problems and programming excersises. There will also be a final course project which will either involve data analysis of a real dataset to gain new insights or explores developing new inference approaches.

Prerequisites: Standard linear algebra and calculus; Probability theory (briefly reviewed); A basic understanding of molecular biology is helpful but not necessary.

Syllabus

  1. Review of probability
    1. Random variables & processes
    2. Markov chains and Perron-Frobenius theory
    3. Hidden Markov models
  2. Frequentist inference methods
    1. Maximum likelihood
    2. Hypothesis testing
    3. Point estimation methods and intervals
    4. Applications to phylogenetics
  3. Introduction to Bayesian methods
    1. The Bayesian approach
    2. Single-parameter models
    3. Multiparameter models
    4. Hierarchical models
  4. Computational approaches to Bayesian inference
    1. Monte-Carlo Markov chains
    2. Expectation-maximization
    3. Variational inference
  5. Hidden Markov models
    1. Three problems: evaluation, decoding, and inference
    2. Gapped sequence alignment, Gene finding, Protein classification
  6. Information theory and inference in computational biology:
    1. Introduction to Information theory
    2. Source coding and compression of biological sequences
    3. Stochastic approximation and sequence evolution
    4. Constrained codes and models of DNA as language

References

  • Gelman, Bayesian Data Analysis
  • MacKay, Information Theory, Inference, and Learning Algorithms
  • Gascuel, Mathematics of Evolution and Phylogeny