Advanced topics in Machine Learning for Computational Biology

Elective Course (CS/Bioinformatics M.Sc. open to undergraduates), Tel Aviv University, School of Computer Science, 2024

Summary

Data-driven science is a ubiquitous paradigm in modern biology: heaps of data are collected on a biological system of interest, from which one must discover qualitative scientific insights or build accurate quantitative models. This course is an overview of advanced machine learning algorithms commonly used in modern computational biology research. The goals are three-fold: i) Learn the underlying mathematical principles behind these algorithms ii) Learn how and when to use them for scientific purposes iii) Understand their limitations. The algorithms will be illustrated on various biological systems (brain recordings, single cell data, protein sequences, molecules, etc.). No prior knowledge of biology is required. Course will be given in English. Evaluation will be based on oral presentation of research articles and home assignments. Tentative syllabus below, is subject to changes.

Syllabus

  • Topic 1: Model Interpretability (LASSO, decision trees).
  • Topic 2: Model Explainability (LIME, SHAP).
  • Topic 3: Low-dimensional embeddings (PCA, tSNE, UMAP).
  • Topic 4: Meaningful feature extraction (ICA, NMF, RBMs).
  • Topic 5: Convolutional Neural Networks (1D & 2D CNN, GradCAM).
  • Topic 6: Graph Neural Networks (GCN, GAT, MPNN).
  • Topic 7: Transformers (masked modeling, autoregressive).
  • Topic 8: Generative models (latent variable models, diffusion models).

Course number: 0368.4238