DESCRIPTION OF COURSES

Close

BI 512/AS 608 BIOINFORMATICS – II                                                                                 (2L+1P) III
(Pre- requisite: AS 571)

Objective
  To aim at exposing the students to advanced statistical and computational techniques related to bioinformatics. The course would prepare the students in understanding bioinformatics principles and their applications.

Theory

UNIT I
  Genomic databases and analysis of high-throughput data sets, Analysis of DNA sequence, Sequence annotation, ESTs, SNPs. BLAST and related sequence comparison methods. EM algorithm and other statistical methods to discover common motifs in biosequences. Multiple alignment and database search using motif models, ClustalW and others. Concepts in phylogeny. Gene prediction based on codons, Decision trees, Classificatory analysis, Neural Networks, Genetic algorithms, Pattern recognition, Hidden Markov models.

UNIT II
  Computational analysis of protein sequence, structure and function. Modeling protein families. Expression profiling by microarray/gene chip, proteomics etc., Multiple alignment of protein sequences, Modeling and prediction of structure of proteins, Designer proteins, Drug designing.

UNIT III
  Markov chains (MC with no absorbing states; Higher order Markov dependence; patterns in sequences; Markov chain Monte Carlo – Hastings-Metropolis algorithm, Simulated Annealing, MC with absorbing States), Bayesian techniques and use of Gibbs Sampling, Advanced topics in design and Analysis of DNA microarray experiments.

UNIT IV
  Computationally intensive methods (Classical estimation methods, Bootstrap estimation and Confidence Intervals, Hypothesis testing, Multiple Hypothesis testing), Evolutionary models (Models of Nucleotide substitution), Phylogenetic tree estimation (Distances: Tree reconstruction – Ultrametric and Neighbor-Joining cases, Surrogate distances, Tree reconstruction, Parsimony and Maximum Likelihood, Modeling, Estimation and Hypothesis Testing), Neural Networks (Universal Approximation Properties, Priors and Likelihoods, Learning Algorithms – Back propagation, Sequence encoding and output interpretation, Prediction of  Protein Secondary Structure, Prediction of Signal Peptides and their cleavage sites, Application for DNA and RNA Nucleotide Sequences), Analysis of SNPs and Haplotypes.

Practical
  Genomic databases and analysis of  high-throughput data sets, BLAST and related sequence comparison methods, Statistical methods to discover common motifs in biosequences, Multiple alignment and database search using motif  models, ClustalW, Classificatory analysis, Neural Networks, Genetic algorithms, Pattern recognition, Hidden Markov models, Computational analysis of protein sequence, Expression profiling by microarray/gene chip, proteomics, Modelling and prediction of structure of proteins, Bayesian techniques and use of Gibbs Sampling, Analysis of DNA microarray experiments, Analysis of one DNA sequence, Analysis of multiple DNA or protein sequences, Computationally intensive methods, Multiple Hypothesis testing, Phylogenetic tree estimation, Analysis of SNPs and Haplotypes.

Suggested Readings