Benjamin Redelings

Evolution, Computation, Statistics

Photo of Benjamin Redelings

News

Oct 2016: OpenTree manuscript finally submitted to PeerJ
"A supertree pipeline for summarizing phylogenetic and taxonomic information for millions of species" [Preprint]

Jun 2016: Presentation at Evolution 2016 meeting in Austin, TX
"New methods for constructing the supertree of life"

Jan 2016: BAli-Phy 2.3.8 released

Nov 2015: Manuscript published in Genetics
"A Bayesian Approach to Inferring Rates of Selfing and Locus-Specific Mutation." [PDF]

Research

I use mathematics and computational techniques to answer questions in evolutionary genetics. The mathematics is mostly Bayesian inference and stochastic process modelling. The computation is primarily Markov chain Monte Carlo (MCMC). The evolutionary genetics is focussed on phylogenetics and multiple sequence alignment, but also includes some coalescent theory.

Work

As of Sep 2015, I'm doing a remote postdoc with Mark Holder on the Open Tree of Life project, part time. I'm also working with Greg Wray at Duke on malaria genomics and PacBio DNA sequencing, part time, as of Feb 2016.

Phylogenetic Alignment

I develop model-based methods for inferring multiple sequence alignments (MSA) that place insertions and deletions on specific branches of the evolutionary tree, instead of just placing gaps in a matrix. These methods also co-estimate the evolutionary tree along with the alignment. I develop the MCMC software BAli-Phy to perform this estimation.

uncertain                                          certain
....310.......320.......330.......340.......350.......360.......370.......
Thermotoga DEVEIIGLSYEIKKTV---VTSVEMFRKELDEGIAGDNVGCLLRGIDKDEVERGQVLA-----APGSIKPHKRF
Anacystis ETIEIVGLR-DTRSTT---VTGVEMFQKTLDEGLAGDNVGLLLRGIQKTDIERGMVLA-----KPGSITPHTKF
Escheria EEVEIVGIK-ETQKST---CTGVEMFRKLLDEGRAGENVGVLLRGIKREEIERGQVLA-----KPGTIKPHTKF
Pyrococcus EVVIFEPASTIFHKPIQGEVKSIEMHHEPLEEALPGDNIGFNVRGVSKNDIKRGDVAGHTTN-PPTVVRTKDTF
Halobacterium DNVSFQPSDVG------GEVKTIEMHHEEVPNAEPGDNVGFNVRGIGKDDIRRGDVCGPADD-PPSVA---DTF
Methanococcus DKVVFEPAGAI------GEIKTVEMHHEQLPSAEPGDNIGFNVRGVGKKDIKRGDVLGHTTN-PPTVA---TDF
Aeropyrum DKVVFMPPGVV------GEVRSIEMHYQQLQQAEPGDNIGFAVRGVSKSDIKRGDVAGHLDK-PPTVA---EEF
Sulfolobus DKIVFMPVGKI------GEVRSIETHHTKIDKAEPGDNIGFNVRGVEKKDVKRGDVAGSVQN-PPTVA---DEF
Giardia MKVVFAPTSQV------SEVKSVEMHHEELKKAGPGDNVGFNVRGLAVKDLKKGYVVGDVTNDPPVGC---KSF
Homo MVVTFAPVNVT------TEVKSVEMHHEALSEALPGDNVGFNVKNVSVKDVRRGNVAGDSKNDPPMEA---AGF
Euglena DVVTFAPNNLT------TEVKSVEMHHEALTEAVPGDNVGFNVKNVSVKDIRRGYVASNAKNDPAKEA---ADF
Nicotiana MVVTFGPTGLT------TEVKSVEMHHEALQEALPGDNVGFNVKNVAVKDLKRGFVASNSKDDPAKGA---ASF
 

Expressive evolutionary models

Evolutionary models often take other models as arguments. For example, the Yang M3 model for positive selection in BAli-Phy is specified like this:

  M3[HKY, 4, F3x4] -- HKY nucleotide model, 4 categories, F3x4 codon frequency model
  M3[GTR, 3, F61]  -- GTR nucleotide model, 3 categories, F61  codon frequency model

I'm interested in expressing evolutionary models in a generic fashion using graphical models. The ultimate goal of this project is that researchers can describe new models in a flexible language without needing to write a new program.