Benjamin Redelings

Evolution, Computation, Statistics

Photo of Benjamin Redelings

News

May 2017: BAli-Phy 3.0-beta1 released

Feb 2017: OpenTree manuscript accepted in PeerJ
"A supertree pipeline for summarizing phylogenetic and taxonomic information for millions of species" [Preprint v1]

Jun 2016: Presentation at Evolution 2016 meeting in Austin, TX
"New methods for constructing the supertree of life"

Nov 2015: Manuscript published in Genetics
"A Bayesian Approach to Inferring Rates of Selfing and Locus-Specific Mutation." [PDF]

Research

I use mathematics and computational techniques to answer questions in evolutionary genetics. The mathematics is mostly Bayesian inference and stochastic process modeling. The computation is primarily Markov chain Monte Carlo (MCMC). The evolutionary genetics is focused on phylogenetics and multiple sequence alignment, but also includes some coalescent theory.

Work

As of Sep 2015, I'm doing a remote postdoc with Mark Holder on the Open Tree of Life project, part time. I'm also working with Greg Wray at Duke on malaria genomics and PacBio DNA sequencing, part time, as of Feb 2016.

Phylogenetic Alignment

I develop model-based methods for inferring multiple sequence alignments (MSA) that place insertions and deletions on specific branches of the evolutionary tree, instead of just placing gaps in a matrix. These methods also co-estimate the evolutionary tree along with the alignment. I develop the MCMC software BAli-Phy to perform this estimation.

uncertain                                          certain
....310.......320.......330.......340.......350.......360.......370.......
Thermotoga DEVEIIGLSYEIKKTV---VTSVEMFRKELDEGIAGDNVGCLLRGIDKDEVERGQVLA-----APGSIKPHKRF
Anacystis ETIEIVGLR-DTRSTT---VTGVEMFQKTLDEGLAGDNVGLLLRGIQKTDIERGMVLA-----KPGSITPHTKF
Escheria EEVEIVGIK-ETQKST---CTGVEMFRKLLDEGRAGENVGVLLRGIKREEIERGQVLA-----KPGTIKPHTKF
Pyrococcus EVVIFEPASTIFHKPIQGEVKSIEMHHEPLEEALPGDNIGFNVRGVSKNDIKRGDVAGHTTN-PPTVVRTKDTF
Halobacterium DNVSFQPSDVG------GEVKTIEMHHEEVPNAEPGDNVGFNVRGIGKDDIRRGDVCGPADD-PPSVA---DTF
Methanococcus DKVVFEPAGAI------GEIKTVEMHHEQLPSAEPGDNIGFNVRGVGKKDIKRGDVLGHTTN-PPTVA---TDF
Aeropyrum DKVVFMPPGVV------GEVRSIEMHYQQLQQAEPGDNIGFAVRGVSKSDIKRGDVAGHLDK-PPTVA---EEF
Sulfolobus DKIVFMPVGKI------GEVRSIETHHTKIDKAEPGDNIGFNVRGVEKKDVKRGDVAGSVQN-PPTVA---DEF
Giardia MKVVFAPTSQV------SEVKSVEMHHEELKKAGPGDNVGFNVRGLAVKDLKKGYVVGDVTNDPPVGC---KSF
Homo MVVTFAPVNVT------TEVKSVEMHHEALSEALPGDNVGFNVKNVSVKDVRRGNVAGDSKNDPPMEA---AGF
Euglena DVVTFAPNNLT------TEVKSVEMHHEALTEAVPGDNVGFNVKNVSVKDIRRGYVASNAKNDPAKEA---ADF
Nicotiana MVVTFGPTGLT------TEVKSVEMHHEALQEALPGDNVGFNVKNVAVKDLKRGFVASNSKDDPAKGA---ASF
 

Expressive language for evolutionary models

Evolutionary models for DNA substitution are often constructed by using other, smaller models as building blocks. For example, the GTR+Gamma model of nucleotide substitution is constructed by combining the GTR model with a gamma distribution of rate heterogeneity across sites. In BAli-Phy, this is currently written as GTR+gamma or gamma[submodel=GTR].

Likewise, the Yang M3 model for positive selection depends on a nucleotide exchange model like GTR, a frequency model, like F3x4, and a number of omega categories. In BAli-Phy it can be specified like this:

  M3[HKY, 4, F3x4] -- HKY nucleotide model, 4 categories, F3x4 codon frequency model
  M3[GTR, 3, F61]  -- GTR nucleotide model, 3 categories, F61  codon frequency model

I'm interested in creating an expressive language to describe evolutionary models that can be used in BAli-Phy, so that new models don't require writing new software. I will accomplish this by extending graphical models.