Event Detail (Archived)

Inviting Darwin into Antibody Language Models

  • This event already took place in October 2025
  • Carson Family Auditorium (CRC)

Event Details

Type
Center for Studies in Physics and Biology Seminars
Speaker(s)
Frederick A. Matsen, Ph.D., professor, Fred Hutchinson Cancer Research Center
Speaker bio(s)

Antibodies are coded by nucleotide sequences that are generated by V(D)J recombination and evolve according to nucleotide mutation and selection processes. Existing antibody language models, however, focus exclusively on antibodies as strings of amino acids and are fit using the masked language modeling objective. In this talk, I will first show that fitting using this objective implicitly incorporates nucleotide-level processes as part of the protein language model, which degrades performance when predicting functional properties of antibodies. To address this limitation, we propose a new framework: a deep amino acid selection model (DASM) that predicts the selective effect of replacing every amino acid with every alternate amino acid. By fitting selection as a separate term from the mutation process, the DASM exclusively quantifies functional effects. This separation of concerns leads to substantially improved performance on standard functional benchmarks. Moreover, our model is an order of magnitude smaller and orders of magnitude faster to evaluate than existing approaches, as well as being readily interpretable. I will then describe some surprising conclusions about how natural selection works for antibodies: there is more to the story than framework vs CDRs!

Open to
Public
Phone
(212) 327-8636
Sponsor
Melanie Lee
(212) 327-8636
leem@rockefeller.edu