I am a researcher at the MIT Computer Science & Artificial Intelligence Laboratory (CSAIL) focused on machine learning. I work with Prof. David Gifford in the Computational Genomics group, developing new interpretability methods for understanding deep neural networks and investigating novel approaches for designing therapeutics using ML.
Previously, I completed my undergrad and Masters at MIT, double majoring in computer science and mathematics. I also minored in economics. I graduated in June 2017 (undergrad) and June 2019 (MEng), advised by Prof. David Gifford.
My main interests broadly span machine learning and information, particularly as applied in computational biology and natural language processing. I am also intrigued by systems security and cryptography.
I have had the pleasure to work at Google Brain, Facebook, Bloomberg LP, KAYAK, and Leiden University.
I am originally from Long Island, New York. In my free time I enjoy sailing, hacking on various projects, and world traveling.
B. Carter, M. Bileschi, J. Smith, T. Sanderson, D. Bryant, D. Belanger, L. Colwell. Critiquing Protein Family Classification Models Using Sufficient Input Subsets. ICML Workshop on Computational Biology. 2019.
B. Carter, J. Mueller, S. Jain, D. Gifford. What made you do this? Understanding black-box decisions with sufficient input subsets. Artificial Intelligence and Statistics (AISTATS). 2019. [arXiv] [pdf]
B. Carter, K. Leidal, D. Neal, Z. Neely. Survey of Fully Verifiable Voting Cryptoschemes. 2016. [pdf]
S.P. Epstein, K.B. Fernandez, B.M. Carter, S.A. Abdou, N. Gadaria, P. A. Asbell. Safety and Efficacy of Ganciclovir Ophthalmic Gel for Treatment of Adenovirus Keratoconjunctivitis Utilizing Cell Culture and Animal Models. Invest. Ophthalmol. Vis. Sci. 2012;53(14):6203. [url]
Click on any of the projects below to learn more. You can also take a look at some of the contributions I have made on GitHub.
Twitter NLP Follower Prediction
This is the final project from a graduate course at MIT in Advanced Natural Language Processing (6.864), taken in Fall 2015.
The goal of the project was to use NLP-based methods to improve upon traditional machine learning approaches to predict the follower count for a user on Twitter given tweet text and simple metadata. The Twitter corpus presents a variety of interesting and difficult ML and NLP problems partly because tweets are limited to 140 characters and often lack syntactic correctness. They also contain hashtags, mentions, emoticons, and links, as well as conversational abbreviations which can be used to improve NLP models. We trained various regression models to predict the number of Twitter followers from tweets and find that NLP augmentation improves prediction accuracy over purely ML approaches, though the significance of the improvement was dependent on the type of regression. We showed that the most accurate predictions follow training with hybrid NLP and ML methods.
The screenshot below shows portion of a live stream from the Twitter feed, in which the trained model makes a live prediction of the follower count, also displaying the actual count. The predictive model works very well at prediction of unseen tweets.
The code for this project was written in Python using, among others, the Scikit-learn machine learning library.
The code and paper will be uploaded in the future.
ICU Patient Predictions
In January 2016, I visited the Leiden Institute of Advanced Computer Science at Leiden University in the Netherlands. There, I worked on a project that uses machine learning techniques to give doctors greater insight into the often critical decision of administering blood transfusions to patients in the intensive care unit (ICU).
The goal of this project was to use a bank of patient data from the intensive care unit to determine whether patients received a transfusion as well as which patient features are most critical in the classification. This code addresses the problem using a variety of predictive classification models from scikit-learn. It also provides tools for data parsing, statistical analysis, graphing, and PDF output of decision tree models.
We unfortunately cannot publish any data associated with this database and have anonymized attribute and string names from within the data schema. However, this code shows the skeleton used as part of the analysis of the data and the goal of predictive classification of the patient data.
The code for the project (without dataset) can be found on GitHub.
Academics for the Future of Science
As co-founder of Academics for the Future of Science at MIT, I built save-science.org, which allows thousands of people to contact Congress and support increased funding for scientific research.
The website is built in AngularJS and based on multiple open-source projects that capture and present information to fill forms on Congress members' webpages. This task is rather complex as there are no direct email addresses to contact Congress; rather, each member has his/her own form with non-standard fields and captcha.
I also built a Python-backend API used by the website that resolves a users address into geographical information through the Google Streets API, as well as database tracking and visualization software to see statistics on the number of submitted forms.
You can find the project and APIs on GitHub.
Ploegh Lab Website
Designed and managed a website for the Ploegh Lab at the Whitehead Institute for Biomedical Research at MIT at ploeghlab.wi.mit.edu. This website receives hundreds of weekly page views.
During this summer, I was involved in immunology research in the lab. My research focused on engineering novel single-domain antibodies that could be used for tumor vaccine development.
In creating the website, I built a variety of customized tools, including a script that automatically fetches new articles on the Publications page. Because the lab frequently publishes research papers, it is difficult for lab managers to constantly update this page with information about the latest articles. The script runs daily and fetches any new paper data from the PubMed API, adding relevant information and a link to the paper.
I also wrote a backend and interface so lab managers can easily update and modify the Members page of the site without touching any code on the website. This online GUI supports adding, removing, and updating lab member data, as well as photo upload and cropping/resizing.
There are also internal-facing databases and interfaces so lab members can track lab inventory and protocols, though these pages are not publically accessible.
Fall 2011 - Spring 2013
Many behavioral studies have shown the beneficial effects of tutoring for both tutor and student. Further research has concluded that learning is enhanced with educational applications that utilize new technology.
StudentsThink was tested in a high school science setting. Teachers using the site verified that their students who participated in StudentsThink, asking and answering questions, performed better on unit exams.
My email is bcarter [at] mit [dot] edu. Feel free to also connect with me on LinkedIn.
I look forward to getting in touch!