Our lab is developing video-based tools for quantitatively analyzing animal behavior. We use machine vision and learning methods for this purpose. Our analyses begin with video of the animals behaving. This video captures a large amount of information about the behavior of the animals, but is high-dimensional and cannot be interpreted quantitatively in its raw form. In our research, we are searching for scientifically meaningful, concise yet detailed, quantitative representations of the behavioral information contained in input videos.
These tools can be used to gain insight into nervous system function, evolution, and ethology. We use these tools to answer a wide variety of questions, using Drosophila as a model organism, such as:
- What is the behavioral effect of activating a small set of neurons?
- How do the behavior of different drosophilid species differ?
- How does a fly's social experience affect its behavior?
- What are the "rules" governing fly interactions?
Besides providing insight into basic questions in the life sciences, performing large-scale experiments to answer this diverse set of questions informs us as to the limitations of current techniques, and what new tools will have the biggest, broadest impact. Each of these experiments provides a weak signal of the "language" of behavior for Drosophila, and what methodologies are useful in illuminating this underlying structure.
We can think of video of animals behaving as a raw measure of their behavior. Clearly, much processing is necessary for this data to be scientifically interpretable. Tracking the animals - converting these videos into trajectories of their positions in each frame - goes a long way toward compressing the data into a more interpretable form. Simple statistics such as the average speed of an animal or the fraction of time spent in a certain part of the environment are illuminating, but there is more information in the data than this. What is a good representation derived from these trajectories to illustrate the behavioral effects of a given manipulation? What is a general representation that will efficiently and completely describe the behavioral effects of many scientific experiments? This question is related to the question, what is the behavioral vocabulary of a given model organism?
Our research focuses on using machine vision and learning to answer these questions. We are developing machine vision software for automatically tracking animals and developing algorithms for learning behavior detectors from manually segmented trajectories. Using these, we are encoding ethologists' knowledge of observed behavioral phenotypes. We are also using machine learning to automatically discover new behavioral phenotypes and statistics. Our data mining approaches use other signals of the neural state of an animal as a semi-supervised labeling of behavior, favoring correlations with, e.g., neuronal expression patterns in thousands of transgenic lines, neural recordings in freely behaving animals, and stimuli (or other measures of the environment of the animals) presented to the animals.
For the algorithms we develop, we are also focused on producing software that can be used by biologists without an in-depth knowledge of computer vision techniques. These algorithms must also generalize to many experiments and be accurate and robust enough to form the basis of scientific results.
We have developed a new machine learning-based system, JAABA, to enable researchers to automatically compute highly descriptive, interpretable, quantitative statistics of the behavior of animals. Through our system, the user encodes their intuition about the structure of behavior in their experiment by labeling the behavior of the animal, e.g. walking, grooming, or chasing, in a small set of video frames. JAABA uses machine learning techniques to convert these manual labels into behavior detectors that can then be used to automatically classify the behaviors of the animals in a large data set with high throughput. JAABA combines a powerful, fast machine learning method, an intuitive interface, and visualizations of the classifier to create an interactive, usable, general-purpose tool for training behavior classifiers. Combined with automatic tracking algorithms such as Ctrax, we envision that it will enable biologists to transform a qualitative understanding about behavioral differences into a quantitative statistic, then systematically look for signals only detectable through automatic, high-throughput, quantitative analysis.
Through a series of groundtruthing experiments, we showed that our system can be used by scientists without expertise in machine learning to independently train accurate behavior detectors. JAABA is general purpose, and we showed that it can be used to easily create a wide variety of accurate individual and social behavior detectors for three important model organisms: flies, larvae, and mice. We also showed that it can be used to create behavior classifiers robust enough to successfully be applied to a large, phenotypically diverse data: our neural activation screen data.
To create a new behavior classifier, the user begins by labeling the behavior of animals in a small number of frames in which they are certain of the correct behavior label. They then push the "Train" button to pass these labels to the machine learning algorithm, which, within a few seconds, creates an initial behavior classifier that predicts the behavior label in all frames. The user can then examine these results, and find and label frames for which the classifier is predicting incorrectly, and the user is confident of the correct label. They can then retrain the classifier, and repeat.
JAABA is a practical implementation of active learning, a subfield of machine learning in which only the most informative training examples are labeled. Traditionally, these are the unlabeled examples on which the current classifier is most unsure. Because of the fuzzy nature of behavior, frames on which the classifier is unsure are often frames for which the behavior label is truly ambiguous. Thus, we instead employ an interactive approach in which the user, aided by visualization and navigation tools for sifting through sets of videos of hundreds of animals and millions of frames, finds and labels frames for which they are certain of the correct label and the current classifier predicts incorrectly. This JAABA interface also increases the communication between the user and the learning algorithm, and allows users with little knowledge of machine learning to understand what the algorithm is capable of, and diagnose why a classifier is misclassifying a given frame.
We have developed a high-throughput system for quantifying the locomotion and social behavior of flies with both breadth and depth. This system was developed as part of the Fly Olympiad project at Janelia. We screened the behavioral effects of TrpA neural activation at a rate of 75 GAL4 lines per week over a period of 1.5 years.
In our system, we record video of groups of flies freely behaving in an open-field walking arena, the Fly Bowl. The Fly Bowl is a chamber for observing the locomotion and social behaviors of multiple flies. Its main difference from more standard arenas is that it was designed with automated tracking and behavior analysis algorithms in mind. We have made significant modifications to the original design to increase throughput, consistency, and image quality. These modifications allow us to use Ctrax to track individual flies’ positions accurately in a completely automated way. Our system consists of 8 bowls that we record from simultaneously.
To reduce disk storage, we developed a MATLAB-based data capture system which compresses our videos by a factor of 80 during recording and is lossless for the tracking algorithm (in our screen, we capture 11 TB of raw video data per week). Our data capture system also captures and monitors metadata information about the environment and the preparation of the flies to ensure that results are repeatable and data are comparable over long periods of time across the different rigs.
To provide oversight for collection of this large data set, we developed visualization tools for examining the stability of experimental conditions and behavior statistics over time, and ensuring that we understood and accounted for correlations between recorded metadata and behavior.
To analyze the data, we developed an automatic pipeline that uses the cluster at Janelia. Typically, data are analyzed within 24-hours of being collected and the results are stored in a database. The first step in our analysis is to track the positions of individual flies using an updated version of Ctrax. From the trajectories, we compute 85 time series of “per-frame” behavior measurements, for instance the fly’s speed in each frame, or the distance from the fly to the closest other fly in each frame. Our first level of characterization of the behavior of the flies are statistics of these measurements, in these examples, the average speed and the average distance to the closest fly, or histograms of these values.
Next, we use behavior classifiers trained using JAABA to compute discrete behavior labels for each fly and frame, labels of whether the fly is or is not performing each of a suite of behaviors, e.g. walking, chasing, and wing grooming. We can then represent the behavior of the flies in terms of the fraction of time that they perform a given behavior. We can also use these discrete behavior categories to segment the trajectories into similar types of data that can be analyzed together. Then, we can look at statistics of our per-frame measurements within these segments, e.g. average speed while chasing, or distance to the closest fly at the beginning of a jump. This allows us to remove the effects of common behaviors such as walking and stopping when scrutinizing less common but highly stereotyped behaviors such as courtship and grooming.
As part of the Fly Olympiad team project at Janelia, we performed a high-throughput, large-scale activation screen of 2,200 transgenic lines of adult Drosophila from the Rubin GAL4 collection. For each of these lines of flies, a different sparse subset of neurons express the temperature sensitive TRPA1 neural effector using the the GAL4-UAS system. When the flies are at an elevated temperature, this sparse subset of neurons are activated. Our goal in the Fly Bowl screen is to understand how exciting these neurons affects behavior. More specifically, for each line, we are producing a quantitative description of how the behavior of flies from this line differs from the behavior of flies from a genetic control. These GAL4 lines are widely used at Janelia and elsewhere, and the behavior annotation provided by our screen will be a useful initial behavioral characterization in many studies.
In addition, the Fly Light team project at Janelia has imaged the dissected nervous systems of flies from each of these GAL4 lines to determine the expression pattern. Using the Fly Bowl screen data and the Fly Light data, we are performing a meta-analysis to determine which brain regions and neural circuits are involved in what behaviors. This analysis has the potential to provide insight into the organization of the fly brain and the function of individual neurons and anatomical regions.
We have screened a total of 2,200 lines. For 70% of these lines, data was collected on at least 2 separate occasions (different crosses at different times of year). Each time a line was tested, we collected 4 videos of 10 male and 10 female flies for 1,000 seconds each. In addition, we collected videos of our control flies multiple times per day. We reliably screened at a rate of 75 GAL4 lines per week over a 1.5 year period. In total, we collected, tracked, and analyzed 225 days of video of a total of 380,000 flies. This data set has been curated using a set of automatic checks to look for errors during collection or processing. We thus have a large, interesting, high quality data set to mine.
The comprehensive reconstruction of cell lineages in complex multicellular organisms is a central goal of developmental biology. We present an open-source computational framework for the segmentation and tracking of cell nuclei with high accuracy and speed. We demonstrate its (i) generality by reconstructing cell lineages in four-dimensional, terabyte-sized image data sets of fruit fly, zebrafish and mouse embryos acquired with three types of fluorescence microscopes, (ii) scalability by analyzing advanced stages of development with up to 20,000 cells per time point at 26,000 cells min−1 on a single computer workstation and (iii) ease of use by adjusting only two parameters across all data sets and providing visualization and editing tools for efficient data curation. Our approach achieves on average 97.0% linkage accuracy across all species and imaging modalities. Using our system, we performed the first cell lineage reconstruction of early Drosophila melanogaster nervous system development, revealing neuroblast dynamics throughout an entire embryo.
Automated image-based tracking and its application in ecologyTrends in Ecology & Evolution 2014
A. I. Dell, J. A. Bender, K. Branson, I. D. Couzin, G. G. Polavieja, L. P J J. Noldus, A. Pérez-Escudero, P. Perona, A. D. Straw, M. Wikelski, and U. Brose Trends in Ecology & Evolution, 29 (2014)
The behavior of individuals determines the strength and outcome of ecological interactions, which drive population, community, and ecosystem organization. Bio-logging, such as telemetry and animal-borne imaging, provides essential individual viewpoints, tracks, and life histories, but requires capture of individuals and is often impractical to scale. Recent developments in automated image-based tracking offers opportunities to remotely quantify and understand individual behavior at scales and resolutions not previously possible, providing an essential supplement to other tracking methodologies in ecology. Automated image-based tracking should continue to advance the field of ecology by enabling better understanding of the linkages between individual and higher-level ecological processes, via high-throughput quantitative analysis of complex ecological patterns and processes across scales, including analysis of environmental drivers.
We present a machine learning–based system for automatically computing interpretable, quantitative measures of animal behavior. Through our interactive system, users encode their intuition about behavior by annotating a small set of video frames. These manual labels are converted into classifiers that can automatically annotate behaviors in screen-scale data sets. Our general-purpose system can create a variety of accurate individual and social behavior classifiers for different organisms, including mice and adult and larval Drosophila.
Learning animal social behavior from trajectory featuresWorkshop on Visual Observation and Analysis of Animal and Insect Behavior 2012
E. Eyjolfsdottir, X. P. Burgos-Artizzu, S. Branson, K. Branson, D. Anderson, and P. Perona Workshop on Visual Observation and Analysis of Animal and Insect Behavior, (2012)
An important role of visual systems is to detect nearby predators, prey, and potential mates , which may be distinguished in part by their motion. When an animal is at rest, an object moving in any direction may easily be detected by motion-sensitive visual circuits [2, 3]. During locomotion, however, this strategy is compromised because the observer must detect a moving object within the pattern of optic flow created by its own motion through the stationary background. However, objects that move creating back-to-front (regressive) motion may be unambiguously distinguished from stationary objects because forward locomotion creates only front-to-back (progressive) optic flow. Thus, moving animals should exhibit an enhanced sensitivity to regressively moving objects. We explicitly tested this hypothesis by constructing a simple fly-sized robot that was programmed to interact with a real fly. Our measurements indicate that whereas walking female flies freeze in response to a regressively moving object, they ignore a progressively moving one. Regressive motion salience also explains observations of behaviors exhibited by pairs of walking flies. Because the assumptions underlying the regressive motion salience hypothesis are general, we suspect that the behavior we have observed in Drosophila may be widespread among eyed, motile organisms.
Prior Publications (3)
Automated tracking of animal movement allows analyses that would not otherwise be possible by providing great quantities of data. The additional capability of tracking in real time--with minimal latency--opens up the experimental possibility of manipulating sensory feedback, thus allowing detailed explorations of the neural basis for control of behaviour. Here, we describe a system capable of tracking the three-dimensional position and body orientation of animals such as flies and birds. The system operates with less than 40 ms latency and can track multiple animals simultaneously. To achieve these results, a multi-target tracking algorithm was developed based on the extended Kalman filter and the nearest neighbour standard filter data association algorithm. In one implementation, an 11-camera system is capable of tracking three flies simultaneously at 60 frames per second using a gigabit network of nine standard Intel Pentium 4 and Core 2 Duo computers. This manuscript presents the rationale and details of the algorithms employed and shows three implementations of the system. An experiment was performed using the tracking system to measure the effect of visual contrast on the flight speed of Drosophila melanogaster. At low contrasts, speed is more variable and faster on average than at high contrasts. Thus, the system is already a useful tool to study the neurobiology and behaviour of freely flying animals. If combined with other techniques, such as 'virtual reality'-type computer graphics or genetic manipulation, the tracking system would offer a powerful new way to investigate the biology of flying animals.
We present a camera-based method for automatically quantifying the individual and social behaviors of fruit flies, Drosophila melanogaster, interacting in a planar arena. Our system includes machine-vision algorithms that accurately track many individuals without swapping identities and classification algorithms that detect behaviors. The data may be represented as an ethogram that plots the time course of behaviors exhibited by each fly or as a vector that concisely captures the statistical properties of all behaviors displayed in a given period. We found that behavioral differences between individuals were consistent over time and were sufficient to accurately predict gender and genotype. In addition, we found that the relative positions of flies during social interactions vary according to gender, genotype and social environment. We expect that our software, which permits high-throughput screening, will complement existing molecular methods available in Drosophila, facilitating new investigations into the genetic and cellular basis of behavior.
JAABA is a machine learning-based system that enables researchers to automatically compute interpretable, quantitative statistics describing video of behaving animals. Through our system, users encode their intuition about the structure of behavior by labeling the behavior of the animal, e.g. walking, grooming, or following, in a small set of video frames. JAABA uses machine learning techniques to convert these manual labels into behavior detectors that can then be used to automatically classify the behaviors of animals in large data sets with high throughput. Our system combines an intuitive graphical user interface, a fast and powerful machine learning algorithm, and visualizations of the classifier into an interactive, usable system for creating automatic behavior detectors. JAABA is complementary to video-based tracking methods, and we envision that it will facilitate extraction of detailed, scientifically meaningful measurements of the behavioral effects in large experiments.
JAABA is an open-source, freely available program developed by members of the Branson lab. It is available for download at: http://jaaba.sourceforge.net/.
Ctrax was designed to allow high-throughput, quantitative analysis of behavior in freely moving flies. Our primary goal in this project is to provide quantitative behavior analysis tools to the neuroethology community; thus, we've endeavored to make the system adaptable to other labs' setups. We have assessed the quality of the tracking results for our setup, and found that it can maintain fly identities indefinitely with minimal supervision, and on average for 1.5 fly-hours automatically.
To further compensate for identity and other tracking errors, we provide the FixErrors Matlab GUI that identifies suspicious sequences of frames and allows a user to correct any tracking errors. We also distribute the BehavioralMicroarray Matlab Toolbox for defining and detecting a broad palette of individual and social behaviors. This software inputs the trajectories output by Ctrax and computes descriptive statistics of the behavior of each individual fly. We provide software for three proof-of-concept experiments to show the potential of the Ctrax software and our behavior detectors.
Ctrax is available for download at http://ctrax.sourceforge.net/.
We are seeking outstanding Postdoctoral Researchers and PhD Students to develop new machine vision and learning algorithms for cutting-edge neuroscience research. In particular, we are looking for computer scientists with expertise in machine vision and learning interested in both developing new algorithms as well as robust, usable systems that will impact the field of neuroscience.
We are primarily a machine vision and learning lab, developing new technologies for neuroscience research. We are focused on the problem of jointly learning the vocabulary of animal behavior and its implementation in the nervous system, in particular developing machine vision and learning algorithms toward this goal. To do this, we are:
- Combining optogenetic/thermogenetic techniques to manipulate the activity of neurons and new machine vision techniques for extracting the behavioral effects of these manipulations from video and data mining techniques for discovering the underlying structure.
- Jointly learning behavior-anatomy structure from video of animals behaving and video of the neuronal activity (measured via calcium imaging).
The machine vision and learning systems we develop toward these goals are general-purpose, and used by biologists across a variety of disciplines.
Relevant fields of machine vision and learning include:
- Learning-based pose estimation for high-resolution videos of animals behaving.
- Fully and weakly supervised activity recognition.
- Multi-view clustering/structure discovery.
- Semi-supervised discovery of structure using a "human-in-the-loop" learning paradigm.
The successful candidates for these positions will have:
- The creativity to develop new algorithms and learning paradigms for using machine vision and learning for scientific discovery.
- Practical knowledge of the current state-of-the-art in machine vision and learning.
- The commitment and dedication to develop robust, working systems.
- Interest in applications of computer science to biology, and the new discoveries they enable.
- Strong programming expertise in MATLAB and C, C++, Java, or Python.
The application of machine vision and learning to large neuroscience video data sets is an emerging field with great potential impact. New technologies such as optogenetics, calcium imaging, and advances in microscopy have enabled the collection of huge image data sets containing detailed information about the structure and function of the nervous system. Because of the scope and complexity of these data sets, machine vision and learning are of vital importance in extracting scientific understanding. The importance of this field of study has recently been highlighted in Obama's BRAIN Initiative Report (in particular, Section II-4).
HHMI's Janelia Farm Research Campus is a pioneering research center near Washington, D.C., where scientists from many disciplines, from computer science to physics to neuroscience, develop and use emerging and innovative technologies to pursue neuroscience's most challenging problems. Established in 2006, Janelia was modeled after institutes like Bell Labs, with small groups collaborating on high-risk, innovative, big science. All research is internally funded by the Howard Hughes Medical Institute, and salaries and benefits are highly competitive. Postdoc positions at Janelia are renewable one-year appointments. For information about Janelia, please visit: http://www.janelia.org/about-janelia.
Applicants should email a CV and a cover letter summarizing their research experience and interests to Kristin Branson at email@example.com.
- June 5, 2014: Our work was referenced as one of the primary research directions in Obama's BRAIN Initiative Report: BRAIN 2025.
- May 1, 2014: We were profiled in the Cell article "40 under 40".
- Spring, 2013: JAABA was discussed in the HHMI Bulletin.
- January 10, 2013: The JAABA article was recommended by the Faculty of 1000.