Filter
Associated Lab
- Remove Dudman Lab filter Dudman Lab
Associated Support Team
3 Janelia Publications
Showing 1-3 of 3 resultsAnimals learn trajectories to rewards in both spatial, navigational contexts and relational, non-navigational contexts. Synchronous reactivation of hippocampal activity is thought to be critical for recall and evaluation of trajectories for learning. Do hippocampal representations differentially contribute to experience-dependent learning of trajectories across spatial and relational contexts? In this study, we trained mice to navigate to a hidden target in a physical arena or manipulate a joystick to a virtual target to collect delayed rewards. In a navigational context, calcium imaging in freely moving mice revealed that synchronous CA1 reactivation was retrospective and important for evaluation of prior navigational trajectories. In a non-navigational context, reactivation was prospective and important for initiation of joystick trajectories, even in the same animals trained in both contexts. Adaptation of trajectories to a new target was well-explained by a common learning algorithm in which hippocampal activity makes dissociable contributions to reinforcement learning computations depending upon spatial context.
Recent success in training artificial agents and robots derives from a combination of direct learning of behavioral policies and indirect learning via value functions. Policy learning and value learning employ distinct algorithms that optimize behavioral performance and reward prediction, respectively. In animals, behavioral learning and the role of mesolimbic dopamine signaling have been extensively evaluated with respect to reward prediction; however, to date there has been little consideration of how direct policy learning might inform our understanding. Here we used a comprehensive dataset of orofacial and body movements to understand how behavioral policies evolve as naive, head-restrained mice learned a trace conditioning paradigm. Individual differences in initial dopaminergic reward responses correlated with the emergence of learned behavioral policy, but not the emergence of putative value encoding for a predictive cue. Likewise, physiologically-calibrated manipulations of mesolimbic dopamine produced multiple effects inconsistent with value learning but predicted by a neural network-based model that used dopamine signals to set an adaptive rate, not an error signal, for behavioral policy learning. This work provides strong evidence that phasic dopamine activity can regulate direct learning of behavioral policies, expanding the explanatory power of reinforcement learning models for animal learning.
The interaction of descending neocortical outputs and subcortical premotor circuits is critical for shaping skilled movements. Two broad classes of motor cortical output projection neurons provide input to many subcortical motor areas: pyramidal tract (PT) neurons, which project throughout the neuraxis, and intratelencephalic (IT) neurons, which project within the cortex and subcortical striatum. It is unclear whether these classes are functionally in series or whether each class carries distinct components of descending motor control signals. Here, we combine large-scale neural recordings across all layers of motor cortex with cell type-specific perturbations to study cortically dependent mouse motor behaviors: kinematically variable manipulation of a joystick and a kinematically precise reach-to-grasp. We find that striatum-projecting IT neuron activity preferentially represents amplitude, whereas pons-projecting PT neurons preferentially represent the variable direction of forelimb movements. Thus, separable components of descending motor cortical commands are distributed across motor cortical projection cell classes.