We show how to bound and calculate the likelihood of dynamical large deviations using evolutionary reinforcement learning. An agent, a stochastic model, propagates a continuoustime Monte Carlo trajectory and receives a reward conditioned upon the values of certain pathextensive quantities. Evolution produces progressively fitter agents, potentially allowing the calculation of a piece of a largedeviation rate function for a particular model and pathextensive quantity. For models with small state spaces, the evolutionary process acts directly on rates, and for models with large state spaces, the process acts on the weights of a neural network that parameterizes the modelâ€™s rates. This approach shows how pathextensive physics problems can be considered within a framework widely used in machine learning.
