A fundamental challenge in machine learning is to take an observation and make a prediction about future behaviour. In this work, we consider a model system which is evolving according to a hidden (to the machine learning algorithm) set of rules. Can we observe this system and predict what it will do next? Using the approach of maximum likelihood, we train a transformer (a power type of neural network architexture) to determine the various rates of the processes which have occurred during the observation window. The transformer is accurate enough in its learning that we can relibably predict the behaviour of the system over a wide range of conditions - well beyond those seen during the trainign process.