January 24, 2025
The past few decades have witnessed a revolution in control of dynamical systems using computation instead of pen-and-paper analysis. The scalability and adaptability of optimization and learning methods make them particularly powerful, but modern engineering applications involving nonclassical systems (hybrid, [human-]cyber-physical, infrastructure, decentralized / distributed, …) require generalizations of state-of-the-art algorithms. This class will provide a unified treatment of abstract concepts, scalable computational tools, and rigorous experimental evaluation for deriving and applying optimization and (reinforcement) learning techniques to control.
You will learn to do these things:
theory
– derive steepest descent algorithms for optimal control problems (OCP)
– derive policy gradient algorithms for Markov decision processes (MDP)
– prove convergence of gradient-based algorithms to local optima for OCP and MDP
computation
– steepest descent for OCP
– receding horizon for OCP
– value and policy iteration for MDP
– policy gradient for MDP
experimentation
– assess convergence
– estimate convergence rate
– evaluate generalization