LLM-ODE: Data-driven Discovery of Dynamical Systems with Large Language Models

GECCO 2026

Amirmohammad Ziaei Bideh Jonathan Gryak
The Graduate Center, CUNY Queens College, CUNY

Overview

Genetic programming (GP) is an established approach for automated equation discovery but suffers from inefficient search and slow convergence. LLM-ODE addresses this by using an LLM as a genetic variation operator that extracts patterns from elite candidate equations to guide symbolic evolution more effectively. Evaluated on 91 dynamical systems, LLM-ODE consistently outperforms standard GP baselines in search efficiency and solution quality, with stronger scalability to high-dimensional systems than linear or Transformer-only methods.


Brief Introduction to the Method

LLM-ODE is a hybrid symbolic regression framework that combines genetic programming (GP) with LLMs to improve search efficiency in the space of symbolic expressions. LLM-ODE replaces conventional stochastic evolutionary operators, such as random mutation and crossover, with LLM-generated proposals that exploit patterns observed in high-performing candidate equations. Rather than searching directly over full systems of equations, LLM-ODE decomposes the problem across state variables. For each state variable in a system, the algorithm independently learns a symbolic approximation of its time derivative. This decomposition reduces the combinatorial complexity of the search and allows targeted optimization of each equation.

architecture

Results

The graph illustrates the discovery rate of GP-based methods as a function of search iterations across various successful discovery thresholds. Across all tested precision thresholds, LLM-ODE methods consistently outperform PySR in both total discovery rate and convergence speed.

Figure 1

Example Pareto Fronts

The following figures show the Pareto fronts of the discovered equations for the 4 systems as the training progresses. Each point on the Pareto front represents a candidate system. We find that LLM-ODE produces a richer set of solutions than uninformed GP method, with more diverse and high-performing systems.


Figure 1
System: Improved logistic equation (harvesting) (D=1)
Figure 2
System: Damped double well oscillator (D=2)
Figure 3
System: Sprott dissipative-conservative system (D=3)
Figure 4
System: Binocular rivalry adaptation (D=4)