
This research introduces **Exploratory Causal Inference**, a framework designed to identify unknown treatment effects within high-dimensional datasets. The authors propose using **foundation models** and **sparse autoencoders (SAEs)** to transform raw data into a dictionary of interpretable latent features. To solve the "**paradox of exploratory causal inference**"—where increased data power causes irrelevant, entangled neurons to appear falsely significant—they develop the **Neural Effect Search (NES)** algorithm. **NES** employs **recursive stratification** to isolate true causal signals by iteratively removing the influence of previously discovered effects. Validated through semi-synthetic tests and ecological trials, the method successfully distinguishes **scientifically relevant outcomes** from experimental noise. Ultimately, this approach bridges the gap between **data-driven empiricism** and human-led **causal interpretation**.