Open Thoughts Project

A DataComp and Bespoke Labs community effort to curate the best open reasoning datasets.

Our latest release is OpenThoughts3, the new SOTA open reasoning data recipe. Read the blog and the full paper for more detail!

About us

We are a team of researchers and engineers from Stanford, University of California Berkeley, University of Washington, Bespoke Labs, UT Austin, Juelich Supercomputing Center (JSC), LAION, UCLA, UNC Chapel Hill, TUM, and Toyota Research Institute united around building the best datasets (and thus the best models). See our previous works at datacomp.ai and mlfoundations.

Open Thoughts is supported by Bespoke Labs, NSF IFML, UT Austin Machine Learning Lab, Juelich Supercomputing Center, Toyota Research Institute, Lambda Labs, NHR Center of TU Dresden, MCML partition of the Leibniz Supercomputing Center, and the Leonardo Supercomputer of CINECA.

Announcements

Subscribe for updates

* indicates required