Experiments CI/CD integration

Run langfuse experiments in GitHub Actions to catch quality regressions before releasing changes to production.
You can now run langfuse experiments in GitHub Actions and catch quality regressions before they ship. The new langfuse/experiment-action tests your application against a langfuse dataset, reports the result directly on the pull request, and tracks the experiment run in langfuse.
Use it to block a PR when an agent's accuracy drops below a threshold, run a release gate against a versioned dataset, or make experiment results part of your existing CI checks.
GitHub Actions
Add the action to your workflow, point it at an experiment script, and choose the dataset that should be used for the gate. The pull request shows whether the experiment passed, regressed, or failed to run.
- uses: langfuse/experiment-action@v1.0.0
with:
langfuse_public_key: ${{ secrets.LANGFUSE_PUBLIC_KEY }}
langfuse_secret_key: ${{ secrets.LANGFUSE_SECRET_KEY }}
langfuse_base_url: https://cloud.langfuse.com
experiment_path: experiments/support-agent-gate
dataset_name: support-agent-regression-set
dataset_version: "2026-04-27T00:00:00Z"
github_token: ${{ github.token }}When an experiment score misses your threshold, the workflow fails so reviewers can see the regression before merging.
Get started
Follow the CI/CD integration guide to add the workflow, write your experiment script, and configure the thresholds to protect your production use case.