May 5, 2026

Experiments CI/CD integration

Tobias Wochinger

Run langfuse experiments in GitHub Actions to catch quality regressions before releasing changes to production.

You can now run langfuse experiments in GitHub Actions and catch quality regressions before they ship. The new langfuse/experiment-action tests your application against a langfuse dataset, reports the result directly on the pull request, and tracks the experiment run in langfuse.

Use it to block a PR when an agent's accuracy drops below a threshold, run a release gate against a versioned dataset, or make experiment results part of your existing CI checks.

GitHub Actions

Add the action to your workflow, point it at an experiment script, and choose the dataset that should be used for the gate. The pull request shows whether the experiment passed, regressed, or failed to run.

- uses: langfuse/experiment-action@v1.0.0
  with:
    langfuse_public_key: ${{ secrets.LANGFUSE_PUBLIC_KEY }}
    langfuse_secret_key: ${{ secrets.LANGFUSE_SECRET_KEY }}
    langfuse_base_url: https://cloud.langfuse.com
    experiment_path: experiments/support-agent-gate
    dataset_name: support-agent-regression-set
    dataset_version: "2026-04-27T00:00:00Z"
    github_token: ${{ github.token }}

When an experiment score misses your threshold, the workflow fails so reviewers can see the regression before merging.

Get started

Follow the CI/CD integration guide to add the workflow, write your experiment script, and configure the thresholds to protect your production use case.

CI/CD Integration

langfuse/experiment-action

Was this page helpful?