Skip to content

feat(api): update API spec from langfuse/langfuse c82119e#1693

Merged
wochinge merged 2 commits into
mainfrom
api-spec-bot-c82119e
Jun 9, 2026
Merged

feat(api): update API spec from langfuse/langfuse c82119e#1693
wochinge merged 2 commits into
mainfrom
api-spec-bot-c82119e

Conversation

@langfuse-bot

@langfuse-bot langfuse-bot commented Jun 9, 2026

Copy link
Copy Markdown
Collaborator

Greptile Summary

This is a Fern-generated API spec sync (c82119e) that introduces two major additions: a new scores_v3 sub-client with cursor-based pagination and a polymorphic value field, and a refactor of the evaluator/evaluation-rule types to support a second evaluator kind (code) alongside the existing llm_as_judge type.

  • New scores_v3 module: adds ScoresV3Client / AsyncScoresV3Client with a single get_many_v3() method and a full set of response types (ScoreV3, ScoreSubjectV3, etc.) using discriminated unions on dataType.
  • Evaluator type split: Evaluator, CreateEvaluatorRequest, and CreateEvaluationRuleRequest are converted from single pydantic model classes to discriminated Union type aliases (_LlmAsJudge | _Code), introducing concrete subtypes exported alongside the union aliases. A parallel inheritance-based hierarchy (EvaluatorBaseLlmAsJudgeEvaluator / CodeEvaluator) is also added.
  • EvaluatorType.visit() breaking change: the new required code callable parameter will raise TypeError for any existing callers that only supplied llm_as_judge, which is worth noting even for an unstable API.

Confidence Score: 4/5

Safe to merge with awareness that EvaluatorType.visit() now requires a second code argument, which will break existing callers.

The bulk of the change is straightforward Fern-generated boilerplate for the new scores_v3 endpoint and a well-structured evaluator type split. The one concrete defect is the EvaluatorType.visit() signature: code is now a required parameter with no default, so any consumer of the unstable API who calls visit(llm_as_judge=...) will see a TypeError at runtime without any deprecation warning or migration path. No internal callers were found in the repo, but the method is part of the exported public surface.

langfuse/api/unstable/commons/types/evaluator_type.py — the visit() signature change is the only spot that could cause a silent runtime failure for existing adopters of the unstable API.

Class Diagram

%%{init: {'theme': 'neutral'}}%%
classDiagram
    class ScoresV3Client {
        +get_many_v3() GetScoresV3Response
        +with_raw_response RawScoresV3Client
    }
    class AsyncScoresV3Client {
        +get_many_v3() GetScoresV3Response
        +with_raw_response AsyncRawScoresV3Client
    }
    class GetScoresV3Response {
        +data List[ScoreV3]
        +meta GetScoresV3Meta
    }
    class ScoreV3 {
        <<Union>>
        ScoreV3_Numeric | ScoreV3_Boolean
        ScoreV3_Categorical | ScoreV3_Text | ScoreV3_Correction
    }
    class Evaluator {
        <<Union>>
        Evaluator_LlmAsJudge | Evaluator_Code
    }
    class Evaluator_LlmAsJudge {
        type: llm_as_judge
        +prompt str
        +output_definition
    }
    class Evaluator_Code {
        type: code
        +source_code str
        +source_code_language
    }
    class EvaluatorBase {
        +id str
        +name str
        +version int
        +scope EvaluatorScope
    }
    class LlmAsJudgeEvaluator
    class CodeEvaluator
    class EvaluatorType {
        <<Enum>>
        LLM_AS_JUDGE
        CODE
        +visit(llm_as_judge, code)
    }
    ScoresV3Client --> GetScoresV3Response
    AsyncScoresV3Client --> GetScoresV3Response
    GetScoresV3Response --> ScoreV3
    Evaluator --> Evaluator_LlmAsJudge
    Evaluator --> Evaluator_Code
    EvaluatorBase <|-- LlmAsJudgeEvaluator
    EvaluatorBase <|-- CodeEvaluator
    EvaluatorType --> Evaluator
Loading
Prompt To Fix All With AI
Fix the following 2 code review issues. Work through them one at a time, proposing concise fixes.

---

### Issue 1 of 2
langfuse/api/unstable/commons/types/evaluator_type.py:21-28
**`visit()` signature change breaks existing callers**

The `code` parameter is required with no default value. Any existing caller that was using `visit(llm_as_judge=lambda: ...)` will now raise `TypeError: visit() missing 1 required positional argument: 'code'` at runtime. Since `EvaluatorType` is exported from the unstable public API, consumers who adopted it are affected. Making `code` optional (e.g. `code: typing.Callable[[], T_Result] = lambda: None`) or raising `NotImplementedError` for unhandled values would be a safer migration path.

### Issue 2 of 2
langfuse/api/client.py:342-347
**Inline import inside property method**

The import `from .scores_v3.client import ScoresV3Client` is placed inside the property method body rather than at the top of the module, violating the project convention. The same pattern is repeated for `AsyncScoresV3Client` in `AsyncLangfuseAPI`. Since this is Fern-generated code and the lazy-load pattern is used throughout the entire file for circular-import avoidance, the fix would need to be applied to the generator template rather than individual property methods.

Reviews (1): Last reviewed commit: "feat(api): update API spec from langfuse..." | Re-trigger Greptile

Greptile also left 1 inline comment on this PR.

Context used:

  • Rule used - Move imports to the top of the module instead of p... (source)

Learned From
langfuse/langfuse-python#1387

@langfuse-bot langfuse-bot requested a review from wochinge June 9, 2026 13:54
@github-actions

github-actions Bot commented Jun 9, 2026

Copy link
Copy Markdown

@claude review

Comment thread langfuse/api/unstable/commons/types/evaluator_type.py

@claude claude Bot left a comment

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM, routine Fern-generated API spec update following the same pattern as recent PRs (e.g., #1685).

Extended reasoning...

Overview

This is an auto-generated Fern API spec update (feat(api): update API spec from langfuse/langfuse c82119e). All 51 changed files live under langfuse/api/ and bear the # This file was auto-generated by Fern from our API Definition. header. The changes add a new scores_v3 client/types module and extend the unstable evaluator/evaluation-rule API to support a code evaluator type alongside llm_as_judge (refactoring the create requests into discriminated unions).

Security risks

None. The diff is entirely generated client code (pydantic models, httpx request plumbing, lazy-import __init__.py shims). No auth, crypto, secret handling, or user-input parsing is introduced or changed; HTTP paths and parameter serialization follow the same conventions as the rest of the generated client.

Level of scrutiny

Low. This matches the established feat(api): update API spec from langfuse/langfuse <sha> PR pattern (see merged #1685, commit 003e045). No hand-written code, no callers inside the repo are affected (grep for evaluators.create/evaluation_rules.create only matches the generated client itself), and the unstable namespace explicitly documents that its surface may evolve, so the create-signature refactor is expected churn.

Other factors

The bug-hunting system found no issues. The change set is large by line count but uniform in shape — auto-generated lazy-import indices, pydantic model classes, and discriminated-union types. Approving.

@wochinge wochinge enabled auto-merge (squash) June 9, 2026 14:05
@wochinge wochinge merged commit da69e63 into main Jun 9, 2026
19 checks passed
@wochinge wochinge deleted the api-spec-bot-c82119e branch June 9, 2026 14:59

@claude claude Bot left a comment

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM — auto-generated Fern API spec sync.

Extended reasoning...

Overview

This PR is a Fern auto-generated API spec sync from langfuse/langfuse c82119e. It touches 51 files entirely under langfuse/api/, all of which are auto-generated boilerplate. Two main additions: (1) a new scores_v3 sub-client with cursor-based pagination, and (2) a refactor of the unstable evaluator/evaluation-rule types to support a new code evaluator kind alongside llm_as_judge via discriminated unions.

Security risks

No security risks. All changes are auto-generated typed API client code — no auth, crypto, permissions, or input-handling logic is touched. New endpoints inherit the existing client wrapper's auth flow.

Level of scrutiny

Low scrutiny is appropriate. This is mechanically generated code from the upstream API spec, and the langfuse-python repo has established patterns for these syncs (the previous c82119e sync feee649 is already on main). The diff is large but homogeneous — adding modules, type aliases, and discriminated unions following the same pattern as existing modules.

Other factors

Greptile flagged EvaluatorType.visit() gaining a required code callable parameter as a breaking change. This is a real signature change, but it lives on the unstable API surface — the enum's own docstring frames this as an explicitly evolving surface, and no internal callers of visit() exist in the repo. Greptile also flagged inline imports in the lazy-loaded property methods, but that's the established Fern-generated pattern throughout client.py and would need to change at the generator level, not per-PR.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants