Skip to content

Feat/asqav audit plugin#3447

Open
falloficaruss wants to merge 4 commits into
flyteorg:masterfrom
falloficaruss:feat/asqav-audit-plugin
Open

Feat/asqav audit plugin#3447
falloficaruss wants to merge 4 commits into
flyteorg:masterfrom
falloficaruss:feat/asqav-audit-plugin

Conversation

@falloficaruss

Copy link
Copy Markdown

Tracking issue

Closes flyteorg/flyte#7085

Why are the changes needed?

In regulated environments (finance, healthcare, etc.), teams need provable, tamper-proof records of what AI workflow steps ran, with what data, and what the outputs were. Execution logs alone are insufficient for auditors.

This plugin provides a @asqav_audit decorator that wraps Flyte tasks with cryptographically signed receipts at each lifecycle point (started, finished, failed) via the Asqav SDK, enabling verifiable audit trails directly from the Flyte UI.

What changes were proposed in this pull request?

New plugins/flytekit-asqav/ plugin package that adds a @asqav_audit decorator (extends ClassDecorator, same pattern as flytekit-wandb)

How was this patch tested?

Tests were added in tests/test_asqav_tracking.py (9 tests, all pass).
The asqav SDK is patched at the module level — no network calls.

Setup process

cd plugins/flytekit-asqav
pip install -e ".[dev]"   # or uv pip install -e .
pytest tests/ -xvs

Screenshots

Check all the applicable boxes

  • I updated the documentation accordingly.
  • All new and existing tests passed.
  • All commits are signed-off.

Related PRs

Docs link

@welcome

welcome Bot commented Jun 24, 2026

Copy link
Copy Markdown

Thank you for opening this pull request! 🙌

These tips will help get your PR across the finish line:

  • Most of the repos have a PR template; if not, fill it out to the best of your knowledge.
  • Sign off your commits (Reference: DCO Guide).

    - Add _depth and _max_depth parameters (default 50) to _sync_execution and sync_node_execution
    - Raise FlyteAssertion when nesting exceeds limit instead of crashing with RecursionError
    - Add unit tests verifying depth guard and regression safety

    Fixes #7338

Signed-off-by: Abhishek Shinde <norizzabhii@gmail.com>
       Implements an  decorator (ClassDecorator pattern) that wraps
       Flyte tasks with cryptographically signed receipts at started/finished/failed
       lifecycle points using the Asqav SDK. Receipts are rendered as a Flyte Deck
       card with verification links.

       Closes #7085

Signed-off-by: Abhishek Shinde <norizzabhii@gmail.com>
…ionError"

This reverts commit 51ca417.

Signed-off-by: Abhishek Shinde <norizzabhii@gmail.com>
@falloficaruss falloficaruss force-pushed the feat/asqav-audit-plugin branch from 10fc010 to 3609235 Compare June 24, 2026 13:35
@jagmarques

Copy link
Copy Markdown

@falloficaruss great work on this. The architecture is solid, the ClassDecorator pattern matches the flytekit-wandb idiom exactly, and the test suite is genuinely good. As the team behind Asqav, we want to help get this across the finish line. Two things worth fixing before merge, and two smaller follow-ups.

P1, agent identity (the main one). Agent.create(name=self.agent_name) is called inside execute(), which means every task invocation mints a fresh server-side keypair. The practical consequence is that the started and finished receipts across separate runs are signed by different agent identities, so you cannot attribute a run sequence to a single stable signer, which undermines the audit trail's value for attribution. It also fills the org with duplicate agents that share the same display name.

The fix is to resolve the agent once, not per call. If you know the agent id up front, use Agent.get(agent_id) at registration time and reuse it. If you want create-on-first-use semantics, cache the result on the instance (self._agent) so it is created at most once per process. We are happy to push this change or pair with you on it, just say the word.

P2, secret resolution order. The code in _resolve_api_key() checks callable first, then ASQAV_API_KEY env, then the Flyte Secret object. The README documents the priority as Secret first, then callable, then env. They disagree, and the practical risk is real. In remote execution a user explicitly wires a Flyte Secret, but a stray ASQAV_API_KEY in the pod environment silently wins and the configured Secret is ignored. We would recommend aligning both to Secret first, then callable, then env fallback, so the most-controlled source wins. The current tests pass under the buggy order, so it would be worth adding one test that asserts Secret beats env to lock this in.

P3, inconsistent fail-open (smaller). asqav.init(), Agent.create(), and the started agent.sign() are all outside any try/except, so an Asqav network error there raises before the user's task function ever runs. The finished and failed sign calls are wrapped. The simplest fix is to wrap init, create, and the started-sign in the same try/except-and-log as the completion receipts, so an Asqav outage never breaks the workflow. If regulated users want fail-closed behavior, a fail_closed: bool = False parameter would make that an explicit opt-in.

P4, CI coverage (small). flytekit-asqav is not added to the plugin-names matrix in .github/workflows/pythonbuild.yml, so the 9 tests in this PR will not run in flytekit CI. A maintainer will ask for this. It is a one-line addition in alphabetical order under the community plugins comment.

None of these are big changes and everything else is in good shape. We are glad to help with any of this directly, let us know.

Signed-off-by: Abhishek Shinde <norizzabhii@gmail.com>
@falloficaruss

Copy link
Copy Markdown
Author

@jagmarques You may review

@jagmarques

Copy link
Copy Markdown

@falloficaruss this looks great, you addressed all four points cleanly.

  • P1 (agent identity): the agent is now resolved once and cached on the instance (self._agent), so a single stable signer is reused across the started and finished receipts. That restores run-sequence attribution and stops the duplicate-agent fan-out.
  • P2 (secret order): _resolve_api_key() now checks the Flyte Secret first, then the callable, then the ASQAV_API_KEY env fallback, and the README matches. test_secret_beats_env_var locks it in so a stray pod env var can no longer shadow a wired Secret.
  • P3 (fail-open consistency): init, create, and the started sign are wrapped in the same log-and-continue path as the completion receipts, and the new fail_closed flag makes fail-closed an explicit opt-in. Both test_fail_closed_propagates_init_failure and test_fail_open_continues_on_init_failure cover it.
  • P4 (CI): flytekit-asqav is in the plugin-names matrix, so the suite runs in flytekit CI.

From the Asqav side this is in good shape and reads ready for a maintainer pass. Nice work, and thanks for the quick turnaround.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Cryptographic audit trails for AI workflow compliance

2 participants