Skip to content

feat: add Tinybird datasources for packages-db tables (CM-1219)#4180

Open
joanagmaia wants to merge 21 commits into
mainfrom
feat/packages-tables-datasources-tb
Open

feat: add Tinybird datasources for packages-db tables (CM-1219)#4180
joanagmaia wants to merge 21 commits into
mainfrom
feat/packages-tables-datasources-tb

Conversation

@joanagmaia

@joanagmaia joanagmaia commented Jun 8, 2026

Copy link
Copy Markdown
Contributor

⚠️ This PR should only be merged after https://github.com/linuxfoundation/crowd.dev/pull/4178/changes

Summary

  • Adds 11 Tinybird datasources for packages-db tables: advisories, advisoryAffectedRanges, advisoryPackages, repos, repoScorecardChecks, maintainers, packages, packageMaintainers, packageDependencies, versions, packageRepos.
  • Each datasource uses ReplacingMergeTree with ENGINE_PARTITION_KEY and ENGINE_VER aligned to the same timestamp column (the effective updated_at for each table).
  • Schemas reflect the latest packages-db migrations: email instead of email_hash on maintainers, dependent_count / transitive_dependent_count / impact on packages.

Notes

  • Tables without a dedicated updated_at use their semantic equivalent: last_synced_at for packages, versions, repos; verified_at for package_repos.
  • packages_universe is intentionally excluded (pending deprecation).
  • Datasources depend on the schema changes in #4178 being applied first.

🤖 Generated with Claude Code


Note

Medium Risk
Touches production packages-db logical replication and ranking SQL; incorrect watermark handling could stale analytics or over-publish CDC events.

Overview
Adds 11 Tinybird datasources for packages-db (packages, versions, dependencies, repos, advisories, maintainers, etc.) using ReplacingMergeTree with per-table watermark columns (lastSyncedAt, updatedAt, or verifiedAt).

A packages-db migration enables CDC for that pipeline: creates sequin_pub over those tables with publish_via_partition_root, sets REPLICA IDENTITY FULL on roots and hash-partition leaves, and updates rank_packages() to bump last_synced_at whenever ranking/criticality fields change so Tinybird’s ENGINE_VER stays correct.

Sync semantics are tightened in workers/DAL: deps.dev repo seeding and Maven upsertRepo no longer set repos.last_synced_at (GitHub enricher owns it); OSV has_critical_vulnerability flips now bump packages.last_synced_at.

Reviewed by Cursor Bugbot for commit 49aa2ff. Bugbot is set up for automated code reviews on this repo. Configure here.

Signed-off-by: Joana Maia <jmaia@contractor.linuxfoundation.org>
Signed-off-by: Joana Maia <jmaia@contractor.linuxfoundation.org>
Signed-off-by: Joana Maia <jmaia@contractor.linuxfoundation.org>
Signed-off-by: Joana Maia <jmaia@contractor.linuxfoundation.org>
Copilot AI review requested due to automatic review settings June 8, 2026 17:01

Copilot AI left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This PR adds a set of new Tinybird datasource definitions under services/libs/tinybird/datasources/ to replicate packages-db domain tables (packages, versions, repos, advisory graph, maintainers, and relationship tables) into ClickHouse/Tinybird for analytics.

Changes:

  • Add 11 new Tinybird .datasource schemas for packages-db tables (advisories, packages, versions, repos, etc.).
  • Use ReplacingMergeTree across the new datasources with partition/sort keys tuned per entity and a designated “version” column for deduplication.
  • Align datasource columns to the latest packages-db schema fields (e.g. maintainer email, package dependent counts/impact).

Reviewed changes

Copilot reviewed 11 out of 11 changed files in this pull request and generated 6 comments.

Show a summary per file
File Description
services/libs/tinybird/datasources/advisories.datasource Defines advisory records (OSV/CVE/GHSA metadata) for Tinybird replication.
services/libs/tinybird/datasources/advisoryAffectedRanges.datasource Defines per-package affected version ranges for advisories.
services/libs/tinybird/datasources/advisoryPackages.datasource Defines mapping between advisories and affected packages.
services/libs/tinybird/datasources/maintainers.datasource Defines registry maintainer identities (incl. email) for analytics.
services/libs/tinybird/datasources/packageDependencies.datasource Defines the package dependency graph edges for analytics queries.
services/libs/tinybird/datasources/packageMaintainers.datasource Defines package↔maintainer relationship rows.
services/libs/tinybird/datasources/packageRepos.datasource Defines package↔repo provenance mapping rows.
services/libs/tinybird/datasources/packages.datasource Defines package-level metadata including criticality/dependency metrics.
services/libs/tinybird/datasources/repoScorecardChecks.datasource Defines per-check OpenSSF Scorecard signals per repo.
services/libs/tinybird/datasources/repos.datasource Defines repository metadata and enrichment signals for analytics.
services/libs/tinybird/datasources/versions.datasource Defines per-version metadata (publish time, prerelease, licenses, etc.).

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Comment thread services/libs/tinybird/datasources/repos.datasource Outdated
Comment thread services/libs/tinybird/datasources/repos.datasource Outdated
Comment thread services/libs/tinybird/datasources/advisories.datasource Outdated
Comment thread services/libs/tinybird/datasources/packages.datasource Outdated
Comment thread services/libs/tinybird/datasources/packages.datasource Outdated
Comment thread services/libs/tinybird/datasources/versions.datasource Outdated
@joanagmaia joanagmaia requested a review from epipav June 8, 2026 17:12
Co-authored-by: Copilot Autofix powered by AI <175728472+Copilot@users.noreply.github.com>
Signed-off-by: Joana Maia <jmaia@contractor.linuxfoundation.org>
Copilot AI review requested due to automatic review settings June 9, 2026 10:48
Co-authored-by: Copilot Autofix powered by AI <175728472+Copilot@users.noreply.github.com>
Signed-off-by: Joana Maia <jmaia@contractor.linuxfoundation.org>
Co-authored-by: Copilot Autofix powered by AI <175728472+Copilot@users.noreply.github.com>
Signed-off-by: Joana Maia <jmaia@contractor.linuxfoundation.org>
Comment thread services/libs/tinybird/datasources/repos.datasource Outdated
Comment thread services/libs/tinybird/datasources/advisories.datasource Outdated
Co-authored-by: Copilot Autofix powered by AI <175728472+Copilot@users.noreply.github.com>
Signed-off-by: Joana Maia <jmaia@contractor.linuxfoundation.org>

Copilot AI left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Copilot reviewed 11 out of 11 changed files in this pull request and generated 7 comments.

Comment thread services/libs/tinybird/datasources/advisories.datasource Outdated
Comment thread services/libs/tinybird/datasources/advisories.datasource Outdated
Comment thread services/libs/tinybird/datasources/repos.datasource Outdated
Comment thread services/libs/tinybird/datasources/repos.datasource Outdated
Comment thread services/libs/tinybird/datasources/versions.datasource Outdated
Comment thread services/libs/tinybird/datasources/packages.datasource Outdated
Comment thread services/libs/tinybird/datasources/packages.datasource Outdated
Co-authored-by: Copilot Autofix powered by AI <175728472+Copilot@users.noreply.github.com>
Signed-off-by: Joana Maia <jmaia@contractor.linuxfoundation.org>
Copilot AI review requested due to automatic review settings June 9, 2026 10:52
joanagmaia and others added 8 commits June 9, 2026 11:53
Co-authored-by: Copilot Autofix powered by AI <175728472+Copilot@users.noreply.github.com>
Signed-off-by: Joana Maia <jmaia@contractor.linuxfoundation.org>
Co-authored-by: Copilot Autofix powered by AI <175728472+Copilot@users.noreply.github.com>
Signed-off-by: Joana Maia <jmaia@contractor.linuxfoundation.org>
Co-authored-by: Copilot Autofix powered by AI <175728472+Copilot@users.noreply.github.com>
Signed-off-by: Joana Maia <jmaia@contractor.linuxfoundation.org>
Co-authored-by: Copilot Autofix powered by AI <175728472+Copilot@users.noreply.github.com>
Signed-off-by: Joana Maia <jmaia@contractor.linuxfoundation.org>
Co-authored-by: Copilot Autofix powered by AI <175728472+Copilot@users.noreply.github.com>
Signed-off-by: Joana Maia <jmaia@contractor.linuxfoundation.org>
Co-authored-by: Copilot Autofix powered by AI <175728472+Copilot@users.noreply.github.com>
Signed-off-by: Joana Maia <jmaia@contractor.linuxfoundation.org>
Co-authored-by: Copilot Autofix powered by AI <175728472+Copilot@users.noreply.github.com>
Signed-off-by: Joana Maia <jmaia@contractor.linuxfoundation.org>

Copilot AI left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Copilot reviewed 11 out of 11 changed files in this pull request and generated 5 comments.

Comment thread services/libs/tinybird/datasources/repos.datasource Outdated
Comment thread services/libs/tinybird/datasources/repos.datasource Outdated
Comment thread services/libs/tinybird/datasources/repos.datasource Outdated
Comment thread services/libs/tinybird/datasources/versions.datasource Outdated
Comment thread services/libs/tinybird/datasources/packages.datasource Outdated
Comment thread services/libs/tinybird/datasources/repos.datasource Outdated
Comment thread services/libs/tinybird/datasources/versions.datasource Outdated
Copilot AI review requested due to automatic review settings June 9, 2026 13:05

@cursor cursor Bot left a comment

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Cursor Bugbot has reviewed your changes and found 1 potential issue.

There are 3 total unresolved issues (including 2 from previous reviews).

Fix All in Cursor

❌ Bugbot Autofix is OFF. To automatically fix reported issues with cloud agents, have a team admin enable autofix in the Cursor dashboard.

Reviewed by Cursor Bugbot for commit b9c833b. Configure here.

Comment thread services/libs/tinybird/datasources/packages.datasource

Copilot AI left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Copilot reviewed 11 out of 11 changed files in this pull request and generated no new comments.

Signed-off-by: Joana Maia <jmaia@contractor.linuxfoundation.org>
Signed-off-by: Joana Maia <jmaia@contractor.linuxfoundation.org>
@joanagmaia joanagmaia requested a review from mbani01 June 9, 2026 13:49
Signed-off-by: Joana Maia <jmaia@contractor.linuxfoundation.org>
Copilot AI review requested due to automatic review settings June 9, 2026 13:57

Copilot AI left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Copilot reviewed 15 out of 15 changed files in this pull request and generated no new comments.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants