feat: pom fetcher (CM-1210)#4179
Conversation
Signed-off-by: Mouad BANI <mouad-mb@outlook.com>
…ved to packages_worker) Signed-off-by: Mouad BANI <mouad-mb@outlook.com>
Signed-off-by: Umberto Sgueglia <usgueglia@contractor.linuxfoundation.org>
Signed-off-by: Umberto Sgueglia <usgueglia@contractor.linuxfoundation.org>
Signed-off-by: Umberto Sgueglia <usgueglia@contractor.linuxfoundation.org>
Signed-off-by: Umberto Sgueglia <usgueglia@contractor.linuxfoundation.org>
Signed-off-by: Umberto Sgueglia <usgueglia@contractor.linuxfoundation.org>
Signed-off-by: Umberto Sgueglia <usgueglia@contractor.linuxfoundation.org>
Signed-off-by: Umberto Sgueglia <usgueglia@contractor.linuxfoundation.org>
Signed-off-by: Umberto Sgueglia <usgueglia@contractor.linuxfoundation.org>
Signed-off-by: Umberto Sgueglia <usgueglia@contractor.linuxfoundation.org>
Signed-off-by: Umberto Sgueglia <usgueglia@contractor.linuxfoundation.org>
Signed-off-by: Umberto Sgueglia <usgueglia@contractor.linuxfoundation.org>
Signed-off-by: Umberto Sgueglia <usgueglia@contractor.linuxfoundation.org>
Signed-off-by: Umberto Sgueglia <usgueglia@contractor.linuxfoundation.org>
Signed-off-by: Umberto Sgueglia <usgueglia@contractor.linuxfoundation.org>
Signed-off-by: Umberto Sgueglia <usgueglia@contractor.linuxfoundation.org>
Signed-off-by: Umberto Sgueglia <usgueglia@contractor.linuxfoundation.org>
…by run mode Signed-off-by: Umberto Sgueglia <usgueglia@contractor.linuxfoundation.org>
Signed-off-by: Umberto Sgueglia <usgueglia@contractor.linuxfoundation.org>
|
|
There was a problem hiding this comment.
Pull request overview
This PR adds Maven POM–based enrichment to packages_worker, along with new OSS packages (osspckgs) Data Access Layer helpers to persist Maven package, version, maintainer, and repo metadata into the packages DB.
Changes:
- Added a Maven Temporal workflow/activity pipeline plus scheduling, including a standalone backfill entrypoint.
- Introduced new osspckgs DAL modules (
packages,versions,maintainers,repos) with upsert + change-detection/audit helpers. - Added Maven parsing/normalization utilities (POM fetch + inheritance resolution, metadata version selection) and unit tests for key normalizers.
Reviewed changes
Copilot reviewed 22 out of 26 changed files in this pull request and generated 4 comments.
Show a summary per file
| File | Description |
|---|---|
| services/libs/data-access-layer/src/osspckgs/versions.ts | Bulk upsert versions via UNNEST, with change-field reporting. |
| services/libs/data-access-layer/src/osspckgs/types.ts | Introduces DB upsert shapes for osspckgs (packages, maintainers, versions, repos). |
| services/libs/data-access-layer/src/osspckgs/repos.ts | Adds repo and package↔repo upsert helpers with change-field reporting. |
| services/libs/data-access-layer/src/osspckgs/packages.ts | Adds Maven “to-sync” paging query, package touch, audit logging, and package upsert. |
| services/libs/data-access-layer/src/osspckgs/maintainers.ts | Adds maintainer upsert + package maintainer link replacement helper. |
| services/libs/data-access-layer/src/osspckgs/index.ts | Barrel exports for osspckgs DAL module. |
| services/libs/data-access-layer/src/index.ts | Re-exports osspckgs maintainer/version helpers from the DAL package entrypoint. |
| services/apps/packages_worker/src/workflows/index.ts | Exposes Maven workflows from the worker workflows index. |
| services/apps/packages_worker/src/maven/workflows.ts | Defines Temporal workflows for critical/non-critical Maven processing. |
| services/apps/packages_worker/src/maven/schedule.ts | Registers the maven-critical Temporal schedule (with recreate-on-exists behavior). |
| services/apps/packages_worker/src/maven/runMavenEnrichmentLoop.ts | Implements Maven batch processing, extraction, persistence, and audit logging. |
| services/apps/packages_worker/src/maven/normalize.ts | Adds prerelease detection and repo URL parsing. |
| services/apps/packages_worker/src/maven/metadata.ts | Fetches/parses maven-metadata.xml and selects stable release version. |
| services/apps/packages_worker/src/maven/extract.ts | Fetches/parses POMs with parent inheritance + in-process caching. |
| services/apps/packages_worker/src/maven/activities.ts | Wires Maven batch processing into Temporal activities. |
| services/apps/packages_worker/src/maven/tests/normalize.test.ts | Adds tests for prerelease detection, stable version selection, and SCM/repo URL normalization. |
| services/apps/packages_worker/src/config.ts | Adds getMavenConfig() env parsing. |
| services/apps/packages_worker/src/bin/packages-worker.ts | Registers the Maven-critical schedule in the main packages worker entrypoint. |
| services/apps/packages_worker/src/bin/maven-worker.ts | Adds Maven-only worker entrypoint for local dev. |
| services/apps/packages_worker/src/bin/maven-backfill.ts | Adds one-shot backfill runner for Maven critical queue. |
| services/apps/packages_worker/src/activities.ts | Re-exports Maven activities. |
| services/apps/packages_worker/package.json | Adds Maven scripts + dependencies (axios, fast-xml-parser) and adjusts local monitor script. |
| scripts/services/maven-worker.yaml | Adds docker-compose service definitions for running Maven worker/dev worker. |
| scripts/builders/packages-worker.env | Builder config update for packages-worker image/services. |
| pnpm-lock.yaml | Locks new dependencies (axios, fast-xml-parser) and transitive updates. |
| backend/.env.dist.local | Adds Maven-related env defaults and sets local osspckgs GCP env placeholders. |
Files not reviewed (1)
- pnpm-lock.yaml: Language not supported
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
| "monitor:osspckgs:local": "bash -c 'set -a && . ../../../backend/.env.dist.local && . ../../../backend/.env.override.local && set +a && SERVICE=monitor tsx src/scripts/monitorOsspckgs.ts'", | ||
| "trigger-bootstrap": "SERVICE=deps-dev-ingest tsx src/scripts/triggerBootstrap.ts", | ||
| "trigger-bootstrap:local": "set -a && . ../../../backend/.env.dist.local && . ../../../backend/.env.override.local && set +a && SERVICE=deps-dev-ingest tsx src/scripts/triggerBootstrap.ts", | ||
| "monitor:osspckgs:local": "bash -c 'set -a && . ../../../backend/.env.dist.local && . ../../../backend/.env.override.local && set +a && node ../../../scripts/monitor-osspckgs.mjs'", |
| const maintainerLinks: Array<{ maintainerId: number; role: 'author' | 'maintainer' }> = [] | ||
| for (const person of allPeople) { | ||
| const username = person.username ?? person.email ?? person.displayName | ||
| if (!username) continue | ||
| const emailHash = person.email | ||
| ? crypto.createHash('sha256').update(person.email.toLowerCase().trim()).digest('hex') | ||
| : null | ||
| const { id: maintainerId, changedFields: mChanged } = await upsertMaintainer(t, { | ||
| ecosystem: 'maven', | ||
| username, | ||
| displayName: person.displayName, | ||
| url: person.url, | ||
| emailHash, | ||
| }) | ||
| mChanged.forEach((f) => changed.add(f)) | ||
| maintainerLinks.push({ maintainerId, role: person.role }) | ||
| } | ||
|
|
||
| if (maintainerLinks.length > 0) { | ||
| const pmChanged = await replacePackageMaintainers(t, packageId, maintainerLinks) | ||
| pmChanged.forEach((f) => changed.add(f)) | ||
| } |
| * Returns null when the artifact is not found (404) or the metadata is | ||
| * malformed. | ||
| */ |
| if (!rootPom) { | ||
| return { | ||
| groupId, | ||
| artifactId, | ||
| version, | ||
| purl, | ||
| description: null, | ||
| licenses: [], | ||
| licensesRaw: null, | ||
| scmUrl: null, | ||
| homepageUrl: null, | ||
| developers: [], | ||
| contributors: [], | ||
| parentHops: 0, | ||
| error: `POM not found: ${pomUrl}`, | ||
| } |
Summary
Changes
Type of change
JIRA ticket