Skip to content

RANGER-5656: TagSync Atlas REST sync fails with AbstractMethodError after RANGER-4076 Jersey 2 migration#1035

Merged
ramackri merged 6 commits into
masterfrom
RANGER-5656-patch
Jun 28, 2026
Merged

RANGER-5656: TagSync Atlas REST sync fails with AbstractMethodError after RANGER-4076 Jersey 2 migration#1035
ramackri merged 6 commits into
masterfrom
RANGER-5656-patch

Conversation

@ramackri

@ramackri ramackri commented Jun 27, 2026

Copy link
Copy Markdown
Contributor

What changes were proposed in this pull request?

Fixes RANGER-5656: TagSync Atlas REST tag source fails when TAG_SOURCE_ATLASREST_ENABLED=true after RANGER-4076 migrated TagSync to Jersey 2.x, while AtlasRESTTagSource still used AtlasClientV2 (Jersey 1.x from atlas-client-v2).

Problem

TagSync reads classifications from Atlas via AtlasClientV2 and pushes tags to Ranger Admin via TagAdminRESTSink (Jersey 2). Both JAX-RS stacks load on the same JVM:

AbstractMethodError: javax.ws.rs.core.UriBuilder.uri(...)
  at org.glassfish.jersey.internal.util.collection.Values$LazyValueImpl.get(...)

The Atlas REST sync thread exits on the first poll. TagSync may appear running, but no automatic Atlas → Ranger tag push occurs.

Before RANGER-4076 After RANGER-4076 (broken) This PR
TagSync sink: Jersey 1 + AtlasClientV2: Jersey 1 TagSync sink: Jersey 2 + AtlasClientV2: Jersey 1 Atlas REST: HttpURLConnection; sink: Jersey 2

Solution

  1. AtlasRESTHttpClient (new) — minimal Atlas REST client using HttpURLConnection + Atlas AtlasType JSON helpers (no Jersey, no AtlasClientV2).

    • POST api/atlas/v2/search/basic — classified entity search
    • GET api/atlas/v2/types/typedefs/ — typedef load
    • Supports Basic auth and Kerberos (UserGroupInformation.doAs)
  2. AtlasRESTTagSource — call AtlasRESTHttpClient instead of AtlasClientV2; remove getAtlasClient().

  3. Packaging — remove atlas-client-v1 / atlas-client-v2 from tagsync/pom.xml and from distro/src/main/assembly/tagsync.xml so Jersey 1 client JARs are not shipped in the TagSync tarball.

Not changed: TagSynchronizer, TagAdminRESTSink, Atlas Kafka tag source, Hive plugin.

Files changed

File Change
tagsync/.../AtlasRESTHttpClient.java Added
tagsync/.../AtlasRESTTagSource.java Modified — use HTTP client
tagsync/pom.xml Modified — drop atlas-client-v1/v2 dependencies
distro/src/main/assembly/tagsync.xml Modified — drop atlas-client-* from lib/

How was this patch tested?

Build and static checks

mvn package -pl tagsync -am -DskipTests
mvn checkstyle:check -pl tagsync -DskipTests
Check Result
tagsync module compile/package Pass
Checkstyle on new AtlasRESTHttpClient.java Pass (0 violations)

Manual testing (Atlas + Ranger + Hive integration)

Manual validation used a docker environment with Ranger Admin, TagSync, Apache Atlas, and Hive (Kerberos-enabled), with TAG_SOURCE_ATLASREST_ENABLED=true and Atlas REST credentials configured in TagSync install properties.

1. TagSync starts without Jersey conflict

  • Rebuilt TagSync from this branch and deployed the new tarball.
  • Started TagSync with Atlas REST source enabled.
  • Confirmed TagSynchronizer process is running and no AbstractMethodError / UriBuilderImpl errors appear in TagSync logs on startup or first poll.

2. Atlas classification → Ranger tag mapping (automatic sync)

  • Created a Hive table with columns suitable for governance testing.
  • Applied a PII classification to a Hive column entity in Atlas via Atlas REST API.
  • Waited for TagSync poll interval (~60 seconds).
  • Verified in Ranger Admin Tag menu: tag definition PII and resource mapping on the Hive service for the classified column (createdBy: rangertagsync, GUID matches Atlas).
  • Verified via Ranger REST: GET /service/tags/resources?serviceName=<hive-service> and GET /service/tags/tagresourcemaps?serviceName=<hive-service>.

3. End-to-end tag policy enforcement in Hive

With tag mappings present, configured Ranger policies on dev_hive (RBAC allow) and dev_tag (tag deny for one user; tag data mask hive:MASK for another on tag PII). Loaded sample row data into the Hive table.

User Query Expected Result
User A SELECT non-PII column Allowed Pass
User A SELECT PII-tagged column Denied (HiveAccessControlException) Pass
User B SELECT PII-tagged column Masked values (e.g. nnn-nn-nnnn) Pass
Hive admin SELECT PII-tagged column Raw values Pass

4. Regression checks

  • TagSync still pushes tags to Ranger Admin via TagAdminRESTSink (Jersey 2 unchanged).
  • No change to file-based or Kafka-based tag sources in this PR.

Related issues

  • RANGER-4076 — Jersey 1 → 2 migration (regression source)
  • RANGER-1897 — original AtlasClientV2 adoption in AtlasRESTTagSource

@ramackri ramackri requested review from mneethiraj and pradeepagrawal8184 and removed request for pradeepagrawal8184 June 27, 2026 10:50
…fter RANGER-4076 Jersey 2 migration

Replace AtlasClientV2 with HttpURLConnection-based AtlasRESTHttpClient so TagSync
does not load Jersey 1.x on the same classpath as Ranger Jersey 2.x REST sink.
Remove atlas-client-v1/v2 from tagsync dependencies and assembly tarball.
@ramackri ramackri force-pushed the RANGER-5656-patch branch 2 times, most recently from 9a52f62 to e483eac Compare June 27, 2026 10:51
…re resolveTag comments

Collapse multi-line statements in AtlasRESTHttpClient and restore original
resolveTag block and inline comments in AtlasRESTTagSource.

@mneethiraj mneethiraj left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@ramackri - this update reduces Ranger dependency on few Atlas libraries, which is good! Please review couple of comments on handling SSL and multiple URLs. It will be ideal if RangerRESTClient can be reused here. Though few log messages in RangerRESTClient refer to Ranger admin, the rest of the code seem to be generic to talk to any HTTP server; please review.

ramk added 2 commits June 27, 2026 17:51
Replace HttpURLConnection Atlas client with RangerRESTClient so TagSync
reuses the same SSL and multi-URL failover as TagAdminRESTSink. Initialize
AtlasRESTHttpClient from comma-separated endpoints and keep basic auth when
credentials are configured alongside Kerberos.
…asRESTHttpClient

Add blank line between javax and java imports for checkstyle, use single
return in execute(), and keep Atlas REST client initialization on one line.
@ramackri

Copy link
Copy Markdown
Contributor Author

@ramackri - this update reduces Ranger dependency on few Atlas libraries, which is good! Please review couple of comments on handling SSL and multiple URLs. It will be ideal if RangerRESTClient can be reused here. Though few log messages in RangerRESTClient refer to Ranger admin, the rest of the code seem to be generic to talk to any HTTP server; please review.
Yes adopted to RangerRESTClient

@ramackri ramackri merged commit 5aa6629 into master Jun 28, 2026
8 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants