Skip to content

RANGER-4676, RANGER-5615: Add OpenSearch Dispatcher to Ranger Audit Server#986

Open
paras200 wants to merge 10 commits into
apache:masterfrom
paras200:RANGER-5615
Open

RANGER-4676, RANGER-5615: Add OpenSearch Dispatcher to Ranger Audit Server#986
paras200 wants to merge 10 commits into
apache:masterfrom
paras200:RANGER-5615

Conversation

@paras200

@paras200 paras200 commented May 28, 2026

Copy link
Copy Markdown

Adds a complete OpenSearch audit destination to Apache Ranger — covering the write path (dispatcher), read path (Ranger Admin UI), and direct plugin writes — as an alternative to the Solr/Elasticsearch-based audit store.

OpenSearch Dispatcher (audit-server/audit-dispatcher/dispatcher-opensearch)

  • OpenSearchDispatcherManager — lifecycle manager with retry-based initialization (exponential backoff, max 5 attempts) and graceful shutdown
  • AuditOpenSearchDispatcher — Kafka consumer that batches audit events and writes them to OpenSearch via the /_bulk API using the low-level RestClient
  • AuditEventOpenSearchDocMapper — canonical 27-field event-to-document mapper
  • Supports basic auth and Kerberos/SPNEGO authentication for OpenSearch connections
  • Document ID deduplication — uses audit.eventId as _id in bulk metadata, falls back to UUID when absent
  • Error handling with partition seek-back and retry sleep on batch failures

Native Ranger Admin Read Path (security-admin — audit_store=opensearch)

  • OpenSearchMgr — RestClient lifecycle manager with basic/keytab/anonymous auth, config imported from OpenSearchAuditDestination
  • OpenSearchUtil — builds OpenSearch Query DSL JSON from SearchCriteria (bool/must, wildcard, match_phrase, query_string OR, range, negation, pagination, sorting); parses responses with Jackson
  • OpenSearchAccessAuditsService — extends AccessAuditsService, orchestrates search + populateViewBean field mapping to VXAccessAudit
  • OpenSearchIndexBootStrapper (embeddedwebserver) — auto-creates ranger_audits index at Ranger Admin startup via HEAD/PUT REST calls
  • Routing wired in AssetMgr, XAuditMgr, RangerBizUtil (AUDIT_STORE_OPENSEARCH), and EmbeddedServer
  • Config: ranger.audit.opensearch.* namespace in ranger-admin-site.xml, install.properties, and setup.sh with URL/port validation

Direct Plugin Audit Destination (agents-audit/dest-os)

  • OpenSearchAuditDestination — extends AuditDestination, provides direct plugin-to-OpenSearch writes using low-level RestClient + /_bulk API
  • Wired into AuditProviderFactory (xasecure.audit.destination.opensearch=true)
  • Exports config constants (CONFIG_PREFIX, CONFIG_URLS, etc.) used by OpenSearchMgr
  • Packaged in distro/pom.xml alongside dest-es, dest-solr, etc.

Bug fix

  • Fix ElasticSearchMgr.connect() to return the client on first connection (missing me = client assignment)

Docker infrastructure (dev-support/ranger-docker)

  • docker-compose.ranger-audit-dispatcher-opensearch.yml for the dispatcher container
  • ranger-audit-dispatcher-opensearch-site.xml dispatcher configuration
  • KDC healthcheck + ZK depends_on: service_healthy to fix keytab provisioning race condition
  • ranger-admin-install-postgres.properties updated with audit_store=opensearch option

How was this patch tested?

Unit tests (33 tests):

  • TestAuditOpenSearchDispatcher (6) — bulk request formatting, field mapping, HTTP errors, item-level errors, UUID generation
  • TestOpenSearchDispatcherManager (5) — dispatcher type filtering, disabled destination, fail-fast
  • TestAuditEventOpenSearchDocMapper (4) — all 27 fields mapped correctly
  • OpenSearchMgrTest (4) — client builder for basic auth, no auth, NONE credentials, multiple hosts
  • OpenSearchUtilTest (9) — query building: partial/full string, date range, collection OR, negation, sorting, pagination, Lucene escaping
  • OpenSearchAccessAuditsServiceTest (4) — null client, IO exception, successful search, empty results
  • TestOpenSearchAuditDestination (8) — doc mapping, null ID, null event time, NONE URLs, null client, empty/null events, config constants

End-to-end validated locally (Docker):

  • Full stack: KDC → ZK → Kafka → OpenSearch → Ranger Admin (audit_store=opensearch) → Audit Ingestor → OpenSearch Dispatcher
  • Posted SPNEGO-authenticated audit events through the ingestor REST API
  • Verified documents indexed in OpenSearch and visible in Ranger Admin UI via /service/assets/accessAudit API
  • Pipeline: Plugin → Ingestor → Kafka → Dispatcher → OpenSearch → Ranger Admin UI

Design decisions

  • Uses low-level ES RestClient (not RestHighLevelClient) — compatible with any OpenSearch version (1.x, 2.x+) without tagline validation issues
  • Dedicated audit_store=opensearch config namespace — separate from elasticsearch, no compatibility hacks
  • ranger-audit-dest-os module mirrors dest-es/dest-solr architecture for plugin-direct writes

@paras200 paras200 force-pushed the RANGER-5615 branch 4 times, most recently from dac9739 to d58b01c Compare June 9, 2026 18:22
Comment thread audit-server/audit-dispatcher/dispatcher-common/pom.xml Outdated
@ramackri

Copy link
Copy Markdown
Contributor

@paras200
The Maven build failed at the audit-dispatcher-opensearch module during the checkstyle-check goal with 16 Checkstyle violations.

@paras200 paras200 force-pushed the RANGER-5615 branch 2 times, most recently from 188138d to 328cd6b Compare June 11, 2026 08:59
@paras200

Copy link
Copy Markdown
Author

@ramackri The checkstyle violations are all LineLength > 80 and JavadocVariable (missing Javadoc on private fields). These are the same class of violations present across the existing dispatcher modules — for example, dispatcher-solr has 101 identical violations and passes CI without issue.

The 80-character line limit is explicitly deprecated per the Apache Ranger Java Style Guide which sets the column limit at 512 characters. @mneethiraj also confirmed in this PR review: "days of 80-character max width are long gone."

The JavadocVariable violations are on private constants/fields — adding Javadoc to these would contradict the coding guideline to avoid unnecessary comments when identifiers are self-explanatory.

The CI build-17 failure is unrelated to checkstyle — it's a timeout in TestGdsREST in security-admin (same issue affecting other PRs like RANGER-5637). The checkstyle plugin is not configured as a CI gate for this module.

@paras200 paras200 force-pushed the RANGER-5615 branch 3 times, most recently from d50154b to babe50b Compare June 12, 2026 18:23
Comment thread dev-support/ranger-docker/scripts/kafka/create-ranger-audit-topic.sh Outdated

@ramackri ramackri left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

remove create-ranger-audit-topic.sh & e2e-audit-opensearch.sh from this PR

@paras200

paras200 commented Jun 15, 2026

Copy link
Copy Markdown
Author

remove create-ranger-audit-topic.sh & e2e-audit-opensearch.sh from this PR

Done — both removed. The e2e test script will be contributed to ranger-tools as a generic audit dispatcher test that can validate solr/opensearch/hdfs dispatchers against the ranger-tools base images.

Comment thread dev-support/ranger-docker/README.md Outdated
image: ranger-zk
container_name: ranger-zk
hostname: ranger-zk.rangernw
depends_on:

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm trying to understand why should ZK depend on KDC?

Copy link
Copy Markdown
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sure, when KERBEROS_ENABLED=true, ranger-zk boots via zookeeper-with-kerberos.sh, which needs its keytab provisioned by the KDC before startup — without the depends_on: ranger-kdc: service_healthy gate, ZK races the KDC and fails to authenticate. It's a depends_on (ordering only), so in non-Kerberos runs it just waits for the KDC container to be healthy with no functional coupling.
This can be moved to the Kerberos-specific overlay if we want to keep it out of the base compose.

audit_elasticsearch_password=$(get_prop 'audit_elasticsearch_password' $PROPFILE)
audit_elasticsearch_index=$(get_prop 'audit_elasticsearch_index' $PROPFILE)
audit_elasticsearch_bootstrap_enabled=$(get_prop 'audit_elasticsearch_bootstrap_enabled' $PROPFILE)
audit_opensearch_urls=$(get_prop 'audit_opensearch_urls' $PROPFILE)

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Since these properties are already added in ranger-admin-site.xml, are they really need in this script ?

Copy link
Copy Markdown
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes — these follow the same install-time substitution pattern as the existing audit_elasticsearch_* and audit_solr_* properties. setup.sh reads the operator-supplied values (lines 105–111) and writes them into ranger-admin-site.xml when audit_store=opensearch (see lines 877–904). The values shipped in ranger-admin-site.xml are just defaults/placeholders that setup.sh overwrites at install time, so removing them here would make the OpenSearch audit store non-configurable during install.

Comment thread security-admin/scripts/install.properties
Comment thread dev-support/ranger-docker/README.md
@paras200 paras200 force-pushed the RANGER-5615 branch 2 times, most recently from 6407aee to 4e11bf2 Compare June 21, 2026 11:18
Comment thread security-admin/src/main/java/org/apache/ranger/opensearch/OpenSearchMgr.java Outdated
Comment thread security-admin/src/main/java/org/apache/ranger/opensearch/OpenSearchMgr.java Outdated
@paras200 paras200 force-pushed the RANGER-5615 branch 2 times, most recently from afe6dcc to d788260 Compare June 26, 2026 11:40
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

5 participants