Skip to content

[common] Support catalog context without Hadoop configuration#8193

Open
tchivs wants to merge 3 commits into
apache:masterfrom
tchivs:paimon-create-context-without-hadoop
Open

[common] Support catalog context without Hadoop configuration#8193
tchivs wants to merge 3 commits into
apache:masterfrom
tchivs:paimon-create-context-without-hadoop

Conversation

@tchivs

@tchivs tchivs commented Jun 10, 2026

Copy link
Copy Markdown

Purpose

This is a narrower follow-up to #6653 for engines such as Trino that provide their own FileIO and should not require Hadoop configuration initialization.

Instead of refactoring the FileIO/CatalogContext hierarchy, this keeps all existing CatalogContext.create(...) behavior unchanged and adds an explicit CatalogContext.createWithoutHadoop(...) factory for the no-Hadoop path.

Changes

  • Keep existing CatalogContext.create(...) overloads loading Hadoop configuration by default for compatibility.
  • Add CatalogContext.createWithoutHadoop(...) for callers that provide their own FileIOLoader.
  • Make hadoopConf() fail with a clear IllegalStateException when called on a Hadoop-free context.
  • Add catalog tests, including a classloader test that filters Hadoop classes and verifies the new path can create a catalog without Hadoop on the classpath.

Tests

  • mvn -pl paimon-core -am -Pfast-build -DfailIfNoTests=false -Dtest=CatalogFactoryTest test
  • mvn -pl paimon-common -am -Pfast-build -DfailIfNoTests=false -Dtest=FileIOTest,ResolvingFileIOTest test

Notes

A companion Trino change is being prepared to consume this API by passing Trino's FileIO loader and disabling Hadoop default configuration loading.

@tchivs

tchivs commented Jun 10, 2026

Copy link
Copy Markdown
Author

Added the Trino 440 companion branch that consumes this API:

The Trino side uses CatalogContext.createWithoutHadoop(...) with its own PaimonFileIOLoader, disables Hadoop default config loading, and avoids installing SecurityContext.

Validation run locally:

  • Paimon: mvn -pl paimon-core -am -Pfast-build -DfailIfNoTests=false -Dtest=CatalogFactoryTest test
  • Paimon: mvn -pl paimon-common -am -Pfast-build -DfailIfNoTests=false -Dtest=FileIOTest,ResolvingFileIOTest test
  • Trino 440 companion: JAVA_HOME=/root/.jdks/temurin-21.0.11 PATH=/root/.jdks/temurin-21.0.11/bin:$PATH ./mvnw -pl plugin/trino-paimon -Dtest=TrinoConnectorFactoryTest,TrinoPluginTest test

@tchivs

tchivs commented Jun 10, 2026

Copy link
Copy Markdown
Author

@JingsongLi PTAL.

This is a narrower follow-up to #6653 based on your previous feedback:

  • reduced the change from a broad FileIO/CatalogContext refactor to an explicit CatalogContext.createWithoutHadoop(...) API
  • kept all existing CatalogContext.create(...) behavior unchanged for compatibility
  • added a no-Hadoop classloader test
  • added a Trino 440 companion branch consuming the new API: https://github.com/tchivs/trino/tree/paimon/trino-440-paimon-1.5

Thanks.

@tchivs

tchivs commented Jun 10, 2026

Copy link
Copy Markdown
Author

Additional Trino 440 companion validation passed locally on branch https://github.com/tchivs/trino/tree/paimon/trino-440-paimon-1.5:

  • JAVA_HOME=/root/.jdks/temurin-21.0.11 PATH=/root/.jdks/temurin-21.0.11/bin:$PATH ./mvnw -pl plugin/trino-paimon -Dtest=TrinoColumnHandleTest,TrinoFilterConverterTest,TrinoPartitioningHandleTest,TrinoRowTest,TrinoSplitTest,TrinoTableHandleTest,PaimonTypeTest,TestTrinoMetadata,TrinoConnectorFactoryTest,TrinoPluginTest test

    • Result: 24 tests passed.
  • JAVA_HOME=/root/.jdks/temurin-21.0.11 PATH=/root/.jdks/temurin-21.0.11/bin:$PATH ./mvnw -pl plugin/trino-paimon -Dtest=TrinoITCase test

    • Result: 29 tests passed.

TrinoITCase starts a local DistributedQueryRunner, creates a temporary Paimon warehouse, creates and writes Paimon tables, and verifies Trino read/query paths against Paimon 1.5 with the no-Hadoop catalog context path.

hadoopConf == null ? getHadoopConfiguration(options) : hadoopConf);
hadoopConf == null && !loadHadoopConf
? null
: new SerializableConfiguration(

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can you just create a method and do try catch? If there is no Hadoop class, set it directly to NULL.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants