Skip to content

fix: std.parseYaml preserves trailing newline for clip-chomped block scalars#1028

Merged
stephenamar-db merged 1 commit into
databricks:masterfrom
He-Pin:fix/parseyaml-block-scalar-chomping
Jun 24, 2026
Merged

fix: std.parseYaml preserves trailing newline for clip-chomped block scalars#1028
stephenamar-db merged 1 commit into
databricks:masterfrom
He-Pin:fix/parseyaml-block-scalar-chomping

Conversation

@He-Pin

@He-Pin He-Pin commented Jun 24, 2026

Copy link
Copy Markdown
Contributor

Summary

This PR fixes JVM std.parseYaml block scalar clip chomping for literal (|) and folded (>) YAML block scalars.

SnakeYAML 2.x strips the trailing newline from default clip-chomped block scalars. YAML clip chomping should preserve exactly one trailing newline when no explicit + or - chomping indicator is present.

Behavior comparison

Local reference versions used:

  • C++ jsonnet v0.22.0
  • go-jsonnet v0.22.0
  • jrsonnet 0.5.0-pre99
Input C++ jsonnet go-jsonnet jrsonnet sjsonnet before sjsonnet after
`a: \n line1\n line2` "line1\nline2\n" same same "line1\nline2"
`a: -\n line1\n line2\n` "line1\nline2" same same same
`a: +\n line1\n line2\n\n` "line1\nline2\n\n" same same same
a: >\n line1\n line2 "line1 line2\n" same same "line1 line2" "line1 line2\n"
`a: # comment -\n line1\n line2` "line1\nline2" "line1\nline2\n" "line1\nline2\n" "line1\nline2"

The header-comment case follows go-jsonnet, jrsonnet, and YAML comment semantics: + or - inside a comment must not be interpreted as a chomping indicator. C++ jsonnet differs for that edge case.

Changes

  • Platform.scala (JVM): detect literal/folded SnakeYAML scalar nodes and inspect the raw YAML block scalar header for explicit chomping indicators
  • Platform.scala (JVM): append the missing trailing newline only for default clip chomping
  • Platform.scala (JVM): stop scanning for chomping indicators at header comments
  • parseyaml_block_scalar_chomping.jsonnet: golden regression coverage for clip/keep/strip, folded/literal, indentation indicators, blank lines, and header comments
  • JS/WASM/Native file tests skip this golden because this fix is JVM/SnakeYAML-specific

Tests

rtk ./mill 'sjsonnet.jvm[3.3.7]'.test.testOnly sjsonnet.ParseYamlTests sjsonnet.FileTests
rtk git diff --check upstream/master...HEAD

Both passed locally after rebasing onto upstream/master.

@stephenamar-db

Copy link
Copy Markdown
Collaborator

please rebase

@He-Pin He-Pin force-pushed the fix/parseyaml-block-scalar-chomping branch 2 times, most recently from adfaa86 to 2b05a0a Compare June 24, 2026 17:30
Motivation:
SnakeYAML 2.x strips the trailing newline from clip-chomped literal and folded block scalars. std.parseYaml should preserve the YAML default clip chomping behavior on JVM.

Modification:
Detect JVM SnakeYAML literal/folded scalar nodes, inspect the raw block scalar header for explicit keep/strip chomping indicators, and append the missing trailing newline only for default clip chomping. Header comments are ignored when scanning for chomping indicators, and the JVM-only golden test is skipped on JS/WASM/Native.

Result:
std.parseYaml now preserves one trailing newline for clip-chomped block scalars while leaving explicit keep/strip chomping unchanged.
@He-Pin He-Pin force-pushed the fix/parseyaml-block-scalar-chomping branch from 2b05a0a to ec57ec1 Compare June 24, 2026 17:58
@stephenamar-db stephenamar-db merged commit 442704e into databricks:master Jun 24, 2026
5 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants