Skip to content

feat(generators): token-level string interpolation metadata + string fixes#9

Open
theoephraim wants to merge 6 commits into
johnsoncodehk:masterfrom
dmno-dev:dmno/backtick-string-delims
Open

feat(generators): token-level string interpolation metadata + string fixes#9
theoephraim wants to merge 6 commits into
johnsoncodehk:masterfrom
dmno-dev:dmno/backtick-string-delims

Conversation

@theoephraim
Copy link
Copy Markdown

@theoephraim theoephraim commented Jun 5, 2026

Apologies in advance for the AI authored PR. This was encountered while wiring up a parser for varlock - the language is called "@env-spec" and is a small DSL on top of familiar dotenv syntax which includes decorator style comments and function calls.

From what I understand, it was having some issues with handling backtick quotes correctly in the generated textmate grammar, as well as string template style regions.

Why this exists (even if reimplemented differently)

This PR is intended to document and lock down specific behavior needed by env-spec-style DSL grammars. If the implementation here is not desired, that is totally fine; the important part is preserving these scenarios.

Reimplementation contract (must-pass behavior)

  1. TextMate backtick delimiter inference must be correct for escaped backtick strings.

    • Input token shape:
    token(/`(?:\\.|[^`\\])*`/, { string: true, escape: /\\./ })
    • Expected TM behavior:
      • begin is `
      • end is `|$
      • no fallback to " delimiters
  2. Interpolation metadata must be first-class on string tokens and propagate across generators.

    • Representative declaration:
    token(/"(?:\\.|[^"\\])*"/, {
      string: true,
      escape: /\\./,
      interpolation: [
        { begin: /\$\{/.source, end: /\}/.source },
        { begin: /\$\(/.source, end: /\)/.source },
      ],
    })
    • Expected generator behavior:
      • TextMate emits nested interpolation regions
      • Monarch emits interpolation begin rules + interpolation states
      • tree-sitter emits interpolation token rules/scanner/highlight captures
  3. YAML quoted-scalar continuation checks must only run for grammars that opt into block scalars.

    • For indentation grammars without indent.blockScalar, inline multiline quoted values such as KEY="line1\nline2" must parse (no YAML continuation indentation error).

Tests that encode these use cases

  • test/env-spec-regressions.ts
    • backtick delimiter regression
    • block-scalar overreach regression
  • test/interpolation-metadata.ts
    • interpolation metadata propagation in TextMate/Monarch/tree-sitter

If this PR is replaced with a cleaner implementation, keeping these tests (or equivalent) should preserve the same user-facing behavior.

Summary of changes in this PR

  • fix TextMate delimiter inference for escaped backtick string tokens
  • keep existing single/double quote behavior unchanged
  • gate YAML-style multiline quoted-scalar indentation enforcement behind indent.blockScalar
  • add token-level string interpolation metadata (interpolation)
  • consume interpolation metadata in TextMate, Monarch, and tree-sitter generation
  • add docs and regression tests for interpolation metadata + env-spec scenarios

New token option

interpolation can be declared on string tokens:

token(/"(?:\\.|[^"\\])*"/, {
  string: true,
  escape: /\\./,
  interpolation: [
    {
      begin: /\$\{/.source,
      end: /\}/.source,
      beginScope: 'punctuation.definition.interpolation.begin',
      endScope: 'punctuation.definition.interpolation.end',
      contentScope: 'meta.embedded.expression',
      include: '$self',
    },
  ],
})

Notes

  • tree-sitter interpolation planning currently targets scanner-friendly forms:
    • one-char outer string delimiters
    • interpolation opener literals of length 1 or 2 after decoding

Validation

  • npm test
    • 15/15 basic tests pass
    • 17/17 interpolation-metadata checks pass
    • 4/4 env-spec regression checks pass

@theoephraim theoephraim changed the title fix(tm): handle escaped backtick string tokens in delimiter inference fix(tm+lexer): backtick escaped strings and block-scalar guard Jun 5, 2026
@theoephraim theoephraim changed the title fix(tm+lexer): backtick escaped strings and block-scalar guard feat(tm): token-level string interpolation metadata + string fixes Jun 5, 2026
@theoephraim theoephraim changed the title feat(tm): token-level string interpolation metadata + string fixes feat(generators): token-level string interpolation metadata + string fixes Jun 5, 2026
@johnsoncodehk
Copy link
Copy Markdown
Owner

johnsoncodehk commented Jun 5, 2026

Hi @theoephraim, could you provide a code example and Monogram configuration that I can use to test this change? Never mind, I saw that tests were already done in the latest commit. :)

@theoephraim
Copy link
Copy Markdown
Author

yes - apologies when I saw what was first created, I had it cranking away to make it more clear :)

@johnsoncodehk
Copy link
Copy Markdown
Owner

The main branch has undergone some major restructurings. I don’t have the permission to push changes to this PR. Could you handle the merge conflicts or grant me the push permission?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants