Replace the documentation build system with an AsciidoctorJ extension#3455
Open
Cole-Greer wants to merge 28 commits into
Open
Replace the documentation build system with an AsciidoctorJ extension#3455Cole-Greer wants to merge 28 commits into
Cole-Greer wants to merge 28 commits into
Conversation
Replace the awk-based preprocessor and shell postprocessor pipeline with a Java-based AsciidoctorJ extension (tinkerpop-docs module) that: - Processes gremlin-groovy listing blocks via GremlinConsole subprocess - Handles multi-line statement joining and callout stripping - Generates tabbed HTML output for language variants - Applies version substitution and callout fixes via postprocessor - Auto-restarts console on timeout with block-level retry - Falls back to dry-run output for blocks that fail after retry The new bin/process-docs.sh orchestrates console/server setup and passes attributes to the AsciidoctorJ plugin via Maven properties. Known issues to address: - Echo pattern in console output needs stripping - Second groovy tab (clean source) not yet generated - Version x.y.z substitution not wired - Missing CSS stylesheet in output - Some callout markers rendered as raw text
- Strip command echo (first line) from console results so output matches published format: gremlin> stmt / ==>result - Add second 'groovy' tab with clean source code (no prompts/output) to match the published two-tab format - Pass tinkerpop-version attribute to all asciidoctor executions so the GremlinPostprocessor can substitute x.y.z with actual version - Update tests for new tab count and graph.traversal() init
- Copy docs/{static,stylesheets} to staging area alongside docs/src/*
to match old build behavior (provides tinkerpop.css to asciidoctor)
- Revert timeout from 120s to 30s since legitimate blocks complete
quickly; infrastructure-dependent blocks will fail fast and retry
- Wrap tab content in <div class='listingblock'><div class='content'> to match published structure - Render callout markers (<1>, <2>) as proper HTML conum elements with hide-when-copy spans instead of literal text - Preserve callouts in console tab display (separate display vs execution statement lists) - Process callouts per-line after HTML escaping
Use the CodeRay gem bundled with AsciidoctorJ to highlight generated code content in tabs. The JRuby runtime is accessed via JRubyRuntimeContext to call CodeRay::Duo directly, producing the same highlighted HTML spans as regular source blocks. Falls back to plain HTML escaping if CodeRay is unavailable.
CodeRay was escaping/mangling callout markers (<1>, <2>) during highlighting. Now callouts are extracted before highlighting, CodeRay processes clean source, and callout HTML is re-injected into the highlighted output at the correct line positions.
- Handle console startup timeout gracefully (skip block instead of crashing the build) - Cache CodeRay Duo encoder object in JRuby global variable - Use heredoc syntax for source input to avoid escaping issues - Build time reduced from 2.5 hours to under 2 minutes
Published docs show continuation lines with indentation only, not repeated gremlin> prompts. Also handle console startup failure gracefully during restart (skip block instead of crashing build).
Without -pl ., Maven runs process-resources across the entire reactor (30+ modules), adding hours of unnecessary processing. The asciidoc profile is only defined on the root pom.
evalScriptlet with embedded source code forced JRuby to parse a new Ruby script for every highlight call (~970 calls), taking 25-30s each. Now uses callMethod to invoke the cached CodeRay Duo object directly, passing source as a RubyString argument. Dry-run drops from hours to 16 seconds.
- Standalone tab groups now consume consecutive [source,lang] blocks even without the 'tab' attribute (matching published behavior) - Callout conums use class="conum" (not "conum invisible") matching published format; use class="comment" for // spans - Remove clear-shadow divs from tab HTML (not in published output) - Remove postprocessor invisible/hide-when-copy transformations that were overriding the correct format
The Hadoop/Spark blocks use traversal().withEmbedded(graph) which takes ~23s for anonymous TraversalSource resolution, plus SparkGraphComputer first-execution overhead (~10s). With 30s timeout, these were right at the edge and intermittently failing. 60s provides comfortable headroom for all legitimate operations while still failing fast on genuinely broken blocks.
For multi-line statements, the console echoes all lines with continuation prompts (......N>) before the actual results. Previously only the first echo line was skipped, leaving continuation prompts in the output and making it appear the block had no results. Now skips all lines matching the continuation prompt pattern.
Blocks using [gremlin-groovy,theCrew] were falling through to an empty TinkerGraph because only 'crew' was mapped. Added 'theCrew' as an alias for TinkerFactory.createTheCrew().
Multi-line SPARQL queries use triple-quoted strings that span lines without leading whitespace (e.g., WHERE clauses). Track open/close of triple-quote pairs to keep them as single statements. Reduces skipped blocks from 13 to 3.
The sparql-gremlin plugin requires: 1. plugin/ directory with main JAR for SPI discovery 2. Registration in plugins.txt for activation at startup 3. All dependency JARs on the main classpath (lib/) because the ext/ child classloader doesn't properly share Jena classes Also register hadoop, spark, and neo4j plugins in plugins.txt.
Standalone plugins (hadoop-gremlin, spark-gremlin) need: 1. A plugin/ directory with main JAR for SPI-based plugin discovery 2. All dependency JARs on the main classpath (lib/) for proper classloading of HadoopGraph, SparkGraphComputer, etc. This mirrors the fix already applied for non-standalone plugins (sparql-gremlin, neo4j-gremlin).
The nested g.inject(g.withComputer()...) block triggers cold Spark initialization which can take 50-60s. 90s gives comfortable headroom while still serving as a failsafe against genuine hangs.
Two fixes for the SPARQL/remote-connect cascade failure: 1. process-docs.sh: fail fast if port 8182 is already in use (stale server from a prior run). Previously the nc readiness check would pass against a stale/incompatible server, causing WebSocket handshake failures that dumped ~500-line Netty stacktraces into every :remote connect block. Also detect early server-process exit (e.g. bind failure) instead of waiting the full 30s timeout. 2. GremlinTreeprocessor: add a 2s delay after closing a dead console before restarting, letting the OS reclaim resources (ports, memory) from Spark/Hadoop blocks so the SPARQL section that follows can recover instead of cascading into repeated timeouts.
The ':remote connect' docs blocks target localhost:8182. Any process occupying that port (stale Gremlin Server or an unrelated service) causes our server to fail binding while nc -z still passes, so the console connects to the wrong service and WebSocket handshakes fail. Updated the fail-fast message to not assume the cause is a stale server and to point at lsof for identifying the actual process.
Two fixes for no-result blocks: 1. initGraphIfNeeded now re-initializes graph + g for EVERY non-existing block, matching the old preprocessor behavior. Previously it skipped re-init when the graph name matched the prior block, so a block that reassigned g (e.g. 'g = traversal().withEmbedded(graph).withComputer()') leaked the OLAP/mutated source into later blocks that expected a fresh OLTP 'g' (e.g. path().by() blocks returned nothing under GraphComputer). 2. Detect sugar syntax (g.V, g.V[0..2], etc.) and call SugarLoader.load() on a fresh console for those blocks, restarting afterward so the permanent Groovy metaclass mutation doesn't leak into other blocks.
Rework the docs build so a genuine Gremlin error fails the build rather
than rendering as a silently-empty block. Investigation confirmed no
executed gremlin-groovy block is expected to error: error output goes to
the console's stderr (the "Display stack trace?" prompt), which neither
the old nor new build captured, and all error examples in the docs are
hand-authored [source,text] blocks.
GremlinConsole now records the stderr error prompt and surfaces it from
execute() as a GremlinExecutionException; the treeprocessor propagates it
(fatal) and the silent dry-run fallback is removed. The 90s timeout
remains a failsafe and single restart-and-retry recovery is preserved.
Enabling this surfaced several previously-masked setup issues, also fixed:
- buildStatements tracks bracket depth so multi-line Groovy closures
(e.g. "(1..10).each { ... }; []") stay grouped instead of hanging at a
continuation prompt.
- initGraphIfNeeded closes the prior graph and clears /tmp/neo4j and
/tmp/tinkergraph.kryo before each block, so a stale Neo4j store lock no
longer hangs Neo4jGraph.open (mirrors the old preprocessor).
- SugarLoader runs on a freshly restarted console so it takes effect on a
pristine Groovy metaclass.
- process-docs.sh resolves the Neo4j DB impl onto the console classpath,
strips only the conflicting io.netty 4.1.24 (keeping netty-3.9.x that
Neo4j needs), and registers TinkerGraph and Credential plugins.
REQUIREMENTS.md FR-4 updated. A remaining Neo4j/Spark Scala classpath
conflict requiring per-book plugin isolation is tracked separately.
(tinkerpop-6jq.14)
Assisted-by: Kiro:claude-opus-4.8 [Kiro CLI]
Neo4j 3.4 (Scala 2.11) and Spark (Scala 2.12) cannot share the docs console's flat classpath. Wire the previously-unwired plugin-exclusion scaffolding so the console restarts with conflicting plugins removed: - Add PluginDirectoryRestartHandler that toggles ext/<plugin> dirs and keeps ext/plugins.txt in sync (the console drops unlisted plugins whose jars vanish on restart). - GremlinTreeprocessor: detect :gremlin-docs-plugins-exclude: at section granularity during the AST walk (document baseline + per-section override with latching), bouncing the console when the set changes. Default to the directory handler in production; tests inject their own. Also :set max-iteration 100 on console start to match published output. - process-docs.sh: install each plugin's deps into ext/<plugin>/plugin/ (deduped vs lib/) instead of the shared lib/, so conflicting deps are isolatable; write plugins.txt deterministically to avoid stale state. - Add :gremlin-docs-plugins-exclude: attributes to the neo4j, hadoop, spark and gremlin-variants chapters with explanatory comments; update the stale reference index comment and developer docs. - Fix an undefined-variable typo (marko -> vMarko) and render the olap-spark-yarn recipe (which needs a real YARN/HDFS cluster) as a non-executed block with hardcoded output. - Set asciidoctor.gemPath under target/ so the JRuby gem extraction no longer creates a gems/ directory at the repo root. Assisted-by: Kiro:claude-opus-4.8 [kiro-cli]
…rpop-6jq.11) The old shell/AWK preprocessor and postprocessor directories have been removed, but the developer documentation still described that system. Rewrite the "Documentation Environment" section to describe the Maven-based AsciidoctorJ extension: it now states the build is Maven-driven, runs OLAP examples against the local filesystem (fs.defaultFS=file:///) so no Hadoop cluster is required, notes the Spark-on-YARN recipe is rendered from pre-captured output, and adds the prerequisite distribution build and --dryRun option. Drop the obsolete pseudo-distributed Hadoop / yarn-site / mapred-site instructions and the AWK/GNU-utils requirements. Point the OLAP jar-conflict note at the new per-book plugin exclusion mechanism, and update stale "preprocessor" wording in the committer docs. Assisted-by: Kiro:claude-opus-4.8 [kiro-cli]
…erpop-6jq.7) processStandaloneTabGroup emitted tab code via TabbedHtmlBuilder.codeTab, which passes the source through verbatim, so standalone [source,<lang>,tab] groups and manual language-variant tab groups (e.g. the driver-connection examples in the "Gremlin Server" / connecting-gremlin-server section) rendered without CodeRay highlighting, unlike the published docs. Route these blocks through highlightAsSource + codeTabHighlighted, the same path the gremlin-groovy tabs use. Reference-book CodeRay span count now matches/exceeds published and the connecting-gremlin-server examples render with the expected keyword/string/comment highlighting. Assisted-by: Kiro:claude-opus-4.8 [kiro-cli]
…q.7)
The docs build runs the Hadoop-Gremlin OLTP/OLAP examples against the
local filesystem (fs.defaultFS=file:///). Hadoop's RawLocalFileSystem
resolves a bare hdfs.ls() to getHomeDirectory(), which reads the JVM
user.home, so the rendered docs listed the entire contents of the build
machine's home directory instead of the clean HDFS home the published
docs show.
Change the bare hdfs.ls() calls in implementations-hadoop-start and
implementations-hadoop-end to hdfs.ls('tinkerpop-modern.kryo') so they
list only the graph file the example just copied -- deterministic output
with no home-directory leakage, and no change to the shared hadoop-gryo
input location. (A full MiniDFSCluster would reproduce the published
HDFS output exactly; that is tracked separately under tinkerpop-6jq.12.)
Assisted-by: Kiro:claude-opus-4.8 [kiro-cli]
…nkerpop-6jq.7) PluginDirectoryRestartHandler moved ext/<plugin> to ext-disabled/<plugin> with Files.move(REPLACE_EXISTING), which throws DirectoryNotEmptyException when ext-disabled/<plugin> already exists as a non-empty directory. An interrupted docs build leaves such a directory, poisoning the next run with "Failed to restart console with excluded plugins". Make the toggle idempotent and source-authoritative: clear any stale destination before moving when disabling, and when enabling drop a leftover disabled duplicate if the plugin is already present in ext/. Also clear ext-disabled/ at the start of bin/process-docs.sh so each build begins from a known state. Adds PluginDirectoryRestartHandlerTest covering the round-trip, double-exclude, and stale-state scenarios. Assisted-by: Kiro:claude-opus-4.8 [kiro-cli]
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
This PR replaces TinkerPop's legacy shell/AWK documentation preprocessor + postprocessor pipeline with a Maven-based AsciidoctorJ extension (tools/tinkerpop-docs). The new extension walks each AsciiDoc book's AST, executes [gremlin-groovy] code blocks against a long-lived Gremlin Console subprocess, and renders
the console output as tabbed, syntax-highlighted HTML — producing output structurally equivalent to the published 3.7.7-SNAPSHOT docs while being easier to maintain, test, and run.
Motivation
The old build was a fragile pipeline of bash + awk scripts under docs/preprocessor/ and docs/postprocessor/ that was hard to test, OS-sensitive (required GNU coreutils on macOS), silently swallowed Gremlin execution errors, and depended on a manually configured pseudo-distributed Hadoop cluster. The replacement
is a single Maven module with unit tests, fail-fast error handling, and a local-filesystem Hadoop configuration that needs no daemons.
What changed
New AsciidoctorJ extension (tools/tinkerpop-docs)
Orchestration — bin/process-docs.sh rewritten to validate the console/server distributions, install plugins, start a Gremlin Server and Gephi mock, and invoke Maven. Supports --dryRun (render without executing).
Per-book plugin isolation — Neo4j 3.4 (Scala 2.11) and Spark (Scala 2.12) cannot share the console's flat classpath. A :gremlin-docs-plugins-exclude: section attribute drives a console restart with the conflicting plugin directories toggled aside, so both the Neo4j and Spark examples render correctly in the
same run. Plugin dependencies are installed into ext//plugin/ (not the shared lib/) so they can be isolated, and the toggle is idempotent/resilient to interrupted builds.
Docs source updates
Removed — the entire docs/preprocessor/ and docs/postprocessor/ script trees (15 files).
Testing
drift.
Tips for reviewers
I've taken the liberty of redeploying the 3.7.7-SNAPSHOT docs from this branch. I would recommend focusing the review on evaluating the built docs. There are a few notable differences worth calling out:
hdfs.ls()withhdfs.ls('tinkerpop-modern.kryo'). This is a minor workaround as the docs build substitutes in the filesystem from the host machine instead of running a local hadoop cluster. This change is to avoid dumping existing contents of the hosts home directory. The old format could be restored by having the docs system internally manage a MiniDFSCluster. This is a viable fix but I've left it out of scope from this PR to limit complexity.Future
The goal of this work was to replace the old docs system with a goal of a 1:1 equivalency in docs output. I think this new extension gives us a better platform to build future enhancements on the docs.
gremlin-groovyexamples, and automatically add tabs for all language variants (excluding groovy-specific examples)