Skip to content

Allow Steps to Accept Traversal#3458

Open
xiazcy wants to merge 6 commits into
masterfrom
steps-taking-traversal
Open

Allow Steps to Accept Traversal#3458
xiazcy wants to merge 6 commits into
masterfrom
steps-taking-traversal

Conversation

@xiazcy

@xiazcy xiazcy commented Jun 12, 2026

Copy link
Copy Markdown
Contributor

Summary

Allow child traversals as arguments to has(), is(), V(), E(), property(), where(P), and all P/TextP predicates. The child traversal is evaluated per-traverser at runtime, enabling dynamic value resolution.

JIRA:

  • TINKERPOP-2586: Traversal objects inside within() silently return empty instead of executing or erroring - fully addressed (traversals are now executed)
  • TINKERPOP-2777: V() cannot accept traversals like select() for data-driven lookups - fully addressed
  • TINKERPOP-3005: property() should accept a traversal producing a Map - fully addressed
  • TINKERPOP-1463: has(key, traversal) doesn't work as expected - partially addressed (dynamic filtering works; runtime folding into index lookups is deferred as a future optimization)

What it enables

// with Modern graph
// Data-driven vertex lookup (TINKERPOP-2777)
gremlin> g.inject(['1','2','3']).as('a').V(__.select('a')).values('name')
==>marko
==>vadas
==>lop

// Dynamic filtering - find people older than marko
gremlin> g.V().has('age', P.gt(__.V(1).values('age'))).values('name')
==>josh
==>peter

// Computed property mutation (TINKERPOP-3005) - set multiple properties from a Map
gremlin> g.V(4).property(__.V(1).project('friendCount','createdSoftware').by(__.out('knows').count()).by(__.out('created').values('name')))
==>v[4]
gremlin> g.V(4).valueMap('friendCount','createdSoftware')
==>[friendCount:[2],createdSoftware:[lop]]

// with AirRoutes graph
// Find airports where the city name ends with the airport code (lowercased)
gremlin> g.V().has("city", TextP.endingWith(__.values("code").toLower())).limit(10).values("code","city").fold()
==>[CUN, Cancun, ORK, Cork, OME, Nome, IEV, Kiev, AJU, Aracaju, IRA, Kirakira, OHE, Mohe, RUA, Arua, VAS, Sivas, AZD, Yazd]

Review Guide

This is a large change (68 files, ~7k lines) but structured in layers. Review bottom-up:

# Commit Layer What to look for
1 5f37d11 Predicate resolution Core: P.resolve(), empty-result semantics, AndP short-circuit, ConnectiveP/NotP delegation
2 6a547d2 Step implementations How steps use P.resolve(). HasStep single path, GraphStep idTraversal, AddPropertyStep mapForm, AcceptsChildPredicateTraversal marker
3 e80423a Grammar Gremlin.g4 nestedTraversal alternatives. Visitors delegate to API (not addStep())
4 1b21d15 GLVs within()/without() serialization. .NET typed overloads
5 793e7a8 Tests + docs Cucumber .feature files define behavioral contract. Reference doc updates
6 400f5ae Cleanup Dead code removal from HasContainer unification. Test quality fixes

Architecture (single resolution path)

                    has("age", __.V(1).values("age"))
                              │
                    DSL wraps: has("age", P.eq(traversal))
                              │
                              ▼
                    HasContainer(key, P.eq(traversal))
                              │
              ┌───────────────┴───────────────┐
              │ HasStep.resolvePredicate()     │
              │   P.resolve(traverser)         │
              │   hc.test(element)             │
              └────────────────────────────────┘

Key decisions to scrutinize

  1. Single resolution path - At the DSL level, has(key, traversal) wraps the traversal in P.eq(traversal) before creating the HasContainer. This means HasStep only has one code path for traversal resolution: call P.resolve(traverser) on the predicate, then hc.test(element). The GremlinLang serialization still records the original has(key, traversal) form (verified by round-trip tests). This is a new 4.0.0 feature, so no existing provider code is affected. Providers implementing custom strategies must check HasContainer.hasTraversal() before folding into index lookups.

  2. Empty-result semantics - When a child traversal produces zero results, behavior depends on predicate type. Collection predicates (within/without) resolve to an empty collection and delegate to Contains.test() which applies correct set-theoretic logic: within([]) is always false, without([]) is always true. Scalar predicates (eq/gt/lt/etc.) set resolvedEmpty=true and steps short-circuit to false since there is no meaningful comparison value. A null result (traversal produces null) is distinct from an empty result: it resolves normally and null is used as the comparison value, consistent with existing P.eq(null) semantics.

  3. AndP short-circuits, OrP doesn't - AndP.resolve() overrides ConnectiveP.resolve() to stop resolving children at the first scalar predicate that resolves empty. This is sound because and(unsatisfiable, X) is always false regardless of X. OrP does NOT short-circuit: a non-empty resolution does not guarantee the predicate will pass test(), so later children must already be resolved when their turn comes. Note: short-circuiting means side-effecting child traversals (e.g., ones using aggregate) in later AndP operands may not execute. This matches standard boolean short-circuit semantics.

  4. P.resolve() mutates in place - resolve() overwrites literals, isCollection, and resolvedEmpty fields on the P instance. In OLTP this is safe since resolve and test happen sequentially per traverser. In OLAP, HasStep.clone() deep-clones predicates per worker, providing thread isolation. The mutation pattern is a known limitation documented for future improvement (return an immutable resolved value instead). Predicate trees of arbitrary depth (AndP containing OrP containing NotP, etc.) are handled by recursive resolution and recursive cloning.

  5. Mutation guard - ChildTraversalValidator rejects any child traversal containing a step that implements the Mutating interface (addV, addE, drop, property). This runs at construction time (DSL methods and P factory methods) and again at strategy time via ChildTraversalVerificationStrategy. Aggregating side-effects (aggregate, store, group) are NOT blocked because they don't modify graph state. Whether they should be restricted is a question for reviewers.

  6. Provider contract - HasContainer.hasTraversal() is the single check point providers must use before folding. Providers that fold HasContainers into index-backed steps without this check will attempt to use an unresolved P value (returns null before resolve() is called) as an index key, producing wrong results. TinkerGraphStepStrategy demonstrates the pattern: skip traversal-bearing HasSteps with continue so that subsequent literal HasSteps can still be folded. Providers with custom GraphStep replacements must also copy the idTraversal field during strategy replacement, otherwise V(traversal) falls through to returning all vertices.

  7. choose() restriction - choose(P) and choose().option(P, ...) reject traversal-bearing predicates at construction time. This is a deliberate scope decision. PredicateTraversal (which evaluates option-key predicates in BranchStep) has access to the traverser and could technically call P.resolve() before P.test(), but extending the branching infrastructure is out of scope for this feature. Users needing dynamic branching can use the choose(Traversal, ...) form where the branch predicate is a full traversal (e.g., choose(__.is(P.gt(traversal)), trueChoice, falseChoice)). This works because IsStep handles resolution normally. Restricting is() when used inside choose()'s branch traversal is not feasible since choose() receives it as an opaque Traversal.Admin and does not inspect internal steps.

What's NOT in this PR

  • P.resolve() thread-safety refactoring (future)
  • instanceof DefaultGraphTraversal cleanup (deferred)
  • Runtime index folding for deterministic child traversals
  • FilterRankingStrategy cost awareness

Compatibility

  • Additive overloads only. No existing step or predicate signature changed.
  • Serialized GremlinLang form is unchanged: has(key, traversal) still serializes as has(key, traversal), verified by GremlinLangTraversalRoundTripTest and GremlinQueryParserTraversalTest.

Testing

All gremlin-core unit tests pass (9132, 0 failures) and the full TinkerGraph Gherkin suite passes (2172 scenarios, 0 failures) against the current master.

Category Coverage
Behavioral contract HasTraversal.feature, IsTraversal.feature, WhereTraversal.feature, VETraversal.feature, PropertyTraversal.feature, ChildTraversalVerification.feature
Internal mechanics PTraversalTest, CloneIndependenceTraversalTest, HasContainerTest, ChildTraversalValidatorTest, ChildTraversalVerificationStrategyTest
Grammar + serialization GremlinQueryParserTraversalTest, GremlinLangTraversalRoundTripTest
Strategy correctness TinkerGraphStepStrategyTraversalTest

Checklist

  • CHANGELOG entry added
  • Reference documentation updated (the-traversal.asciidoc)
  • Upgrade documentation added (release-4.x.x.asciidoc)
  • All GLVs updated (Python, .NET, Go, JavaScript)
  • Full mvn clean install passes
  • No new external dependencies introduced

VOTE +1

xiazcy added 6 commits June 12, 2026 14:07
Extend P, TextP, NotP, and the connective predicates (AndP/OrP) to carry a
child traversal whose result is resolved per-traverser at runtime instead of a
literal value. P.resolve() splits the traverser, seeds and runs the child
traversal, and installs the result as the comparison value for the current test
cycle.

Semantics:
- Scalar predicates (eq/neq/gt/lt/gte/lte) take the first result; an empty
  result cannot be satisfied and is flagged resolved-empty so steps short-circuit.
- Collection predicates (within/without) resolve to a collection; an empty
  result resolves to an empty set so within(empty) is false and without(empty)
  is true, matching literal P.within([])/P.without([]) semantics.
- Multi-traversal within/without combine the first result of each child.
- AndP short-circuits resolution at the first child that resolves empty.
- NotP exposes getWrapped() so traversal collection does not rely on negate().

GremlinLang serializes traversal-bearing predicates as op(traversal).
…mutation guards

Wire the runtime-resolved predicates into the filter, lookup, and mutation
steps and their DSL entry points:
- has(key|T|label, traversal), hasLabel(traversal): normalized to a HasContainer
  holding P.eq(traversal) so there is a single resolution path; the direct
  traversal form on HasContainer is removed.
- is(traversal), where(P) with a traversal-bearing predicate.
- V(traversal)/E(traversal) start steps seed a synthetic traverser (consistent
  with mergeV/mergeE) so the id traversal can run without an incoming traverser.
- property(traversal) producing a Map of key/value pairs, dispatched by an
  internal mapForm flag rather than a sentinel key.
- All/Any/None step support for traversal-bearing predicates.

Mutation safety is enforced in two layers: ChildTraversalValidator at DSL
construction time and ChildTraversalVerificationStrategy at strategy time, the
latter validating any step marked with the new AcceptsChildPredicateTraversal
interface. GraphStep/TinkerGraphStepStrategy skip folding traversal-bearing
HasContainers into index lookups since their value is dynamic.
Extend Gremlin.g4 and the traversal/predicate/spawn visitors to accept a
nestedTraversal where a literal or predicate is expected in has(), hasLabel(),
is(), V(), E(), property(), and the P/TextP predicate productions. Start-step
V(traversal)/E(traversal) parsing is supported; mutating child traversals are
rejected downstream by the verification strategy.
Mirror the Java overloads in gremlin-python, gremlin-go, gremlin-dotnet, and
gremlin-javascript so has(), hasLabel(), is(), V(), E(), property(), and the
P/TextP predicates accept child traversals. Update the .NET translator (Java
and TS) and the translation fixtures, and add the generated cucumber step
bindings for the new feature scenarios.
…ng steps

Add cross-language Gherkin scenarios covering has/is/where/property/V/E with
child-traversal arguments and the child-traversal mutation verification, plus
CHANGELOG, upgrade, and reference documentation describing the feature, its
first-result/empty-set semantics, and the provider-side HasContainer folding
guard.
…te leftover null guards and dead code from HasContainer unification

Assisted-by: Kiro:claude-opus-4-6
Comment on lines +65 to +70
// If the value is a DefaultGraphTraversal (the type created by __.xxx() anonymous traversals),
// treat it as a child traversal rather than a literal. This handles the case where Java's
// overload resolution picks P(BiPredicate, V) instead of P(BiPredicate, Traversal) when
// the caller passes a GraphTraversal. We specifically check for DefaultGraphTraversal
// rather than Traversal to avoid catching internal traversal types like ConstantTraversal,
// ValueTraversal, and IdentityTraversal which are used as literal values in P.

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not sure I agree with handling Traversals other than DefaultGraphTraversal as literal values. I can't imagine a case where I would want to P.eq() a ConstantTraversal, and have that mean constantTraversal.equals(myValue) instead of constantTraversal().next().equals(myValue) like any regular GraphTraversal. That said, I'm not sure when I would ever want to pass a ConstantTraversal into a predicate.

Even if the choice is kept to exclude most of the oddball traversals, I would think testing for GraphTraversal instead of DefaultGraphTraversal would be more reasonable.

// the caller passes a GraphTraversal. We specifically check for DefaultGraphTraversal
// rather than Traversal to avoid catching internal traversal types like ConstantTraversal,
// ValueTraversal, and IdentityTraversal which are used as literal values in P.
if (value instanceof org.apache.tinkerpop.gremlin.process.traversal.dsl.graph.DefaultGraphTraversal) {

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this logic should be folded into setValue() instead of kept exclusively in the constructor. I don't like the implication that new P(Compare.eq, __.V(5)) won't behave the same as new P(Compare.eq, 10).setValue(__.V(5))

private Traversal.Admin<?, ?> traversalValue;
private List<Traversal.Admin<?, ?>> traversalValues;

public P(final PBiPredicate<V, V> biPredicate, final V value) {

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think we should be careful to avoid the assumption that V value here can't also be a List<Traversal> simply because that explicit overload is present, or because the type parameter theoretically forbids it. This class really abuses its generics, and relies heavily on type erasure to even function. It's not exactly type safe, and generally best to treat V as Object in most instances here.

* Recursively integrates all child traversals found in the predicate tree into the given parent step.
* Handles {@link ConnectiveP} (recurses into children) and {@link NotP} (recurses into wrapped predicate).
*/
public static void integrateTraversals(final P<?> p, final TraversalParent parent) {

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This feels like a bit of an odd API for P to surface, we already have collectTraversals() below. Couldn't the parent steps use this to integrate all of their children directly, instead of delegating this through P?

Comment on lines +60 to +61
private Traversal.Admin<?, ?> traversalValue;
private List<Traversal.Admin<?, ?>> traversalValues;

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nit: The result types of these traversals should arguably be V throughout this class. I expect that would help with a lot of the @SuppressWarnings("unchecked") annotations throughout this class. There's probably an argument that the generics are so broken in this class already that it hardly matters, but it would be more proper this way.

Suggested change
private Traversal.Admin<?, ?> traversalValue;
private List<Traversal.Admin<?, ?>> traversalValues;
private Traversal.Admin<?, V> traversalValue;
private List<Traversal.Admin<?, V>> traversalValues;

return new P(Contains.without, value);
}

/**

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nit on organization, it would be nice to interleave these overloads with their matching pairs (all the eq's together, all the gt's together...)

*
* @since 4.0.0
*/
public static <V> P<V> eq(final Traversal<?, ?> traversalValue) {

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We shouldn't assume that traversals will never end up in the original eq(V value) overload, it's essentially eq(Object value) after erasure and java can be unreliable when selecting overloads sometimes. We ran into that problem quite a bit with GValue, the only value the GValue overloads really gave was making the generics behave a bit nicer in the average case. There's plenty of times where Java chooses to ignore the narrow overloads.

*
* @since 4.0.0
*/
public static <V> P<V> without(final Traversal<?, ?> first, final Traversal<?, ?> second, final Traversal<?, ?>... more) {

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why do we need special cases for these extra first, second, more overloads?

* Throws {@link IllegalArgumentException} if one is found.
*/
public static void validate(final Traversal.Admin<?, ?> child) {
for (final Step<?, ?> step : child.getSteps()) {

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nit: this should probably just use

public static <S> List<S> getStepsOfAssignableClassRecursively(final Class<S> stepClass, final Traversal.Admin<?, ?> traversal) {

* @since 4.0.0
*/
public static <V> P<V> lt(final Traversal<?, ?> traversalValue) {
ChildTraversalValidator.validate(traversalValue.asAdmin());

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It might be better to pull this validation into the P constructor or even setValue(), in most cases P's are built via these static methods but not always.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants