Skip to content

Make Tree no longer extend HashMap#3448

Open
GumpacG wants to merge 6 commits into
apache:masterfrom
GumpacG:tree
Open

Make Tree no longer extend HashMap#3448
GumpacG wants to merge 6 commits into
apache:masterfrom
GumpacG:tree

Conversation

@GumpacG

@GumpacG GumpacG commented Jun 3, 2026

Copy link
Copy Markdown
Contributor

Summary

Changes Tree from extends HashMap<T, Tree<T>> to a final class that
holds a Map internally and exposes a tree-shaped public API. Tree is no
longer a Map, which removes inherited Map methods whose behavior did not
align with tree semantics (e.g. size() returned root-entry count, not total
nodes; containsKey only checked immediate children).

Per the dev-list discussion: https://lists.apache.org/thread/o0nqh6kmrkdht531655p351ldjll045d

This is a breaking change targeted at TinkerPop 4.0.0.

Changes

Tree now exposes:

  • Navigation: rootNodes(), childAt(key) (throws if absent), hasChild(key),
    contains(value) (recursive), findSubtree(key) (recursive, Optional),
    getOrCreateChild(key) (construction primitive).
  • Structure: isLeaf(), nodeCount(), getNodesAtDepth(int),
    getTreesAtDepth(int), getLeafNodes(), getLeafTrees().
  • Composition/output: addTree(), splitParents(), prettyPrint(),
    structural equals/hashCode/toString.

Behavior fixes that fall out of the rewrite:

  • isLeaf() returns true on an empty tree instead of throwing.
  • getNodesAtDepth(0) returns the root nodes (now 0-based).

Internal callers migrated off the removed Map surface:

  • TreeStep / TreeSideEffectStep use getOrCreateChild during construction.
  • GraphBinary TreeSerializer and GraphSON TreeJacksonSerializer/
    Deserializer (V1–V4) iterate via rootNodes()/childAt() and rebuild via
    getOrCreateChild() + addTree().
  • DetachedFactory / ReferenceFactory gained an explicit Tree branch
    (previously handled via the instanceof Map path).

Breaking changes

  • Code treating a Tree as a Map (assignment to Map, instanceof Map,
    get/put/keySet/entrySet/size) no longer compiles.
  • Gremlin patterns that worked only because Tree was a Map (select(keys),
    count(local), unfold() on a tree) no longer compose; process the result
    client-side after next() or reshape the upstream traversal.

Speed and Memory

Analytical estimates (not benchmarked); the backing HashMap table and entries
are identical before and after.

  • Memory: ~16 B more per node — Tree now holds a HashMap instead of
    being one, adding one wrapper object per node. The map itself is unchanged.
  • Speed: comparable. Construction is slightly faster (getOrCreateChild is
    1–2 map ops vs. the old 3); childAt is one op slower (collapsible to one).
    Wire-format encoding is unchanged.

Testing

  • TreeTest rewritten against the new API (navigation, recursive
    containment/search, depth queries, leaf detection, null keys, structural
    equality, toString, pretty-print); TreeSupplierTest updated.

Assisted-by: Kiro: Claude Opus 4.8

Assisted-by: Kiro: Claude Opus 4.8
@GumpacG GumpacG marked this pull request as draft June 4, 2026 16:22
@codecov-commenter

codecov-commenter commented Jun 4, 2026

Copy link
Copy Markdown

Codecov Report

❌ Patch coverage is 89.07104% with 20 lines in your changes missing coverage. Please review.
✅ Project coverage is 76.32%. Comparing base (a28cd1f) to head (5a9c33e).
⚠️ Report is 23 commits behind head on master.

Files with missing lines Patch % Lines
...n/structure/io/graphson/GraphSONSerializersV2.java 0.00% 8 Missing ⚠️
...n/structure/io/graphson/GraphSONSerializersV3.java 0.00% 8 Missing ⚠️
...rpop/gremlin/process/traversal/step/util/Tree.java 96.29% 1 Missing and 2 partials ⚠️
...n/structure/io/graphson/GraphSONSerializersV4.java 87.50% 0 Missing and 1 partial ⚠️
Additional details and impacted files
@@             Coverage Diff              @@
##             master    #3448      +/-   ##
============================================
- Coverage     76.35%   76.32%   -0.04%     
- Complexity    13424    13541     +117     
============================================
  Files          1012     1017       +5     
  Lines         60341    61014     +673     
  Branches       7075     7147      +72     
============================================
+ Hits          46076    46571     +495     
- Misses        11548    11639      +91     
- Partials       2717     2804      +87     

☔ View full report in Codecov by Harness.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.
  • 📦 JS Bundle Analysis: Save yourself from yourself by tracking and limiting bundle sizes in JS merges.

@GumpacG GumpacG marked this pull request as ready for review June 5, 2026 00:07
Assisted-by: Kiro: Claude Opus 4.8
Comment thread docs/src/upgrade/release-4.x.x.asciidoc Outdated
tree['marko']
tree['marko']['josh']
tree.getObjectsAtDepth(3)
tree.childAt('marko')

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

would be nice if we didn't lose the groovy syntax. childAt is better than the Java alternative but less so for Groovy. a bit of Groovy sugar would bring that back.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I've updated it to include tree in the SugarLoader to include support the old syntax. Docs will just have the old syntax in the comment as an option with the sugar plugin.

|`tree.containsKey(key)` |`tree.hasChild(key)`
|`tree.keySet()` |`tree.rootNodes()`
|`tree.size()` |`tree.rootNodes().size()` for root entries, or `tree.nodeCount()` for total nodes
|`getObjectsAtDepth(d)` |`getNodesAtDepth(d)` (now 0-based: depth 0 returns the roots)

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

i know we wanted more tree terminology so we moved from "Objects" to "Node", but it just occurred to me that the return here isn't exactly a "node", is it? I wonder if that will be confusing.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I agree that it's not exactly a "node" however, renaming to "objects" may make other tree functions not very tree-like. For example, nodeCount() would have to be objectCount() and rootNodes() would have to be rootObjects(). I think it would be more conducive for tree users to keep "node".

Comment thread docs/src/upgrade/release-4.x.x.asciidoc
Comment thread CHANGELOG.asciidoc Outdated
spmallette added a commit to spmallette/tinkerpop that referenced this pull request Jun 11, 2026
Determines whether a PR's changed files form one coherent change or
multiple disconnected clusters. Uses the OLAP 'a' traversal source
with connectedComponent() step.

Tested on PR apache#3448: correctly identifies 21-file main cluster
(Tree + serializers + tests) and a 2-file satellite cluster
(test infrastructure with minor cascading changes).
GumpacG added 2 commits June 11, 2026 15:54
Assisted-by: Kiro: Claude Opus 4.8
@kenhuuu

kenhuuu commented Jun 12, 2026

Copy link
Copy Markdown
Contributor

VOTE +1

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

5 participants