Skip to content

Add CycloneDX SBOM generation for Ballerina packages#44574

Open
Ranvin36 wants to merge 4 commits into
ballerina-platform:masterfrom
Ranvin36:pack_sbom
Open

Add CycloneDX SBOM generation for Ballerina packages#44574
Ranvin36 wants to merge 4 commits into
ballerina-platform:masterfrom
Ranvin36:pack_sbom

Conversation

@Ranvin36
Copy link
Copy Markdown

@Ranvin36 Ranvin36 commented Apr 27, 2026

Purpose

Adds generation of a CycloneDX SBOM (bom.cdx.json) inside the bala for each package which will allow Xray vulnerability scanning.

Summary

This pull request adds CycloneDX SBOM (Software Bill of Materials) generation capability to Ballerina package archives. The implementation creates a bom.cdx.json file within each bala package, enabling dependency analysis and component tracking.

Changes

New SBOM Generation Utility

  • Introduces SbomGenerator class that converts package manifests (Ballerina.toml and dependencies.toml) into CycloneDX JSON format
  • Parses TOML configuration to extract project metadata and dependency information
  • Generates component entries with Package URL (PURL) identifiers for both Ballerina and Maven packages
  • Resolves dependency relationships and produces formatted CycloneDX JSON output

Integration into Bala Archive Creation

  • Enhanced BalaWriter to invoke SBOM generation during package archiving
  • Creates temporary artifacts during generation and ensures cleanup through exception handling
  • Embeds the generated SBOM file directly into the bala archive
  • Non-fatal failures in SBOM generation do not prevent bala packaging

Module Dependencies

  • Added toml4j module requirement to support TOML manifest parsing

Quality Assurance

  • Added comprehensive test suite covering basic SBOM generation and dependency resolution scenarios
  • Tests validate correct component inclusion and dependency relationships in generated SBOMs

Outcome

Packages now include machine-readable component and dependency information, enabling downstream tools to perform analysis of package contents and transitive dependencies.

@coderabbitai
Copy link
Copy Markdown

coderabbitai Bot commented Apr 27, 2026

📝 Walkthrough

Walkthrough

The changes introduce SBOM (Software Bill of Materials) generation for Bala archives. A new SbomGenerator class extracts package metadata and dependencies from Ballerina.toml and dependencies.toml, converting them to CycloneDX JSON format. The BalaWriter integrates this by invoking SBOM generation when packaging, writing the generated bom.cdx.json into the archive.

Changes

Cohort / File(s) Summary
SBOM Generation
compiler/ballerina-lang/src/main/java/io/ballerina/projects/util/SbomGenerator.java
New class that parses TOML manifests (Ballerina.toml and dependencies.toml), extracts project metadata and dependencies (Ballerina and Maven packages), constructs CycloneDX component maps keyed by PURL, resolves versionless PURLs by matching group+name, merges dependency nodes, and generates a CycloneDX JSON (bom.cdx.json) using GSON pretty-printing.
Bala Archive Integration
compiler/ballerina-lang/src/main/java/io/ballerina/projects/BalaWriter.java
Integrated SBOM generation into Bala packaging: added private helper method addBOM() that creates temporary directory, invokes SbomGenerator.generateBom(), locates generated .cdx.json, writes it to archive, and cleans up; failures are caught and ignored. Minor adjustment to addPackageDoc to use method invocation on project().sourceRoot().
Module Dependency
compiler/ballerina-lang/src/main/java/module-info.java
Added requires toml4j; declaration to support TOML manifest parsing.
SBOM Tests
compiler/ballerina-lang/src/test/java/io/ballerina/projects/util/SbomGeneratorTest.java
New TestNG test suite with two test methods: testGenerateBomBasic() validates SBOM generation from minimal Ballerina.toml; testGenerateBomWithDependencies() validates inclusion of Ballerina and Maven dependencies in the SBOM with correct PURL references and dependency links. Includes helper methods for locating generated .cdx.json and recursive directory cleanup.

Sequence Diagram(s)

sequenceDiagram
    participant BalaWriter
    participant SbomGenerator
    participant ManifestFile as Manifest<br/>(Ballerina.toml/<br/>dependencies.toml)
    participant FileSystem
    participant BalaArchive

    BalaWriter->>SbomGenerator: generateBom(manifestPath, outputPath)
    activate SbomGenerator
    
    SbomGenerator->>ManifestFile: read & parse TOML
    activate ManifestFile
    ManifestFile-->>SbomGenerator: project metadata & dependencies
    deactivate ManifestFile
    
    SbomGenerator->>SbomGenerator: extract packages & libraries
    SbomGenerator->>SbomGenerator: build PURL mappings<br/>(Ballerina & Maven)
    SbomGenerator->>SbomGenerator: resolve versionless PURLs
    SbomGenerator->>SbomGenerator: merge & construct<br/>CycloneDX JSON
    
    SbomGenerator->>FileSystem: write bom.cdx.json
    FileSystem-->>SbomGenerator: ✓ file written
    deactivate SbomGenerator
    
    BalaWriter->>FileSystem: locate bom.cdx.json
    FileSystem-->>BalaWriter: ✓ file path
    
    BalaWriter->>BalaArchive: add bom.cdx.json
    BalaArchive-->>BalaWriter: ✓ added to archive
    
    BalaWriter->>FileSystem: cleanup temporary files
Loading

Estimated code review effort

🎯 4 (Complex) | ⏱️ ~60 minutes

Poem

🐰 A SBOM in every archive we weave,
CycloneDX components, dependencies achieved!
From Ballerina manifests, PURLs unfold,
Maven links intertwine, a story retold.
In bom.cdx.json, our treasures reside,
With Bala packages bundled with pride! ✨

🚥 Pre-merge checks | ✅ 3 | ❌ 2

❌ Failed checks (2 warnings)

Check name Status Explanation Resolution
Description check ⚠️ Warning The PR description is severely incomplete, providing only the Purpose section while missing critical sections including Approach, Samples, Remarks, and the entire Check List. Complete the description by adding the Approach section explaining the implementation design, optional Samples section, Remarks for known issues/related PRs, and the Check List with checkboxes confirming contribution guidelines compliance and testing coverage.
Docstring Coverage ⚠️ Warning Docstring coverage is 56.00% which is insufficient. The required threshold is 80.00%. Write docstrings for the functions missing them to satisfy the coverage threshold.
✅ Passed checks (3 passed)
Check name Status Explanation
Title check ✅ Passed The title clearly and concisely summarizes the main change: adding CycloneDX SBOM generation for Ballerina packages, which directly aligns with all modified files.
Linked Issues check ✅ Passed Check skipped because no linked issues were found for this pull request.
Out of Scope Changes check ✅ Passed Check skipped because no linked issues were found for this pull request.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.

✨ Finishing Touches
🧪 Generate unit tests (beta)
  • Create PR with unit tests

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

Copy link
Copy Markdown

@coderabbitai coderabbitai Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 12

🧹 Nitpick comments (12)
compiler/ballerina-lang/src/main/java/io/ballerina/projects/BalaWriter.java (1)

146-146: Naming: prefer addBom over addBOM to match the surrounding camelCase convention.

Other helpers in this class are addBalaJson, addPackageJson, addDependenciesJson, etc. addBOM is the only all-caps acronym. Rename to addBom for consistency.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@compiler/ballerina-lang/src/main/java/io/ballerina/projects/BalaWriter.java`
at line 146, Rename the method addBOM to addBom in the BalaWriter class and
update all call sites (e.g., the call currently written as
addBOM(balaOutputStream)) to use addBom to follow camelCase convention; change
the method declaration name from addBOM to addBom and update any internal
references, Javadoc, and tests that refer to addBOM to the new name to avoid
compilation errors.
compiler/ballerina-lang/src/test/java/io/ballerina/projects/util/SbomGeneratorTest.java (3)

34-34: Test class should be final and follow the project's test conventions.

Minor: most other test classes in compiler/ballerina-lang/src/test/java are final and avoid wildcard imports (line 31: import java.util.*;). Aligning here keeps the pattern consistent.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In
`@compiler/ballerina-lang/src/test/java/io/ballerina/projects/util/SbomGeneratorTest.java`
at line 34, Make the test class final and remove the wildcard java import:
change the class declaration SbomGeneratorTest to "final class
SbomGeneratorTest" and replace the "import java.util.*;" wildcard with explicit
imports for only the types used in this file (e.g., List, Map, Optional, etc.)
so the test follows the project's conventions and matches other tests in the
directory.

95-97: Loose assertions weaken these tests.

buildPurl(...) is deterministic: when org="ballerina" (no dot, non-empty), the produced purl is always pkg:ballerina/ballerina/io@1.8.0. The contains(...) || contains(...) form lets a future refactor silently change the emitted format on one branch without the test noticing. Pin the exact expected purl and assert just that — and include the actual purl set in the failure message so debugging is fast:

🔧 Proposed fix
-                Assert.assertTrue(purls.contains("pkg:ballerina/ranvin/math_utils_v2@2.1.2"));
-                Assert.assertTrue(purls.contains("pkg:ballerina/ballerina/io@1.8.0") || purls.contains("pkg:ballerina/io@1.8.0"),
-                        "Expected ballerina io component");
-                Assert.assertTrue(purls.contains("pkg:maven/org.apache.logging.log4j/log4j-core@2.14.1"), "Expected maven log4j component");
+                Assert.assertTrue(purls.contains("pkg:ballerina/ranvin/math_utils_v2@2.1.2"), "purls=" + purls);
+                Assert.assertTrue(purls.contains("pkg:ballerina/ballerina/io@1.8.0"),
+                        "Expected ballerina io component, purls=" + purls);
+                Assert.assertTrue(purls.contains("pkg:maven/org.apache.logging.log4j/log4j-core@2.14.1"),
+                        "Expected maven log4j component, purls=" + purls);
@@
-                Assert.assertTrue(depsSet.contains("pkg:ballerina/ballerina/io@1.8.0") || depsSet.contains("pkg:ballerina/io@1.8.0"));
-                Assert.assertTrue(depsSet.contains("pkg:maven/org.apache.logging.log4j/log4j-core@2.14.1"));
+                Assert.assertTrue(depsSet.contains("pkg:ballerina/ballerina/io@1.8.0"), "depsSet=" + depsSet);
+                Assert.assertTrue(depsSet.contains("pkg:maven/org.apache.logging.log4j/log4j-core@2.14.1"),
+                        "depsSet=" + depsSet);

Also applies to: 109-110

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In
`@compiler/ballerina-lang/src/test/java/io/ballerina/projects/util/SbomGeneratorTest.java`
around lines 95 - 97, The assertions in SbomGeneratorTest use a loose
"contains(a) || contains(b)" pattern for purls produced by buildPurl; change
these to assert equality against the exact deterministic purl string (e.g.,
"pkg:ballerina/ballerina/io@1.8.0" and
"pkg:maven/org.apache.logging.log4j/log4j-core@2.14.1") and include the actual
purls set (the variable purls) in the assertion failure message so failures show
what was produced; update the two occurrences around the checks that reference
buildPurl/purls (the io and log4j assertions) accordingly.

34-34: Add coverage for failure / no-op paths.

The two tests cover the happy paths but miss several branches that are easy to regress on:

  • A Ballerina.toml missing required fields (e.g., no name/version) — should surface a TomlException per extractPackageMetadata.
  • A manifest where [package] is absent and componentsByPurl ends up empty — currently generateBom writes nothing and doesn't create an output file. BalaWriter.addBOM then silently produces no bom.cdx.json in the bala. Both behaviors deserve a pinned test.
  • Verifying that the output filename is normalized to end with .cdx.json even when the caller passed bom-output.json (the tests already pass this, but never assert the filename of the file actually written — they only locate any *.cdx.json).
  • A duplicate-dependency case across Ballerina.toml (Maven) and dependencies.toml (Ballerina) to lock in the de-duplication-by-purl behavior.
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In
`@compiler/ballerina-lang/src/test/java/io/ballerina/projects/util/SbomGeneratorTest.java`
at line 34, Add unit tests in SbomGeneratorTest that cover the failure/no-op
branches: (1) a test that feeds a Ballerina.toml missing required fields and
asserts extractPackageMetadata throws TomlException; (2) a test that builds a
manifest with no [package] section so generateBom yields an empty
componentsByPurl and verify that BalaWriter.addBOM produces no bom file (assert
no output file is created or that generateBom returns an expected no-op result);
(3) a test that requests an output filename like "bom-output.json" and asserts
the actual file written by generateBom/BalaWriter ends with ".cdx.json"
(normalize behavior); and (4) a duplicate-dependency test where the same
dependency appears in Ballerina.toml (Maven style) and dependencies.toml
(Ballerina style) and assert the final BOM deduplicates by purl. Locate and
exercise extractPackageMetadata, generateBom, and BalaWriter.addBOM when adding
these tests.
compiler/ballerina-lang/src/main/java/io/ballerina/projects/util/SbomGenerator.java (7)

32-37: Class Javadoc could use @since and tighten the description.

This class is part of a public package (io.ballerina.projects.util is exported in module-info.java), so the Javadoc is part of the API contract. Add @since and clarify that generateBom is the sole entry point and that it writes one *.cdx.json file per call. Currently the second sentence in the class Javadoc starts in the description block without a blank <p> paragraph break, which renders awkwardly.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In
`@compiler/ballerina-lang/src/main/java/io/ballerina/projects/util/SbomGenerator.java`
around lines 32 - 37, Update the SbomGenerator class Javadoc to include an
`@since` tag and reword/format the description: add a blank paragraph break (<p>)
before the second sentence, state that generateBom(...) is the sole public entry
point, and explicitly document that each call to generateBom writes a single
"*.cdx.json" CycloneDX BOM file; reference the class name SbomGenerator and the
public method generateBom in the Javadoc so the API contract is clear.

30-30: Avoid wildcard import.

import java.util.*; masks which collection types this class actually relies on (LinkedHashMap, LinkedHashSet, UUID, Comparator, etc.) and is at odds with the rest of io.ballerina.projects.util, which uses explicit imports. Same concern in SbomGeneratorTest.java line 31. Replace with explicit imports.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In
`@compiler/ballerina-lang/src/main/java/io/ballerina/projects/util/SbomGenerator.java`
at line 30, Replace the wildcard import in SbomGenerator.java (and the similar
wildcard in SbomGeneratorTest.java) with explicit java.util imports used by the
class: remove "import java.util.*;" and add the specific types referenced (for
example LinkedHashMap, LinkedHashSet, UUID, Comparator, List, Map, Set,
ArrayList — or whichever exact collection/utility types SbomGenerator and its
test actually use) so the file shows explicit imports instead of a wildcard.

264-265: Long lines exceed typical Ballerina compiler module style (120 cols).

The exception messages here run well past 120 columns on a single line, which doesn't match the surrounding style in compiler/ballerina-lang. Consider splitting them. Same applies to lines 75, 79, 98, 115 in this file. (Static analysis may flag these once Checkstyle is run.)

Also applies to: 313-314, 373-374

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In
`@compiler/ballerina-lang/src/main/java/io/ballerina/projects/util/SbomGenerator.java`
around lines 264 - 265, The long single-line exception messages in SbomGenerator
(e.g., the throw new IllegalArgumentException("Package metadata is
incomplete...") and similar throws around the class) exceed the 120-column
style; update each throw site in SbomGenerator to split the message into
concatenated string literals or build the message with String.format/variables
so no source line exceeds 120 columns (identify the sites by the throw new
IllegalArgumentException/RuntimeException statements and the long messages
currently around the class, including the ones you noted at the earlier line
ranges) and ensure each split preserves the exact message content when
concatenated at runtime.

78-82: Catching Exception is too broad and risks masking programming errors.

These catch (Exception e) blocks turn any RuntimeException (NPE, ClassCastException, etc.) thrown by the TOML parser or by your own helpers into a TomlException, hiding the original location. Narrow to the exceptions toml4j actually throws (IllegalStateException for type mismatches, plus IOException from the underlying read — already declared via read(File)), or at the very least re-throw Error/RuntimeException when they're not parser errors.

Note: line 78 also catches IllegalArgumentException, which the previous catch on line 74 already handles — so the first catch is partially redundant. Coordinating the catches makes the intent clearer.

Also applies to: 97-101, 113-118

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In
`@compiler/ballerina-lang/src/main/java/io/ballerina/projects/util/SbomGenerator.java`
around lines 78 - 82, The catch blocks in SbomGenerator that currently use
"catch (Exception e)" (and a redundant catch for IllegalArgumentException) are
too broad; change them to only handle the TOML/parser exceptions you expect
(e.g., IllegalStateException and IOException from toml4j or the file read),
preserve the original cause by wrapping those specific exceptions into a
TomlException, and rethrow unchecked Throwables (RuntimeException and Error)
unchanged so you don't mask programming errors; also remove the duplicate
IllegalArgumentException catch and coordinate the remaining catches in the
methods of class SbomGenerator where TomlException is constructed so only
parser/IO exceptions are converted.

597-597: Heuristic org.contains(".") to decide Maven vs Ballerina is brittle.

The PURL kind is being inferred from whether the org string contains a dot. This conflates two different concerns:

  • A Ballerina org that happens to contain a . (the validator may evolve over time) is silently mis-categorized as Maven.
  • A Maven groupId without a dot (rare but valid — junit:junit:4.x historical examples) is mis-categorized as Ballerina.

Pass the kind explicitly. The two callers already know which kind they're emitting:

  • extractMavenDependencies → always Maven.
  • extractPackageMetadata / extractBallerinaDependencies → always Ballerina.
🔧 Sketch
private static String buildBallerinaPurl(String org, String name, String version) { ... }
private static String buildMavenPurl(String group, String artifactId, String version) { ... }
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In
`@compiler/ballerina-lang/src/main/java/io/ballerina/projects/util/SbomGenerator.java`
at line 597, The heuristic using "org.contains(\".\")" to set isMavenDependency
in SbomGenerator is brittle; remove that check and change the PURL construction
to two explicit builders (e.g., buildBallerinaPurl(String org, String name,
String version) and buildMavenPurl(String group, String artifactId, String
version)) or add an explicit kind parameter to the existing PURL builder, then
update the callers: have extractMavenDependencies always call the Maven builder
(or pass "maven") and have extractPackageMetadata/extractBallerinaDependencies
always call the Ballerina builder (or pass "ballerina"), ensuring the logic no
longer relies on org.contains(".") and that PURL creation uses the correct
builder based on the caller.

522-522: Consider bumping specVersion to a current CycloneDX version.

CycloneDX 1.4 is from 2022; 1.5 (2023), 1.6 (2024), and 1.7 (2025) are widely supported by tools like Xray, Dependency-Track, and Trivy. Since this generator is brand new, picking 1.5–1.7 avoids an immediate follow-up bump and unlocks fields like lifecycles, licenses w/ SPDX expressions, and improved bom-ref handling. While newer versions maintain practical backward compatibility (tools ignore unknown fields), strict schema-compliant 1.4 consumers may fail; weigh this against the benefits of staying current.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In
`@compiler/ballerina-lang/src/main/java/io/ballerina/projects/util/SbomGenerator.java`
at line 522, Update the CycloneDX specVersion set in SbomGenerator (the
bom.put("specVersion", ...) call in SbomGenerator.java) to a current supported
version (e.g., "1.7" or at least "1.6"/"1.5") by changing the literal value;
after updating, run/adjust any SBOM schema validations or unit tests that assert
the previous "1.4" string to ensure compatibility with the newer spec and any
new fields you plan to emit.

38-38: Consider adopting cyclonedx-core-java (v12.1.0) to replace hand-rolled SBOM generation.

This file implements ~840 lines of functionality—PURL construction via string concatenation, PURL parsing via manual string operations (.indexOf(), .substring(), .split()), component model, JSON serialization—that org.cyclonedx:cyclonedx-core-java already provides out-of-the-box. The library includes Bom, Component, Dependency, BomGenerator, BomJsonGenerator classes, and PackageURL (via com.github.package-url:packageurl-java), all schema-validated against the official CycloneDX specification.

Benefits of migrating:

  • Eliminates brittle string parsing in parsePurlToComponent and parseDependencyString (replace with PackageURL).
  • Guarantees output validates against the published CycloneDX JSON schema (currently targeting v1.4; library supports up to v1.6.1).
  • Spec version upgrades become trivial.
  • Reduces codebase by ~600 LOC.

This is a significant refactor worth prioritizing before the code accumulates more dependents. Consider opening a tracking issue if not done immediately.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In
`@compiler/ballerina-lang/src/main/java/io/ballerina/projects/util/SbomGenerator.java`
at line 38, Replace the hand-rolled SBOM generation in SbomGenerator with
cyclonedx-core-java v12.1.0: stop building PURLs and parsing them manually in
methods like parsePurlToComponent and parseDependencyString and instead use
com.github.packageurl.PackageURL to parse package URLs and
org.cyclonedx.models.Bom / Component / Dependency and
org.cyclonedx.generators.json.BomJsonGenerator to build and serialize the BOM;
update SbomGenerator to construct Component instances from PackageURL (mapping
type/name/version/namespace), wire dependencies via Dependency objects, and emit
schema-compliant JSON via BomJsonGenerator (ensuring you remove the manual
string concatenation/parsing logic and replace it with these library calls).
compiler/ballerina-lang/src/main/java/module-info.java (1)

23-23: Use the existing TOML parsing APIs instead of pulling in toml4j.

The codebase already has io.ballerina.toml and related APIs (BallerinaToml, DependencyManifest, PackageManifest) that are actively used throughout the compiler module for parsing manifests. Introducing toml4j adds unnecessary dependency surface, divergence in TOML parsing behavior, and supply-chain risk by pinning a second TOML library.

Additionally, toml4j (com.moandjiezana.toml) is unmaintained (last release several years ago) and does not ship as a proper JPMS named module. The requires toml4j; declaration in module-info.java will rely on an automatic module name, which is fragile and not recommended for a long-lived compiler module.

Refactor to use the existing manifest APIs or io.ballerina.toml directly through PackageContext, which also gives you access to resolved/effective dependency versions instead of re-parsing raw TOML.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@compiler/ballerina-lang/src/main/java/module-info.java` at line 23, The
module declares a new requires toml4j in module-info.java which introduces an
unnecessary, unmaintained TOML dependency; replace uses of com.moandjiezana.toml
by switching callers to the existing compiler manifest APIs (e.g. BallerinaToml,
DependencyManifest, PackageManifest) or use PackageContext to obtain
parsed/effective manifest data instead of re-parsing with toml4j; update any
code that referenced toml4j parsing to call the appropriate API
(BallerinaToml.parse / PackageContext.getManifest / DependencyManifest
accessors) and remove the requires toml4j line from module-info.java so the
module relies on the in-tree io.ballerina.toml implementation.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@compiler/ballerina-lang/src/main/java/io/ballerina/projects/BalaWriter.java`:
- Around line 187-188: The ternary that computes the local variable pkg (String
pkg = packageContext != null ? packageContext.packageName().toString() :
"<unknown";) is dead and unused; remove that declaration and the null-check, and
simplify the exception throw at the end of the SBOM generation block (the throw
new ProjectException(..., se) line) so it does not depend on pkg or redundant
packageContext checks—leave the throw using se.getMessage() and se as before.
Target the BalaWriter code paths around packageContext and the SBOM generation
failure throw to remove the unused pkg variable and any unnecessary null logic.
- Around line 192-194: Replace the direct call to Files.list(tmpDir) with a
try-with-resources to ensure the directory stream is closed: declare
Optional<Path> generated outside the block, then use try (Stream<Path> stream =
Files.list(tmpDir)) { generated = stream.filter(p ->
p.getFileName().toString().toLowerCase().endsWith(".cdx.json")).findFirst(); }
so the stream returned by Files.list(tmpDir) is closed; update the BalaWriter
code where generated is computed to use this pattern.
- Around line 183-218: The current inner catch around SbomGenerator.generateBom
in BalaWriter (catch (TomlException | IOException se) that throws new
ProjectException) will abort packaging; change this to best-effort: replace the
throw with logging a warning (include packageContext.packageName() if non-null
and se.getMessage()) and skip adding SBOM (e.g., return from the SBOM-generation
block or continue) so bala packaging proceeds. Also ensure the outer catch
(IOException e) no longer silently swallows errors — log an error/warning with
the exception details (instead of empty comment) so real IO failures (like
Files.createTempDirectory) are visible. Target symbols:
SbomGenerator.generateBom, the inner catch handling the variable se, the thrown
ProjectException, and the outer catch (IOException e).
- Around line 162-178: Replace hardcoded manifest filenames in BalaWriter (the
block that computes manifestPath using packageContext.project().sourceRoot())
with ProjectConstants.BALLERINA_TOML and ProjectConstants.DEPENDENCIES_TOML;
specifically, look up the canonical ProjectConstants.BALLERINA_TOML first, then
fall back to ProjectConstants.DEPENDENCIES_TOML (and avoid checking a
lowercase-only variant), so the code uses the established constants and the
canonical names in the checks and Files.exists calls.

In
`@compiler/ballerina-lang/src/main/java/io/ballerina/projects/util/SbomGenerator.java`:
- Line 87: The code in SbomGenerator uses
manifestPath.getParent().resolve("dependencies.toml") which will miss the
canonical Ballerina filename on case-sensitive filesystems; change the lookup to
use the canonical constant ProjectConstants.DEPENDENCIES_TOML (or otherwise
check both "Dependencies.toml" and "dependencies.toml") when resolving the
dependencies file, updating any usage around the manifestPath variable in
SbomGenerator to rely on ProjectConstants.DEPENDENCIES_TOML to ensure the
correct file is found.
- Around line 122-201: The current branch quietly does nothing when
componentsByPurl is empty, resulting in no SBOM file; update SbomGenerator so
that when componentsByPurl.isEmpty() it still builds and writes a minimal
CycloneDX BOM JSON (with bomFormat, specVersion, serialNumber, metadata and
empty components/dependencies arrays) by calling buildBomJson(...) with an empty
components map and empty dependency list, then writing to the path produced by
ensureCdxJsonExtension(outputPath) (creating parent dirs and writing to
finalOutputPath as done in the existing try-with-resources). Also, add a clear
signal (either a return value or an exception) or ensure BalaWriter logs a
warning when the generator writes just the minimal BOM so callers can
distinguish "generated empty SBOM" from "skipped".
- Around line 61-63: The code dereferences manifestPath.getFileName() and
manifestPath.getParent() unguarded, which can NPE for relative/root paths; in
SbomGenerator.generateBom (and any constructor/usage creating ballerinaTomlPath)
normalize or guard manifestPath first: convert manifestPath to an
absolute/normalized Path (e.g., manifestPath =
manifestPath.toAbsolutePath().normalize()) or check for nulls before use, then
compute ballerinaTomlPath by first ensuring getFileName()!=null and using
getParent()!=null (fallback to manifestPath itself or Paths.get(".") as
appropriate) before calling resolve("Ballerina.toml"); update both the
ballerinaTomlPath assignment and the similar logic at the other occurrence
(around line 87) to use the same null-safe/normalized approach.
- Around line 198-200: Replace the PrintWriter usage that swallows IOExceptions
with the underlying BufferedWriter so write errors propagate: in SbomGenerator
(the method that opens finalOutputPath) remove the try-with-resources creating
new PrintWriter(Files.newBufferedWriter(finalOutputPath)) and instead use the
BufferedWriter returned by Files.newBufferedWriter(finalOutputPath) directly to
write bomJson (and let IOException propagate); also remove the now-unused import
java.io.PrintWriter. Ensure the resource is closed via try-with-resources and no
checkError() calls are used.
- Around line 822-833: The if-block in SbomGenerator that checks "if
(s.contains(\"/\") && s.contains(\"@\"))" is unreachable because earlier logic
already returns for any s containing "@"; remove this entire branch (the check
and its local variables atPos, left, version, slash, group, artifact and the
call to buildPurl) to eliminate dead code and keep the remaining parsing logic
intact in SbomGenerator.java.
- Line 518: The current SBOM serial is non-deterministic because SbomGenerator
builds serial with "urn:uuid:"+UUID.randomUUID(), which breaks reproducible bala
builds; change SbomGenerator to compute the serial deterministically from the
package coordinates (e.g., organization, package name, version and any module
identifier) by creating a stable byte representation of those fields and
deriving a UUID via UUID.nameUUIDFromBytes(...) or by hashing (SHA-1/SHA-256)
the concatenated coordinates and formatting as
"urn:uuid:"+UUID.nameUUIDFromBytes(hashBytes). Replace the UUID.randomUUID()
usage where the local variable serial is assigned so the same source always
yields the same serialNumber in bom.cdx.json. Ensure you use the same stable
field order and encoding when building the input bytes.
- Around line 289-314: The code currently throws IllegalArgumentException when a
Maven coord is incomplete (version null), breaking valid path-only dependencies;
update SbomGenerator so it only requires groupId and artifactId (allowing
version to be null) and do not throw for path-only entries: in the block that
currently checks if (groupId != null && artifactId != null && version != null)
change the condition to require only groupId and artifactId, call
buildPurl(groupId, artifactId, version) which already tolerates null version,
populate componentsByPurl and dependencyNodes as before, and for entries that
lack groupId/artifactId (path-only platform.javaXX.dependency) simply
skip/continue instead of throwing; keep extractDependencyRefs, componentsByPurl,
and dependencyNodes usage intact.
- Line 275: Replace the hardcoded loop over new
String[]{"java8","java11","java17","java21"} in SbomGenerator (the for loop that
processes javaVersion) with code that discovers Java platforms dynamically:
obtain the "platform" table from the parsed project model, iterate its child
keys, filter keys that start with "java" (e.g., "java25"), collect them (skip
null table), and iterate that list so SBOM generation (the existing logic that
consumes javaVersion) covers any newly added JDK platforms instead of silently
missing them.

---

Nitpick comments:
In `@compiler/ballerina-lang/src/main/java/io/ballerina/projects/BalaWriter.java`:
- Line 146: Rename the method addBOM to addBom in the BalaWriter class and
update all call sites (e.g., the call currently written as
addBOM(balaOutputStream)) to use addBom to follow camelCase convention; change
the method declaration name from addBOM to addBom and update any internal
references, Javadoc, and tests that refer to addBOM to the new name to avoid
compilation errors.

In
`@compiler/ballerina-lang/src/main/java/io/ballerina/projects/util/SbomGenerator.java`:
- Around line 32-37: Update the SbomGenerator class Javadoc to include an `@since`
tag and reword/format the description: add a blank paragraph break (<p>) before
the second sentence, state that generateBom(...) is the sole public entry point,
and explicitly document that each call to generateBom writes a single
"*.cdx.json" CycloneDX BOM file; reference the class name SbomGenerator and the
public method generateBom in the Javadoc so the API contract is clear.
- Line 30: Replace the wildcard import in SbomGenerator.java (and the similar
wildcard in SbomGeneratorTest.java) with explicit java.util imports used by the
class: remove "import java.util.*;" and add the specific types referenced (for
example LinkedHashMap, LinkedHashSet, UUID, Comparator, List, Map, Set,
ArrayList — or whichever exact collection/utility types SbomGenerator and its
test actually use) so the file shows explicit imports instead of a wildcard.
- Around line 264-265: The long single-line exception messages in SbomGenerator
(e.g., the throw new IllegalArgumentException("Package metadata is
incomplete...") and similar throws around the class) exceed the 120-column
style; update each throw site in SbomGenerator to split the message into
concatenated string literals or build the message with String.format/variables
so no source line exceeds 120 columns (identify the sites by the throw new
IllegalArgumentException/RuntimeException statements and the long messages
currently around the class, including the ones you noted at the earlier line
ranges) and ensure each split preserves the exact message content when
concatenated at runtime.
- Around line 78-82: The catch blocks in SbomGenerator that currently use "catch
(Exception e)" (and a redundant catch for IllegalArgumentException) are too
broad; change them to only handle the TOML/parser exceptions you expect (e.g.,
IllegalStateException and IOException from toml4j or the file read), preserve
the original cause by wrapping those specific exceptions into a TomlException,
and rethrow unchecked Throwables (RuntimeException and Error) unchanged so you
don't mask programming errors; also remove the duplicate
IllegalArgumentException catch and coordinate the remaining catches in the
methods of class SbomGenerator where TomlException is constructed so only
parser/IO exceptions are converted.
- Line 597: The heuristic using "org.contains(\".\")" to set isMavenDependency
in SbomGenerator is brittle; remove that check and change the PURL construction
to two explicit builders (e.g., buildBallerinaPurl(String org, String name,
String version) and buildMavenPurl(String group, String artifactId, String
version)) or add an explicit kind parameter to the existing PURL builder, then
update the callers: have extractMavenDependencies always call the Maven builder
(or pass "maven") and have extractPackageMetadata/extractBallerinaDependencies
always call the Ballerina builder (or pass "ballerina"), ensuring the logic no
longer relies on org.contains(".") and that PURL creation uses the correct
builder based on the caller.
- Line 522: Update the CycloneDX specVersion set in SbomGenerator (the
bom.put("specVersion", ...) call in SbomGenerator.java) to a current supported
version (e.g., "1.7" or at least "1.6"/"1.5") by changing the literal value;
after updating, run/adjust any SBOM schema validations or unit tests that assert
the previous "1.4" string to ensure compatibility with the newer spec and any
new fields you plan to emit.
- Line 38: Replace the hand-rolled SBOM generation in SbomGenerator with
cyclonedx-core-java v12.1.0: stop building PURLs and parsing them manually in
methods like parsePurlToComponent and parseDependencyString and instead use
com.github.packageurl.PackageURL to parse package URLs and
org.cyclonedx.models.Bom / Component / Dependency and
org.cyclonedx.generators.json.BomJsonGenerator to build and serialize the BOM;
update SbomGenerator to construct Component instances from PackageURL (mapping
type/name/version/namespace), wire dependencies via Dependency objects, and emit
schema-compliant JSON via BomJsonGenerator (ensuring you remove the manual
string concatenation/parsing logic and replace it with these library calls).

In `@compiler/ballerina-lang/src/main/java/module-info.java`:
- Line 23: The module declares a new requires toml4j in module-info.java which
introduces an unnecessary, unmaintained TOML dependency; replace uses of
com.moandjiezana.toml by switching callers to the existing compiler manifest
APIs (e.g. BallerinaToml, DependencyManifest, PackageManifest) or use
PackageContext to obtain parsed/effective manifest data instead of re-parsing
with toml4j; update any code that referenced toml4j parsing to call the
appropriate API (BallerinaToml.parse / PackageContext.getManifest /
DependencyManifest accessors) and remove the requires toml4j line from
module-info.java so the module relies on the in-tree io.ballerina.toml
implementation.

In
`@compiler/ballerina-lang/src/test/java/io/ballerina/projects/util/SbomGeneratorTest.java`:
- Line 34: Make the test class final and remove the wildcard java import: change
the class declaration SbomGeneratorTest to "final class SbomGeneratorTest" and
replace the "import java.util.*;" wildcard with explicit imports for only the
types used in this file (e.g., List, Map, Optional, etc.) so the test follows
the project's conventions and matches other tests in the directory.
- Around line 95-97: The assertions in SbomGeneratorTest use a loose
"contains(a) || contains(b)" pattern for purls produced by buildPurl; change
these to assert equality against the exact deterministic purl string (e.g.,
"pkg:ballerina/ballerina/io@1.8.0" and
"pkg:maven/org.apache.logging.log4j/log4j-core@2.14.1") and include the actual
purls set (the variable purls) in the assertion failure message so failures show
what was produced; update the two occurrences around the checks that reference
buildPurl/purls (the io and log4j assertions) accordingly.
- Line 34: Add unit tests in SbomGeneratorTest that cover the failure/no-op
branches: (1) a test that feeds a Ballerina.toml missing required fields and
asserts extractPackageMetadata throws TomlException; (2) a test that builds a
manifest with no [package] section so generateBom yields an empty
componentsByPurl and verify that BalaWriter.addBOM produces no bom file (assert
no output file is created or that generateBom returns an expected no-op result);
(3) a test that requests an output filename like "bom-output.json" and asserts
the actual file written by generateBom/BalaWriter ends with ".cdx.json"
(normalize behavior); and (4) a duplicate-dependency test where the same
dependency appears in Ballerina.toml (Maven style) and dependencies.toml
(Ballerina style) and assert the final BOM deduplicates by purl. Locate and
exercise extractPackageMetadata, generateBom, and BalaWriter.addBOM when adding
these tests.
🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

  • Push a commit to this branch (recommended)
  • Create a new PR with the fixes

ℹ️ Review info
⚙️ Run configuration

Configuration used: Organization UI

Review profile: CHILL

Plan: Pro

Run ID: 1253ccc6-5297-4c6d-8e7c-986e0d26851e

📥 Commits

Reviewing files that changed from the base of the PR and between f59068a and 0e9d5db.

📒 Files selected for processing (4)
  • compiler/ballerina-lang/src/main/java/io/ballerina/projects/BalaWriter.java
  • compiler/ballerina-lang/src/main/java/io/ballerina/projects/util/SbomGenerator.java
  • compiler/ballerina-lang/src/main/java/module-info.java
  • compiler/ballerina-lang/src/test/java/io/ballerina/projects/util/SbomGeneratorTest.java

private static void extractMavenDependencies(Toml toml, Map<String, Map<String, Object>> componentsByPurl,
List<Map<String, Object>> dependencyNodes) {
// Look for all java/maven dependencies
for (String javaVersion : new String[]{"java8", "java11", "java17", "java21"}) {
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🛠️ Refactor suggestion | 🟠 Major

🧩 Analysis chain

🏁 Script executed:

#!/bin/bash
# Confirm what platform.* targets are actually used in the codebase / sample projects.
rg -n --type=toml '^\[platform\.' | head -40

Repository: ballerina-platform/ballerina-lang

Length of output: 1775


🏁 Script executed:

# First, locate and examine SbomGenerator.java around the target line
find . -name "SbomGenerator.java" -type f

Repository: ballerina-platform/ballerina-lang

Length of output: 163


🏁 Script executed:

# Once located, examine the code around line 275 with broader context
cat -n compiler/ballerina-lang/src/main/java/io/ballerina/projects/util/SbomGenerator.java | sed -n '250,300p'

Repository: ballerina-platform/ballerina-lang

Length of output: 3176


🏁 Script executed:

# Check if there's any documentation or configuration about supported platforms
rg -n "java8|java11|java17|java21|supported.*platform" compiler/ballerina-lang/src/main/java/io/ballerina/projects/util/SbomGenerator.java -A 2 -B 2 | head -80

Repository: ballerina-platform/ballerina-lang

Length of output: 449


🏁 Script executed:

# Search the broader codebase to see how other code handles platform discovery
rg -n "platform\." compiler/ballerina-lang/src/main/java/io/ballerina/projects/ -A 1 -B 1 | grep -E "(java8|java11|java17|java21|keySet|toMap)" | head -30

Repository: ballerina-platform/ballerina-lang

Length of output: 254


🏁 Script executed:

# Check if java8 is referenced elsewhere in the codebase as a supported platform
rg -i "java8" compiler/ballerina-lang/src/main/java/io/ballerina/projects/ -B 2 -A 2 | head -60

Repository: ballerina-platform/ballerina-lang

Length of output: 849


🏁 Script executed:

# Look for any documentation or constants about supported platforms
rg -n "supported.*platform|platform.*version|java.*version" compiler/ballerina-lang/src/main/java/io/ballerina/ --type=java | grep -i "8\|11\|17\|21\|version" | head -30

Repository: ballerina-platform/ballerina-lang

Length of output: 2156


🏁 Script executed:

# Check what other keys exist under "platform" table in test TOML files
rg -A 10 '^\[platform\.' compiler/ballerina-lang/src/test/resources/ballerina-toml/ | grep -E '^\[|^[a-z]' | head -50

Repository: ballerina-platform/ballerina-lang

Length of output: 3508


🏁 Script executed:

# Check the git history to see if java8 was recently removed or deprecated
git log --oneline -p compiler/ballerina-lang/src/main/java/io/ballerina/projects/util/SbomGenerator.java | grep -B 5 -A 5 "java8\|java11\|java17\|java21" | head -80

Repository: ballerina-platform/ballerina-lang

Length of output: 840


🏁 Script executed:

# Look for configuration files or documentation about supported versions
fd -e properties -e xml -e yaml -e yml . compiler/ballerina-lang/src/main/java/io/ballerina/ | xargs grep -l "java\|platform\|version" 2>/dev/null | head -10

Repository: ballerina-platform/ballerina-lang

Length of output: 59


🏁 Script executed:

# Check if there are any tests for the extractMavenDependencies method
rg -n "extractMavenDependencies" compiler/ballerina-lang/src/test/ -B 2 -A 10 | head -80

Repository: ballerina-platform/ballerina-lang

Length of output: 59


🏁 Script executed:

# Look for any issue tracking or comments about platform versions
rg -n "TODO|FIXME|BUG|XXX" compiler/ballerina-lang/src/main/java/io/ballerina/projects/util/SbomGenerator.java

Repository: ballerina-platform/ballerina-lang

Length of output: 59


🏁 Script executed:

# Check what the actual platform table looks like in a real test - get more context
cat -n compiler/ballerina-lang/src/test/resources/ballerina-toml/valid-ballerina.toml

Repository: ballerina-platform/ballerina-lang

Length of output: 1012


Hardcoded Java platform list will silently drop new JDK dependencies from SBOM.

The current code only checks ["java8", "java11", "java17", "java21"]. When Ballerina adds support for new JDKs (e.g., java25), Maven dependencies under platform.java25 will be missed by SBOM generation—silently—without error or warning. This defeats Xray scanning and other security scanning tools that rely on complete dependency lists.

Discover platforms dynamically by iterating the platform table's children with a java* prefix filter. This is safe because all test TOML files use only [platform.java*] entries (no non-Java platform keys to accidentally include).

🔧 Proposed fix
-        // Look for all java/maven dependencies
-        for (String javaVersion : new String[]{"java8", "java11", "java17", "java21"}) {
-            String platformKey = "platform." + javaVersion;
-            Toml platformTable = toml.getTable(platformKey);
-            if (platformTable != null) {
+        // Look for all java/maven dependencies under any platform.java* table
+        Toml platformsTable = toml.getTable("platform");
+        if (platformsTable != null) {
+            Map<String, Object> platformMap = platformsTable.toMap();
+            for (String javaVersion : platformMap.keySet()) {
+                if (!javaVersion.startsWith("java")) {
+                    continue;
+                }
+                Toml platformTable = platformsTable.getTable(javaVersion);
+                if (platformTable == null) {
+                    continue;
+                }
                 // Get the dependency(s) from the platform table
                 List<Toml> depTables = getTomlTables(platformTable, "dependency");
                 ...
             }
         }
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In
`@compiler/ballerina-lang/src/main/java/io/ballerina/projects/util/SbomGenerator.java`
at line 275, Replace the hardcoded loop over new
String[]{"java8","java11","java17","java21"} in SbomGenerator (the for loop that
processes javaVersion) with code that discovers Java platforms dynamically:
obtain the "platform" table from the parsed project model, iterate its child
keys, filter keys that start with "java" (e.g., "java25"), collect them (skip
null table), and iterate that list so SBOM generation (the existing logic that
consumes javaVersion) covers any newly added JDK platforms instead of silently
missing them.

Comment on lines +289 to +314
if (groupId != null && artifactId != null && version != null) {
String purl = buildPurl(groupId, artifactId, version);
Map<String, Object> comp = new LinkedHashMap<>();
comp.put("type", "library");
comp.put("group", groupId);
comp.put("name", artifactId);
comp.put("version", version);
comp.put("purl", purl);
componentsByPurl.putIfAbsent(purl, comp);

Object depsObj = depMap.get("dependencies");
if (depsObj == null) {
depsObj = depMap.get("dependsOn");
}

// Extracts the dependsOn dependencies and store them in the form of purl's
List<String> depRefs = extractDependencyRefs(depsObj, componentsByPurl);

// Bring ref & dependsOn dependencies to a single data structure
Map<String, Object> depNode = new LinkedHashMap<>();
depNode.put("ref", purl);
depNode.put("dependsOn", depRefs);
dependencyNodes.add(depNode);
} else {
throw new IllegalArgumentException("Maven dependency is incomplete in the TOML file. 'groupId', 'artifactId', and 'version' are required fields.");
}
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major

🧩 Analysis chain

🏁 Script executed:

#!/bin/bash
# Confirm whether path-only platform.javaXX.dependency entries exist in test fixtures.
rg -n --type=toml -B1 -A4 '^\[\[platform\.java[0-9]+\.dependency\]\]' | head -80
rg -n --type=toml -B1 -A4 '\bpath\s*=' compiler/ballerina-lang/src/test 2>/dev/null | head -40

Repository: ballerina-platform/ballerina-lang

Length of output: 12597


🏁 Script executed:

cat -n compiler/ballerina-lang/src/main/java/io/ballerina/projects/util/SbomGenerator.java | sed -n '280,320p'

Repository: ballerina-platform/ballerina-lang

Length of output: 2492


🏁 Script executed:

rg -A 20 'private.*buildPurl' compiler/ballerina-lang/src/main/java/io/ballerina/projects/util/SbomGenerator.java

Repository: ballerina-platform/ballerina-lang

Length of output: 950


🏁 Script executed:

rg -n 'version != null' compiler/ballerina-lang/src/main/java/io/ballerina/projects/util/SbomGenerator.java

Repository: ballerina-platform/ballerina-lang

Length of output: 332


🏁 Script executed:

rg -n 'SbomGenerator' compiler/ballerina-lang/src/test --type=java | head -20

Repository: ballerina-platform/ballerina-lang

Length of output: 539


🏁 Script executed:

fd -name '*SbomGenerator*Test*' compiler/ballerina-lang/src/test

Repository: ballerina-platform/ballerina-lang

Length of output: 308


🏁 Script executed:

rg -B5 -A10 'path.*=' compiler/ballerina-lang/src/test/resources/ballerina-toml/local-dependencies.toml

Repository: ballerina-platform/ballerina-lang

Length of output: 557


🏁 Script executed:

cat -n compiler/ballerina-lang/src/test/java/io/ballerina/projects/util/SbomGeneratorTest.java | head -100

Repository: ballerina-platform/ballerina-lang

Length of output: 5416


🏁 Script executed:

rg -B2 -A5 'path.*=' compiler/ballerina-lang/src/test/resources/ballerina-toml/ --type=toml | grep -E '(^\[|path|groupId|artifactId|version)' | head -50

Repository: ballerina-platform/ballerina-lang

Length of output: 5831


🏁 Script executed:

# Check if there are any test fixtures with ONLY path and no groupId/artifactId
rg --type=toml '^\[\[platform\.java' compiler/ballerina-lang/src/test/resources -A5 | grep -B5 'path =' | grep -v 'groupId\|artifactId\|version'

Repository: ballerina-platform/ballerina-lang

Length of output: 6809


🏁 Script executed:

cat compiler/ballerina-lang/src/test/resources/ballerina-toml/additional-props-ballerina.toml

Repository: ballerina-platform/ballerina-lang

Length of output: 739


🏁 Script executed:

cat compiler/ballerina-lang/src/test/resources/ballerina-toml/platform-without-org-name-version.toml

Repository: ballerina-platform/ballerina-lang

Length of output: 449


🏁 Script executed:

# Check if SbomGenerator test or other tests explicitly test path-only entries
rg -B5 -A5 'path.*only' compiler/ballerina-lang/src/test --type=java

Repository: ballerina-platform/ballerina-lang

Length of output: 59


Skip or gracefully handle path-only platform.javaXX.dependency entries in SBOM generation.

Throwing IllegalArgumentException when version is null breaks valid Ballerina manifests that use path instead of Maven coordinates. Test fixtures (e.g., additional-props-ballerina.toml, project_no_class_def_found/Ballerina.toml) confirm path-only entries are legitimate; the current code makes them un-packageable.

The buildPurl() method already handles null version safely (it checks if (version != null && !version.isEmpty())), so either skip incomplete entries silently or conditionally build components for path-only entries without requiring Maven coordinates.

🔧 Proposed fix
-                    if (groupId != null && artifactId != null && version != null) {
+                    if (groupId != null && artifactId != null) {
                         String purl = buildPurl(groupId, artifactId, version);
                         ...
-                    } else {
-                        throw new IllegalArgumentException("Maven dependency is incomplete in the TOML file. 'groupId', 'artifactId', and 'version' are required fields.");
-                    }
+                    }
+                    // Skip path-based or URL-based platform entries without Maven coordinates.
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In
`@compiler/ballerina-lang/src/main/java/io/ballerina/projects/util/SbomGenerator.java`
around lines 289 - 314, The code currently throws IllegalArgumentException when
a Maven coord is incomplete (version null), breaking valid path-only
dependencies; update SbomGenerator so it only requires groupId and artifactId
(allowing version to be null) and do not throw for path-only entries: in the
block that currently checks if (groupId != null && artifactId != null && version
!= null) change the condition to require only groupId and artifactId, call
buildPurl(groupId, artifactId, version) which already tolerates null version,
populate componentsByPurl and dependencyNodes as before, and for entries that
lack groupId/artifactId (path-only platform.javaXX.dependency) simply
skip/continue instead of throwing; keep extractDependencyRefs, componentsByPurl,
and dependencyNodes usage intact.

Comment on lines +822 to +833
// Maven slash form or simple path-like: group/artifact@version
if (s.contains("/") && s.contains("@")) {
int atPos = s.indexOf('@');
String left = s.substring(0, atPos);
String version = s.substring(atPos + 1);
int slash = left.indexOf('/');
if (slash > 0) {
String group = left.substring(0, slash);
String artifact = left.substring(slash + 1);
return buildPurl(group, artifact, version);
}
}
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟡 Minor

Dead code: this branch is unreachable.

The block at line 807–820 already handles every s.contains("@") case (with or without /) and returns. By the time control could reach line 823, s.contains("@") is necessarily false, so the s.contains("/") && s.contains("@") test can never be true. Drop it.

🔧 Proposed fix
-        // Maven slash form or simple path-like: group/artifact@version
-        if (s.contains("/") && s.contains("@")) {
-            int atPos = s.indexOf('@');
-            String left = s.substring(0, atPos);
-            String version = s.substring(atPos + 1);
-            int slash = left.indexOf('/');
-            if (slash > 0) {
-                String group = left.substring(0, slash);
-                String artifact = left.substring(slash + 1);
-                return buildPurl(group, artifact, version);
-            }
-        }
-
         // As a last resort, return null to avoid malformed purls
         return null;
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In
`@compiler/ballerina-lang/src/main/java/io/ballerina/projects/util/SbomGenerator.java`
around lines 822 - 833, The if-block in SbomGenerator that checks "if
(s.contains(\"/\") && s.contains(\"@\"))" is unreachable because earlier logic
already returns for any s containing "@"; remove this entire branch (the check
and its local variables atPos, left, version, slash, group, artifact and the
call to buildPurl) to eliminate dead code and keep the remaining parsing logic
intact in SbomGenerator.java.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant