Skip to content

Add extractFontFamily method#705

Merged
MaximPlusov merged 4 commits into
integrationfrom
font-family
Jun 9, 2026
Merged

Add extractFontFamily method#705
MaximPlusov merged 4 commits into
integrationfrom
font-family

Conversation

@LonelyMidoriya

@LonelyMidoriya LonelyMidoriya commented Jun 8, 2026

Copy link
Copy Markdown
Contributor

Summary by CodeRabbit

  • Bug Fixes
    • Improved font-family detection and normalization so fonts render consistently despite subset markers, trailing style tokens, or irregular casing.
    • Now inserts word breaks in concatenated names, collapses extra whitespace, and falls back to a generic sans-serif when no family can be derived.

@LonelyMidoriya LonelyMidoriya self-assigned this Jun 8, 2026
@coderabbitai

coderabbitai Bot commented Jun 8, 2026

Copy link
Copy Markdown

Review Change Stack

Warning

Review limit reached

@LonelyMidoriya, we couldn't start this review because you've reached your PR review rate limit.

More reviews will be available in 10 minutes and 43 seconds. Learn how PR review limits work.

Your organization has run out of usage credits. Purchase more in the billing tab.

⌛ How to resolve this issue?

After more reviews become available, a review can be triggered using the @coderabbitai review command as a PR comment. Alternatively, push new commits to this PR.

We recommend that you space out your commits to avoid hitting the rate limit.

🚦 How do rate limits work?

CodeRabbit enforces hourly rate limits for each developer per organization.

Our paid plans include higher PR review limits than trial, open-source, and free plans. In all cases, reviews become available again over time. During sustained high-volume PR review activity, CodeRabbit may temporarily slow when the next review becomes available.

Please see our Fair Usage Limits Policy for further information.

ℹ️ Review info
⚙️ Run configuration

Configuration used: defaults

Review profile: CHILL

Plan: Pro

Run ID: 5af1366c-3dd3-411d-ae9b-6cac9823df0c

📥 Commits

Reviewing files that changed from the base of the PR and between d6cc2f3 and f0efbca.

📒 Files selected for processing (1)
  • src/main/java/org/verapdf/pd/font/PDFontDescriptor.java
📝 Walkthrough

Walkthrough

Adds a new FontConstants.STYLE_SUFFIXES list and a public PDFontDescriptor.extractFontFamilyFromFontName(String) that normalizes PDF font names by trimming, removing subset markers and known style suffixes, inserting word boundaries, collapsing whitespace, and defaulting empty results to "sans-serif".

Changes

Font Family Extraction Utility

Layer / File(s) Summary
Imports and style-suffix constants
src/main/java/org/verapdf/pd/font/PDFontDescriptor.java, src/main/java/org/verapdf/tools/FontConstants.java
Import section changes to use java.util.*; new FontConstants defines STYLE_SUFFIXES, a list of known font style/variant suffix tokens.
Font family extraction and normalization
src/main/java/org/verapdf/pd/font/PDFontDescriptor.java
New public static extractFontFamilyFromFontName(String) method: trims input, strips subset wildcard markers, iteratively removes style suffixes from FontConstants.STYLE_SUFFIXES (case-insensitive, optional -), removes trailing -, inserts spaces between lower-to-upper letter boundaries, collapses whitespace, and returns "sans-serif" when the result is empty.

Estimated code review effort

🎯 3 (Moderate) | ⏱️ ~20 minutes

Poem

A rabbit nibbles font-name threads,
Peels off suffixes from sleepy heads,
Splits the camel, trims the mess,
Returns a family clean and dressed. 🐇✨

🚥 Pre-merge checks | ✅ 4 | ❌ 1

❌ Failed checks (1 warning)

Check name Status Explanation Resolution
Docstring Coverage ⚠️ Warning Docstring coverage is 50.00% which is insufficient. The required threshold is 80.00%. Write docstrings for the functions missing them to satisfy the coverage threshold.
✅ Passed checks (4 passed)
Check name Status Explanation
Description Check ✅ Passed Check skipped - CodeRabbit’s high-level summary is enabled.
Title check ✅ Passed The title 'Add extractFontFamily method' directly and accurately describes the main change: introduction of a new public static helper method extractFontFamilyFromFontName in PDFontDescriptor.
Linked Issues check ✅ Passed Check skipped because no linked issues were found for this pull request.
Out of Scope Changes check ✅ Passed Check skipped because no linked issues were found for this pull request.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.

✨ Finishing Touches
🧪 Generate unit tests (beta)
  • Create PR with unit tests
  • Commit unit tests in branch font-family

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

@coderabbitai coderabbitai Bot left a comment

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 2

🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

Inline comments:
In `@src/main/java/org/verapdf/pd/font/PDFontDescriptor.java`:
- Around line 58-63: The STYLE_SUFFIXES array in PDFontDescriptor includes the
overly broad "It" entry which causes valid family names ending in "it" (e.g.,
"Summit") to be truncated; remove the "It" string from STYLE_SUFFIXES (or
replace it with an unambiguous token like "Ital" only if needed) and update any
tests that relied on removing that short suffix; ensure the code path that trims
name suffixes (the logic that iterates STYLE_SUFFIXES when normalizing family
names) still covers all intended style markers such as "Italic", "Oblique",
"Bold", etc., without the ambiguous "It".
- Around line 132-133: extractFontFamily currently returns null for a null/empty
fontName which breaks its fallback contract; change the initial guard in
PDFontDescriptor.extractFontFamily so that when fontName is null or empty it
returns the CSS fallback "sans-serif" instead of null, ensuring callers always
receive a valid font-family string and preserving the method's fallback
behavior.
🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

  • Push a commit to this branch (recommended)
  • Create a new PR with the fixes

ℹ️ Review info
⚙️ Run configuration

Configuration used: defaults

Review profile: CHILL

Plan: Pro

Run ID: d3848ea0-03f5-4c20-aff6-7c36eee64da4

📥 Commits

Reviewing files that changed from the base of the PR and between 862b1f2 and 97a1917.

📒 Files selected for processing (1)
  • src/main/java/org/verapdf/pd/font/PDFontDescriptor.java

Comment thread src/main/java/org/verapdf/pd/font/PDFontDescriptor.java Outdated
Comment thread src/main/java/org/verapdf/pd/font/PDFontDescriptor.java Outdated

@coderabbitai coderabbitai Bot left a comment

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

♻️ Duplicate comments (2)
src/main/java/org/verapdf/tools/FontConstants.java (1)

7-77: ⚠️ Potential issue | 🟠 Major | ⚡ Quick win

"It" suffix is too broad and corrupts valid family names.

The "It" entry on line 27 will incorrectly strip valid font names ending in it, such as "Summit""Summ". The list already includes more specific alternatives like "Italic" and "Ita". Remove "It" to prevent false normalization.

🛡️ Proposed fix
         // ---- Slope ----
-        "Italic", "Ita", "It",
+        "Italic", "Ita",
         "Oblique", "Obl", "Caps", "CapsI",
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@src/main/java/org/verapdf/tools/FontConstants.java` around lines 7 - 77,
STYLE_SUFFIXES contains a too-broad "It" suffix that wrongly trims valid family
names (e.g., "Summit" → "Summ"); remove the "It" entry from the STYLE_SUFFIXES
list in FontConstants to stop accidental normalization, keeping the more
specific "Italic" and "Ita" entries intact; update any unit tests or
normalization logic that assumed "It" to rely on the remaining suffixes or add a
targeted rule if genuinely needed.
src/main/java/org/verapdf/pd/font/PDFontDescriptor.java (1)

125-126: ⚠️ Potential issue | 🟠 Major | ⚡ Quick win

Return "sans-serif" instead of null for empty input.

Returning null for null or empty input breaks the method's fallback contract. Line 155 already returns "sans-serif" when the processed name is empty, so the initial null check should do the same for consistency. This forces unnecessary null checks at call sites.

🛡️ Proposed fix
 public static String extractFontFamilyFromFontName(String fontNameWithoutSubset) {
-    if (fontNameWithoutSubset == null || fontNameWithoutSubset.isEmpty()) return null;
+    if (fontNameWithoutSubset == null || fontNameWithoutSubset.isEmpty()) return "sans-serif";
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@src/main/java/org/verapdf/pd/font/PDFontDescriptor.java` around lines 125 -
126, The null/empty guard in PDFontDescriptor.extractFontFamilyFromFontName
currently returns null for a null or empty fontNameWithoutSubset which breaks
the method's fallback contract; change that guard to return "sans-serif" instead
so it matches the later empty-name fallback (and avoids forcing callers to
null-check). Update the behavior in extractFontFamilyFromFontName to return
"sans-serif" when fontNameWithoutSubset is null or empty, keeping the rest of
the name-processing logic unchanged.
🧹 Nitpick comments (1)
src/main/java/org/verapdf/pd/font/PDFontDescriptor.java (1)

29-29: 💤 Low value

Consider explicit imports over wildcard.

Wildcard imports (java.util.*) can obscure dependencies and cause conflicts if types are added to the package in the future. Explicit imports improve clarity.

♻️ Suggested refactor
-import java.util.*;
+import java.util.Iterator;
+import java.util.Map;
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@src/main/java/org/verapdf/pd/font/PDFontDescriptor.java` at line 29, Replace
the wildcard import at the top of PDFontDescriptor (the "import java.util.*;"
statement) with explicit imports for only the types actually used by the class
(e.g., List, Map, Set, Optional, Collection, Iterator, etc. as applicable);
update the import list to include those specific java.util classes referenced
anywhere in PDFontDescriptor to improve clarity and avoid accidental conflicts.
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

Inline comments:
In `@src/main/java/org/verapdf/pd/font/PDFontDescriptor.java`:
- Line 129: The replacement removing "*<digits>" in
PDFontDescriptor.extractFontFamilyFromFontName is inconsistent with this
project's subset format and the fact the method is unused; update the method to
either (A) remove the name.replaceAll("\\*\\d+", "") line and rely on
caller-side subset stripping, or (B) change the normalization to handle this
repo's subset marker by detecting PDFont.SUBSET_REGEX (or applying substring(7)
when it matches) before further processing, and add a short comment explaining
why subset stripping is done here if you keep it; ensure the change is made
inside extractFontFamilyFromFontName and reference PDFont.SUBSET_REGEX or
substring(7) as appropriate.

---

Duplicate comments:
In `@src/main/java/org/verapdf/pd/font/PDFontDescriptor.java`:
- Around line 125-126: The null/empty guard in
PDFontDescriptor.extractFontFamilyFromFontName currently returns null for a null
or empty fontNameWithoutSubset which breaks the method's fallback contract;
change that guard to return "sans-serif" instead so it matches the later
empty-name fallback (and avoids forcing callers to null-check). Update the
behavior in extractFontFamilyFromFontName to return "sans-serif" when
fontNameWithoutSubset is null or empty, keeping the rest of the name-processing
logic unchanged.

In `@src/main/java/org/verapdf/tools/FontConstants.java`:
- Around line 7-77: STYLE_SUFFIXES contains a too-broad "It" suffix that wrongly
trims valid family names (e.g., "Summit" → "Summ"); remove the "It" entry from
the STYLE_SUFFIXES list in FontConstants to stop accidental normalization,
keeping the more specific "Italic" and "Ita" entries intact; update any unit
tests or normalization logic that assumed "It" to rely on the remaining suffixes
or add a targeted rule if genuinely needed.

---

Nitpick comments:
In `@src/main/java/org/verapdf/pd/font/PDFontDescriptor.java`:
- Line 29: Replace the wildcard import at the top of PDFontDescriptor (the
"import java.util.*;" statement) with explicit imports for only the types
actually used by the class (e.g., List, Map, Set, Optional, Collection,
Iterator, etc. as applicable); update the import list to include those specific
java.util classes referenced anywhere in PDFontDescriptor to improve clarity and
avoid accidental conflicts.
🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

  • Push a commit to this branch (recommended)
  • Create a new PR with the fixes

ℹ️ Review info
⚙️ Run configuration

Configuration used: defaults

Review profile: CHILL

Plan: Pro

Run ID: 731047db-bc46-4e31-bf1e-b9a3d7565fc4

📥 Commits

Reviewing files that changed from the base of the PR and between 97a1917 and d6cc2f3.

📒 Files selected for processing (2)
  • src/main/java/org/verapdf/pd/font/PDFontDescriptor.java
  • src/main/java/org/verapdf/tools/FontConstants.java

Comment thread src/main/java/org/verapdf/pd/font/PDFontDescriptor.java
@MaximPlusov MaximPlusov merged commit 7cae651 into integration Jun 9, 2026
9 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants