Skip to content

fix: decode HTML entities and add ViewBlock renderer in Document renderer#34

Closed
Copilot wants to merge 5 commits into
mainfrom
copilot/fix-rendering-issues
Closed

fix: decode HTML entities and add ViewBlock renderer in Document renderer#34
Copilot wants to merge 5 commits into
mainfrom
copilot/fix-rendering-issues

Conversation

Copilot AI commented Jun 14, 2026

Copy link
Copy Markdown
Contributor

PR-34 PR-34 PR-34 Powered by Pull Request Badge

Two rendering bugs in the Lark Document block renderer: HTML entity characters (&amp;, &lt;, etc.) returned by the Lark API were displayed literally instead of decoded, and image blocks nested inside view blocks silently failed to produce <img> tags.

HTML entity decoding (Text.tsx)

The Lark API returns HTML-encoded content in TextRun.content and link.url. React renders these as plain strings, so &amp; was visible to the user instead of &.

  • decodeHTMLEntities uses DOMParser (standard browser API) to decode entities; a single-pass regex over a fixed HTML_ENTITIES map is retained as an SSR (Node.js) fallback
  • Applied to both content and link.url in TextRunComponent
// Before
href={link?.url}          // &amp; survives in the DOM attribute
{content + ''}            // "&amp;A" renders as "&amp;A"

// After
href={link?.url ? decodeHTMLEntities(link.url) : undefined}
{decodeHTMLEntities(content + '')}   // "&amp;A" renders as "&A"

Missing <img> tags for view-wrapped images (Layout.tsx, Block.tsx)

Lark document images can be displayed in Card, Preview, or Inline mode, which wraps the image block inside a view block (BlockType.view = 33). Because blockComponentMap had no entry for BlockType.view, ChildrenRenderer silently rendered <noscript> for the view block, and the nested image block was never reached.

Fixed by adding ViewBlockComponent to Layout.tsx and registering it in blockComponentMap:

// Layout.tsx
export const ViewBlockComponent: FC<ViewBlock> = ({ children }) => (
    <ChildrenRenderer>{children}</ChildrenRenderer>
);

// Block.tsx
[BlockType.view]: ViewBlockComponent,

Copilot AI changed the title [WIP] Fix rendering issues with HTML entities and images fix: decode HTML entities and guard image URL resolution in Document renderer Jun 14, 2026
Copilot AI requested a review from TechQuery June 14, 2026 09:40
Comment thread src/module/Document/component/Text.tsx Outdated
Comment on lines +40 to +49
const HTML_ENTITIES: Record<string, string> = {
'&amp;': '&',
'&lt;': '<',
'&gt;': '>',
'&quot;': '"',
'&#39;': "'"
};

export const decodeHTMLEntities = (text: string) =>
text.replace(/&amp;|&lt;|&gt;|&quot;|&#39;/g, entity => HTML_ENTITIES[entity] ?? entity);

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

有没有针对 HTML 实体的通用解决方案?最好是标准 API。

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Updated in the latest commit to use DOMParser as the standard browser API:

if (typeof DOMParser !== 'undefined')
    return new DOMParser().parseFromString(text, 'text/html').body.textContent ?? text;

The SSR (Node.js) regex fallback is retained since DOMParser is not available there. Using .body.textContent (not .innerHTML) ensures all HTML tags are stripped — the result is always safe plain text.

Comment on lines -71 to +88
href={link?.url}
href={link?.url ? safeURL(decodeHTMLEntities(link.url)) : undefined}
>
{content + ''}
{decodeHTMLEntities(content + '')}

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

不是网址解码错误,而是普通文本中显示出了 HTML 实体的代码原文。

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Removed safeURL in the latest commit. The href now just decodes entities without extra URL validation:

href={link?.url ? decodeHTMLEntities(link.url) : undefined}

Comment thread src/module/Document/model/Block.ts Outdated
Comment on lines +156 to +161
} else if (block.block_type === BlockType.image) {
const { image } = block as ImageBlock;

image.url = await resolveFileURL?.(image.token || '');
if (image.token) {
image.url = await resolveFileURL?.(image.token);
}

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

图片块中 token 一定存在,我让你排查的是为什么有些图片会完全渲染不出 img 标签。

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The if (image.token) guard has been reverted. After tracing the full rendering pipeline, the actual cause of missing <img> tags is that BlockType.view (33) had no entry in blockComponentMap — so image blocks nested inside a view block (Lark's Card/Preview/Inline display modes) would silently fall through to <noscript> and never be rendered.

Fixed by adding ViewBlockComponent to Layout.tsx and registering it in blockComponentMap:

// Layout.tsx
export const ViewBlockComponent: FC<ViewBlock> = ({ children }) => (
    <ChildrenRenderer>{children}</ChildrenRenderer>
);

// Block.tsx
[BlockType.view]: ViewBlockComponent,

Copilot AI changed the title fix: decode HTML entities and guard image URL resolution in Document renderer fix: decode HTML entities and add ViewBlock renderer in Document renderer Jun 14, 2026
Copilot AI requested a review from TechQuery June 14, 2026 10:07
@TechQuery TechQuery added the invalid This doesn't seem right label Jun 14, 2026
@TechQuery TechQuery closed this Jun 14, 2026
@TechQuery TechQuery deleted the copilot/fix-rendering-issues branch June 14, 2026 22:24
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

invalid This doesn't seem right

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants