Skip to content

fix: preserve HTML comments in MCP comment output by unescaping \!#1028

Open
shivama205 wants to merge 4 commits intoanthropics:mainfrom
shivama205:fix/mcp-comment-html-exclamation-escape
Open

fix: preserve HTML comments in MCP comment output by unescaping \!#1028
shivama205 wants to merge 4 commits intoanthropics:mainfrom
shivama205:fix/mcp-comment-html-exclamation-escape

Conversation

@shivama205
Copy link

@shivama205 shivama205 commented Mar 7, 2026

Summary

Fixes #971

  • MCP comment servers (github-comment-server, github-inline-comment-server) used sanitizeContent() on outgoing comment bodies. This function is designed for input sanitization (stripping hidden content from user comments to prevent prompt injection), but when applied to output, it stripped all HTML comments.
  • Separately, ! in <!-- --> gets escaped to \! upstream, producing <\!-- which isn't valid HTML comment syntax and renders as visible text on GitHub.
  • Added sanitizeOutputContent() specifically for outgoing comment bodies that:
    • Unescapes <\!--<!-- (only when a matching --> exists, to prevent unclosed comments from eating page content)
    • Preserves HTML comments (does not call stripHtmlComments)
    • Still redacts GitHub tokens (security)
    • Still strips invisible/zero-width characters (security)
  • Switched both MCP comment servers to use sanitizeOutputContent() instead of sanitizeContent()
  • sanitizeContent() remains unchanged for input sanitization paths

Edge cases tested

  1. Token inside HTML comment (<!-- ghp_xxx -->) — still redacted
  2. Double-escaped \\! — not over-unescaped, stays as \\!
  3. Mixed escaped and unescaped comments in same body — both handled correctly
  4. \! inside fenced code blocks (e.g. bash \!42) — preserved, not mangled
  5. Unclosed <\!-- with no matching --> — left escaped to prevent eating rest of page
  6. Multiline HTML comments — works
  7. Adjacent comments with no whitespace — both unescaped
  8. Prompt injection via HTML comment — output preserves (Claude's own output), input sanitizer still strips (user content)

Test plan

  • All 666 tests pass (added 6 new ones)
  • bun run typecheck — clean
  • bun run format:check — clean

shivama205 and others added 4 commits March 7, 2026 23:10
The MCP comment servers used sanitizeContent() (designed for input
sanitization) on outgoing comment bodies, which stripped HTML comments
entirely. Additionally, upstream escaping of ! to \! made <!-- markers
render as visible <\!-- text in GitHub comments.

Add sanitizeOutputContent() for output paths that:
- Unescapes \! back to ! so HTML comments render correctly
- Preserves HTML comments (no stripHtmlComments)
- Still redacts GitHub tokens and strips invisible characters

Fixes anthropics#971

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Narrow unescaping from global \! → ! to only <\!-- → <!--
so that \! inside code blocks (e.g. bash history expansion)
is preserved unchanged.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Prevents unclosed <\!-- from becoming <!-- which would eat the rest
of the page. Also adds tests for unclosed comments, multiline
comments, adjacent comments, and \! inside code blocks.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

MCP comment tool escapes ! in HTML comments, making markers visible

1 participant