Class IllegalSymbolCheck
java.lang.Object
com.puppycrawl.tools.checkstyle.AbstractAutomaticBean
com.puppycrawl.tools.checkstyle.api.AbstractViolationReporter
com.puppycrawl.tools.checkstyle.api.AbstractCheck
com.puppycrawl.tools.checkstyle.checks.coding.IllegalSymbolCheck
- All Implemented Interfaces:
Configurable,Contextualizable
Checks that specified symbols (by Unicode code points or ranges) are not used in code.
By default, blocks common symbol ranges.
Rationale: This check helps prevent emoji symbols and special characters in code (commonly added by AI tools), enforce coding standards, or forbid specific Unicode characters.
Default ranges cover:
- U+2190–U+27BF: Arrows, Mathematical Operators, Box Drawing, Geometric Shapes, Miscellaneous Symbols, and Dingbats
- U+1F600–U+1F64F: Emoticons
- U+1F680–U+1F6FF: Transport and Map Symbols
- U+1F700–U+10FFFF: Alchemical Symbols and other pictographic symbols
For a complete list of Unicode characters and ranges, see: List of Unicode characters
-
Property
symbolCodes- Specify the symbols to check for, as Unicode code points or ranges. Format: comma-separated list of hex codes or ranges (e.g.,"0x2705, 0x1F600-0x1F64F"). To allow only ASCII characters, use"0x0080-0x10FFFF". Type isjava.lang.String. Default value is"0x2190-0x27BF, 0x1F600-0x1F64F, 0x1F680-0x1F6FF, 0x1F700-0x1FFFFF".
- Since:
- 13.3.0
-
Nested Class Summary
Nested ClassesModifier and TypeClassDescriptionprivate static final recordRepresents a parsed Unicode range.Nested classes/interfaces inherited from class com.puppycrawl.tools.checkstyle.AbstractAutomaticBean
AbstractAutomaticBean.OutputStreamOptions -
Field Summary
FieldsModifier and TypeFieldDescriptionprivate final Set<IllegalSymbolCheck.CodePointRange> Precomputed code point ranges.private booleanFlag to track if ranges have been initialized.static final StringA key is pointing to the warning message text in "messages.properties" file.private static final StringString Range Separator.Precomputed single code points.private StringSpecify the symbols to check for, as Unicode code points or ranges. -
Constructor Summary
Constructors -
Method Summary
Modifier and TypeMethodDescriptionprivate voidCheck the text for illegal symbols.int[]The configurable token set.int[]Returns the default token a check is interested in.int[]The tokens that this check must be registered for.booleanWhether comment nodes are required or not.private booleanisIllegalSymbol(int codePoint) Check if a code point is illegal based on configured ranges.private booleanisInSymbolCodes(int codePoint) Check if code point is in the configured symbol codes.private static intparseCodePoint(String str) Parse a code point from string representation.private voidparseRange(String rangeStr) Parse and store a range.voidsetSymbolCodes(String symbols) Setter to specify the symbols to check for.voidvisitToken(DetailAST ast) Called to process a token.Methods inherited from class com.puppycrawl.tools.checkstyle.api.AbstractCheck
beginTree, clearViolations, destroy, finishTree, getFileContents, getFilePath, getLine, getLineCodePoints, getLines, getTabWidth, getTokenNames, getViolations, init, leaveToken, log, log, log, setFileContents, setTabWidth, setTokensMethods inherited from class com.puppycrawl.tools.checkstyle.api.AbstractViolationReporter
finishLocalSetup, getCustomMessages, getId, getMessageBundle, getSeverity, getSeverityLevel, setId, setSeverityMethods inherited from class com.puppycrawl.tools.checkstyle.AbstractAutomaticBean
configure, contextualize, getConfiguration, setupChild
-
Field Details
-
MSG_KEY
A key is pointing to the warning message text in "messages.properties" file.- See Also:
-
RANGE_SEPARATOR
String Range Separator.- See Also:
-
singleCodePoints
Precomputed single code points. -
codePointRanges
Precomputed code point ranges. -
symbolCodes
Specify the symbols to check for, as Unicode code points or ranges. -
initialized
Flag to track if ranges have been initialized.
-
-
Constructor Details
-
IllegalSymbolCheck
public IllegalSymbolCheck()
-
-
Method Details
-
setSymbolCodes
Setter to specify the symbols to check for. Format: comma-separated list of hex codes or ranges (e.g., "0x2705, 0x1F600-0x1F64F").- Parameters:
symbols- the symbols specification- Throws:
IllegalArgumentException- if the format is invalid- Since:
- 13.3.0
-
getDefaultTokens
Description copied from class:AbstractCheckReturns the default token a check is interested in. Only used if the configuration for a check does not define the tokens.- Specified by:
getDefaultTokensin classAbstractCheck- Returns:
- the default tokens
- See Also:
-
getAcceptableTokens
Description copied from class:AbstractCheckThe configurable token set. Used to protect Checks against malicious users who specify an unacceptable token set in the configuration file. The default implementation returns the check's default tokens.- Specified by:
getAcceptableTokensin classAbstractCheck- Returns:
- the token set this check is designed for.
- See Also:
-
getRequiredTokens
Description copied from class:AbstractCheckThe tokens that this check must be registered for.- Specified by:
getRequiredTokensin classAbstractCheck- Returns:
- the token set this must be registered for.
- See Also:
-
isCommentNodesRequired
Description copied from class:AbstractCheckWhether comment nodes are required or not.- Overrides:
isCommentNodesRequiredin classAbstractCheck- Returns:
- false as a default value.
-
visitToken
Description copied from class:AbstractCheckCalled to process a token.- Overrides:
visitTokenin classAbstractCheck- Parameters:
ast- the token to process
-
checkText
Check the text for illegal symbols.- Parameters:
text- the text to checkast- the AST node
-
isIllegalSymbol
Check if a code point is illegal based on configured ranges.- Parameters:
codePoint- the code point to check- Returns:
- true if the code point is illegal
-
isInSymbolCodes
Check if code point is in the configured symbol codes.- Parameters:
codePoint- the code point to check- Returns:
- true if in symbol codes
-
parseRange
Parse and store a range.- Parameters:
rangeStr- the range string- Throws:
IllegalArgumentException- if range format is invalid
-
parseCodePoint
Parse a code point from string representation. Supports formats: 0x1234, \\u1234, U+1234, or plain hex.- Parameters:
str- the string to parse- Returns:
- the code point value
- Throws:
NumberFormatException- if the string cannot be parsed
-