Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Checkstyle incorrectly counts unicode characters #10183

Closed
nrmancuso opened this issue Jun 26, 2021 · 1 comment
Closed

Checkstyle incorrectly counts unicode characters #10183

nrmancuso opened this issue Jun 26, 2021 · 1 comment

Comments

@nrmancuso
Copy link
Member

➜  src cat Test.java
public class Test {
    String aaaaaac = "  "; // two spaces
    String aaaaaad = "πŸ’©πŸ’©"; // two emojis
}

➜  src java -jar checkstyle-8.43-all.jar -t Test.java
CLASS_DEF -> CLASS_DEF [1:0]
|--MODIFIERS -> MODIFIERS [1:0]
|   `--LITERAL_PUBLIC -> public [1:0]
|--LITERAL_CLASS -> class [1:7]
|--IDENT -> Test [1:13]
`--OBJBLOCK -> OBJBLOCK [1:18]
    |--LCURLY -> { [1:18]
    |--VARIABLE_DEF -> VARIABLE_DEF [2:4]
    |   |--MODIFIERS -> MODIFIERS [2:4]
    |   |--TYPE -> TYPE [2:4]
    |   |   `--IDENT -> String [2:4]
    |   |--IDENT -> aaaaaac [2:11]
    |   |--ASSIGN -> = [2:19]
    |   |   `--EXPR -> EXPR [2:21]
    |   |       `--STRING_LITERAL -> "  " [2:21]
    |   `--SEMI -> ; [2:25]
    |--VARIABLE_DEF -> VARIABLE_DEF [3:4]
    |   |--MODIFIERS -> MODIFIERS [3:4]
    |   |--TYPE -> TYPE [3:4]
    |   |   `--IDENT -> String [3:4]
    |   |--IDENT -> aaaaaad [3:11]
    |   |--ASSIGN -> = [3:19]
    |   |   `--EXPR -> EXPR [3:21]
    |   |       `--STRING_LITERAL -> "πŸ’©πŸ’©" [3:21]
    |   `--SEMI -> ; [3:27]                          // should be 3:25
    `--RCURLY -> } [4:0]

➜  src while read line; do echo -n "$line" | wc -m ; done< Test.java  
19
22
22
1

I would expect the following:

CLASS_DEF -> CLASS_DEF [1:0]
|--MODIFIERS -> MODIFIERS [1:0]
|   `--LITERAL_PUBLIC -> public [1:0]
|--LITERAL_CLASS -> class [1:7]
|--IDENT -> Test [1:13]
`--OBJBLOCK -> OBJBLOCK [1:18]
    |--LCURLY -> { [1:18]
    |--VARIABLE_DEF -> VARIABLE_DEF [2:4]
    |   |--MODIFIERS -> MODIFIERS [2:4]
    |   |--TYPE -> TYPE [2:4]
    |   |   `--IDENT -> String [2:4]
    |   |--IDENT -> aaaaaac [2:11]
    |   |--ASSIGN -> = [2:19]
    |   |   `--EXPR -> EXPR [2:21]
    |   |       `--STRING_LITERAL -> "  " [2:21]
    |   `--SEMI -> ; [2:25]
    |--VARIABLE_DEF -> VARIABLE_DEF [3:4]
    |   |--MODIFIERS -> MODIFIERS [3:4]
    |   |--TYPE -> TYPE [3:4]
    |   |   `--IDENT -> String [3:4]
    |   |--IDENT -> aaaaaad [3:11]
    |   |--ASSIGN -> = [3:19]
    |   |   `--EXPR -> EXPR [3:21]
    |   |       `--STRING_LITERAL -> "πŸ’©πŸ’©" [3:21]
    |   `--SEMI -> ; [3:25]
    `--RCURLY -> } [4:0]
@nrmancuso
Copy link
Member Author

Fixed via #10280:

➜  src cat Test.java
public class Test {
    String aaaaaac = "  "; // two spaces
    String aaaaaad = "πŸ’©πŸ’©"; // two emojis
}

➜  src java -jar ~/IdeaProjects/checkstyle/target/checkstyle-9.0-SNAPSHOT-all.jar -t Test.java
CLASS_DEF -> CLASS_DEF [1:0]
|--MODIFIERS -> MODIFIERS [1:0]
|   `--LITERAL_PUBLIC -> public [1:0]
|--LITERAL_CLASS -> class [1:7]
|--IDENT -> Test [1:13]
`--OBJBLOCK -> OBJBLOCK [1:18]
    |--LCURLY -> { [1:18]
    |--VARIABLE_DEF -> VARIABLE_DEF [2:4]
    |   |--MODIFIERS -> MODIFIERS [2:4]
    |   |--TYPE -> TYPE [2:4]
    |   |   `--IDENT -> String [2:4]
    |   |--IDENT -> aaaaaac [2:11]
    |   |--ASSIGN -> = [2:19]
    |   |   `--EXPR -> EXPR [2:21]
    |   |       `--STRING_LITERAL -> "  " [2:21]
    |   `--SEMI -> ; [2:25]
    |--VARIABLE_DEF -> VARIABLE_DEF [3:4]
    |   |--MODIFIERS -> MODIFIERS [3:4]
    |   |--TYPE -> TYPE [3:4]
    |   |   `--IDENT -> String [3:4]
    |   |--IDENT -> aaaaaad [3:11]
    |   |--ASSIGN -> = [3:19]
    |   |   `--EXPR -> EXPR [3:21]
    |   |       `--STRING_LITERAL -> "πŸ’©πŸ’©" [3:21]
    |   `--SEMI -> ; [3:25]
    `--RCURLY -> } [4:0]

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

4 participants