Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

ClassRefTypeSignature.getFullyQualifiedClassName() can return incorrect classnames in some circumstances, using "." instead of "$" #854

Open
chubert-ch opened this issue Apr 26, 2024 · 1 comment

Comments

@chubert-ch
Copy link

Reproduction steps

Best exemplified by the following testcase which (erroneously) passes on the latest branch if you pop it in a ClassTypeSignatureTest class:

    @Test
    void demonstrateIncorrectQualifiedName() {
        ClassLoader mainClassLoader = ClassTypeSignatureTest.class.getClassLoader();
        ScanResult scanResult = new ClassGraph()
                .enableClassInfo()
                .enableAnnotationInfo()
                .ignoreClassVisibility()
                .ignoreFieldVisibility()
                .ignoreMethodVisibility()
                .overrideClassLoaders(mainClassLoader)
                .verbose()
                .scan();

        String anonymousClass = "com.google.common.collect.TreeRangeMap$SubRangeMap$1";
        ClassInfo classInfo = scanResult.getClassInfo(anonymousClass);
        ClassRefTypeSignature signature = classInfo.getTypeSignatureOrTypeDescriptor().getSuperclassSignature();

        String subRangeMapAsMapClassName = signature.getFullyQualifiedClassName();
        // ClassGraph returns a "fully qualified classname" that is malformed and it can't use
        assertNull(scanResult.getClassInfo(subRangeMapAsMapClassName));
        assertEquals("com.google.common.collect.TreeRangeMap$SubRangeMap.SubRangeMapAsMap", subRangeMapAsMapClassName);
        // With the "$" separating the inner class names rather than the "." all is well
        String rightClassName = "com.google.common.collect.TreeRangeMap$SubRangeMap$SubRangeMapAsMap";
        assertNotNull(scanResult.getClassInfo(rightClassName));
    }

TreeRangeMap is already on the classgraph classpath, so this test can be run directly. The test is also available here.

Why I think this behavior is incorrect

The javadoc of getFullyQualifiedClassName indicates that it should be returning the $ version, particularly because this is the string that could become a valid path to the class file if all the "."s were replaced by "/"s.

Additionally, it's odd to be returning a class name in a format that classgraph itself can't use.

Thoughts as to why it happens and how it could be fixed

From what I can tell this happens because:

  1. The type signature in the constant pool is Lcom/tngtech/archunit/thirdparty/com/google/common/collect/TreeRangeMap<TK;TV;>.SubRangeMap.SubRangeMapAsMap;
  2. ClassRefTypeSignature.parse attempts to parse it, but passes the "SubRangeMap.SubRangeMapAsMap" fragment to TypeUtils.getIdentifierToken(), which unlike ClassRefTypeSignature.parse does NOT split on both $ and ., instead just splitting on $, meaning the entire suffix is thought to be "SubRangeMap.SubRangeMapAsMap" rather than there being two suffixes, "SubRangeMap" and "SubRangeMapAsMap".

I am, however, not confident enough in this conclusion to make a PR with this change to TypeUtils.getIdentifierToken().

Logs

I have included verbose logs as requested, but do not think they will be relevant:
ClassGraphLogs.log

@lukehutch
Copy link
Member

@chubert-ch Thanks for reporting this, and for trying to track down the source of the issue.

To be totally honest, when I wrote this, I couldn't figure out all the rules Java uses for when to use $ as a separator rather than .. So I just made my best guess, and I anticipated that one day someone would file a bug like this!

I tried to follow the Java spec pretty closely when writing the type signature parsing code:

https://docs.oracle.com/javase/specs/jvms/se22/html/jvms-4.html#jvms-4.7.9.1

Although looking back at the code now, it looks like I coded defensively, e.g. where the spec says you'll see a . separator, I wrote while (parser.peek() == '.' || parser.peek() == '$').

Unfortunately I don't have time to work on this right now, because I'm busy getting ready to launch a company, but the link above may give you a good starting-point for checking the parser code to see if it conforms to the spec. It should be very straight-forward to compare the code to the spec, because I implemented a simple recursive-descent parser that pretty much exactly parallels the spec in structure.

If the parser is compliant with the spec, then the next place to look might be the code for signature.getFullyQualifiedClassName().

Any further help you can give in getting this fixed would be much appreciated. Thanks!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants