Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

External classes are included by default in upwards scan #463

Open
rgovind3 opened this issue Aug 10, 2020 · 1 comment
Open

External classes are included by default in upwards scan #463

rgovind3 opened this issue Aug 10, 2020 · 1 comment

Comments

@rgovind3
Copy link

External classes are included by default to the scan result while doing the "upward scanning" added in 8a46a5e#diff-05aabb17a3fb7ed8696198513c8926ac. I think this behavior is confusing to have on by default. Since there is a separate enableExternalClasses modifier I believe the expected behavior here should be to exclude external classes by default from the scan result and include them only when enableExternalClasses = true.

Here's a slightly modified version of Issue261Test that highlights this issue, and tests for what I think should be the expected behavior.

public class Issue261Test {
    private static class SuperSuperCls {  }

    private static class SuperCls extends SuperSuperCls {  }

    private static class Cls extends SuperCls {  }

    @Test
    public void issue261Test() {
        try (ScanResult scanResult = new ClassGraph().whitelistClasses(Cls.class.getName()).enableAllInfo()
                .scan()) {

            // --- Current check ---
            // assertThat(scanResult.getSubclasses(SuperSuperCls.class.getName()).getNames())
            //        .containsOnly(SuperCls.class.getName(), Cls.class.getName());

            // --- Expected check ---
            assertThat(scanResult.getSubclasses(SuperSuperCls.class.getName()).getNames())
                    .containsOnly(Cls.class.getName());
        }
    }

    @Test
    public void currentIssueTest() {
        // Include external classes
        try (ScanResult scanResult = new ClassGraph().whitelistClasses(Cls.class.getName()).enableAllInfo()
                .enableExternalClasses()
                .scan()) {

            assertThat(scanResult.getSubclasses(SuperSuperCls.class.getName()).getNames())
                    .containsOnly(SuperCls.class.getName(), Cls.class.getName());
        }
    }
}
@lukehutch
Copy link
Member

lukehutch commented Aug 11, 2020

Hmm... there are a surprising number of nonintuitive things about classpath scanning that do not match people's expectations, and it's a constant battle to figure out how to set the defaults so that the fewest people possible are surprised by the default behavior.

In this case, that change was introduced because of complaints that external superclasses were not being returned. People are used to the Java reflection API, so they're highly surprised if they can't get a superclass of a class, even if the class but not its superclass is in the accept list (formerly "whitelist").

There is an ability for any method of ScanResult or ClassInfo to return only non-external classes (unless external classes are enabled). For ClassInfo#getSuperclasses(), currently all superclasses of an external class are returned, because of the above-mentioned expectation: see here (/* strictAccept = */ false):

https://github.com/classgraph/classgraph/blob/latest/src/main/java/io/github/classgraph/ClassInfo.java#L1538

However for ClassInfo#getSubclasses(), currently if the superclass is itself an external class (as in your example test), all subclasses are returned, whereas if the superclass is an accepted (whitelisted) class, then only accepted subclasses are returned: see here (/* strictAccept = */ !isExternalClass):

https://github.com/classgraph/classgraph/blob/latest/src/main/java/io/github/classgraph/ClassInfo.java#L1525

You can see the logic here for how classes end up in the final list (or how classes are excluded from the final list):

https://github.com/classgraph/classgraph/blob/latest/src/main/java/io/github/classgraph/ClassInfo.java#L772

Unfortunately if you look at ScanResult and ClassInfo, you'll see that there are many different methods that filter ClassInfo objects in this manner (search for strictAccept in those classes), and an executive decision has to be made for every one of them to match people's most likely expectations. The situation you gave an example of is only one of many.

I'm hesitant to change the current behavior unless it's obvious that any change is likely to reduce the average degree of surprise as to the behavior, and break the minimum amount of existing code. But there may be a way to walk that fine line.

One possibility is to never return any non-external class, but that is the former behavior that I got lots of complaints about, because it didn't match people's expectations.

However we need to figure out some general principles that can be applied to all the methods in ClassInfo and ScanResult in a consistent way. The current general principle is basically, if it's an "upwards" search through the class hierarchy, then external classes are included; if it's a "downwards" search, then: (a) if the start class for the search is not an external class, then only external classes are excluded; otherwise (b) external classes are included.

If you can suggest a better general heuristic, I'm all ears!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants