Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Enhancement] Improve test coverage for SDK #703

Open
5 tasks
saratvemulapalli opened this issue Apr 25, 2023 · 7 comments
Open
5 tasks

[Enhancement] Improve test coverage for SDK #703

saratvemulapalli opened this issue Apr 25, 2023 · 7 comments
Labels
discuss enhancement New feature or request good first issue Good for newcomers

Comments

@saratvemulapalli
Copy link
Member

saratvemulapalli commented Apr 25, 2023

Is your feature request related to a problem?

Code coverage for SDK is trending downwards and is at 66% on 25th April 2023[1].

Writing up this issue to improve our coverage to atleast 75%.
Feel free to pick up any of these to improve test coverage

Proposal

  • Add coverage to bring SDK upto 75% or a number which maintainers agree to.
  • Enforce test coverage minimum % for PRs

[1] https://app.codecov.io/gh/opensearch-project/opensearch-sdk-java/tree/main/src/main/java/org/opensearch/sdk

@saratvemulapalli saratvemulapalli added enhancement New feature or request untriaged good first issue Good for newcomers labels Apr 25, 2023
@dbwiddis
Copy link
Member

dbwiddis commented Apr 25, 2023

Proposal

  • Add coverage to bring SDK upto 75% or a number which maintainers agree to.
  • Enforce test coverage minimum % for PRs

OK, time to get on my soapbox again. :-)

First, note that I asked about this in #250 and got no responses other than my own, which became the de facto default.

I'll rehash some of my opinions from that issue. TLDR, I think code coverage is an arbitrary metric that's easy to cheat. It can give you the illusion of well tested software when in fact it's not.

I like code coverage reports that we can drill into periodically in order to find things we may have missed (as you have done with this issue). I dislike it as a mandatory element of PR workflow that encourages devs to find the fastest way to get a passing metric, even if it's a useless test.

I can easily "cover" a line of code by executing it. But am I properly testing the assertions? The API link above is mostly copy/paste from the plugin interfaces... do we really need to test that a default empty list is an empty list? Or any other default value? What value does that bring? Other than the default returns, the offending code appears to be toString() or equals(). Sure, we can write tests to "cover" those but I expect the toString will be perfunctory (and it's intended for user consumption, not code, so I argue it has little to no value) and it's a pain to write a test for all possible equals comparisons.

The SDKTransportService code is a good side-by-side example. The sendRegisterTransportActionsRequest() method is covered, while the sendRemoteExtensionActionRequest() isn't. Both requests really should be tested via Integration Tests and maybe when those are active we can revisit our coverage. Because the "coverage" for the first one is just a mock verifying that the method was called. It provides near-zero value and exists primarily to up our coverage numbers.

ActionListener I don't think we can even test outside of IT.

The handler code we could probably test more rigorously with unit tests, but it's best tested in an IT environment as well.

OK, back off my soapbox. My thoughts:

  1. Any number we pick right now is going to change (drastically) when we enable ITs. I'd prefer to wait until IT framework is up, use the code coverage to identify holes, get it fixed up, and maintain it.
  2. I think codecov has an "auto" setting that we can use instead of a fixed number. I'm not sure of the math/theory behind it but I suspect it's better than any arbitrary fixed value. This article suggests 80% is a good value to shoot for, but I don't think we can get that without either IT's or useless mocks that "cover" the code but don't actually test it.
  3. We already have the tools available to look at diff code coverage (a comment is posted on every PR). But diff coverage varies even more than regular coverage. Add a new method with 2 lines of code and the values will be 0, 50%, or 100%.
    What's a good threshold?
    • All maintainers should be looking at that diff coverage when reviewing PRs. (We probably aren't doing this.) So to help prompt us to do so, I'm ok with one of the following two values for "diff code coverage":
      • 100% with a failure we are allowed to merge with a failing check. This would at least prompt a maintainer to positively view the diff coverage and identify if missing coverage is acceptable (for example, an "exception that should never happen" branch). I still don't particularly like this but I'd rather 100% than any lower value that implies that any value less than total coverage of everything you can/should is OK.
      • ">0%", which will fail if there is zero coverage on the diff. This will prompt at least some tests. And maybe even seeing a passing ">0" check may prompt maintainers to look at the code diff we already have available.

@dbwiddis
Copy link
Member

Whoever does SDKTransportService should do #585 at the same time.

@saratvemulapalli
Copy link
Member Author

Thanks @dbwiddis for writing this up. I've added discuss label.
💯 with you on IT(integration tests) will cover more real use cases, unit tests also will add value.
I like the idea of PR diff should be a metric we can rely on for quality and let the maintainers decide if its ok to merge the changes.
Im sure we can have mechanisms around it to enforce it.

This issue is a consequence of #704, we were trying to RegisterRestActions without validating the extension uniqueId and with Protobuf changes they dont take null.
This issue is to find ways to improve our quality, with tests (including IT and UT).

@saratvemulapalli
Copy link
Member Author

We just realized SSL[1] changes we merged in dropped our coverage by double digits.
@cwperks do you mind picking this up ? As I get time, I'll chip in as well.

[1] #619

@cwperks
Copy link
Member

cwperks commented May 4, 2023

Hi @saratvemulapalli, yes I definitely plan to help on this issue and I'll try to highlight out some of the challenges to get better coverage in this area. For context, the security plugin performs a lot of the testing of SSL configuration in this test class: SSLTest.

This test class subclasses SingleClusterTest which subclasses AbstractSecurityUnitTest. The tests in this suite also utilize the ClusterHelper to spin up in memory clusters with different configurations.

I highlighted the areas of the security codebase above to try to give examples of how I think the test framework for testing changes like SSL Configuration should be. I'd like to create the ability to create in-memory configurations of an extension + an opensearch node (with security enabled) and test out the different configuration options for SSL and verify that extension bootstrap happens successfully.

I briefly took a look on Tuesday to see what would be required to build this kind of framework and the process wasn't straightforward.

@cwperks
Copy link
Member

cwperks commented May 4, 2023

I will look into whether its possible to publish a test jar from the security plugin and utilize that for tests in this repo.

@dbwiddis
Copy link
Member

dbwiddis commented May 4, 2023

I will look into whether its possible to publish a test jar from the security plugin and utilize that for tests in this repo.

@cwperks @saratvemulapalli given #714 we may not want to spend too much time figuring out a solution in this repo. I think we all agree the code needs to be tested, but the new location would be a better place to do that. We can configure jacoco to skip that package.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
discuss enhancement New feature or request good first issue Good for newcomers
Projects
None yet
Development

No branches or pull requests

4 participants