Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[SUREFIRE-2058] Corrupted STDOUT by directly writing to native stream in forked JVM 1 with UTF-8 console logging #518

Merged
merged 3 commits into from Apr 26, 2022

Conversation

zoltanmeze
Copy link
Contributor

@zoltanmeze zoltanmeze commented Apr 11, 2022

Two cases are covered here:

  • Test run swallowing some of the outputs and finishing with the following warning:
    [WARNING] Corrupted channel by directly writing to native stream in forked JVM 1. 
    
  • Test run getting completely stuck

Both cases are caused by the same issue, output buffer overflow in AbstractStreamDecoder#readString(Memento, int), because output char buffer is not cleared if it's not completely full.

Added unit tests:

  • AbstractStreamDecoderTest#shouldReadStringOverflowOnNewLine:
    Input buffer of 1026 bytes: 1023 x 1-byte + 1 x 2-bytes encoded character + new line (LF).
    • In first iteration it decodes the first 1023 bytes from the input buffer without clearing the output buffer.
    • In second iteration decodeString returns an overflow, but it still force reads the 2-byte encoded character from input to output. readString ends up with 1 remaining byte (LF) on input buffer causing channel to become corrupted.
  • AbstractStreamDecoderTest#shouldReadStringOverflowOn4BytesEncodedSymbol:
    Input buffer of 1027 bytes: 1023 x 1 byte + 1 x 4-bytes encoded characters.
    • In first iteration it decodes the first 1023 bytes from the input buffer without clearing the output buffer.
    • In second iteration decodeString returns an overflow, without reading anything from input buffer. This is causing an infinite loop in readString, it never exits.

Tested locally with some private projects. In one of the project 1k corrupted channel messages on M6, 0 wtih these changes. Also tried just just printing out a ton of messages generated using RandomStringUtils#random.

Following this checklist to help us incorporate your
contribution quickly and easily:

  • Make sure there is a JIRA issue filed
    for the change (usually before you start working on it). Trivial changes like typos do not
    require a JIRA issue. Your pull request should address just this issue, without
    pulling in other changes.
  • Each commit in the pull request should have a meaningful subject line and body.
  • Format the pull request title like [SUREFIRE-XXX] - Fixes bug in ApproximateQuantiles,
    where you replace SUREFIRE-XXX with the appropriate JIRA issue. Best practice
    is to use the JIRA issue title in the pull request title and in the first line of the
    commit message.
  • Write a pull request description that is detailed enough to understand what the pull request does, how, and why.
  • Run mvn clean install to make sure basic checks pass. A more thorough check will
    be performed on your pull request automatically.
  • You have run the integration tests successfully (mvn -Prun-its clean install).

If your pull request is about ~20 lines of code you don't need to sign an
Individual Contributor License Agreement if you are unsure
please ask on the developers list.

To make clear that you license your contribution under
the Apache License Version 2.0, January 2004
you have to acknowledge this by using the following check-box.

@Tibor17
Copy link
Contributor

Tibor17 commented Apr 11, 2022

Hi @zoltanmeze , we fixed buffer overflow in 8e30194
This is different buffer overflow?
The changes in the algorithm must be reviewed very carefully. Let me check it.

@Tibor17 Tibor17 self-requested a review April 11, 2022 20:14
@zoltanmeze
Copy link
Contributor Author

I believe it it a different issue, not throwing any BufferOverflowException.

CharsetDecoder#decode in decodeString call (here) results in overflow when there is not enough space left in the output char buffer. And output buffer is not cleared because the !output.hasRemaining() condition. Clearing after each chunk ensures underflow in decodeString, so it has always enough space to write decoded string into the the output buffer.

You can verify this with the two included unit tests. Test method naming is probably not the best, in the first it doesn't needs to be a new line character after a 2-bytes encoded character, it can be anything else.

@Tibor17
Copy link
Contributor

Tibor17 commented Apr 12, 2022

@zoltanmeze
I have used only your two unit tests on master without the fix. I could reproduce the issue. Thx for the fix.

@Tibor17
Copy link
Contributor

Tibor17 commented Apr 12, 2022

@zoltanmeze
I wrote this code but I would like to remove the loop. I have already done it and I would like to make anew PR after yours. I will invite you there. Do you want to add something into this PR?

@zoltanmeze
Copy link
Contributor Author

@zoltanmeze I wrote this code but I would like to remove the loop. I have already done it and I would like to make anew PR after yours. I will invite you there. Do you want to add something into this PR?

@Tibor17 No, everything good.

That do-while looked suspicious to me, I don't think that condition is ever true. But didn't want to touch it.

@Tibor17
Copy link
Contributor

Tibor17 commented Apr 12, 2022

StringBuilder s = new StringBuilder( 1025 ) extra 1000 nano seconds are not visible by humans. StringBuilder reallocates the internal array. The other devs may think that 1025 is some business value in this object but we know that it is not. We can call the default constructor.

Channel channel = new Channel( s.toString().getBytes( UTF_8 ), s.length() );

Mock thread = new Mock( channel, new MockForkNodeArguments(),
Collections.<Segment, ForkedProcessEventType>emptyMap() );
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Here can be static import of emptyMap() without Generics.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sure. Other test methods are also using explicit type arguments, so just used the same way.
For consistency replaced in all places in this class, see 0b1c564.

@Tibor17
Copy link
Contributor

Tibor17 commented Apr 13, 2022

@zoltanmeze
I would like to continue with some changes regarding the same class and subclasses in #519 , pls have a look.

@zoltanmeze zoltanmeze mentioned this pull request Apr 13, 2022
8 tasks
@walterFor
Copy link

@Tibor17
Actually runMode in 5df49 can be set to null in line 119 TestOutputReportEntry.java , which will cause another BufferOverflowException. Attach stacktrace for my case:
java.nio.BufferOverflowException at java.nio.Buffer.nextPutIndex(Buffer.java:547) at java.nio.HeapByteBuffer.put(HeapByteBuffer.java:172) at org.apache.maven.surefire.api.stream.AbstractStreamEncoder.encodeString(AbstractStreamEncoder.java:127) at org.apache.maven.surefire.api.stream.AbstractStreamEncoder.encodeStringData(AbstractStreamEncoder.java:171) at org.apache.maven.surefire.api.stream.AbstractStreamEncoder.encode(AbstractStreamEncoder.java:157) at org.apache.maven.surefire.booter.spi.EventChannelEncoder.encodeMessage(EventChannelEncoder.java:398) at org.apache.maven.surefire.booter.spi.EventChannelEncoder.setOutErr(EventChannelEncoder.java:188) at org.apache.maven.surefire.booter.spi.EventChannelEncoder.testOutput(EventChannelEncoder.java:183) at org.apache.maven.surefire.api.booter.ForkingRunListener.writeTestOutput(ForkingRunListener.java:113) at org.apache.maven.surefire.api.booter.ForkingRunListener.writeTestOutput(ForkingRunListener.java:44) at org.apache.maven.surefire.common.junit4.JUnit4RunListener.writeTestOutput(JUnit4RunListener.java:235) at org.apache.maven.surefire.api.report.ConsoleOutputCapture$ForwardingPrintStream.println(ConsoleOutputCapture.java:144)

@zoltanmeze
Copy link
Contributor Author

@walterFor Is that somehow related to this issue? It doesn't look related. Can you create a ticket with reproducible steps?

@zoltanmeze zoltanmeze requested a review from Tibor17 April 14, 2022 19:45
@Tibor17
Copy link
Contributor

Tibor17 commented Apr 14, 2022

@zoltanmeze
@walterFor
yes, that's true, the code

        if ( runMode != null )
        {
            // one byte of length + one delimiter character ':' + <string> + one delimiter character ':'
            lengthOfMetadata += 1 + 1 + runMode.geRunmode().length() + 1;
        }

should become

        // one byte of length + one delimiter character ':' + <string> + one delimiter character ':'
        lengthOfMetadata += 1 + 1 + ( runMode == null ? 0 : runMode.getRunmodeBinary().length ) + 1;

@Tibor17
Copy link
Contributor

Tibor17 commented Apr 23, 2022

Hi @zoltanmeze
Would you have time to join us again and continue?

Copy link
Contributor

@Tibor17 Tibor17 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@zoltanmeze
I have qualified the problem with runMode as a bug, see the comment. The outcome would be the same buffer overflow and the same meaning for this Jira.

…ing output buffer

- shouldReadStringOverflowOnNewLine - ends up with 1 single byte (LF) remaining on input buffer
- shouldReadStringOverflowOn4BytesEncodedSymbol - causing an infinite loop with 4 bytes left on input buffer
Overflow can happen even when output buffer has still some remaining space left
@zoltanmeze
Copy link
Contributor Author

zoltanmeze commented Apr 25, 2022

@zoltanmeze I have qualified the problem with runMode as a bug, see the comment. The outcome would be the same buffer overflow and the same meaning for this Jira.

@Tibor17

Done. See 511b5c4. Also rebased the branch onto latest master.

I still think that BufferOverflowException on null runMode should be a separate issue/PR, it's not really related to SUREFIRE-2058. It's a bug in StreamEncoder, but 2058 is a StreamDecoder issue with no exception thrown. So it should be probably linked to SUREFIRE-2056 instead. If you change you mind I can pull that last commit into separate PR.

@zoltanmeze zoltanmeze requested a review from Tibor17 April 25, 2022 08:17
@Tibor17
Copy link
Contributor

Tibor17 commented Apr 25, 2022

@zoltanmeze
Feel free to extract the last commit to a separate PR and Jira.
The Jira SUREFIRE-2056 is about another data element - testRunId.

@zoltanmeze zoltanmeze changed the title [SUREFIRE-2058] - Corrupted STDOUT by directly writing to native stream in forked JVM 1 with UTF-8 console logging [SUREFIRE-2058] Corrupted STDOUT by directly writing to native stream in forked JVM 1 with UTF-8 console logging Apr 25, 2022
@zoltanmeze
Copy link
Contributor Author

@Tibor17 Done, created a separate jira ticket and PR for BufferOverflowException on null runMode, see #529

@Tibor17
Copy link
Contributor

Tibor17 commented Apr 25, 2022

Let's run the CI build...

@Tibor17 Tibor17 merged commit 754dd9c into apache:master Apr 26, 2022
@Tibor17
Copy link
Contributor

Tibor17 commented Apr 26, 2022

@zoltanmeze
Thx for contributing.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
4 participants