Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[improve][misc] Upgrade Netty to 4.1.86.Final and Netty Tcnative to 2.0.54.Final #18599

Merged
merged 3 commits into from
Dec 13, 2022

Conversation

lhotari
Copy link
Member

@lhotari lhotari commented Nov 24, 2022

Motivation

Modifications

  • Upgrade Netty to 4.1.86.Final
  • Update Netty Tcnative to 2.0.54.Final (referenced in netty-bom)
  • Upgrade netty-iouring.version to 0.0.16.Final which is compatible with 4.1.85.Final/4.1.86.Final
  • Remove extra ctx.read() which is no more required when autoread is switched on.

Documentation

  • doc
  • doc-required
  • doc-not-needed
  • doc-complete

Matching PR in forked repository

PR in forked repository: lhotari#103

Copy link
Member

@nodece nodece left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good!

@lhotari lhotari force-pushed the lh-upgrade-netty-4.1.85.Final branch from f0d6153 to 8084010 Compare November 24, 2022 10:16
@codecov-commenter
Copy link

codecov-commenter commented Nov 24, 2022

Codecov Report

Merging #18599 (2223362) into master (3180a4a) will decrease coverage by 18.04%.
The diff coverage is 33.33%.

Impacted file tree graph

@@              Coverage Diff              @@
##             master   #18599       +/-   ##
=============================================
- Coverage     46.17%   28.13%   -18.05%     
+ Complexity    10359     5391     -4968     
=============================================
  Files           703      623       -80     
  Lines         68845    59108     -9737     
  Branches       7382     6147     -1235     
=============================================
- Hits          31788    16628    -15160     
- Misses        33448    40131     +6683     
+ Partials       3609     2349     -1260     
Flag Coverage Δ
unittests 28.13% <33.33%> (-18.05%) ⬇️

Flags with carried forward coverage won't be shown. Click here to find out more.

Impacted Files Coverage Δ
.../main/java/org/apache/pulsar/PulsarStandalone.java 0.00% <0.00%> (ø)
...broker/intercept/ManagedLedgerInterceptorImpl.java 0.00% <0.00%> (ø)
...va/org/apache/pulsar/broker/service/ServerCnx.java 32.40% <ø> (-15.04%) ⬇️
...va/org/apache/pulsar/client/impl/ConsumerImpl.java 15.09% <0.00%> (-0.04%) ⬇️
...he/pulsar/client/impl/MultiTopicsConsumerImpl.java 22.78% <0.00%> (-0.09%) ⬇️
...nsaction/pendingack/impl/PendingAckHandleImpl.java 6.19% <50.00%> (-45.65%) ⬇️
.../pulsar/broker/service/AbstractBaseDispatcher.java 36.57% <66.66%> (-20.90%) ⬇️
...sistent/PersistentDispatcherMultipleConsumers.java 40.97% <100.00%> (-10.76%) ⬇️
...n/java/org/apache/pulsar/client/api/RawReader.java 0.00% <0.00%> (-100.00%) ⬇️
...ava/org/apache/pulsar/broker/admin/v1/Brokers.java 0.00% <0.00%> (-100.00%) ⬇️
... and 245 more

@lhotari lhotari marked this pull request as draft November 24, 2022 10:55
@lhotari
Copy link
Member Author

lhotari commented Nov 24, 2022

There's an odd StackoverflowError in the build now. Attempts to fix are PRs #18604 and #18602

@lhotari lhotari marked this pull request as ready for review November 24, 2022 18:44
@lhotari lhotari closed this Nov 24, 2022
@lhotari lhotari reopened this Nov 24, 2022
@lhotari
Copy link
Member Author

lhotari commented Nov 25, 2022

There are odd test failures with newer Netty version.

I was investigating the issue with org.apache.pulsar.client.api.PatternMultiTopicsConsumerTest . In some cases, messages are not delivered to the consumer. This happens in the testNotifications test case. Since the topic is subscribed instantly after it's created, this could be some sort of race condition. Perhaps the latency is reduced with the newer Netty version slightly, just enough to trigger the race condition?

@lhotari lhotari force-pushed the lh-upgrade-netty-4.1.85.Final branch from db271d0 to bc69b20 Compare November 25, 2022 18:12
@lhotari
Copy link
Member Author

lhotari commented Nov 25, 2022

Simple instructions to reproduce the issue:

git clone https://github.com/apache/pulsar
cd pulsar
# checkout PR 18599 in detached mode
git fetch origin pull/18599/head && git checkout FETCH_HEAD
mvn -Pcore-modules,-main -T 1C clean install -DskipTests -Dspotbugs.skip=true
# run test in a loop until it fails. The problem usually reproduces in less than 5 runs (usually 1st or 2nd round).
while mvn -DredirectTestOutputToFile=false -DtestRetryCount=0 test -pl pulsar-broker -Dtest=PatternMultiTopicsConsumerTest#testNotifications; do echo "Rerunning..."; done

Now downgrade Netty version to 4.1.84.Final

perl -i -p -e 's/4.1.85.Final/4.1.84.Final/g' $(git grep -l 4.1.85.Final)
# run test in a loop, it doesn't fail with 4.1.84.Final
while mvn -DredirectTestOutputToFile=false -DtestRetryCount=0 test -pl pulsar-broker -Dtest=PatternMultiTopicsConsumerTest#testNotifications; do echo "Rerunning..."; done

This proves that changes in netty/netty#12888 are the source of the regression in Pulsar:

# revert any changes (like downgrade to 4.1.84.Final)
git reset --hard HEAD
pushd
# patch netty-common-4.1.85.Final.jar file
cd ~/.m2/repository/io/netty/netty-common/4.1.85.Final
cp -p netty-common-4.1.85.Final.jar netty-common-4.1.85.Final.jar.orig
unzip ../4.1.84.Final/netty-common-4.1.84.Final.jar 'io/netty/util/HashedWheelTimer*'
zip netty-common-4.1.85.Final.jar io/netty/util/HashedWheelTimer*.class
rm -rf io/

# go back to the checkout directory
popd
# run test in a loop, it doesn't fail with 4.1.85.Final patched with HashedWheelTimer class from 4.1.84.Final
while mvn -DredirectTestOutputToFile=false -DtestRetryCount=0 test -pl pulsar-broker -Dtest=PatternMultiTopicsConsumerTest#testNotifications; do echo "Rerunning..."; done

Remember to restore the patched jar file:

cd ~/.m2/repository/io/netty/netty-common/4.1.85.Final
mv netty-common-4.1.85.Final.jar.orig netty-common-4.1.85.Final.jar

@lhotari
Copy link
Member Author

lhotari commented Nov 25, 2022

Although the Netty upgrade makes the problem appear in Pulsar, it might be a bug in Pulsar that causes the problem. @andrasbeni Do you have any ideas why PatternMultiTopicsConsumerTest fails with Netty 4.1.85? (there's instructions to reproduce in #18599 (comment))

@lhotari
Copy link
Member Author

lhotari commented Nov 27, 2022

Although the Netty upgrade makes the problem appear in Pulsar, it might be a bug in Pulsar that causes the problem. @andrasbeni Do you have any ideas why PatternMultiTopicsConsumerTest fails with Netty 4.1.85? (there's instructions to reproduce in #18599 (comment))

@andrasbeni you can ignore the previous comment. This has been confirmed to be caused by a bug in Netty. I'm expecting that the fix will be available in the next Netty 4.1.x release 4.1.86 .

@HQebupt
Copy link
Contributor

HQebupt commented Dec 4, 2022

Netty 4.1.85.Final released
A potential memory leak bug has been fixed in the pooled allocator (#12897)

Nice. I encountered the memory leak problem in Pulsar.
image

@lhotari lhotari changed the title [improve][misc] Upgrade Netty to 4.1.85.Final and Netty Tcnative to 2.0.54.Final [improve][misc] Upgrade Netty to 4.1.86.Final and Netty Tcnative to 2.0.54.Final Dec 12, 2022
@lhotari lhotari force-pushed the lh-upgrade-netty-4.1.85.Final branch from bc69b20 to 7839fdf Compare December 12, 2022 17:29
@lhotari lhotari force-pushed the lh-upgrade-netty-4.1.85.Final branch from 7839fdf to 2223362 Compare December 12, 2022 17:47
@lhotari lhotari merged commit 1cd20c5 into apache:master Dec 13, 2022
lhotari added a commit that referenced this pull request Dec 13, 2022
lhotari added a commit that referenced this pull request Dec 13, 2022
lhotari added a commit to datastax/pulsar that referenced this pull request Dec 13, 2022
….0.54.Final (apache#18599)

(cherry picked from commit 1cd20c5)
(cherry picked from commit 3cace2f)
@Technoboy- Technoboy- added this to the 2.12.0 milestone Dec 15, 2022
Demogorgon314 pushed a commit to Demogorgon314/pulsar that referenced this pull request Dec 26, 2022
Demogorgon314 pushed a commit to Demogorgon314/pulsar that referenced this pull request Dec 29, 2022
lifepuzzlefun pushed a commit to lifepuzzlefun/pulsar that referenced this pull request Jan 10, 2023
hangc0276 pushed a commit to apache/bookkeeper that referenced this pull request Jan 11, 2023
### Motivation

- see release notes for 4.1.85.Final: https://netty.io/news/2022/11/09/4-1-85-Final.html
- see release notes for 4.1.86.Final: https://netty.io/news/2022/12/12/4-1-86-Final.html

It is unknown whether there are fixes that specifically target Netty features used in Bookkeeper. One of the motivation of this PR is to upgrade Netty in Bookkeeper when the Netty version gets upgraded in Pulsar. The PR for Netty 4.1.86.Final upgrade in Pulsar is apache/pulsar#18599


### Changes

Upgrade Netty version to 4.1.86.Final
yaalsn pushed a commit to yaalsn/bookkeeper that referenced this pull request Jan 30, 2023
### Motivation

- see release notes for 4.1.85.Final: https://netty.io/news/2022/11/09/4-1-85-Final.html
- see release notes for 4.1.86.Final: https://netty.io/news/2022/12/12/4-1-86-Final.html

It is unknown whether there are fixes that specifically target Netty features used in Bookkeeper. One of the motivation of this PR is to upgrade Netty in Bookkeeper when the Netty version gets upgraded in Pulsar. The PR for Netty 4.1.86.Final upgrade in Pulsar is apache/pulsar#18599


### Changes

Upgrade Netty version to 4.1.86.Final
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

9 participants