You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Bug
I encountered this issue regarding a particular behavior in the Hazelcasts cluster setup. After intentionally shutting down one of the cluster nodes, I noticed that the remaining nodes received the following logs:
WARN […] com.hazelcast.internal.server.tcp.TcpServerConnectionErrorHandler - Removing connection to endpoint […] Cause => java.io.IOException {Connection refused to address /[…]}, Error-Count: 5
WARN […] com.hazelcast.internal.cluster.impl.MembershipManager - […] Member […] is suspected to be dead for reason: No connection
The remaining nodes are detecting the failure of the shut-down node. However, despite the intentional shutdown, the other nodes are still attempting to reconnect to the shut-down node.
I tested it on Hazelcast version 5.3.2, 5.0.2 and 4.0.3 and it always produces the same logs.
Expected behavior
I expect that when a node is intentionally shutdown, the other nodes do not attempt to reconnect to the shutdown node.
How to reproduce
I created a test to reproduce the error.
package com.nm.test.hazelcast.shutdown;
import static org.junit.jupiter.api.Assertions.assertEquals;
import static org.junit.jupiter.api.Assertions.assertTrue;
import com.hazelcast.config.Config;
import com.hazelcast.config.TcpIpConfig;
import com.hazelcast.core.Hazelcast;
import com.hazelcast.core.HazelcastInstance;
import com.hazelcast.internal.cluster.impl.MembershipManager;
import com.hazelcast.spi.properties.ClusterProperty;
import com.nm.test.hazelcast.utils.StoreLoggedEventsAppender;
import java.util.ArrayList;
import java.util.List;
import java.util.concurrent.TimeUnit;
import org.apache.log4j.Logger;
import org.junit.jupiter.api.AfterEach;
import org.junit.jupiter.api.BeforeEach;
import org.junit.jupiter.api.Disabled;
import org.junit.jupiter.api.Test;
// Test handling intentional node shutdown.
public class TestShutDown6 {
private List<HazelcastInstance> instances;
private final String[] targetLoggers = { "com.hazelcast.internal.server.tcp.TcpServerConnectionErrorHandler", MembershipManager.class.getName() };
private StoreLoggedEventsAppender tcpServerConnectionErrorHandlerAppender;
private StoreLoggedEventsAppender membershipManagerAppender;
private Logger tcpServerConnectionErrorHandlerLogger;
private Logger membershipManagerLogger;
@BeforeEach
public void setUp() throws Exception {
// create individual appenders for target loggers
tcpServerConnectionErrorHandlerAppender = new StoreLoggedEventsAppender();
membershipManagerAppender = new StoreLoggedEventsAppender();
// add appenders to the respective loggers
tcpServerConnectionErrorHandlerLogger = Logger.getLogger(targetLoggers[0]);
tcpServerConnectionErrorHandlerLogger.setAdditivity(true);
tcpServerConnectionErrorHandlerLogger.addAppender(tcpServerConnectionErrorHandlerAppender);
membershipManagerLogger = Logger.getLogger(targetLoggers[1]);
membershipManagerLogger.setAdditivity(true);
membershipManagerLogger.addAppender(membershipManagerAppender);
instances = new ArrayList<>();
}
@AfterEach
public void tearDown() {
// remove the appenders
tcpServerConnectionErrorHandlerLogger.removeAppender(tcpServerConnectionErrorHandlerAppender);
membershipManagerLogger.removeAppender(membershipManagerAppender);
// shutdown all Hazelcast instances
for (HazelcastInstance instance : instances) {
instance.getLifecycleService().terminate();
}
}
@Test
public void testNoReconnectAfterNode1Shutdown() throws InterruptedException {
// create config and start a 2 node cluster
configAndStart2NodeCluster();
HazelcastInstance hcInstance1 = instances.get(0);
HazelcastInstance hcInstance2 = instances.get(1);
// shut down hcInstance1 intentionally
hcInstance1.getLifecycleService().shutdown();
// wait for some time to ensure any reconnection attempts would have happened
TimeUnit.SECONDS.sleep(10);
// ensure only one member remains in the cluster after shutting down hcInstance1
assertEquals(1, hcInstance2.getCluster().getMembers().size());
// ensure no reconnection attempts are made by hcInstance:
// assert that there were no WARN messages from the target loggers
assertTrue(tcpServerConnectionErrorHandlerAppender.getWarnLogs().isEmpty());
assertTrue(membershipManagerAppender.getWarnLogs().isEmpty());
}
private void configAndStart2NodeCluster() {
// create config
Config config = new Config();
// configure Log4j logging
config.setProperty(ClusterProperty.LOGGING_TYPE.getName(), "log4j2");
// enable TCP-IP config
TcpIpConfig tcpIpConfig = config.getNetworkConfig().getJoin().getTcpIpConfig();
tcpIpConfig.setEnabled(true);
tcpIpConfig.setMembers(List.of("127.0.0.1"));
HazelcastInstance hcInstance1 = Hazelcast.newHazelcastInstance(config);
HazelcastInstance hcInstance2 = Hazelcast.newHazelcastInstance(config);
instances.add(hcInstance1);
instances.add(hcInstance2);
}
}
package com.nm.test.hazelcast.utils;
import java.util.ArrayList;
import java.util.List;
import org.apache.log4j.AppenderSkeleton;
import org.apache.log4j.Level;
import org.apache.log4j.spi.LoggingEvent;
public class StoreLoggedEventsAppender extends AppenderSkeleton {
private List<String> debugLogs = new ArrayList<>();
private List<String> infoLogs = new ArrayList<>();
private List<String> warnLogs = new ArrayList<>();
private List<String> errorLogs = new ArrayList<>();
@Override
protected void append(LoggingEvent loggingEvent) {
if (Level.DEBUG.equals(loggingEvent.getLevel())) {
debugLogs.add(loggingEvent.getRenderedMessage());
} else if (Level.INFO.equals(loggingEvent.getLevel())) {
infoLogs.add(loggingEvent.getRenderedMessage());
} else if (Level.WARN.equals(loggingEvent.getLevel())) {
warnLogs.add(loggingEvent.getRenderedMessage());
} else if (Level.ERROR.equals(loggingEvent.getLevel())) {
errorLogs.add(loggingEvent.getRenderedMessage());
}
}
@Override
public void close() {
}
@Override
public boolean requiresLayout() {
return false;
}
public List<String> getDebugLogs() {
return debugLogs;
}
public List<String> getInfoLogs() {
return infoLogs;
}
public List<String> getWarnLogs() {
return warnLogs;
}
public List<String> getErrorLogs() {
return errorLogs;
}
}
The text was updated successfully, but these errors were encountered:
Bug
I encountered this issue regarding a particular behavior in the Hazelcasts cluster setup. After intentionally shutting down one of the cluster nodes, I noticed that the remaining nodes received the following logs:
WARN […] com.hazelcast.internal.server.tcp.TcpServerConnectionErrorHandler - Removing connection to endpoint […] Cause => java.io.IOException {Connection refused to address /[…]}, Error-Count: 5
WARN […] com.hazelcast.internal.cluster.impl.MembershipManager - […] Member […] is suspected to be dead for reason: No connection
The remaining nodes are detecting the failure of the shut-down node. However, despite the intentional shutdown, the other nodes are still attempting to reconnect to the shut-down node.
I tested it on Hazelcast version 5.3.2, 5.0.2 and 4.0.3 and it always produces the same logs.
Expected behavior
I expect that when a node is intentionally shutdown, the other nodes do not attempt to reconnect to the shutdown node.
How to reproduce
I created a test to reproduce the error.
The text was updated successfully, but these errors were encountered: