Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

After intentional shutdown of a node of the cluster, the other nodes are still attempting to reconnect to the shutdown node #26319

Open
federicasuriano opened this issue Apr 24, 2024 · 0 comments

Comments

@federicasuriano
Copy link

Bug
I encountered this issue regarding a particular behavior in the Hazelcasts cluster setup. After intentionally shutting down one of the cluster nodes, I noticed that the remaining nodes received the following logs:

WARN […] com.hazelcast.internal.server.tcp.TcpServerConnectionErrorHandler - Removing connection to endpoint […] Cause => java.io.IOException {Connection refused to address /[…]}, Error-Count: 5
WARN […] com.hazelcast.internal.cluster.impl.MembershipManager - […] Member […] is suspected to be dead for reason: No connection

The remaining nodes are detecting the failure of the shut-down node. However, despite the intentional shutdown, the other nodes are still attempting to reconnect to the shut-down node.
I tested it on Hazelcast version 5.3.2, 5.0.2 and 4.0.3 and it always produces the same logs.

Expected behavior
I expect that when a node is intentionally shutdown, the other nodes do not attempt to reconnect to the shutdown node.

How to reproduce
I created a test to reproduce the error.

package com.nm.test.hazelcast.shutdown;

import static org.junit.jupiter.api.Assertions.assertEquals;
import static org.junit.jupiter.api.Assertions.assertTrue;

import com.hazelcast.config.Config;
import com.hazelcast.config.TcpIpConfig;
import com.hazelcast.core.Hazelcast;
import com.hazelcast.core.HazelcastInstance;
import com.hazelcast.internal.cluster.impl.MembershipManager;
import com.hazelcast.spi.properties.ClusterProperty;
import com.nm.test.hazelcast.utils.StoreLoggedEventsAppender;
import java.util.ArrayList;
import java.util.List;
import java.util.concurrent.TimeUnit;
import org.apache.log4j.Logger;
import org.junit.jupiter.api.AfterEach;
import org.junit.jupiter.api.BeforeEach;
import org.junit.jupiter.api.Disabled;
import org.junit.jupiter.api.Test;

// Test handling intentional node shutdown.
public class TestShutDown6 {

	private List<HazelcastInstance> instances;

	private final String[] targetLoggers = { "com.hazelcast.internal.server.tcp.TcpServerConnectionErrorHandler", MembershipManager.class.getName() };

	private StoreLoggedEventsAppender tcpServerConnectionErrorHandlerAppender;
	private StoreLoggedEventsAppender membershipManagerAppender;

	private Logger tcpServerConnectionErrorHandlerLogger;
	private Logger membershipManagerLogger;

	@BeforeEach
	public void setUp() throws Exception {

		// create individual appenders for target loggers
		tcpServerConnectionErrorHandlerAppender = new StoreLoggedEventsAppender();
		membershipManagerAppender = new StoreLoggedEventsAppender();

		// add appenders to the respective loggers
		tcpServerConnectionErrorHandlerLogger = Logger.getLogger(targetLoggers[0]);
		tcpServerConnectionErrorHandlerLogger.setAdditivity(true);
		tcpServerConnectionErrorHandlerLogger.addAppender(tcpServerConnectionErrorHandlerAppender);

		membershipManagerLogger = Logger.getLogger(targetLoggers[1]);
		membershipManagerLogger.setAdditivity(true);
		membershipManagerLogger.addAppender(membershipManagerAppender);

		instances = new ArrayList<>();
	}

	@AfterEach
	public void tearDown() {

		// remove the appenders
		tcpServerConnectionErrorHandlerLogger.removeAppender(tcpServerConnectionErrorHandlerAppender);
		membershipManagerLogger.removeAppender(membershipManagerAppender);

		// shutdown all Hazelcast instances
		for (HazelcastInstance instance : instances) {
			instance.getLifecycleService().terminate();
		}
	}

	@Test
	public void testNoReconnectAfterNode1Shutdown() throws InterruptedException {

		// create config and start a 2 node cluster
		configAndStart2NodeCluster();

		HazelcastInstance hcInstance1 = instances.get(0);
		HazelcastInstance hcInstance2 = instances.get(1);

		// shut down hcInstance1 intentionally
		hcInstance1.getLifecycleService().shutdown();

		// wait for some time to ensure any reconnection attempts would have happened
		TimeUnit.SECONDS.sleep(10);

		// ensure only one member remains in the cluster after shutting down hcInstance1
		assertEquals(1, hcInstance2.getCluster().getMembers().size());

		// ensure no reconnection attempts are made by hcInstance:
		// assert that there were no WARN messages from the target loggers
		assertTrue(tcpServerConnectionErrorHandlerAppender.getWarnLogs().isEmpty());
		assertTrue(membershipManagerAppender.getWarnLogs().isEmpty());
	}

	private void configAndStart2NodeCluster() {

		// create config
		Config config = new Config();

		// configure Log4j logging
		config.setProperty(ClusterProperty.LOGGING_TYPE.getName(), "log4j2");

		// enable TCP-IP config
		TcpIpConfig tcpIpConfig = config.getNetworkConfig().getJoin().getTcpIpConfig();
		tcpIpConfig.setEnabled(true);
		tcpIpConfig.setMembers(List.of("127.0.0.1"));

		HazelcastInstance hcInstance1 = Hazelcast.newHazelcastInstance(config);
		HazelcastInstance hcInstance2 = Hazelcast.newHazelcastInstance(config);

		instances.add(hcInstance1);
		instances.add(hcInstance2);
	}
}
package com.nm.test.hazelcast.utils;

import java.util.ArrayList;
import java.util.List;
import org.apache.log4j.AppenderSkeleton;
import org.apache.log4j.Level;
import org.apache.log4j.spi.LoggingEvent;

public class StoreLoggedEventsAppender extends AppenderSkeleton {

	private List<String> debugLogs = new ArrayList<>();

	private List<String> infoLogs = new ArrayList<>();

	private List<String> warnLogs = new ArrayList<>();

	private List<String> errorLogs = new ArrayList<>();

	@Override
	protected void append(LoggingEvent loggingEvent) {

		if (Level.DEBUG.equals(loggingEvent.getLevel())) {
			debugLogs.add(loggingEvent.getRenderedMessage());
		} else if (Level.INFO.equals(loggingEvent.getLevel())) {
			infoLogs.add(loggingEvent.getRenderedMessage());
		} else if (Level.WARN.equals(loggingEvent.getLevel())) {
			warnLogs.add(loggingEvent.getRenderedMessage());
		} else if (Level.ERROR.equals(loggingEvent.getLevel())) {
			errorLogs.add(loggingEvent.getRenderedMessage());
		}
	}

	@Override
	public void close() {
	}

	@Override
	public boolean requiresLayout() {
		return false;
	}

	public List<String> getDebugLogs() {
		return debugLogs;
	}

	public List<String> getInfoLogs() {
		return infoLogs;
	}

	public List<String> getWarnLogs() {
		return warnLogs;
	}

	public List<String> getErrorLogs() {
		return errorLogs;
	}
}
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

1 participant