Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Run failed TestEndpointSwitchResolvesViolation #13442

Closed
LeoYang90 opened this issue Oct 26, 2021 · 7 comments
Closed

Run failed TestEndpointSwitchResolvesViolation #13442

LeoYang90 opened this issue Oct 26, 2021 · 7 comments

Comments

@LeoYang90
Copy link

LeoYang90 commented Oct 26, 2021

master code run failed test case:

tests/integration/clientv3/ordering_util_test.go:

func TestEndpointSwitchResolvesViolation(t *testing.T) {
	integration2.BeforeTest(t)
	clus := integration2.NewClusterV3(t, &integration2.ClusterConfig{Size: 3})
	defer clus.Terminate(t)
	eps := []string{
		clus.Members[0].GRPCURL(),
		clus.Members[1].GRPCURL(),
		clus.Members[2].GRPCURL(),
	}
	cfg := clientv3.Config{Endpoints: []string{clus.Members[0].GRPCURL()}}
	cli, err := integration2.NewClient(t, cfg)
	if err != nil {
		t.Fatal(err)
	}
	defer cli.Close()

	ctx := context.TODO()

	if _, err = clus.Client(0).Put(ctx, "foo", "bar"); err != nil {
		t.Fatal(err)
	}
	// ensure that the second member has current revision for key "foo"
	if _, err = clus.Client(1).Get(ctx, "foo"); err != nil {
		t.Fatal(err)
	}

	// create partition between third members and the first two members
	// in order to guarantee that the third member's revision of "foo"
	// falls behind as updates to "foo" are issued to the first two members.
	clus.Members[2].InjectPartition(t, clus.Members[:2]...)
	time.Sleep(1 * time.Second) // give enough time for the operation

	// update to "foo" will not be replicated to the third member due to the partition
	if _, err = clus.Client(1).Put(ctx, "foo", "buzz"); err != nil {
		t.Fatal(err)
	}

	cli.SetEndpoints(eps...)
	orderingKv := ordering.NewKV(cli.KV, ordering.NewOrderViolationSwitchEndpointClosure(cli))
	// set prevRev to the second member's revision of "foo" such that
	// the revision is higher than the third member's revision of "foo"
	_, err = orderingKv.Get(ctx, "foo")
	if err != nil {
		t.Fatal(err)
	}

	t.Logf("Reconfigure client to speak only to the 'partitioned' member")
	cli.SetEndpoints(clus.Members[2].GRPCURL())
	resp, err := orderingKv.Get(ctx, "foo", clientv3.WithSerializable())
	if err != ordering.ErrNoGreaterRev {
		if err == nil {
			t.Fatalf("err nil %v+", resp)
		}
		t.Fatalf("While speaking to partitioned leader, we should get ErrNoGreaterRev error %v+", err)
	}
}

result:

    ordering_util_test.go:73: Reconfigure client to speak only to the 'partitioned' member
    logger.go:130: 2021-10-27T10:44:40.449+0800	INFO	grpc	[[core] ccResolverWrapper: sending update to cc: {[{unix:localhost:m2 localhost <nil> 0 <nil>}] 0xc00009f560 <nil>}]
    logger.go:130: 2021-10-27T10:44:40.449+0800	INFO	grpc	[[balancer] base.baseBalancer: got new ClientConn state:  {{[{unix:localhost:m2 localhost <nil> 0 <nil>}] 0xc00009f560 <nil>} <nil>}]
    logger.go:130: 2021-10-27T10:44:40.449+0800	INFO	grpc	[[core] addrConn: tryUpdateAddrs curAddr: {unix:localhost:m2 localhost <nil> 0 <nil>}, addrs: [{unix:localhost:m2 localhost <nil> 0 <nil>}]]
    logger.go:130: 2021-10-27T10:44:40.449+0800	INFO	grpc	[[core] addrConn: tryUpdateAddrs curAddrFound: true]
    logger.go:130: 2021-10-27T10:44:40.449+0800	DEBUG	client	retrying of unary invoker	{"target": "etcd-endpoints://0xc0007f8780/localhost:m0", "attempt": 0}
    logger.go:130: 2021-10-27T10:44:40.449+0800	INFO	grpc	[[core] Subchannel Connectivity change to SHUTDOWN]
    logger.go:130: 2021-10-27T10:44:40.449+0800	INFO	grpc	[[transport] transport: loopyWriter.run returning. connection error: desc = "transport is closing"]
    logger.go:130: 2021-10-27T10:44:40.449+0800	INFO	grpc	[[transport] transport: loopyWriter.run returning. connection error: desc = "transport is closing"]
    logger.go:130: 2021-10-27T10:44:40.449+0800	INFO	grpc	[[core] Subchannel Connectivity change to SHUTDOWN]
    logger.go:130: 2021-10-27T10:44:40.449+0800	INFO	grpc	[[transport] transport: loopyWriter.run returning. connection error: desc = "transport is closing"]
    ordering_util_test.go:78: err nil &{cluster_id:1088614747799843982 member_id:1740948910635890621 revision:3 raft_term:5  [key:"foo" create_revision:2 mod_revision:3 version:2 value:"buzz" ] false 1 {} [] 0}+
    logger.go:130: 2021-10-26T20:24:33.791+0800	INFO	grpc	[[core] Channel Connectivity change to SHUTDOWN]

InjectPartition do not work.

@LeoYang90 LeoYang90 changed the title Distributors Application for <YOUR DISTRIBUTION HERE> Run failed TestEndpointSwitchResolvesViolation Oct 26, 2021
@kerthcet
Copy link

take it for a further research.
/assign

@LeoYang90
Copy link
Author

LeoYang90 commented Oct 27, 2021

take it for a further research. /assign

Got it!
Grpc balance switch endpoint to m0 rather m2.

...
cluster.go:263:  - m0 -> 501edbcb8e5ae1d8 (unix://localhost:m0)
cluster.go:263:  - m1 -> b783595d8f41c27f (unix://localhost:m1)
cluster.go:263:  - m2 -> 43c9b6fcf11ebc22 (unix://localhost:m2)
...
logger.go:130: 2021-10-27T10:44:40.449+0800	DEBUG	client	retrying of unary invoker	{"target": "etcd-endpoints://0xc0007f8780/localhost:m0", "attempt": 0}
ordering_util_test.go:77: While speaking to partitioned leader, we should get ErrNoGreaterRev error <nil>+, &{cluster_id:16997566679625247269 member_id:5773293439648719320 revision:3 raft_term:4  [key:"foo" create_revision:2 mod_revision:3 version:2 value:"buzz" ] false 1 {} [] 0}+

Memberid 5773293439648719320 convert to hex 501edbcb8e5ae1d8

@kerthcet

@ptabor
Copy link
Contributor

ptabor commented Oct 29, 2021

I'm not sure whether it really switches to: etcd-endpoints://0xc0007f8780/localhost:m0 or its a consequence of naming representation change forced by: #13192, where the 'first' endpoint is taken as the connection name. @serathius

@LeoYang90
Copy link
Author

I'm not sure whether it really switches to: etcd-endpoints://0xc0007f8780/localhost:m0 or its a consequence of naming representation change forced by: #13192, where the 'first' endpoint is taken as the connection name. @serathius

Response header gives member id: 5773293439648719320, which is 501edbcb8e5ae1d8. @ptabor

@stale
Copy link

stale bot commented Jan 29, 2022

This issue has been automatically marked as stale because it has not had recent activity. It will be closed after 21 days if no further activity occurs. Thank you for your contributions.

@stale stale bot added the stale label Jan 29, 2022
@ahrtr ahrtr removed the stale label Feb 18, 2022
@ahrtr
Copy link
Member

ahrtr commented Feb 18, 2022

I tried lots of times, and could not reproduce this issue. So I am closing this issue, feel free to reopen it if you still can reproduce it.

@ahrtr ahrtr closed this as completed Feb 18, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Development

No branches or pull requests

4 participants