New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

Sign up for GitHub

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Jump to bottom

RFC: reduce traffic for resolved ts #108

Open

you06 wants to merge 5 commits into tikv:master from you06:reduce-traffic-resolved-ts

you06 commented May 23, 2023 •

edited


          add RPC for reducing traffic for resolved ts

2fb55da

Signed-off-by: you06 <you1474600@gmail.com>

you06 force-pushed the reduce-traffic-resolved-ts branch from d005adb to 2fb55da Compare

May 23, 2023 12:52

Connor1996 reviewed

View reviewed changes

text/0000-reduce-traffic-resolved-ts.md Outdated


		The proposed method for reducing the traffic of pushing safe ts is hibernated region optimization. This method can be especially beneficial for large clusters since it can save a significant amount of traffic.

		The motivation behind this approach is that TiKV currently uses the `CheckLeader` request to push resolved ts, and the traffic it costs grows linearly with the number of regions. By utilizing hibernated region optimization, it is possible to reduce this traffic and improve overall performance.

Member

Connor1996 May 24, 2023

But hibernated region is removed in raftstore-v2

Author

you06 May 26, 2023

Yeah, maybe the traffic of pushing safe ts won't be a problem if the region size is dynamic. This RFC is only an optimization in raftstore-v1.

cfzjywxk mentioned this pull request

[WIP]storage: optimize check leader flow, using raft check first to get valid leaders tikv/tikv#14832

Draft

you06 mentioned this pull request

[WIP] resolved-ts: push resolved ts in two phases tikv/tikv#14837

Open

you06 added 3 commits

May 30, 2023 19:43


          update a safe version

48f1f78

Signed-off-by: you06 <you1474600@gmail.com>


          update the issue of safe ts lag in followers

03c21d8

Signed-off-by: you06 <you1474600@gmail.com>


          rename file

Signed-off-by: you06 <you1474600@gmail.com>

you06 force-pushed the reduce-traffic-resolved-ts branch from ac74ce1 to 5890509 Compare

June 5, 2023 16:39

cfzjywxk reviewed

View reviewed changes

text/0108-reduce-traffic-resolved-ts.md

		@@ -0,0 +1,114 @@
		## Summary

		The proposed method for reducing the traffic of pushing safe ts is hibernated region optimization. This method can be especially beneficial for large clusters since it can save a significant amount of traffic.

cfzjywxk Jun 7, 2023

The proposed method for reducing the traffic of pushing safe ts is hibernated region optimization.

Currently, the main strategy in the new design is to minimize leader validation traffic and safe-ts pushing traffic for valid leaders.

text/0108-reduce-traffic-resolved-ts.md

		The `CheckLeaderResponse` respond with the regions that pass the Raft safety check. The leader can then push its `safe_ts` for those regions. Since most regions will pass the safety check, it is not necessary to respond with the IDs of all passing regions. Instead, we can respond with only the IDs of regions that fail the safety check.

		Another optimization is that we can confirm the leadership if the leader lease is hold, by calling Raft's read-index command. But this will involve in the Raftstore thread pool, more CPU will be used by this.

cfzjywxk Jun 7, 2023

We could add some explanations:
By default, the raft lease is set to 10s. If the advance-ts-interval is configured to 100ms, only one leader validation using the read index method will be required for 100 rounds which saves 99 rounds of unnecessary leader checks.

Tema Jun 20, 2023

Does it mean that in this case we don't need to send CheckLeaderRequest every 100ms and can only send ApplySafeTsRequest while we are under the lease?

cfzjywxk Jul 3, 2023

Yes as the validation and pushing are separated into two phases.

text/0108-reduce-traffic-resolved-ts.md

		```

		To save traffic, `ApplySafeTsRequest.unsafe_regions` only contains the regions whose leader may be changed. In the ideal case, this request is small because there is almost no unsafe regions.

cfzjywxk Jun 7, 2023

This is an example that will make it easier for readers to understand. Like when advance-ts-interval is set to 100ms, how many bytes could be saved running 100 rounds comparing the new protocol and the previous one.

text/0108-reduce-traffic-resolved-ts.md

		Safety is guaranteed as long as the safe ts is generated from a valid leader. By decoupling the pushing of safe ts into two phases, the leader peer can ensure that the leadership is still unchanged before safe ts is generated.

		For inactive regions, only the region IDs are sent. If the term has changed since the last active region check, the follower will respond with a check failure. When the leader receives the `CheckLeaderResponse`, and the inactive region ID is not in `CheckLeaderResponse.failed_regions`, it means that the terms of the leader and follower are both unchanged. After checking leadership for inactive regions, it's safe to push the safe timestamps from the leader. Additionally, since there are no following writes in inactive regions, the `applied_index` check is unnecessary.

cfzjywxk Jun 7, 2023

Better to explain How safety is guaranteed if the leader is a stale one or the pushing request is from a stale one when the network partition happens.

text/0108-reduce-traffic-resolved-ts.md

		Another optimization is that we can confirm the leadership if the leader lease is hold, by calling Raft's read-index command. But this will involve in the Raftstore thread pool, more CPU will be used by this.

		### Inactive Regions

cfzjywxk Jun 7, 2023

We need to explain how this new design interacts with the existing hibernated region feature and identify any dependencies it may have. Additionally, we should outline how to determine if a region is inactive.

As I understand it, the current workflow involves:

Checking if a region leader is active.
Validating that the leader role is still held for both active and inactive regions using different methods.
Sending pushing ts requests; note that different request types are required for active and inactive regions.

you06 added a commit to you06/kvproto that referenced this pull request


          implement tikv/rfcs#108

8b06726

Signed-off-by: you06 <you1474600@gmail.com>

This was referenced Jun 12, 2023

Add resolved ts traffic optimizations pingcap/kvproto#1128

Open

Tracking issue of reducing traffic for resolved ts tikv/tikv#14919

Open

Tema reviewed

View reviewed changes

text/0108-reduce-traffic-resolved-ts.md

+              }
+              message ApplySafeTsResponse {
+                  uint64 ts = 1;

Tema Jun 20, 2023

Why do we need get a ts back in response? Also would we need ts for each active region in this case not just one ts?

Author

you06 Jun 25, 2023

The ts is actually not used.

would we need ts for each active region in this case not just one ts?

No, because leader doesn not need to know the safe ts in followers.

Tema reviewed

View reviewed changes

text/0108-reduce-traffic-resolved-ts.md

+              ```diff
+              message CheckLeaderRequest {
+                  repeated LeaderInfo regions = 1;

Tema Jun 20, 2023

If CheckLeaderRequest does not advance safe_ts anymore (as subsequent ApplySafeTsRequest does it), we probably can strip some data out of LeaderInfo as well. E.g. no need to pass ReadState anymore?

Author

you06 Jun 25, 2023

We send LeaderInfo without ReadState as a transition state from active to inactive(you can see "Transition State" section in the updates).
For active regions(which have continues write), they are not contains in CheckLeaderRequest.regions, because the leadership is checked by Raft lease, we send the checked state of them in ApplySafeTsRequest.checked_leaders.

Tema reviewed

View reviewed changes

text/0108-reduce-traffic-resolved-ts.md

+              ```protobuf
+              message CheckedLeader {
+                  uint64 region_id = 1;
+                  ReadState read_state = 2;

Tema Jun 20, 2023

Do we need both AppliedIndex and SafeTs here? Anyway to save space further, we can introduce ReadStateDiff which would contain AppliedIndex and SafeTs diff with ApplySafeTsRequest.ts which would only take 2-4 bytes instead of 8 bytes. I believe uint64 is never a 8 bytes thanks to varint protobug encoding, but point is same. By passing only diff towards ApplySafeTsRequest.ts, it will always be much shorter and thanks to varint encoding we can save lot of space.

Author

you06 Jun 25, 2023

Do we need both AppliedIndex and SafeTs here?

Yes , it's contained in ReadState, because in active regions, the applied index in follower may delay from leader, we cannot apply the safe ts until the apply index catches up CheckedLeader.read_state.applied_index.

Anyway to save space further, we can introduce ReadStateDiff

That's fun and I'll do some research of sending diff when we continue to push this project.


          update transition state

706458b

Signed-off-by: you06 <you1474600@gmail.com>

overvenus reviewed

View reviewed changes

text/0108-reduce-traffic-resolved-ts.md

+              }
+              ```
+              To apply safe ts for valid leaders as soon as possible, instead of waiting for the next round of advancing resolved timestamps, we need to send another `ApplySafeTS` request. The `ApplySafeTS` request is usually small, the traffic caused by it can be ignored.

Member

overvenus Jul 3, 2023

The ApplySafeTS request is usually small

How small is it? Does it grow linearly with the number of regions?

Author

you06 Jul 3, 2023

The size of ApplySafeTS request grows by the number of active regions whose the apply index is changed.

But we just move the leader info from CheckLeader request to ApplySafeTS request, even all regions are active, there won't be more traffic.

text/0108-reduce-traffic-resolved-ts.md

Comment on lines +56 to +61

+              message ApplySafeTsRequest {
+                  uint64 ts = 1;
+                  uint64 store_id = 2;
+                  repeated uint64 unsafe_regions = 3;
+                  repeated CheckedLeader checked_leaders = 4;
+              }

Member

overvenus Jul 3, 2023

Could you add some comments about these fields?

text/0108-reduce-traffic-resolved-ts.md


		### Active Regions

		The `CheckLeaderResponse` respond with the regions that pass the Raft safety check. The leader can then push its `safe_ts` for those regions. Since most regions will pass the safety check, it is not necessary to respond with the IDs of all passing regions. Instead, we can respond with only the IDs of regions that fail the safety check.

Member

overvenus Jul 3, 2023

Instead, we can respond with only the IDs of regions that fail the safety check.

What if messages get out of order? We may miss some failed regions.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment