Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

pd-client: share the connection when GetStoreMinResolvedTS from PD #922

Open
AndreMouche opened this issue Aug 2, 2023 · 7 comments · Fixed by #921
Open

pd-client: share the connection when GetStoreMinResolvedTS from PD #922

AndreMouche opened this issue Aug 2, 2023 · 7 comments · Fixed by #921

Comments

@AndreMouche
Copy link
Member

Hi,
currently, we try to GetStoreMinResolvedTS for all stores from PD every 2 seconds

client-go/tikv/kv.go

Lines 602 to 607 in 719e645

if s.pdHttpClient != nil {
safeTS, err = s.pdHttpClient.GetStoreMinResolvedTS(ctx, storeID)
if err != nil {
logutil.BgLogger().Debug("get resolved TS from PD failed", zap.Error(err), zap.Uint64("store-id", storeID))
}
}

Here we use http request and do not share the connections.

client-go/util/pd.go

Lines 89 to 94 in 719e645

// GetStoreMinResolvedTS get store-level min-resolved-ts from pd.
func (p *PDHTTPClient) GetStoreMinResolvedTS(ctx context.Context, storeID uint64) (uint64, error) {
var err error
for _, addr := range p.addrs {
query := fmt.Sprintf("%s/%d", storeMinResolvedTSPrefix, storeID)
v, e := pdRequest(ctx, addr, query, p.cli, http.MethodGet, nil)

This will be a problem (too many connection handshakes on PD) when there are a large number of tikv and tidb instances. For example when we have 100 tikvs and 100 tidbs, then the connections for pd/api/v1/min-resolved-ts to PD would be 100*100/2=5000 in 1 seconds. this will be a great pressure for PD. Meanwhile, From the above code on L92, it is possible that the first address is not the PD leader, and then it will cause the increasement of PD followers connections/CPU too.
Here comes the suggestion: Use GRPC(pd-client) to GetStoreMinResolvedTS instead of http-client, which will share the connections and check the pd's leader regularly

@AndreMouche
Copy link
Member Author

@HuSharp Please take a look, thanks!

@AndreMouche
Copy link
Member Author

related issue pingcap/tidb#43737

@Connor1996
Copy link
Member

When enabling TLS, it would waste so much time on TLS handshake
image

@HuSharp
Copy link
Member

HuSharp commented Aug 3, 2023

@AndreMouche @Connor1996 This is the reason why we need http api tikv/pd#6386
I don't think removing it would be a better idea.

For example when we have 100 tikvs and 100 tidbs, then the connections for pd/api/v1/min-resolved-ts to PD would be 100*100/2=5000 in 1 seconds.

To focus on this issue, I tried to extend the api to get the stores together, which will reduce it to 1 * 100 / 2 = 50
Do you think this is a solution?
PTAL, thx!
pd: tikv/pd#6880
client-go: #921

@Connor1996
Copy link
Member

@HuSharp Why not provide a gRPC API?

@HuSharp
Copy link
Member

HuSharp commented Aug 3, 2023

@HuSharp Why not provide a gRPC API?

Oh, I misunderstood you, I'll provide the grpc that got all the store's ts or support for long connections later.

@mayjiang0203
Copy link

/severity major

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging a pull request may close this issue.

4 participants