Opportunities to batch / avoid truncate and delete shards operations, triggered via chitchat. #4908

fulmicoton · 2024-04-25T04:06:31Z

No description provided.

fulmicoton · 2024-04-25T05:30:09Z

Delete shard only happens via chitchat. Nothing is done if the shard is not still part of the control plane model, so there should not be any redundancy in this operation.
Batching the delete shards request could however relieve the metastore from a load:
right now when a few indexer restart, they close their shard, which in turn end up being consumed all and deleted, all at the same time. It has been observed to be one of the most frequent query on project airmail.

Truncation can happen via chitchat or via reception of a grpc request.
Truncate entails writing something in the mrecordlog.
This is not an expensive operation per se (there is no fsync involved for instance). We also properly check that the queue is trailing behind before applying the truncation.

Because the gRPC does not update the shardpositions model, we actually often truncate twice, and the check is helping.

The check (and the truncation) however, require acquiring the write lock. Maybe we could make this operation cheaper by either doing the check using the partial lock, by updating the shardpositions model on grpc, or possibly by somehow batching truncations.

fulmicoton · 2024-04-25T07:58:34Z

The delete operation seems to be using the index correctly.

Delete on shards  (cost=0.41..8.43 rows=1 width=6)
   ->  Index Scan using shards_pkey on shards  (cost=0.41..8.43 rows=1 width=6)
         Index Cond: (((index_uid)::text = 'simian_chico_8976363586344670227:01HWA32X693SVC1NVBCC2T0ND5'::text) AND ((source_id)::text = '_ingest-source'::text) AND ((shard_id)::text = ANY ('{01HWA3QH2X5E3TJEXBB4NEJH48}'::text[])))

fulmicoton · 2024-04-25T08:21:34Z

upon deletion of an indexer, I see 317 "deleting shards" logs.
Each node is hosting about 150 shards, so that's twice more than expected.

Investigating specific shard id does show a unique line. The second half is probably "rebalancing" kicking in.

fulmicoton added the enhancement New feature or request label Apr 25, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Opportunities to batch / avoid truncate and delete shards operations, triggered via chitchat. #4908

Opportunities to batch / avoid truncate and delete shards operations, triggered via chitchat. #4908

fulmicoton commented Apr 25, 2024

fulmicoton commented Apr 25, 2024

fulmicoton commented Apr 25, 2024

fulmicoton commented Apr 25, 2024 •

edited

Opportunities to batch / avoid truncate and delete shards operations, triggered via chitchat. #4908

Opportunities to batch / avoid truncate and delete shards operations, triggered via chitchat. #4908

Comments

fulmicoton commented Apr 25, 2024

fulmicoton commented Apr 25, 2024

fulmicoton commented Apr 25, 2024

fulmicoton commented Apr 25, 2024 • edited

fulmicoton commented Apr 25, 2024 •

edited