Skip to content
This repository has been archived by the owner on Nov 24, 2023. It is now read-only.

relay: refactor dm-worker relay logic #2214

Open
6 tasks
lichunzhu opened this issue Oct 13, 2021 · 0 comments
Open
6 tasks

relay: refactor dm-worker relay logic #2214

lichunzhu opened this issue Oct 13, 2021 · 0 comments
Milestone

Comments

@lichunzhu
Copy link
Contributor

lichunzhu commented Oct 13, 2021

User story

  1. When relay log is disabled, each subtask of the dm-worker will pull binlogs from the same source, which may put more pressure on the upstream server, and the average performance of each source will be reduced if the number of connections is large.
  2. When relay log is disabled, if the upstream cleans up the binlog and the dm synchronization is behind, the synchronization will fail and the user can only resume the task by redoing the full tasks.
  3. When relay log is turned on, dm synchronization latency increases compared to non-relay log under less load, which makes some users reluctant to turn on relay log (see Testing synchronization link latency and CPU usage)
  4. When relay log is enabled in earlier versions of dm, the CPU consumption of dm-worker is significantly higher than without relay.
  5. relay log can only be read by the local dm-worker, if the source is transferred to another worker, the relay can only be pulled from mysql upstream again, but not from the existing dm-worker.

Target

  • Reduce the latency of relay synchronization
  • Optimize the use of CPU and other machine resources for relay with low latency
  • Optimize relay module code structure and readability
  • Relay purger logic optimization
  • (To be discussed) relay reader supports grpc protocol for binlog transfer to other dm-workers.

New relay module design

relay writer

In the existing relay design, the binlog feature determines that the relay writer part itself is written sequentially, and after comparing with MySQL and TiDB Binlog design, I think there is no big problem with the existing implementation, and the update can focus on optimizing the code structure and streamlining the process.

The relay log structure and directory are not changed, so there is no compatibility problem.

relay reader

This proposal focuses on the reader, so we propose to make the following changes to the DM relay module.

  1. The relay reader code is moved to the relay module, and the relay module is responsible for the scheduling management, and implements a unified interface for pulling binlog from local binlog or getting binlog from other dm-worker through grpc.
  2. When the downstream consumption speed is higher than the upstream fetching speed, the unified relay module avoids the reader to repeatedly check the local disk file size to know whether a new binlog event is generated. After preliminary testing, this approach should reduce the latency after relay is turned on to about the same as when it is not turned on.
  3. (To be discussed) The relay module caches a section of the latest binlog read and ready to be written, and if this binlog is requested, the relay is directly taken out of memory and sent to the reader, which is mainly used to solve the time-consuming problem of adding a layer of writing and reading binlog when the downstream consumption speed is higher than the upstream pulling speed.
    • Cache size design, peak -> trough switching
    • Forward and backward switching, switch relay log file as the switching time
    • Quickly locate the event according to GTID/pos when switching

Subtasks

Phase 1 - relay reader refactoring

  • Add reader interface to relay module to unify local/grpc entry #2215
  • New EventListener inside relay to notify each reader goroutine after successful writing of event
  • BinlogReader listens to binlog file changes and abstracts it to EventNotifier, so that relay can notify directly.
  • Move the relay reader part in pkg/streamer relay module
  • Code content and structure optimization: such as relayHolder merged into relay section
  • Fix single test and integration test
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

Successfully merging a pull request may close this issue.

2 participants