Commit sst in dedicated threads #1275

rockeet · 2023-03-01T04:39:11Z

Execute Rdb_sst_info::commit_sst_file in dedicated threads, this improves performance:

Rdb_sst_file_ordered::commit may use stack to reverse input data, this is time consuming
m_sst_file_writer->Finish may be consuming, at least it need to call fsync

Execute Rdb_sst_info::commit_sst_file in dedicated threads increase parallelization with mimimal code changes.

bladepan · 2023-04-25T22:38:32Z

storage/rocksdb/rdb_sst_info.cc

@@ -392,13 +393,23 @@ int Rdb_sst_info::open_new_sst_file() {
 }

 void Rdb_sst_info::commit_sst_file(Rdb_sst_file_ordered *sst_file) {
+  m_commiting_threads_mutex.lock();


consider std::lock_guardstd::mutex? we use RDB_MUTEX_LOCK_CHECK macros elsewhere in this file

bladepan · 2023-04-25T22:39:46Z

storage/rocksdb/rdb_sst_info.cc

@@ -479,6 +490,11 @@ int Rdb_sst_info::finish(Rdb_sst_commit_info *commit_info,
    close_curr_sst_file();
  }

+  for (auto& thr : m_commiting_threads) {


do we need to grab a log here too?

bladepan · 2023-04-25T22:44:58Z

Execute Rdb_sst_info::commit_sst_file in dedicated threads, this improves performance:
1. `Rdb_sst_file_ordered::commit` may use stack to reverse input data, this is time consuming

2. `m_sst_file_writer->Finish` may be consuming, at least it need to call `fsync`
Execute Rdb_sst_info::commit_sst_file in dedicated threads increase parallelization with mimimal code changes.

the keys are already ordered, why it would reverse input data?
the fsync happens when writing sst file to disk?

could you please provide some rough calculations about the time savings?

rockeet · 2023-04-26T16:45:21Z

Execute Rdb_sst_info::commit_sst_file in dedicated threads, this improves performance:
1. `Rdb_sst_file_ordered::commit` may use stack to reverse input data, this is time consuming

2. `m_sst_file_writer->Finish` may be consuming, at least it need to call `fsync`
Execute Rdb_sst_info::commit_sst_file in dedicated threads increase parallelization with mimimal code changes.
the keys are already ordered, why it would reverse input data?

the fsync happens when writing sst file to disk?

could you please provide some rough calculations about the time savings?

In Rdb_sst_file_ordered::commit(), if m_use_stack is true, it will write kv from stack.
In Rdb_sst_file_ordered::Rdb_sst_file::commit(), it calls m_sst_file_writer->Finish(), in which will write file and call fsync.

In our private branch, when m_use_stack is true(reverse cf), create index time reduces 30+%.

bladepan · 2023-04-26T23:57:52Z

1. In `Rdb_sst_file_ordered::commit()`, if `m_use_stack` is true, it will write kv from stack.

2. In `Rdb_sst_file_ordered::Rdb_sst_file::commit()`, it calls `m_sst_file_writer->Finish()`, in which will write file and call fsync.

In our private branch, when m_use_stack is true(reverse cf), create index time reduces 30+%.

Rdb_index_merge and Rdb_sst_file_ordered both use cf's comparator, when Rdb_index_merge write to Rdb_sst_file_ordered, the keys should be in cf's increasing order, m_use_stack should be false in this case.

rockeet · 2023-04-28T03:19:44Z

1. In `Rdb_sst_file_ordered::commit()`, if `m_use_stack` is true, it will write kv from stack.

2. In `Rdb_sst_file_ordered::Rdb_sst_file::commit()`, it calls `m_sst_file_writer->Finish()`, in which will write file and call fsync.

In our private branch, when m_use_stack is true(reverse cf), create index time reduces 30+%.

Rdb_index_merge and Rdb_sst_file_ordered both use cf's comparator, when Rdb_index_merge write to Rdb_sst_file_ordered, the keys should be in cf's increasing order, m_use_stack should be false in this case.

We removed merge_record::m_comparator since it is identical in each std::set element, and using Slice::operator< to avoid rocksdb::Comparator virtual function call, thus it is reversed with regard to rev cf and makes m_use_stack being true.

rockeet added 2 commits March 1, 2023 12:26

rdb_sst_info: commit sst_file in dedicated threads

a447264

rdb_sst_info: commit sst_file in dedicated threads: simplify

33787bb

facebook-github-bot added the CLA Signed label Mar 1, 2023

bladepan reviewed Apr 25, 2023

View reviewed changes

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Commit sst in dedicated threads #1275

Commit sst in dedicated threads #1275

rockeet commented Mar 1, 2023 •

edited

bladepan Apr 25, 2023

bladepan Apr 25, 2023

bladepan commented Apr 25, 2023

rockeet commented Apr 26, 2023 •

edited

bladepan commented Apr 26, 2023

rockeet commented Apr 28, 2023 •

edited

Commit sst in dedicated threads #1275

Are you sure you want to change the base?

Commit sst in dedicated threads #1275

Conversation

rockeet commented Mar 1, 2023 • edited

bladepan Apr 25, 2023

Choose a reason for hiding this comment

bladepan Apr 25, 2023

Choose a reason for hiding this comment

bladepan commented Apr 25, 2023

rockeet commented Apr 26, 2023 • edited

bladepan commented Apr 26, 2023

rockeet commented Apr 28, 2023 • edited

rockeet commented Mar 1, 2023 •

edited

rockeet commented Apr 26, 2023 •

edited

rockeet commented Apr 28, 2023 •

edited