New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Remove need for &mut self in create_cf
and drop_cf
#496
Conversation
@aleksuss thanks for the comment! From my understanding of reading: #279 (comment), the reason this previous change was reverted was because However this seems to imply the caller will have to eat the cost of at least one call to some locking mechanism on the DB in order to get the mutable reference anyways. If instead we wrapped the
|
@aleksuss pinging for update, thanks |
@aleksuss @stanislav-tkach Hi, I'm colleague with @carllin. You reacted very promptly to my pr in the past (wow, already ~1.5 years ago): #339 We greatly appreciated that. :) As a quick context, this pr is blocking us like the previous pr, so we'd like this to move forward in some way. The problem we're experiencing is spontaneous but whole processing stalls of our product due to the underlying shared rocksdb (ref: solana-labs/solana#14586), happening on production systems.. Hope you can understand our sentiment of this bug... ;) To implement the possible solution to the stall, we want to modify column families via a background thread. Ideally, we don't want to take a unfortunate fork/ |
To be honest, I'm not ready to merge this PR, at least right now. I added changes like this in PR: #197 and then I reverted it as you can see. I think if you need shared mutable access to a db you should provide synchronization by yourself. E.g. I use rocksdb in my project with shared access and I use |
@aleksuss Thanks for quick reply and valuable advise! At least, we can progress in some way. :) We'll consider whether the approach can work for our case or not. :) |
@aleksuss @stanislav-tkach Thanks for the careful consideration! Unfortunately for our use case, I think wrapping the entire database object in a lock is sub-optimal, because:
I noted your concern about single-threaded lock access perf concerns, and have proposed a change in my latest commit that I think will fit both single and multi-threaded scenarios. The proposed change here would introduce a configurable
Would really appreciate your thoughts, and thanks for taking the time! |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@carllin The proposed approach in commit c51484c looks much much better. Thanks. I really like it and I think I will use it in my project 👍🏻
But couple nitpiks:
- Could you please add comment-docs to public methods?
- I think if you add tests to cover multithreaded case with cfs then clippy will be happy 😄
src/column_family.rs
Outdated
@@ -43,6 +43,7 @@ impl ColumnFamilyDescriptor { | |||
|
|||
/// An opaque type used to represent a column family. Returned from some functions, and used | |||
/// in others | |||
#[derive(Clone)] |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Do you really think it is safe ? 🤔
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I believe so, here's my analysis:
We are cloning this object
pub struct ColumnFamily {
pub(crate) inner: *mut ffi::rocksdb_column_family_handle_t,
}
where inner
is this pointer https://github.com/facebook/rocksdb/blob/master/db/c.cc#L973-L974 rocksdb_column_family_handle_t* handle = new rocksdb_column_family_handle_t;
, which aliases an object defined here:
class ColumnFamilyHandle {
public:
virtual ~ColumnFamilyHandle() {}
// Returns the name of the column family associated with the current handle.
virtual const std::string& GetName() const = 0;
// Returns the ID of the column family associated with the current handle.
virtual uint32_t GetID() const = 0;
// Fills "*desc" with the up-to-date descriptor of the column family
// associated with this handle. Since it fills "*desc" with the up-to-date
// information, this call might internally lock and release DB mutex to
// access the up-to-date CF options. In addition, all the pointer-typed
// options cannot be referenced any longer than the original options exist.
//
// Note that this function is not supported in RocksDBLite.
virtual Status GetDescriptor(ColumnFamilyDescriptor* desc) = 0;
// Returns the comparator of the column family associated with the
// current handle.
virtual const Comparator* GetComparator() const = 0;
};
where the destructor ~ColumnFamilyHandle()
is what decrements the refcount to the column family and performs cleanup of old files: https://github.com/facebook/rocksdb/blob/master/db/column_family.cc#L68-L84
The destructor however is invoked through ffi::rocksdb_column_family_handle_destroy
https://github.com/facebook/rocksdb/blob/master/db/c.cc#L997-L1000, which is not what's called in the wrapper's drop_cf
: https://github.com/rust-rocksdb/rust-rocksdb/blob/master/src/db.rs#L687, which is only called when the DB
object is dropped:
fn drop(&mut self) {
unsafe {
self.cfs.get(|cf_map| {
for cf in cf_map.values() {
ffi::rocksdb_column_family_handle_destroy(cf.inner);
}
ffi::rocksdb_close(self.inner);
})
}
}
This means cloning the handle here should be a harmless operation that does not incur any cleanup or additional data races.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
That's true. But my concern here is that the Clone
will allow you to write something like this:
#[test]
fn test_unsafe_clone_column_family() {
let danger_cf;
let db_path = DBPath::new("_rust_rocksdb_create_clone_column_family1");
{
let mut opts = Options::default();
opts.create_if_missing(true);
let mut db = DB::open(&opts, &db_path).expect("db open");
db.create_cf("cf1", &opts).unwrap();
let cf = db.cf_handle("cf1").unwrap();
danger_cf = cf.clone();
db.put_cf(&cf, b"1", b"2").unwrap();
}
let db_path = DBPath::new("_rust_rocksdb_create_clone_column_family2");
{
let mut opts = Options::default();
opts.create_if_missing(true);
let mut db = DB::open(&opts, &db_path).expect("db open");
db.create_cf("cf1", &opts).unwrap();
db.put_cf(&danger_cf, b"1", b"2").unwrap(); // BOOOOM !!! (signal 11, invalid memory reference)
}
}
It's a common case in C++, but Rust is positioned as a safe language and to allow such cases it's a bad practice. IMHO.
cc @stanislav-tkach
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@aleksuss Taking your concern into account, in my latest commit: 2cf5f6d I've removed the Clone
on the object and instead implemented a separate cf_handle_multi_threaded()
path that returns an Arc<ColumnFamily>
. This means only users of the multithreaded config will risk running into the inconsistency you described above, while by default, existing users will be unaffected.
@aleksuss sorry for the dirty PR, it was a proof-of-concept and not entirely cleaned up as you have duly noted 😄 I wanted to get sign-off on this direction before investing more in polishing up the code/tests. Now that I have it, I will get this cleaned up to be more robust. Thanks for reviewing! |
130c61d
to
8694813
Compare
@aleksuss I have added a |
src/column_family.rs
Outdated
@@ -43,6 +43,7 @@ impl ColumnFamilyDescriptor { | |||
|
|||
/// An opaque type used to represent a column family. Returned from some functions, and used | |||
/// in others | |||
#[derive(Clone)] |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
That's true. But my concern here is that the Clone
will allow you to write something like this:
#[test]
fn test_unsafe_clone_column_family() {
let danger_cf;
let db_path = DBPath::new("_rust_rocksdb_create_clone_column_family1");
{
let mut opts = Options::default();
opts.create_if_missing(true);
let mut db = DB::open(&opts, &db_path).expect("db open");
db.create_cf("cf1", &opts).unwrap();
let cf = db.cf_handle("cf1").unwrap();
danger_cf = cf.clone();
db.put_cf(&cf, b"1", b"2").unwrap();
}
let db_path = DBPath::new("_rust_rocksdb_create_clone_column_family2");
{
let mut opts = Options::default();
opts.create_if_missing(true);
let mut db = DB::open(&opts, &db_path).expect("db open");
db.create_cf("cf1", &opts).unwrap();
db.put_cf(&danger_cf, b"1", b"2").unwrap(); // BOOOOM !!! (signal 11, invalid memory reference)
}
}
It's a common case in C++, but Rust is positioned as a safe language and to allow such cases it's a bad practice. IMHO.
cc @stanislav-tkach
let inner = self.create_inner_cf_handle(name.as_ref(), opts)?; | ||
match &mut self.cfs { | ||
ColumnFamilyHandles::MultiThread(_) => { | ||
panic!("Not available on MultiThread implementation") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sorry for my pickiness, but I have a feeling that it's not good to panic here. What about to bound an invoke of this with trait ??? I'll try to provide the idea with code soon.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@aleksuss Hi, again. I want this to move forward :) (cc: @carllin)
What about to bound an invoke of this with trait ??? I'll try to provide the idea with code soon.
Maybe, were you thinking about like this?: ryoqun@1be386d
Sorry for very rough code. But this eliminates bunch of various concerns arisen from this pr's review:
- ColumnFamily is bounded in the case of MultiThread with zero runtime overhead (I even stopped using Arc)
- ... and no
derive(Clone)
onColumnFamily
- will have no
panic!
s (if everything is correctly rewritten) - keeps api compatibility and no multithread methods.
- no enum runtime variant branching (although this is cheap)
- downside: a bit of internal code churn (I'll promise I'll work hard to fix them all)
If this direction is really to go from maintainer's perspective, I'll finish this very quickly. :)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Maybe, were you thinking about like this?: ryoqun/rust-rocksdb@1be386d
Yes. That is exactly what I meant 👍🏻
Unfortunately, I have no time now to provide my vision on it. But your approach is very very near to my vision. Let's try to implement it 🙂
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Problem: Creating and Dropping column families currently requires a mutable reference. In a system where we have shared access to the database object across multiple threads, this means we need exclusive access to the
DB
Proposal: Wrap the
cfs
column in an RwLock, so an exclusive lock is not needed on the entireDB
object