Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

manage immutable Arc<Rkv> rather than interiorly mutable Arc<RwLock<Rkv>> #130

Open
wants to merge 2 commits into
base: main
Choose a base branch
from

Conversation

mykmelez
Copy link
Contributor

Manager currently manages Arc<RwLock<Rkv>> instances, which enables interior mutability of the Rkv instances. That can result in surprising behavior when fetching a mutated Rkv from the Manager. It also requires consumers to acquire a (read, at least) lock to use a managed Rkv, even though all of its methods take an immutable reference to self (&self).

This branch removes the RwLock from managed Rkv instances, such that Manager returns Arc<Rkv>, which is immutable, cannot result in surprising behavior when fetching an Rkv from the Manager, and doesn't need to be locked.

Note that this is a breaking change that will require an incompatible version bump (currently a minor version bump, as we're still pre-1.0).

It's also worth considering whether there is some reason to keep this lock, even though it isn't currently necessary. For example, if we want consumers to be able to close an environment in the future using mdb_env_close(), which can only be called by a single thread, then we'll need a way to lock the Rkv instance.

In that case, I think we can implement internal locking in Rkv of its env field, making it hold an RwLock<MDB_env> and lock it for writing before closing the environment.

@ncloudioj What do you think?

…kv>>

Manager currently manages Arc<RwLock<Rkv>> instances, which enables interior mutability of the Rkv instances. That can result in surprising behavior when fetching a mutated Rkv from the Manager. It also requires consumers to acquire a (read, at least) lock to use a managed Rkv, even though all of its methods take an immutable reference to self (&self).

This branch removes the RwLock from managed Rkv instances, such that Manager returns Arc<Rkv>, which is immutable, cannot result in surprising behavior when fetching an Rkv from the Manager, and doesn't need to be locked.
@mykmelez mykmelez requested a review from ncloudioj March 12, 2019 17:35
@ncloudioj
Copy link
Member

One possible use case of RwLock<Rkv> is for map resizing. lmdb-rs only exposes set_map_size() through EnvironmentBuilder. Hence once the env gets created and opened, there is no way to resize it unless we reopen it with a new EnvironmentBuilder. Although LMDB does allow us to call mdb_env_set_mapsize on an open env.

Perhaps we can change lmdb-rs to expose the resize API from Environment? Other than that, I think it's fine to drop the RwLock here.

@ncloudioj
Copy link
Member

ncloudioj commented Mar 14, 2019

This looks good.

Since this patch allows us to resize the map without closing it, I can't find any other use cases for the mutability any more.

@mykmelez
Copy link
Contributor Author

Note: I'm hold off landing this so I can think about it a bit more.

@ncloudioj
Copy link
Member

After rethinking about the implementation for the early environment close, I just realized that RwLock could be useful to enforce those conditions for a safe close: 1) all transactions, databases, and cursors must already be closed; 2) Only a single thread may call mdb_env_close().

Let's say we add a new API called close so the consumers can close the environment,

manager::singleton().write().unwrap().close("path_to_env");

To implement it, since lmdb::Environment implements Drop, in which it calls mdb_env_close(). We can just delete the entry in Manager.environments once we acquire the exclusive lock on RwLock<rkv>, it works because all other transactions need to hold the shared lock to proceed.

With those two rw locks on Manager and Rkv, I think we'd meet the above conditions to safely close the environment, correct?

@ncloudioj
Copy link
Member

Let's say we do want to keep the RwLock, in order to prevent consumers from mutating the managed rkv as demonstrated in #129 , shall we mark all the rkv constructors as pub(crate) so that the manager would be the only entry point to open&create&close rkv environments?

@mykmelez What do you think?

@mykmelez
Copy link
Contributor Author

I think it's a good goal, but at least one recent consumer (bug 1429796) has had to avoid using Manager to work around a bug in either Firefox (bug 1531887) or Rust (rust-lang/rust#58862). So I'm a bit hesitant to block direct access to rkv environments at the moment!

@ncloudioj
Copy link
Member

So I'm a bit hesitant to block direct access to rkv environments at the moment!

Oops , that ship has sailed! 😉

Sounds good to me. I think we can remind rkv consumers in other ways to avoid the accidental mutation as mentioned in #129 as well as to avoid opening the same env more than once in one process if they have to access rkv directly.

@victorporof
Copy link
Contributor

Sounds like we can WONTFIX this? @ncloudioj

@ncloudioj
Copy link
Member

Yes, we can WONTFIX this for now, only re-open it if we find it necessary.

@victorporof
Copy link
Contributor

The consumer mentioned in #130 (comment) uses the manager now, and all consumers in m-c are transitioned to avoid LMDB. I think we can (and should) reopen this now, because the RwLock is incredibly annoying.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

3 participants