Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merge issue43 hotfix #55

Open
wants to merge 4 commits into
base: master
Choose a base branch
from

Conversation

carterian8
Copy link

@carterian8 carterian8 commented May 4, 2022

Closes #43. Watcher::add_appointment spawns an async tokio::task to handle cases when Watcher::store_triggered_appointment runs into bitcoind connectivity issues.

Introducing this async change caused a substantial ripple effect through the code with the most notable being an update to the Carrier to use tokio::sync::Notify to listen for changes to the status of bitcoind_reachable (please see #43 for justification). Other changes just further support the constraints introduced by incorporating more async operations.

All unit tests were updated to verify these changes and an additional test, Watcher::test_add_appointment_bitcoind_unreachable, was added to verify the correctness of the specific corner case presented in #43.

@sr-gi sr-gi self-requested a review May 12, 2022 09:31
Copy link
Member

@sr-gi sr-gi left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I gave this a first pass. It really feels like this should be solvable in an easier way.

Going the async route means we need to access async functions from chain::Listen (as you already mentioned in #43) which means the design has to be changed significantly. On top of that, it also means accessing the Carrier in an async way, but this interacts with bitcoind synchronously, so things become pretty hacky.

I need deeper understanding on Rust async and lifetimes to judge how to properly fix this. Given this fixes a pretty rare edge case I'll leave it on hold for now.

PS: I've created a branch trying to simplify this but didn't really go further before realizing what I stated in the review https://github.com/talaia-labs/rust-teos/tree/55-tokio-notify. I will revisit this once the cln plugin is merged.

pub(crate) async fn send_transaction(&self, tx: &Transaction) -> ConfirmationStatus {
let mut continue_looping = true;
let mut receipt: Option<ConfirmationStatus> = None;
while continue_looping {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This feels real hacky, I thing we'd be better off using async_recursion.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Certainly open to async_recursion here but usually tend towards non-recursive solutions to save the stack.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Again this is just a side effect of the async change made in Watcher

ConfirmationStatus::InMempoolSince(self.block_height)
pub(crate) async fn send_transaction(&self, tx: &Transaction) -> ConfirmationStatus {
let mut continue_looping = true;
let mut receipt: Option<ConfirmationStatus> = None;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This shouldn't be needed. I think it comes from the continue_looping approach so in the case you need to loop again you have no value for receipt. I think recursion may make more sense.

@@ -439,12 +478,13 @@ impl Watcher {
///
/// If the appointment is rejected by the [Responder] (i.e. for being invalid), the data is wiped
/// from the database but the slot is not freed.
fn store_triggered_appointment(
&self,
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Converting this into a class method is one of the things I was trying to avoid to be honest, because it means having to redesign how the data structures of the Watcher are accessed (Arc<Mutex<>> now). I'm guessing you ended up going this route because you hit an issue regarding lifetimes in the Watcher when trying to call store_triggered_appointment (it needing to be static). That's the exact issue I didn't know exactly how to fix.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah this didn't feel right to me either but based on my understanding of Rust this is inevitable because you cannot use self within an async call as it violates the static lifetime requirement.

@carterian8
Copy link
Author

@sr-gi

Yeah I agree this fix is a bit meaty for the problem it addresses.

Regarding lifetimes and Rust, I'm not sure you'll find a clean solution combining async and self. I stumbled across some workarounds but, based on my understanding of Rust, referring to self within an async call violates the static lifetime requirement and calls for some unconventional techniques.

In my opinion, if you'd like to use async to fix this issue, its going to require a redesign of the magnitude presented here.

Maybe we can chat when you're ready and converge on a solution you'd be happy with.

@sr-gi
Copy link
Member

sr-gi commented Jun 13, 2022

@carterian8 I chatted about this recently with @TheBlueMatt since it goes above my current understanding of Rust. He mentioned that Rust needs to make sure that the object is pointed by &self will exist in the same place in memory forever (mainly what static does by my understanding, hence the compiler's suggestion). He also mentioned that the simplest way to make this work may be to replace &self for something like us: Arc<Object>.

I may try to make this work in a simpler scenario, like for the watchtower-plugin retrier to convince myself how this may work (i.e. convert it into an object given it is currently spawning async tasks).

@sr-gi sr-gi added help wanted Extra attention is needed hard to review sharpen your review knife async related to async stuff labels Aug 18, 2022
Comment on lines +868 to +871
join_handle_option = Some(tokio::spawn(async move {
let mut appointments_to_delete = HashSet::from_iter(
invalid_breaches.into_keys());
let mut delivered_appointments = HashSet::new();
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Note: This would make this method exit early and a task will be spawned on tokio to do the job. This could potentially make the responder's/gatekeeper's own block_connected method run before or interleaving with this task.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
async related to async stuff hard to review sharpen your review knife help wanted Extra attention is needed
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Watcher::store_triggered_appointment should be called as an async task
3 participants