Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[BUG] IO error #402

Closed
chronicl opened this issue Dec 21, 2022 · 9 comments
Closed

[BUG] IO error #402

chronicl opened this issue Dec 21, 2022 · 9 comments
Labels
f: Help wanted o: Windows A Windows OS exclusive issue t: Bug

Comments

@chronicl
Copy link

Any tasks that aren't completed very quickly throw an io error.

Steps to reproduce the bug

  1. pueued --daemonize
  2. pueue add <Some task that takes longer than 3 seconds>
  3. wait for error

For example running the following as a task pueue add cargo run, yields the error for me.

fn main() {
    std::thread::sleep(std::time::Duration::from_secs(3));
}

Less than 3 seconds doesn't always throw the error, although I've tested other tasks that take less than 3 seconds and still throw the error consistently, for example Start-Sleep -Seconds 1 in powershell.

Expected behavior

No errors.

Logs/Output

The error is described as some IO error. Check daemon log. when calling pueue status and the daemon log is [ERROR] Child 1 failed with io::Error: Os { code: 258, kind: TimedOut, message: "The wait operation timed out." } (not sure if this is the daemon log).

Additional context

  • Operating System: Windows 11
  • Pueue version: 3.0.0. This error does not happen in version 2.0.0.
@Nukesor
Copy link
Owner

Nukesor commented Dec 21, 2022

You can check the daemon log by running pueued -d -vv.
The output on that terminal is that of the daemon.
It would be nice to get a bit more debug output, as I cannot see what the problem is from your description.

You might have to do some debugging yourself, as I no longer own a Windows machine.

I assume that this has something to do with the new way processes are handled in v3.0, but I'm not sure what could have caused this.

@chronicl
Copy link
Author

Thanks for the fast reply!
I just tested it on another windows pc, same error. pueued -d -vv only shows

10:28:19 [INFO] Checking path: "C:\\Users\\*\\AppData\\Roaming\\pueue\\pueue.yml"
10:28:19 [INFO] Found config file at: "C:\\Users\\*\\AppData\\Roaming\\pueue\\pueue.yml"
10:28:19 [INFO] Restoring state
10:28:53 [INFO] Didn't find pueue alias file at "C:\\Users\\*\\AppData\\Roaming\\pueue_aliases.yml".
10:28:53 [INFO] Started task: Start-Sleep -Seconds 1
10:28:54 [ERROR] Child 0 failed with io::Error: Os { code: 258, kind: TimedOut, message: "The wait operation timed out." }

I'm a bit busy the next 2 weeks, but once I find time I'll look into this more.
I like pueue a lot, it would be a shame if it didn't work on windows.

@Nukesor
Copy link
Owner

Nukesor commented Dec 22, 2022

The error comes somewhere from inside this function.

Though I'm really not sure what the problem is. "The wait operation timed out" seems to be a super generic error for all kinds of windows process problems. Are you by any chance familiar with Rust and Windows systems programming?

This probably needs a bit of a deep dive and a proper debugging session by someone with a bit of experience.

@Nukesor
Copy link
Owner

Nukesor commented Dec 22, 2022

A bit of context:

The error we see appears when pueued checks whether there're any finished tasks.

Over here we check if a child has finished via the try_wait call.
This call is usually expected to be successful, as anything else indicates that there's probably a problem while communicating with the system about the status of the process.

The error is then forwarded to the handle_finished_tasks function, where I luckily added some error handling code for this scenario.

Since v3.0.0, Pueue no longer uses the stdlib implementation of Child, but rather a wrapper around it via the command_group library.

The custom windows implementation for try_wait and wait can be found over here.

@Nukesor Nukesor added t: Bug f: Help wanted o: Windows A Windows OS exclusive issue labels Dec 22, 2022
@Nukesor Nukesor removed their assignment Dec 22, 2022
@chronicl
Copy link
Author

I'm familiar with rust and have used the windows api a decent amount in rust before, so I can give it a shot, but I'm not home for the next few days.
Quickly looking at the lines of code you linked, I think overlapped.as_ptr() here is never null, but I can't check right now.

@chronicl
Copy link
Author

I checked out the windows docs now too and I think the following has been happening:

What I mentioned in my last comment should be correct, overlapped has type MaybeUninit<LPOVERLAPPED> and MaybeUninit::as_ptr returns *const LPOVERLAPPED, which is checked for being null, although we should be checking wether LPOVERLAPPED itself is null. We probably want to use MaybeUninit::assume_init instead, or just not use MaybeUninit at all: let overlapped = std::ptr::null_mut(); should be fine?
The error we get isn't really an error, but expected behavior when calling GetQueueCompletionStatus with timeout 0, because we aren't calling it to wait until a completion packet is send, we are calling it to check wether a completion packet has been send. But once the null check is fixed we won't be checking for the error anymore.

This is just what I came up with now, will test in a few days and make a PR if this was indeed the issue.

@Nukesor
Copy link
Owner

Nukesor commented Dec 24, 2022

Awesome, thanks for the quick follow up!

I'll try to tackle the test suite refactoring soon, which will hopefully be a big step for running the tests on all supported platforms :)

@chronicl
Copy link
Author

I submitted this PR now to the command-group crate. Pueue seems to work as expected on windows with this change!

@Nukesor
Copy link
Owner

Nukesor commented Dec 31, 2022

Dependency is bumped and point release is published :)

@Nukesor Nukesor closed this as completed Dec 31, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
f: Help wanted o: Windows A Windows OS exclusive issue t: Bug
Projects
None yet
Development

No branches or pull requests

2 participants