Skip to content

Commit

Permalink
Avoid bisect command to get stuck
Browse files Browse the repository at this point in the history
From Palkan in benoittgt/rspec_repro_bisect_deadlock#1

First, I've tried to play with the number of specs which led to the
interesting conclusion: **the process hangs only at 1548+ specs**.

```diff
 RSpec.describe "a bunch of nothing" do
   (0...(ENV.fetch('N', 3000).to_i)).each do |t|
     it { expect(t).to eq t }
   end
 end
```

Try to run with `N=1547` and `N=1548`.

Seems suspicious, right?

Let's add `pry-byebug` to the equation (or Gemfile).

In order it to work we need to tweak our runner code a bit:

```diff
- $stdout = $stderr = @spec_output
+ # $stdout = $stderr = @spec_output
```

After a bit of `puts` debugging I localized the problem:
[`@channel.send`](/lib/rspec/core/bisect/fork_runner.rb@7b6b9c3#L122).

 `Channel#send` calls `IO#write` here
 /lib/rspec/core/bisect/utilities.rb@7b6b9c3#L41:

 ```ruby
 def send(message)
   packet = Marshal.dump(message)
   @write_io.write("#{packet.bytesize}\n#{packet}")
 end
 ```

 Do you know, what is the `packet.bytesize` for `N=1548`? It's **65548**.
 This number is very important: the pipe size is only **65536** on MacOS
 (see docs for [`IO#write_nonblock`](ruby-doc.org/core-2.6.3/IO.html#method-i-write_nonblock)
 for more).

 That makes `@write_io.write` hangs forever, because no one reads the
 buffer: we call `Channel#receive` only after `Process.waitpid(pid)`,
 thus waiting for the write operation to complete.

 -----------

 A basic proposal will be to use WNOHANG. From waitpid doc:

 > If WNOHANG was specified in options and there were no children
 > in a waitable state, then waitid() returns 0 immediately (...)

 To validate this proposal on OSX we run just before running bisect:
 `lsof -n -P -r1 -c ruby | grep -e 'PIP' -e '===' -e 'COMMAND'`

 This will give us in loop the PIPE sizes of Ruby processes. Without our
 patch we see that quickly we hit 65536 bytes on two pipes, with the patch
 we keep pipes at the right size.

```
COMMAND PID    USER     FD   TYPE DEVICE                SIZE/OFF  NODE NAME
ruby    40134  benoit    3   PIPE 0xf3b025a6a6cd6005    16384     ->0xf3b025a6a6cd5045
ruby    40134  benoit    4   PIPE 0xf3b025a6a6cd5045    16384     ->0xf3b025a6a6cd6005
ruby    40134  benoit    5   PIPE 0xf3b025a6a6cd7805    16384     ->0xf3b025a6a6cd7145
ruby    40134  benoit    7   PIPE 0xf3b025a6a6cd7145    16384     ->0xf3b025a6a6cd7805
ruby    40134  benoit   10   PIPE 0xf3b025a6a6cd6fc5    16384     ->0xf3b025a6a6cd5a05
ruby    40134  benoit   11   PIPE 0xf3b025a6a6cd5a05    16384     ->0xf3b025a6a6cd6fc5
ruby    40144  benoit    3   PIPE 0xf3b025a6a6cd5d05    16384     ->0xf3b025a6a6cd5c45
ruby    40144  benoit    4   PIPE 0xf3b025a6a6cd5c45    16384     ->0xf3b025a6a6cd5d05
ruby    40144  benoit    5   PIPE 0xf3b025a6a6cd7085    16384     ->0xf3b025a6a6cd6785
ruby    40144  benoit    7   PIPE 0xf3b025a6a6cd6785    16384     ->0xf3b025a6a6cd7085
ruby    40144  benoit   10   PIPE 0xf3b025a6a6cd6fc5    16384     ->0xf3b025a6a6cd5a05
ruby    40144  benoit   11   PIPE 0xf3b025a6a6cd5a05    16384     ->0xf3b025a6a6cd6fc5
```

But if we look properly from the doc we can even go further.

> If status information is immediately available on an appropriate child process,
waitpid() returns this information. Otherwise, waitpid() returns immediately with
an error code indicating that the information was not available. In other words,
WNOHANG checks child processes without causing the caller to be suspended.

and as pirj mention: "With this in mind, do we really need to check that information
that waitpid returns? We don't seem to use it."

Removing "waitpid" produce the same behavior as with `WNOHANG`.

Improvements:
The bisect command request lot's of ram. The next step should be to
reduce that consumption.

Related:
 - fix: #2637
 - PR discussion: #2669
  • Loading branch information
benoittgt committed Nov 26, 2019
1 parent b7067c5 commit 8590b53
Showing 1 changed file with 0 additions and 1 deletion.
1 change: 0 additions & 1 deletion lib/rspec/core/bisect/fork_runner.rb
Expand Up @@ -92,7 +92,6 @@ def initialize(runner, channel)

def dispatch_specs(run_descriptor)
pid = fork { run_specs(run_descriptor) }
Process.waitpid(pid)
end

private
Expand Down

0 comments on commit 8590b53

Please sign in to comment.