Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

The unit test test_shrink make no sense #114

Open
yanghaku opened this issue Apr 20, 2022 · 2 comments
Open

The unit test test_shrink make no sense #114

yanghaku opened this issue Apr 20, 2022 · 2 comments

Comments

@yanghaku
Copy link

yanghaku commented Apr 20, 2022

In the test_shrink, if comment the function call pool.set_num_threads(TEST_TASKS); , the test also can pass. Because after_shrink_size == TEST_TASKS and active threads == TEST_TASKS.

Then I want to try write a test case that can test shrink, but it will test fail sometimes (because the implementation in lib.rs).

The code is as follows, test_no_shrink is for the origin test case I metion above, test_shrink_in_sleep is my custom test case using sleep.

#[cfg(test)]
mod test {
    use std::{
        sync::{Arc, Barrier},
        thread::sleep,
        time::Duration,
    };
    use threadpool::ThreadPool;

    const TEST_TASKS: usize = 4;
    #[test]
    fn test_no_shrink() {
        let test_tasks_begin = TEST_TASKS + 2;

        let pool = ThreadPool::new(test_tasks_begin);
        let b0 = Arc::new(Barrier::new(test_tasks_begin + 1));
        let b1 = Arc::new(Barrier::new(test_tasks_begin + 1));

        for _ in 0..test_tasks_begin {
            let (b0, b1) = (b0.clone(), b1.clone());
            pool.execute(move || {
                b0.wait();
                b1.wait();
            });
        }

        let b2 = Arc::new(Barrier::new(TEST_TASKS + 1));
        let b3 = Arc::new(Barrier::new(TEST_TASKS + 1));

        for _ in 0..TEST_TASKS {
            let (b2, b3) = (b2.clone(), b3.clone());
            pool.execute(move || {
                b2.wait();
                b3.wait();
            });
        }

        b0.wait();

        // no shrink call, it also can test successfully
        // pool.set_num_threads(TEST_TASKS);

        assert_eq!(pool.active_count(), test_tasks_begin);
        b1.wait();

        b2.wait();
        assert_eq!(pool.active_count(), TEST_TASKS);
        b3.wait();
    }

    #[test]
    fn test_shrink_in_sleep() {
        let test_tasks_begin = TEST_TASKS / 2;
        let shrink_tasks = 1;

        let mut pool = ThreadPool::new(test_tasks_begin);

        for _ in 0..TEST_TASKS {
            pool.execute(move || {
                sleep(Duration::from_secs(1));
            })
        }

        sleep(Duration::from_millis(100));
        pool.set_num_threads(shrink_tasks);

        /*
         *  if TEST_TASKS = 4:
         *
         *  time        tasks
         *  [0-1]       task0, task1
         *  [1-2]       task2
         *  [2-3]       task3
         */

        sleep(Duration::from_secs(1));
        assert_eq!(pool.active_count(), 1);
    }
}

This is the test result:

running 2 tests
test test::test_no_shrink ... ok
test test::test_shrink_in_sleep ... FAILED

failures:

---- test::test_shrink_in_sleep stdout ----
thread 'test::test_shrink_in_sleep' panicked at 'assertion failed: `(left == right)`
  left: `2`,
 right: `1`', src/main.rs:77:9
note: run with `RUST_BACKTRACE=1` environment variable to display a backtrace


failures:
    test::test_shrink_in_sleep

test result: FAILED. 1 passed; 1 failed; 0 ignored; 0 measured; 0 filtered out; finished in 1.10s

Although I know the thread pool does not need such a precise shrink, I want to ask how to write a test case to test whether the shrink really works and the test case can pass everytime.

@dns2utf8
Copy link
Member

One of the problems is, that we don't have control over the OS and the other tasks on the machine. Depending on the load, sleeping for 100ms may not be enough for the workers to settle down in time.
We used to have tighter tolerances, because personal development machines usually have more resources than CI machines.

I am currently rewriting the crate to make it faster and more flexible, but the focus again is on faster work speeds not faster admin reactions.

May I ask, how did you stumble over this test :)

@yanghaku
Copy link
Author

yanghaku commented May 3, 2022

Sorry for my late reply.

1. The reason for my fail test test_shrink_in_sleep

In my test code, Firstly initialize a thread pool with an initial number of threads of 2. And use the thread pool to run 4 tasks with sleep 1s, assuming t_0,t_1,t_2,t_3.
Let the time to start running the task be 0, then in the [0, 1000ms], the thread pool will run t_0 and t_1. And t_2, t_3 will be in the queued channel.

At the 100 ms, we set the number of thread pools to 1. therefore:
[0, 100ms]: max_thread_count=2, active_count=2, queued_count=2,
[100ms, 1000ms]: max_thread_count=1, active_count=2, queued_count=2

When task t_0 or t_1 ends, the corresponding thread will continue the next round of the loop. First, determine the value of active_count and max_thread_count, if active_count is larger, stop this thread, which accomplishes shrink.

The code I describe is in lib.rs: L745-L766
judge if (active_count <= max_thread_count ) ==> recv a job ==> active_count increases by 1 ==> running the job

As expected, because max_thread_num=1, it will run t_2 at [1000ms, 2000ms] and run t_3 at [2000ms, 3000ms].
So I check assert_eq!(pool.active_count(), 1) at 1100ms.

But! [L745-L766] this operation is not atomic, maybe both threads are not stopped at the same time because active_count <= max_thread_count is true.
In this situation,
[1000ms, 2000ms]: max_thread_count=1, active_count=2, queued_count=2
So this unit test fails.

The point of the thread pool is to work faster, not a faster management response, I think so. It is tolerable to have such a situation, so I feel that my test case is unsuitable.

2. The test case test_set_num_threads_decreasing in lib.rs L813

My test case test_shrink_in_sleep is very similar to test_set_num_threads_decreasing, but the key difference is that my test case is new threads < queued tasks, and in test_set_num_threads_increasing is new threads == queued tasks

You can try modify the code in lib.rs L821, change pool.set_num_threads(new_thread_amount); to pool.set_num_threads(new_thread_amount-1);, then the test case will fail too.

But I think, because in the original thread pool max_thread_count > queued task, even if the set_num_threads function does not work, it can still pass assert_eq!(pool.active_count(), new_thread_amount).

So the test case test_set_num_threads_decreasing does not observe whether set_num_threads works, test_shrink in lib.rs L953 is the same.

So I want to ask if it is necessary to rewrite a better test case to test if thread number decreasing really works.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants