Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

fix quadratic performance in FileTask#out_of_date? #224

Merged
merged 1 commit into from Oct 18, 2017
Merged

Commits on Oct 5, 2017

  1. fix quadratic performance in FileTask#out_of_date?

    FileTask#out_of_date? has been changed in 462e403 to call #needed?
    which calls #out_of_date? recursively. In some cases, where the
    graph of dependencies is fairly dense, this leads to quadratic
    performance when it was linear before.
    
    Use #all_prerequisite_tasks to avoid the problem. This also saves
    a File.exist? test as #timestamp already takes it into account.fix quadratic performance in FileTask#out_of_date?
    
    FileTask#out_of_date? has been changed in 462e403 to call #needed?
    which calls #out_of_date? recursively. In some cases, where the
    graph of dependencies is fairly dense, this leads to quadratic
    performance when it was linear before.
    
    Use #all_prerequisite_tasks to avoid the problem. This also saves
    a File.exist? test as #timestamp already takes it into account.
    
    I made a benchmark that measures the difference in duration for
    out_of_date? on 12.0, 12.1 and 12.1 with this commit applied.
    
    The benchmark used to validate the performance creates 5 layers
    of FileTask, where all tasks of the parent layer are connected
    to all tasks of the child. A root task is added at the top
    and #out_of_date? is called on it.
    
    The benchmark varies the number of tasks per layer.
    
    **12.0**
    
    | Tasks per layers | Duration (s) | Standard deviation |
    |------------------|--------------|--------------------|
    | 1                | 1.32e-05     | 0.96e-05           |
    | 2                | 8.69e-05     | 1.11e-06           |
    | 5                | 1.84e-05     | 2.43e-06           |
    | 8                | 2.89e-05     | 1.05e-05           |
    | 10               | 3.35e-05     | 4.12e-06           |
    | 15               | 4.97e-05     | 6.74e-06           |
    | 20               | 6.19e-05     | 6.23e-06           |
    
    **12.1**
    
    | Tasks per layers | Duration (s) | Standard deviation |
    |------------------|--------------|--------------------|
    | 1                | 7.00e-05     | 5.62e-05           |
    | 2                | 3.98e-04     | 7.38e-05           |
    | 5                | 2.32e-02     | 1.02e-03           |
    | 8                | 0.22         | 0.006              |
    | 10               | 0.65         | 0.006              |
    | 15               | 4.78         | 0.048              |
    | 20               | 20           | 0.49               |
    
    **PR 224**
    
    | Tasks per layers | Duration (s) | Standard deviation |
    |------------------|--------------|--------------------|
    | 1                | 4.47e-05     | 2.68e-05           |
    | 2                | 7.56e-05     | 2.92e-05           |
    | 5                | 2.42e-03     | 4.16e-05           |
    | 8                | 0.51e-03     | 7.21e-05           |
    | 10               | 0.77e-03     | 0.13e-03           |
    | 15               | 14.2e-03     | 0.11e-03           |
    | 20               | 24.2e-03     | 0.16e-03           |
    
    Benchmarking code:
    
    ~~~ ruby
    require 'rake'
    
    LAYER_MAX_SIZE = 20
    LAYER_COUNT = 5
    
    def measure(size)
        app = Rake::Application.new
        layers = (0...LAYER_COUNT).map do |layer_i|
            (0...size).map do |i|
                app.define_task(Rake::FileTask, "#{layer_i}_#{i}")
            end
        end
        layers.each_cons(2) do |parent, child|
            child_names = child.map(&:name)
            parent.each { |t| t.enhance(child_names) }
        end
        root = app.define_task(Rake::FileTask, "root")
        root.enhance(layers[0].map(&:name))
    
        tic = Time.now
        root.send(:out_of_date?, tic)
        Time.now - tic
    end
    
    FileUtils.touch "root"
    sleep 0.1
    LAYER_COUNT.times do |layer_i|
        LAYER_MAX_SIZE.times do |i|
            FileUtils.touch "#{LAYER_COUNT - layer_i - 1}_#{i}"
        end
        sleep 0.1
    end
    
    COUNT = 100
    [1, 2, 5, 8, 10, 15, 20].each do |size|
        mean = 0
        sum_squared_deviations = 0
    
        COUNT.times do |count|
            duration = measure(size)
            old_mean = mean
            mean     = old_mean + (duration - old_mean) / (count + 1)
            sum_squared_deviations = sum_squared_deviations + (duration - old_mean) * (duration - mean)
        end
    
        puts "#{size} #{mean} #{Math.sqrt(sum_squared_deviations / (COUNT - 1))}"
    end
    ~~~
    doudou committed Oct 5, 2017
    Copy the full SHA
    85fa80d View commit details
    Browse the repository at this point in the history