Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Enhance fastforward speed if no count value has been given #438

Open
wants to merge 4 commits into
base: master
Choose a base branch
from

Conversation

KAYLukas
Copy link
Contributor

@KAYLukas KAYLukas commented Dec 7, 2018

No description provided.

@codecov
Copy link

codecov bot commented Dec 7, 2018

Codecov Report

Merging #438 into master will decrease coverage by 0.01%.
The diff coverage is 98.57%.

Impacted file tree graph

@@             Coverage Diff              @@
##             master     #438      +/-   ##
============================================
- Coverage     98.87%   98.85%   -0.02%     
- Complexity     1809     1849      +40     
============================================
  Files            65       65              
  Lines          5160     5259      +99     
============================================
+ Hits           5102     5199      +97     
- Misses           58       60       +2
Impacted Files Coverage Δ Complexity Δ
lib/Recur/RRuleIterator.php 99.09% <98.57%> (-0.33%) 191 <100> (+40)
lib/ITip/Broker.php 99.32% <0%> (ø) 154% <0%> (ø) ⬇️

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update 5b86533...8eaabc0. Read the comment docs.

@staabm
Copy link
Member

staabm commented Dec 7, 2018

Thanks for the PR.

This adds a lot of code. For which use-case do these changes improve performance? Did you measure how much it improves performance?

@KAYLukas
Copy link
Contributor Author

KAYLukas commented Dec 7, 2018

@staabm It's dependent on the other PRs (which I included in this branch too) because otherwise I would have to write tests that are on purpose incorrect.

It improves performance for basically every rrule that does not have the COUNT limit that is included. I'm working on a good benchmark, I did benchmark it a few weeks ago, just lost the data ;-)

@evert
Copy link
Member

evert commented Dec 7, 2018

I'm also a little worried about the complexity this adds. It's not a super easy to maintain part of the code. At the very least the newly introduced code is not really documented, and it could stand to benefit from more in-line documentation and better variable naming. It's pretty impressive though. I have no doubt that this greatly improves performance.

@staabm
Copy link
Member

staabm commented Dec 7, 2018

Yeah its a lot of complexity. I cant overlook the impact of the change though, therefore will delegate the final call to @evert

@evert
Copy link
Member

evert commented Dec 7, 2018

For what it's worth, the intent of the fastForward function (for me) was always to have a hook where future optimizations (exactly such as these) can happen.

The way I envisioned that these optimizations would happen, is that the fastForward function looks if the rules from RRULE are considered "simple enough", such as RRULE:FREQ=DAILY, and instead of calling nextDaily() just immediately skip ahead to those dates.

By giving every nextXXX() function an amount argument, the fast-forwarding logic is effectively handed off to those respective functions, except those functions themselves don't respect COUNT or UNTIL. Thankfully they're protected.

So usually what I will ask people who need a feature like this is: Are there 2 or 3 specific cases that would solve 90% of your problem, because that tends to be a bit more effective in the long. For example, having support for faster fastForwarding in hourly events might not be interesting to anyone because hourly recurring events are so rare.

I think another feeling with this patch is that it feels very much like a patch on existing logic, when maybe the existing logic/structure wasn't really good enough. This makes me wonder if a better design exists to satisfy this requirement.

Anyway, I realize that these are fairly vague and not-that-constructive comments :/

@KAYLukas
Copy link
Contributor Author

KAYLukas commented Dec 11, 2018

So usually what I will ask people who need a feature like this is: Are there 2 or 3 specific cases that would solve 90% of your problem, because that tends to be a bit more effective in the long. For example, having support for faster fastForwarding in hourly events might not be interesting to anyone because hourly recurring events are so rare.

@evert So in general, we want to use this library in a more enterprise setting. Fastforward will be included and needs to be performant enough to be used without causing a DDOS vulnerability. Essentially this means that we will have to block any RRULEs that can not be fast-forwarded without causing a performance problem. In this case the idea would be to block COUNTs that are larger than say 500. So just improving the performance in some cases is not good enough in my opinion. Especially hour events should be fixed as these are the most expensive in fact.

It's very important to have this or a similar performance improvement as we are relying on this. So hopefully, we can find a solution as it's best to have this in the main branch of course :-).

At the very least the newly introduced code is not really documented, and it could stand to benefit from more in-line documentation and better variable naming.

I understand, I will fix this, of course. Would be better to sort out a way that this speed up can be merged first, though.

As I promised the performance measure. I wrote an easy example of the speed up like this (note that there are a lot of tests included in the FastForwardTest case which stress test the performance and test if the fast forward does run in a certain timeframe)

require_once 'vendor/autoload.php';

foreach (['YEARLY', 'MONTHLY', 'DAILY'] as $freq) {
    $timezone = 'America/New_York';
    $startDate = new \DateTime('1992-10-23 00:00:00', new \DateTimeZone($timezone));
    $rrule = new \Sabre\VObject\Recur\RRuleIterator("FREQ=$freq", $startDate);

    $start_time = microtime(true);
    for ($i = 0; $i < 1000; $i++) {
       $rrule->rewind();
       $rrule->fastForward(new \DateTime('now'));
    }
    $end_time = microtime(true);

    echo 'Frequency: ', $freq, PHP_EOL;
    echo 'Time spent for 1: ', ($end_time - $start_time), ' ms', PHP_EOL, PHP_EOL;
}

The output for the master branch is:

$ php ./benchmark.php 
Frequency: YEARLY
Time spent for 1: 0.067322969436646 ms

Frequency: MONTHLY
Time spent for 1: 0.58958315849304 ms

Frequency: DAILY
Time spent for 1: 13.813991069794 ms

For the proposed branch:

php ./benchmark.php 
Frequency: YEARLY
Time spent for 1: 0.034175872802734 ms

Frequency: MONTHLY
Time spent for 1: 0.056872129440308 ms

Frequency: DAILY
Time spent for 1: 0.015760898590088 ms

is that the fastForward function looks if the rules from RRULE are considered "simple enough", such as RRULE:FREQ=DAILY, and instead of calling nextDaily() just immediately skip ahead to those dates.

So that is general not easily done. Consider this case: (new \DateTime("31-1-2018"))->modify("+1 month”): gives 2018-03-1. So we can construct an rrule that just skipping x months forward is not easily done:

DTSTART:20180131T000000
RRULE:FREQ=MONTHLY;BYMONTHDAY=-1

Of course, we could patch this, but then why won’t we already rely on logic that is already there to handle these cases? The benefit of this is that this code is used way more, so there is less chance that a bug will go unnoticed.

Hopefully, we can figure something out here :-)

@valentinbonneaud valentinbonneaud deleted the performance/fast-forward branch May 19, 2020 17:58
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

3 participants