Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Imporve AppendHTMLEscape fast path #1249

Merged
merged 1 commit into from Mar 17, 2022

Conversation

zhangyunhao116
Copy link
Contributor

@zhangyunhao116 zhangyunhao116 commented Mar 16, 2022

The < is occurs more frequently in the HTML string.

Move <

name                 old time/op  new time/op  delta
AppendHTMLEscape-16  99.3ns ± 1%  93.8ns ± 1%  -5.50%  (p=0.000 n=10+10)

And I am wondering maybe we could disable the fast path? Looks like it is not used in most cases.

Disable fast path

name                 old time/op  new time/op  delta
AppendHTMLEscape-16  99.3ns ± 1%  82.3ns ± 3%  -17.10%  (p=0.000 n=10+10)

@erikdubbelboer
Copy link
Collaborator

How about using strings.IndexAny to find the first special character. If it returns -1 you can return append(dst, s...). Otherwise you can dst = append(dst, s[:i]...) and continue the for loop from there. I'm not 100% sure if strings.IndexAny doesn't allocate though.

@zhangyunhao116
Copy link
Contributor Author

The strings.IndexByte actually use assembly directly(SSE AVX etc), but strings.IndexAny is more complicate (contains more Go codes), it will be slower in this case.

use strings.IndexAny

name                 old time/op  new time/op   delta
AppendHTMLEscape-16  99.3ns ± 1%  106.2ns ± 1%  +6.92%  (p=0.000 n=10+10)

@erikdubbelboer
Copy link
Collaborator

In that case I'm in favor of removing the fast path as well.

But in that case isn't it also possible to use strings.IndexByte to find the next index character to encode instead of looping though the whole string one by one?

@zhangyunhao116
Copy link
Contributor Author

But in that case isn't it also possible to use strings.IndexByte to find the next index character to encode instead of looping though the whole string one by one?

I haven't come up with a good plan yet, the strategy seems complicated, maybe we could do it in the future.

This PR removes the fast path now.

name                 old time/op  new time/op  delta
AppendHTMLEscape-16  99.3ns ± 1%  82.3ns ± 3%  -17.10%  (p=0.000 n=10+10)

@erikdubbelboer erikdubbelboer merged commit 3101938 into valyala:master Mar 17, 2022
@erikdubbelboer
Copy link
Collaborator

Thanks. I like simplifying the code if possible so I think this is also a good solution.

u5surf pushed a commit to u5surf/fasthttp that referenced this pull request Mar 23, 2022
u5surf pushed a commit to u5surf/fasthttp that referenced this pull request Mar 23, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

2 participants