Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Diff highlighting broken (looks streaming search related) #19403

Closed
rvantonder opened this issue Mar 24, 2021 · 9 comments · Fixed by #19543
Closed

Diff highlighting broken (looks streaming search related) #19403

rvantonder opened this issue Mar 24, 2021 · 9 comments · Fixed by #19543
Assignees
Labels
bug An error, flaw or fault that produces an incorrect or unexpected result, or behavior.

Comments

@rvantonder
Copy link
Contributor

Looks like a regression with streaming search, because it seems to manifest only with streaming on:

example:

Screen Shot 2021-03-23 at 7 12 22 PM

With streaming off:

Screen Shot 2021-03-23 at 7 11 38 PM

Commit message highlighting via type:commit appears unaffected. Not sure if backend or frontend so just assigning everyone.

@rvantonder rvantonder added bug An error, flaw or fault that produces an incorrect or unexpected result, or behavior. team/search labels Mar 24, 2021
@github-actions
Copy link
Contributor

Heads up @lguychard - the "team/search" label was applied to this issue.

@keegancsmith
Copy link
Member

Will take a look. I believe this happens without streaming as well sometimes. Need to see if I can find the other issue about this (or maybe I just have it in my org files).

@keegancsmith
Copy link
Member

keegancsmith commented Mar 24, 2021

This is correctly highlighted in the src-cli + streaming, which makes me think it has something to do with how webapp highlights commits. I think this is an old issue, I do have memory of seeing it before:

src search -stream 'repo:^github\.com/sourcegraph/sourcegraph$ type:diff return count:10'

image

vs

image

(match count difference is due to using a lower count: limit when I did the cli query)

@limitedmage
Copy link
Contributor

limitedmage commented Mar 24, 2021

Looking. Also ran into the separate issue: https://sourcegraph.slack.com/archives/C07KZF47K/p1616608611137800 (update: that issue is now resolved)

@limitedmage
Copy link
Contributor

limitedmage commented Mar 24, 2021

Root cause: The library we're using to convert Markdown to HTML, Marked, converts tabs to spaces with no way to disable this. So all our highlighting calculations are broken once the tabs have been converted to spaces.

This doesn't happen in non-streaming search because we return HTML of the code instead of Markdown.

@limitedmage
Copy link
Contributor

Not sure what the best way to fix this is. @keegancsmith What is the feasibility of modifying streaming search backend to return HTML like GraphQL search does instead of Markdown?

@keegancsmith
Copy link
Member

Not sure what the best way to fix this is. @keegancsmith What is the feasibility of modifying streaming search backend to return HTML like GraphQL search does instead of Markdown?

We can do that. It is a bit strange we have markdown as the response type, really you would just expect a raw byte buffer like we get with line content/etc. So I'd like to better understand how this all works. I would expect we can have a better arch for streaming before we start having 3rd party clients depend on the API. It seems we ask the backend to highlight code blocks, I see requests to graphql asking to highlight a codeblock adn we get back HTML. Is there a better way to structure this overall?

@stefanhengl
Copy link
Member

@keegancsmith and I discussed this a bit. We think the backend should just send back content as plain text, without the ```COMMIT_MSG or ```diff markers. The frontend wouldn't have to call "Marked" at all. We would still call the backend for highlighting though.

Only somewhat related, we noticed that sending content back to the backend for highlighting can take a lot of time. Maybe streaming could send down the first N events already highlighted? @limitedmage WDYT?

@limitedmage
Copy link
Contributor

I think we should move forward with returning HTML identical to the GraphQL response as we know this works. We can work on optimizing later (eg. returning HTML already with syntax highlighting).

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug An error, flaw or fault that produces an incorrect or unexpected result, or behavior.
Projects
None yet
Development

Successfully merging a pull request may close this issue.

4 participants