Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Improve repl output window performance #2011

Open
wants to merge 3 commits into
base: dev
Choose a base branch
from

Conversation

julienvincent
Copy link
Contributor

@julienvincent julienvincent commented Jan 8, 2023

This PR introduces two new config entries - replOutputThrottleRate and replOutputMaxLines which are both used to improve the performance of the repl output window/file.

replOutputThrottleRate

If set to a non-0 number this will throttle output from the repl connection. If more output items are received than the throttle rate in a 500ms window then they will just be dropped.

replOutputMaxLines

If set to a non-0 number this will cause the output window to be truncated if it grows
beyond this threshold.

This should fix #2010 and #804 and should help address #942 (or maybe fix it?)

@julienvincent julienvincent changed the title Feature/repl output performance Improve repl output performance Jan 8, 2023
@julienvincent julienvincent changed the title Improve repl output performance Improve repl output window performance Jan 8, 2023
When an excessive amount of output is produced by the repl (for example
as a result of a rogue loop that is printing to stdout) Calva can
sometimes hang while trying to write all the output to the output
window/file.

The only way to resolve is to restart/reload the VSCode window.

This commit introduces a new config entry `replOutputThrottleRate` which
when set to a non-0 number will throttle output from the repl
connection. If more output items are received than the throttle rate in
a 500ms window then they will just be dropped.

Addresses BetterThanTomorrow#942
Fixes BetterThanTomorrow#2010
Calvas performance can be drastically affected as the size of the repl
output window grows.

This commit adds a config entry `replOutputMaxLines` which if set to a
non-0 number will cause the output window to be truncated if it grows
beyond this threshold.

Fixes BetterThanTomorrow#804
@julienvincent julienvincent force-pushed the feature/repl-output-performance branch from 6f95b38 to 99b6e9d Compare January 8, 2023 22:44
Comment on lines +778 to +782
"calva.replOutputMaxLines": {
"markdownDescription": "The maximum number of lines to retain in the repl output window. Having the repl output window grow too large can significantly affect performance. Setting this to 0 will disable truncating",
"type": "number",
"default": 1000
},
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's generally not the number of lines that is the problem, rather the number of tokens on the top level, or in a sexpr. A way to count tokens on the top level is to count words when printing output. That will introduce state that can be tricky to manage, though...

1000 lines is very low. It could be some 50 evaluations of not too big structures. This would be completely non-problematic for Calva to handle. While printing 1000 lines of output with 100 words each might start to be problematic (I actually don't know where the problems start.)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's generally not the number of lines that is the problem, rather the number of tokens on the top level, or in a sexpr

While this makes sense, I don't think the additional complexity needed to implement this is really worth it. My assumption is that counting lines is a close-enough analogy in the average case for this to probably be sufficient. Especially if we make sure the threshold is high enough for the user to not care about the truncation.

1000 lines is very low. It could be some 50 evaluations of not too big structures

This value was chosen pretty arbitrary, I'd be happy to increase it to a degree. While working on this I started to see noticeable performance degradation after about 10k lines of stack-trace content (I was printing Exceptions to stdout) so I don't think it should be increased too much. Maybe 5k lines?

I also think it's unlikely that a user would really be scrolling that far back in the repl history too often for us to need to worry about truncating after a few thousand lines.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Unlikely, maybe, but I know of a user who does this all the time. 😄 I search the file rather than scroll, but anyway. (I keep tons of browser tabs open as well.) To me it makes sense to disable this limit feature (by setting limit to 0, e.g.) and to have that as default.

Copy link
Member

@bpringe bpringe Jan 11, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hey @julienvincent, thanks for taking time to look into this.

I think deleting at an arbitrary line length will cause unbalanced forms and break syntax highlighting and maybe also formatting, unless I'm missing something. If I'm not though, I don't think that's really an option.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

To me it makes sense to disable this limit feature (by setting limit to 0, e.g.) and to have that as default.

I'm happy with that too.

I think deleting at an arbitrary line length will cause unbalanced forms and break syntax highlighting ...

It does break highlighting but in a localised way (only the broken form will be colored incorrectly). The rest of the file maintains the correct highlighting. I think this is an probably an acceptable behaviour.

and maybe also formatting

Which formatting do you mean? I haven't noticed anything obviously wrong.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

image

Here can see what I mean

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The way the structural editor works, when there is broken balance in the document, all bets are off. It will sometimes seem to work (or even work), but in many cases the structural editor is broken and with that a lot of Calva functionality that depends on this.

If we limit on lines, it will have to do it such that the balance is never broken. There are some ways to do this, I can think of two right now:

  1. Add the missing brackets. There is code for that in Calva, even if it might be a bit tricky to re-use (or, messy code, tbh).
  2. Always print full results. The limit then need to happen before the buffering, I think. (Not that I know how it is done now, but anyway.)

That said, I think we could also consider popping up a warning when the line limit is reached, instead of enforcing the limit. The warning can have a button for clearing the window.

Copy link
Member

@bpringe bpringe Jan 13, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It does break highlighting but in a localised way (only the broken form will be colored incorrectly). The rest of the file maintains the correct highlighting. I think this is an probably an acceptable behaviour.

Yeah, I'd say that, in particular, is acceptable too.

By formatting I meant if someone is typing code in the repl window, the auto-formatting could/would be broken if there's an unbalanced form (which Peter mentioned above - structural editing). And as he said other functionality may be broken too in that case.

CHANGELOG.md Outdated Show resolved Hide resolved
@PEZ
Copy link
Collaborator

PEZ commented Jan 10, 2023

Thanks for caring about this issue!

I'm not sure about the solution:

Generally we try to challenge the need to for adding configuration. There might be some other ways to tackle this that don't introduce configuration, or that introduces configuration that is easier to understand for the user. I think I posted an issue at some point, suggesting that we move the stdout and stderr output to another place, like an output channel, where we don't need to support structural editing (which doesn't make super much sense for print output anyway). That would leave mostly evaluation results in the REPL/output window and there is already configuration that can limit the length of these.

Seems reasonable throttling the pace of printing. And we might need/want to do that even if we print stdout/stderr somewhere else. I have no clue about what would be a good default though. Throttling too eagerly will disrupt things too as missing output can be very confusing.

Printing together with the evaluation makes sense for many use cases, so we might want to have config for this if we go the print-somewhere-else route.

We might want to collect more information about the problem. What if we added counters to the documents about how many tokens they have and how long it takes to scan them and to iterate through them? We could use the token counter for limitation settings instead of line count. And we could store this information in an edn file or something under the .calva supporting a Calva diagnostics command. At least for the output window, which is a bit special.

Co-authored-by: Peter Strömberg <pez@pezius.com>
@julienvincent
Copy link
Contributor Author

julienvincent commented Jan 10, 2023

I think I posted an issue at some point, suggesting that we move the stdout and stderr output to another place, like an output channel

Personally I really like that all my output lives in one place and I also much prefer the user-experience of it just being a document/editor window over the vscode native output channels. If we wanted to add support for outputting to a vscode channel I would ask for making that configurable so a user can choose to keep it in the repl/output window.

Seems reasonable throttling the pace of printing. And we might need/want to do that even if we print stdout/stderr somewhere else. I have no clue about what would be a good default though

My guess is that while outputting to a vscode channel might be more performant than the output window, we would probably still want to throttle yes.

I have no clue about what would be a good default though

Setting this to something quite high, like 2k, as a default starting point is probably quite safe. I think it would be quite rare that people actually run into this - it's mostly intended as a safety guard against rogue code, which I think people would struggle to trigger under normal development.

Throttling too eagerly will disrupt things too as missing output can be very confusing

It's worth highlighting that throttled/dropped items are reported in the output window which should help to reduce confusion. Perhaps when reporting we should also show a help message for how to increase/disable the throttling?

We could use the token counter for limitation settings instead of line count

Do you mean as a way to drive the decision for setting a default or are you proposing some kind of mechanism for Calva to automatically decide on a value?

@PEZ
Copy link
Collaborator

PEZ commented Jan 10, 2023

We could use the token counter for limitation settings instead of line count

Do you mean as a way to drive the decision for setting a default or are you proposing some kind of mechanism for Calva to automatically decide on a value?

I was thinking that if the document knows how many top level forms it has, it will be as easy to limit the output on this as on line counts.

Then we could consider measuring two things on each document:

  • Time it took to scan it (the lexer work)
  • Time it took to traverse it, I think we go through the whole document with forwardSexp() at some points and we could record the time

If we store this info on the document as well, it could also be used for limiting the output window output. Then the mechanism could be that we inform the user when the output window is getting slow, with the instruction to clear the window, rather than having an automatic limit.

@bpringe
Copy link
Member

bpringe commented Jan 11, 2023

If we wanted to add support for outputting to a vscode channel I would ask for making that configurable so a user can choose to keep it in the repl/output window.

I believe this was suggested / the idea, were that to be implemented.

My guess is that while outputting to a vscode channel might be more performant than the output window, we would probably still want to throttle yes.

I doubt throttling is necessary in output channels at all, but of course we lose the benefits of it going to an editor, and this is a bit of a tangent - I just wanted to comment.

I wonder if spending much more time and effort on the current output window is worth it. I've been thinking lately that Calva might be better off without the current output window. In its place we could use Portal or (to not add such a dependency) some read-only display provided by VS Code that supports syntax highlighting.

@PEZ I know the term "webview" might stir things inside you from past Calva days 😄, but I think a read-only webview would be drastically simpler than when we had another paredit implementation in a webview. In a webview we could use an existing JS solution for syntax highlighting. Maybe we could also use markdown instead of or within a webview? I don't know if that would be a good idea or not.

Anyway, I've just been thinking we can remove a lot of complexity and get rid of some of these issues by doing something like the above.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

3 participants