Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Integration of the simulation feature seems deficient #532

Open
eric-s-raymond opened this issue May 19, 2022 · 13 comments
Open

Integration of the simulation feature seems deficient #532

eric-s-raymond opened this issue May 19, 2022 · 13 comments

Comments

@eric-s-raymond
Copy link
Contributor

I'm developing a TUI as part of Shimmer, a project that aims to eliminate the single points of failure that are centralized forge sites by enabling issue and merge-request data to be embedded directly in a Git repository. The user agent for browsing this embedded data will field in three flavors; a CI for scripting (done), a TUI for convenient local browsing (in progeess), and a web-server that resembles a forge interface but can run anywhere there's a copy of the repo (not yet started). You can read about the project here:

https://gitlab.com/esr/shimmer

I'm using tcell for the TUI. I found tcell very pleasant to use until the program developed a crash bug. It's perfectly reproducible, and I even know the line of source code where the program is dying -- I think it's panicking on a bad array index. But tcell is just as bad as old-time curses about silently crashing without leaving trace information, I wanted to work around this by using the terminal simulation and adding a bunch of logging, but found it too difficult.

The fact that simulation screens have a different allocation function from regular ones is the beginning of the pain. Makes it difficult to write my code so I can dynamically choose between a real console and a simulated one. I also don't understand why the regular and simulation-screen interfaces are as different as they are.

Here's what I think I ought to be able to do:

. Set TERM to a magic value and get back a Screen that does the simulatiion thing internally, not messing with the tty settings.

. Call a method on the returned object that gives me an event channel I can push events to. Its version of event polling waits on that channel.

. Have an available method of the object that dumps the screen state in a textual form.

Is there some good reason it doesn't work that way now? If I wrote patches to fix this, would you take them?

@gdamore
Copy link
Owner

gdamore commented May 19, 2022

So the simulation screen was written quite a long time ago, and was intended to be a framework to support some of the unit tests. Maybe StubScreen would have been a more apt name. Some of the interfaces it exposes are solely to facilitate unit tests.

Having said that, I'm completely amenable to a set of improvements around this area. The ideas you've presented here seem like they would be meaningful improvements, and I like the idea of being able to feed events into a channel to simulate user interaction. (We could also implement some kind of test events if we need to send messages into the simulation to access non-public state or make changes or read state from within the context of the event loop.)

So assuming the contributions are well-formed (follow style, licensing, etc.) and there aren't technical deficiencies found in them that are unresolvable, I see no reason we wouldn't accept such contributions.

@eric-s-raymond
Copy link
Contributor Author

eric-s-raymond commented May 19, 2022

Thanks, that's good to know. I'll be quite happy to write under tcell's Apache license, I'm very accustomed to contributing to open source. Also I'm a former maintainer of ncurses.

In the absence of a SimulationScreen I could actually use, I've been chasing my bug with log messages to a disk file, narrowing down the bug location that way. It looks like there might be some more general problem with writes to offscreen locations - in my example composing a TUI status line that is too wide for the screen. Do you know of anything in tcell's data structures that might create such a vulnerability?

I've been sketching a design in my head and I think it can be pretty simple. There might not even be a requirement for a special terminal type. Suppose there were a method of Screen that could be called before Init that did two things:

  • Replaced normal input polling with a read from a character channel it returns

  • Turned off the messing with tty settings that Init() and Fini() normally do.

That would actually do most of what's needed to de-complicate driving and monitoring TUI code. Two more methods would be useful:

  • Redirect tcell output to a specified io.Writer()
  • Change the way terminfo control strings are shipped so that all characters become printable.

With these facilities it wouldn't be required to have any magic TERM value

@gdamore
Copy link
Owner

gdamore commented May 20, 2022

I actually know who you are. I'm a little surprised our paths haven't crossed before this, actually.

Your approach to simulation could certainly work... be advised however that the actual APIs are carefully designed to abstract some things for screens because e.g. the Windows console API works quite differently than a TTY based terminal.

As far as writing to offscreen locations, I don't think it will actually work in tcell directly, because tcell itself only has knowledge more or less of the "physical" screen. A higher layer (for example look in the views directory for a somewhat incomplete example) can provide logical windows (views in that case!) that might extend quite a lot larger than what tcell can actually display.

Essentially tcell keeps its "canvas" clipped to the size of the physical screen, rather some kind of logical frame buffer behind it might do.

You might find that what you're looking for is actually a higher level toolkit, like the views package. A number of such examples have been written. I think the README has links to a few of them. If you wanted to start writing up your own, I would recommend looking at the views directory, as it has a pretty starting point and I have built some actual functional UIs with it. (Of course, this is an area that gets very opinionated, and you might find the direction I took with views to not be to your taste. But looking won't hurt. ;-)

@eric-s-raymond
Copy link
Contributor Author

You are correct, there is no issue with offscreen writes. I mistakenly thought there might be one because I couldn't see the actual panic trace. I have ideas about how to fix the library to surface that panic trace. I'll try to budget some time for this over the next month.

I did look at some higher-level widget libraries. The problem I ran into is that they're all so poorly documented that I judged getting up to speed in them would take longer than doing the simple things I wanted to do in bare tcell. And, in fact, I now have the browsing interface I wanted done; there are a few features still needed, but I don't expect them to pose any real difficulties if your disengage()/engage() pair works as expected

I'm leaving this issue open in a browser tab to remind me to write those entry points I described upthread.

Finally: It's a long shot, but if you're in easy travel distance of Charlotte NC and free 10-12 June I'll be doing a talk about Shimmer there. Possibly a keynote address, that hasn't been decided yet.

https://southeastlinuxfest.org/

@gdamore
Copy link
Owner

gdamore commented May 24, 2022

Thanks.

I won't be anywhere near there at that time, and have other plans those dates anyway. But good luck!

@gdamore
Copy link
Owner

gdamore commented May 24, 2022

As to surfacing a panic, I think the idiomatic defer / recover solution is best. I just don't know how to regularly expose that via API.

I suppose we could do this automatically if we find a callback that we are executing panics, but that only covers a small percentage of possible panics. Really the application should do this as part of its setup, because you want to catch any panic when you leave the application's scope.

I more "complete" application framework (one where the user's code all runs as callbacks/events, sort of like what the views package does) might do this for the application. Although you'd still miss panics that originate from disassociated goroutines, I think.

If you panic'd without deferring your shutdown of the screen, you'd have seen some attempt to write stuff to your screen, although it might be garbled and your tty settings and other settings (such as mouse, colors, etc.) might be other than what you'd like.

By deferring the screen.Fini(), you wind up disabling the alternate screen buffer on the terminal emulator, and whatever was written there by panic is not displayed.

I'm not aware of a better solution to this problem, generally.

@eric-s-raymond
Copy link
Contributor Author

eric-s-raymond commented May 26, 2022

I may have a solution to the panic problem,. It's this:

func main() {
	s, err := tcell.NewScreen()
	if err != nil {
		croak("error during screen allocation: %s", err)
	}

	if err := s.Init(); err != nil {
		croak("error during screen initialization: %s", err)
	}

	defer func() {
		maybePanic := recover()
                s.Fini()
		if maybePanic != nil {
			panic(maybePanic)
		}
	}()

        // Application code goes here
}

That should clean up tcell's state and then re-raise the panic where it's visible. Can you think of any obvious reason this wouldn't work?

@eric-s-raymond
Copy link
Contributor Author

Amending this: I've just verified that that trick works by adding it, and a command to induce a panic, to my TUI.

I'm going to write some patches for your documentation now.

@gdamore
Copy link
Owner

gdamore commented May 26, 2022

Thanks. I have been thinking of doing the same since this conversation started. I'd welcome your contribution here.

@eric-s-raymond
Copy link
Contributor Author

Working on it now. MR to follow shortly.

@eric-s-raymond
Copy link
Contributor Author

OK, I now have a branch in which the Screen interface has a SetDebug method, only actually implemented in tScreen, Setting the flag to true does the following things:

  1. Disables messing with the TTY modes

  2. Tells the TPuts method to render non-printables in what it ships as backslash escapes, C style escapes the ESC is rendered as \e because we want to be able to recognize the ANSI escape-sentence leader easily.

This was actually a remarkably small and well-confined change.

It appears PollEvent already reads from an event queue fed by a goroutine in tty_unix_go, so that part is easy.

I'm thinking about how to test this now.

@eric-s-raymond
Copy link
Contributor Author

set-debug.txt

I'm tossing you a text patch rather than an MR because it's not ready to merge, needing a bit of detail I think you can fill in better than I.

At the point in engage() where Start() is called, if t.debug is true t.tty needs to be initialialized so writes to it go to os.Stdout. I also expect something unhelpful to happen when t.tty.WindowSize() is called after Start() has been skipped.

@eNV25
Copy link
Contributor

eNV25 commented Jun 18, 2022

Any chance of enabling that at build time?

Something like this:

// main.go
if debug {
	...
}
...
// debug_true.go
//go:build debug
const debug = true
// debug_false.go
//go:build !debug
const debug = false

And you should be able to enable debugging with go build -tags debug.

The advantage here is that you don't need to add a SetDebug method to Screen. And debugging code would be removed from normal builds because of dead code elimination.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants