Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

tmp must not exit the process on its own on SIGINT #216

Closed
silkentrance opened this issue Jan 24, 2020 · 11 comments
Closed

tmp must not exit the process on its own on SIGINT #216

silkentrance opened this issue Jan 24, 2020 · 11 comments

Comments

@silkentrance
Copy link
Collaborator

silkentrance commented Jan 24, 2020

originates from #193

To recap:

@ferm10n the reason for that is that SIGINT handlers are expected to terminate the process by calling process.exit(), see https://nodejs.org/api/process.html for more information. Here is an excerpt that explains why tmp currently (and in the past) will enforce the process exit while also attaching its SIGINT listerner last to the chain of available listener.

'SIGTERM' and 'SIGINT' have default handlers on non-Windows platforms that reset the terminal 
mode before exiting with code 128 + signal number. If one of these signals has a listener installed, its 
default behavior will be removed (Node.js will no longer exit).

And, AFAIK, this was the original behaviour of tmp before that I put my dirty fingers on it :) but I might be wrong. Memory fails over the years...

Regardless, the current approach is prone to failure, especially if the application itself installs a SIGINT listener that will exit upon its own, so the tmp listener might never be called.

I will rethink and I might already know a proper solution, based on your input

a) tmp never exits the process
b) tmp registers its SIGINT handler as the first handler in the list of handlers

this, however, requires the application to move any files or dirs of interest and which are currently under "supervision" by tmp to a different location, in case that these must be kept.

Unless, of course, that the opts specified that the file or dir must be kept.

also, the cleanup callback must be refactored so that it will not throw an exception if a file or directory is missing.

@silkentrance
Copy link
Collaborator Author

silkentrance commented Jan 24, 2020

TODO this change requires existing tests that rely on tmp exiting the process on terminating signals to install additional signal handlers.

@silkentrance
Copy link
Collaborator Author

silkentrance commented Jan 27, 2020

The main problem with this is that, if the user does not install a sigint handler, the application will keep on running and will not exit. So this needs to be communicated in the documentation.

  1. Alternatively, we could have a tmp.exitOnInterruption method that will allow the user from opting out of the standard behaviour, which is to exit on SIGINT.

  2. Alternatively, we could also get rid of the interrupt handler altogether and leave it to the application to call on process.exit() on SIGINT. The standard tmp provided exit handler should still be called.

@ferm10n @raszi what do you think? which would be best? keep the original behaviour and make it optional (opt out) or get rid of the SIGINT listener altogether, provided that node will call our exit handler on process exit and will not use some SIGINT specific code path? (need to test this)

Personally, and as suggested by you, I would go for the second approach, as this will get rid of some code, provided that our exit handler is still being called, otherwise we need to go the first route.
And this would be a compatibility breaking change, too, as users in the past might have relied on tmp to exit the process, so this should be communicated well.

As for support for jest and node sandboxing, this is a completely different topic, including the complexity of figuring out previously installed handlers.

@raszi
Copy link
Owner

raszi commented Jan 27, 2020

Signal handling is a mess and by now it is clear for me that we cannot have a clear solution or a good enough workaround for that.

As I expressed in a different ticket I believe we should not try to add any listener but we should provide a generic clean-up callback, what the users of tmp could call and that would get rid of the files generated so far.

There are a lot of different use-cases and cleaning up the temporary files when the node process exists is just one of them, there should also a way to clean-up temporary files regularly.

@silkentrance
Copy link
Collaborator Author

silkentrance commented Jan 27, 2020

@raszi I am glad that you are still alive and kicking. I was already wondering whether something bad has happened.

Lets see about these new requirements, having a generic cleanup callback in a Jest/node sandboxed environment should be similar to the current process exit listener.

However, this should be an additional installment in respect to the standard on process exit listener.

As for continuous garbage collection, this should not be made a part of tmp itself, rather it should be made a call, see the generic cleanup callback. And it will then be left to the user to repeatedly call the garbage collector. Which in turn might add some extra complexity on whether a file is currently open and cannot be deleted and so on. Not to mention overall complexity with directory trees, where a specific file in a given directory is still being required for reading and so on.
So automated and periodic cleanup is currently out of the question unless the process exits.

And also given node sandboxes in respect to the Jest testing framework, the references to the previous instances of tmp controlled filesystem objects will be lost. By that, we need to keep track of the garbage using the listeners alone, unless we add the required information to the process environment as process variables, i.e. keeping track of objects that need to be removed on either process exit or forced garbage collection.

The latter, as already mentioned is rather complicated in respect to possibly deeply nested directories.

But give me some time on thinking about a proper solution, unless you have one already.

@ferm10n
Copy link

ferm10n commented Feb 2, 2020

I've been doing a bit of thinking on this and some research into where the handlers were introduced.

I have a couple questions though:

  1. If there are other SIGINT handlers not installed by tmp, isn't it reasonable to assume that those will be responsible for calling process.exit? If that is true, then maybe that could be the condition on which tmp actually calls exit or not (kinda how doExit is used). I know that the check for existingListeners happens when tmp installs its own listener, but maybe that should happen inside tmp's SIGINT handler itself after SIGINT is actually called. That way things won't get weird if application SIGINTs are added after tmp's. I'll try to post a PR tomorrow night to show what I mean...
    but I do understand why not the SIGINT calls process.exit and I think I can see a clean way to do it. I still agree that exposing a garbage collection method is a good idea.

  2. Why are there handlers for both process.on('exit') and process.on(SIGINT)? Wouldn't the exit handler be sufficient? If it is, maybe that eliminates the need for SIGINT handling overall! I'll keep doing research to see what the circumstances were (PRs and commits and whatnot) that introduced the exit handler in addition to SIGINT.

@silkentrance
Copy link
Collaborator Author

silkentrance commented Feb 3, 2020

@ferm10n Thank you very much for participating on this.

Let's start with answering the second question first:

In the past, AFAIK node v6 and maybe also later versions, exit handlers would not be called with the node standard SIGINT handler. Therefore we introduced the SIGINT handler as requested by a user. And with early Jest sandboxing, this became even more of a hassle, leading to the current "install listeners safely" situation.

With Jest being fixed now, we still need to see whether the standard SIGINT handler will call the exit handlers.

process.addListener('exit', function () { console.log('exiting'); });

function loop() {
  console.log('waiting for SIGINT');
  setTimeout(loop, 1000);
}

loop();

Running this shows that the exit handler will not be called, even with node v13.
So one needs to install a custom SIGINT handler that will then call process.exit().

process.addListener('exit', function () { console.log('exiting'); });
process.on('SIGINT', process.exit);

function loop() {
  console.log('waiting for SIGINT');
  setTimeout(loop, 1000);
}

loop();

And this works as expected.

See #121 (original issue) and #192 (tmp must not exit the process on its own) and #129 (original sandboxing by jest caused tmp to install listeners multiple times).

See also nodejs/node#2853 for more information on why the SIGINT handler will not trigger the process exit event handlers.

And according to this nodejs/node-v0.x-archive#457 (comment) it is also unsafe to call exit(), however, how would you then be able to terminate the process?

@silkentrance
Copy link
Collaborator Author

silkentrance commented Feb 3, 2020

@ferm10n Now for question one

Of course, this would be a good solution. As for exposing the garbage collection function, well this is definitely an option but might result in some other problems, where the user has worker threads that will call the garbage collector while other threads are still working with some temporary files a/o directories. The garbage collector does not know whether a resource is still in use and will simply remove the files and directories, leading to unwanted/unspecified side effects and errors even.
On windows this is not that much of a problem as files will always be locked by the system. On Linux and MacOS this is more problematic as there are no intrinsic locks to a file or directories even.

I am currently looking into various file lock packages on npm but I do not think that @raszi would appreciate adding yet another dependency. And I even started prototyping my own naive solution that is quite similar to the https://www.npmjs.com/package/lockfile package, with the locking mechanism being self-advisory only, i.e. the garbage collector would check on a file or directory still being locked and will then refrain from garbage collecting it.

And doing proper mandatory locking on Linux seems quite complicated: https://www.thegeekstuff.com/2012/04/linux-file-locking-types/.

And also there is always the keep option but this leaves it entirely to the user to actually call the cleanup callback, so we might get away with that and keep the solution as simple as possible.

@silkentrance
Copy link
Collaborator Author

silkentrance commented Feb 3, 2020

@ferm10n So if we would have a tmp.removeGarbage() function, it should be communicated well to the user that this is a risky operation. Also, the removal must be async in order to not stall the application. So again we have two different modes of operation here, one that uses the async cleanup callbacks and one that has to use the sync cleanup callbacks (exit handler requires that all operation is synchronous).

@raszi
Copy link
Owner

raszi commented Feb 3, 2020

Or to drop this automatic functionality altogether.

We provide a cleanupCallback with which the users of the library could signal that the files could be removed.

We could provide a final cleanup function which could be called whenever the user feels like (usually on exit) and that would clean up all the created files.

@silkentrance
Copy link
Collaborator Author

@raszi see #233. the sigint handler has been dropped altogether.

@silkentrance
Copy link
Collaborator Author

Closing this in favor of #233 which tackles this issue and others as well.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

3 participants