Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Provide a way to automatically delete multipart temporary files #5653

Open
wants to merge 10 commits into
base: main
Choose a base branch
from

Conversation

ikhoon
Copy link
Contributor

@ikhoon ikhoon commented Apr 30, 2024

Motivation:

Armeria does not automatically delete the uploaded files, so users should manually remove the temporary files themselves. It would be useful if we provided some options for how to delete multipart temporary files.

Related: #5652

Modifications:

  • Add MultipartRemovalStrategy that is used to determine how to delete multipart files. For now, three options are supported.
    • NEVER
    • ON_RESPONSE_COMPLETION
    • ON_JVM_SHUTDOWN
  • Add builder methods to server/virtualhost/service builders.

Breaking changes:

  • Multipart temporary files are now automatically removed when a response is fully sent. If you want to keep the existing behavior, use MultipartRemovalStrategy.NEVER.

Result:

  • You can now specify when to remove multipart temporary files using MultipartRemovalStrategy.
Server
  .builder()
  .multipartRemovalStrategy(MultipartRemovalStrategy.ON_RESPONSE_COMPLETION)

Motivation:

Armeria does not automatically delete the uploaded files, so users
should manually remove the temporary files themselves.
It would be useful if we provided some options for how to delete
multipart temporary files.

Related: line#5652

Modifications:

- Add `MultipartRemovalStrategy` that is used to determine how to delete
  multipart files. For now, three options are supported.
  - NEVER
  - ON_RESPONSE_COMPLETION
  - ON_JVM_SHUTDOWN
- Add builder methods to server/virtualhost/service builders.

Breaking changes:

- Multipart temporary files are now automatically removed when a
response is fully sent. If you want to keep the existing behavior,
use `MultipartRemovalStrategy.NEVER.

Result:

You can now specify when to remove multipart temporary files using
`MultipartRemovalStrategy`.

```java
Server
  .builder()
  .multipartRemovalStrategy(MultipartRemovalStrategy.ON_RESPONSE_COMPLETION)
```
}, blockingExecutorService);
break;
case ON_JVM_SHUTDOWN:
tempFile.toFile().deleteOnExit();
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Wouldn't it result in a kind of resource leak, since the JVM needs to keep the list of files to remove on exit?

What do you think about this:

  1. Introduce an additional flag purgeMultipartUploadsOnExit (default: true) that registers a shutdown hook that scans the directory and remove the items.
  2. Remove ON_JVM_SHUTDOWN from (default)multipartRemovalStrategy.

Copy link
Contributor Author

@ikhoon ikhoon Apr 30, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Introduce an additional flag purgeMultipartUploadsOnExit (default: true)

I think purgeMultipartUploadsOnExit and MultipartRemovalStrategy are mutually exclusive. If MultipartRemovalStrategy.ON_RESPONSE_COMPLETION is set, purgeMultipartUploadsOnExit means nothing. If some users want to keep the temporary files for n days as a backup even after the server restarts, they have to set both MultipartRemovalStrategy.NEVER and purgeMultipartUploadsOnExit=false.

that registers a shutdown hook that scans the directory and remove the items.

How about moving multipart files to a special directory such as temporary if ON_JVM_SHUTDOWN is set? The shutdown hook will delete all temporary directories under multipartUploadsLocations.

- service1-upload/
  - incomplete/
  - running/ <- for ON_RESPONSE_COMPLETION
  - complete/ <- for NEVER
  - temporary/ <- for ON_JVM_SHUTDOWN
- service2-upload/
  - incomplete/
  - running/
  - temporary/

Copy link
Contributor

@jrhee17 jrhee17 May 2, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm +1 for @ikhoon 's idea to at least group the user-facing APIs in one place. I do think we will need to add a shutdown hook somewhere if ON_JVM_SHUTDOWN is selected though.

Eventually, I imagine an API like the following (if very advanced features are needed)

val mps = MultipartRemovalStrategy.
  // builder methods
  .onJvmShutdown()
  .onResponseComplete()
  .never()
  // maybe advanced
  .builderForScheduledDeletion().poll(1 day).olderThan(7 days)
ServerBuilder.multipartRemovalStrategy(mps)

How about moving multipart files to a special directory such as temporary if ON_JVM_SHUTDOWN is set?

I'm not sure I understood the need to separate the directory between complete and temporary.

Also, would it make sense to somehow embed the server id for each multipart file in case multiple servers are running in the same jvm? This way, we can still delete files that are for a specific jvm and the flag ON_JVM_SHUTDOWN makes more sense.

e.g. <pid>-<server id>- as the prefix for the temporary files.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm unsure if ON_JVM_SHUTDOWN is a really useful option, it is worth implementing the complicated functions. For now, I would like to delete ON_JVM_SHUTDOWN from the PR and make it unsupported. If users have requests for it, we would like to consider them.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

For now, I would like to delete ON_JVM_SHUTDOWN from the PR and make it unsupported.

Sounds good to me.

@jrhee17
Copy link
Contributor

jrhee17 commented May 7, 2024

The build seems to be failing. Is this PR ready for review?

@ikhoon
Copy link
Contributor Author

ikhoon commented May 7, 2024

Fixed all failures. This PR is now ready for review.

@trustin
Copy link
Member

trustin commented May 7, 2024

There's still one test failure left.

@ikhoon
Copy link
Contributor Author

ikhoon commented May 7, 2024

I wonder if it is useful to synchronize the getters of VirtualHost and ServiceConfig.
MultipartRemovalStrategy is used when there is a matching service. So ServiceConfig will always exist. VirtualHost.multipartRemovalStrategy() won't be used internally or by users.

Anyway, I fixed the failure.

@trustin
Copy link
Member

trustin commented May 8, 2024

ServiceConfig will always exist. VirtualHost.multipartRemovalStrategy() won't be used internally or by users.

A user could do this:

serverBuilder
  .withVirtualHost("foo.com")
    .multipartRemovalStrategy(NEVER)
    .service(...)
    .service(...);

Then virtual host level properties will be useful? If we have any service-level default, we should not do that, so that virtual host level (or default virtual host level if not set at virtual host level) is used when service level property is unset.

Copy link
Member

@trustin trustin left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🚀🚀🚀

routingCtx -> RequestId.of(1L),
ServerErrorHandler.ofDefault().asServiceErrorHandler(), NOOP_CONTEXT_HOOK);
final ServiceConfig config = newServiceConfig(HealthCheckService.builder().build(),
ServiceNaming.fullTypeName());
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

👏👏👏

@minwoox
Copy link
Member

minwoox commented May 8, 2024

Could you check the failure?

com.linecorp.armeria.internal.server.MultipartTempFileRemovalTest.[2] type=/default - 
org.opentest4j.AssertionFailedError: 
Expecting value to be false but was true
	at jdk.internal.reflect.GeneratedConstructorAccessor35.newInstance(Unknown Source)
	at java.base@17.0.10/jdk.internal.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:45)
	at java.base@17.0.10/java.lang.reflect.Constructor.newInstanceWithCaller(Constructor.java:499)
	at app//com.linecorp.armeria.internal.server.MultipartTempFileRemovalTest.testRemovalStrategy(MultipartTempFileRemovalTest.java:79)
	at java.base@17.0.10/java.lang.reflect.Method.invoke(Method.java:568)
	at java.base@17.0.10/java.util.stream.ForEachOps$ForEachOp$OfRef.accept(ForEachOps.java:183)
	```

@minwoox
Copy link
Member

minwoox commented May 8, 2024

This isn't relevant to your PR but could you also fix this line? MultiPart -> Multipart

<type://MultiPart#toHttpRequest(String)> and transmitted to a server using a <type://WebClient>.

Copy link
Contributor

@jrhee17 jrhee17 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good to me once #5653 (comment) is handled 👍 Thanks @ikhoon ! 🙇 👍 🙇

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Automatically remove multipart temporary files
4 participants