Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Optimize buffering for ResourceContainers #20

Open
3 tasks
alexstaeding opened this issue Nov 26, 2021 · 0 comments
Open
3 tasks

Optimize buffering for ResourceContainers #20

alexstaeding opened this issue Nov 26, 2021 · 0 comments
Labels
bug size:L This can be dealt with in 2-3 weeks

Comments

@alexstaeding
Copy link
Member

Currently, standard implementations of ResourceContainer such as ZipResourceContainer buffer the backing stream. This was a necessary safety precaution in the initial implementation of the launcher module (#12), as it is not initially clear how callers may use the stream. In the interest of time, this was the best solution.

As an example, this is the current state of ZipResourceContainer as of v0.2.1:
https://github.com/SourceGrade/Jagr/blob/30c517952cdcc53ce4f2c54ee6d6cc87a3d3a69e/launcher/src/main/kotlin/org/sourcegrade/jagr/launcher/io/ZipResourceContainer.kt#L57-L63

The result from the underlying ZipInputStream is buffered using ByteArrayResource:
https://github.com/SourceGrade/Jagr/blob/30c517952cdcc53ce4f2c54ee6d6cc87a3d3a69e/launcher/src/main/kotlin/org/sourcegrade/jagr/launcher/io/Resource.kt#L70-L73

Further compounding the issue, is the fact that zip.readBytes() creates its own buffer and then copies the array again to cut it to the correct size (IOStreams.kt in the Kotlin stdlib):

/**
 * Reads this stream completely into a byte array.
 *
 * **Note**: It is the caller's responsibility to close this stream.
 */
@SinceKotlin("1.3")
public fun InputStream.readBytes(): ByteArray {
    val buffer = ByteArrayOutputStream(maxOf(DEFAULT_BUFFER_SIZE, this.available()))
    copyTo(buffer)
    return buffer.toByteArray()
}

Why turn off buffering by default?

In cases where the original resource is directly transformed, it is not necessary to save the original state of the stream. While some buffering may be required for the transformation, that should be up to the transformer to implement. A "general" buffering strategy is not useful here and may significantly harm performance.,

A possible solution to this is to not buffer the standard implementations of ResourceContainer at all. Instead, for cases where a "general" buffer is required, provide a method similar to InputStream.buffered from the Kotlin stdlib (which returns a wrapped, buffered InpustStream) but applied to either ResourceContainer and/or Resource.

Deliverables

  • Remove all redundant array copies and stop buffering standard implementations of ResourceContainer by default
  • Provide an alternative method when buffering is needed
  • Document the effects of non-buffered resources
@juliushardt juliushardt self-assigned this Nov 26, 2021
@alexstaeding alexstaeding added this to the v0.3 milestone Nov 26, 2021
@alexstaeding alexstaeding added the semver:major A major change breaking backwards compatibility label Nov 26, 2021
@alexstaeding alexstaeding modified the milestones: v0.3, v0.4 Dec 7, 2021
@alexstaeding alexstaeding modified the milestones: v0.4, v0.5 Mar 3, 2022
@alexstaeding alexstaeding modified the milestones: v0.5, v0.6 Jun 1, 2022
@alexstaeding alexstaeding removed this from the v0.6 milestone Sep 23, 2022
@alexstaeding alexstaeding added size:L This can be dealt with in 2-3 weeks bug and removed semver:major A major change breaking backwards compatibility labels Oct 19, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug size:L This can be dealt with in 2-3 weeks
Projects
Development

No branches or pull requests

2 participants