Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Implement text encoding helpers on top of the new runtime #679

Merged
merged 6 commits into from Jun 21, 2019
Merged

Conversation

dcodeIO
Copy link
Member

@dcodeIO dcodeIO commented Jun 19, 2019

As a follow-up to #564 this PR implements the API proposed in the old PR as helpers on top of the String class, namely String.UTF8 and String.UTF16, and removes the old lengthUTF8/fromUTF8/toUTF8 helpers.

API is:

declare namespace String {
  /** Encoding helpers for UTF-8. */
  export namespace UTF8 {
    /** Calculates the byte length of the specified string when encoded as UTF-8, optionally null terminated. */
    export function byteLength(str: string, nullTerminated?: bool): i32;
    /** Encodes the specified string to UTF-8 bytes, optionally null terminated. */
    export function encode(str: string, nullTerminated?: bool): ArrayBuffer;
    /** Decodes the specified buffer from UTF-8 bytes to a string, optionally null terminated. */
    export function decode(buf: ArrayBuffer, nullTerminated?: bool): string;
    /** Decodes raw UTF-8 bytes to a string, optionally null terminated. */
    export function decodeUnsafe(buf: usize, len: usize, nullTerminated?: bool): string;
  }
  /** Encoding helpers for UTF-16. */
  export namespace UTF16 {
    /** Calculates the byte length of the specified string when encoded as UTF-16. */
    export function byteLength(str: string): i32;
    /** Encodes the specified string to UTF-16 bytes. */
    export function encode(str: string): ArrayBuffer;
    /** Decodes the specified buffer from UTF-16 bytes to a string. */
    export function decode(buf: ArrayBuffer): string;
    /** Decodes raw UTF-16 bytes to a string. */
    export function decodeUnsafe(buf: usize, len: usize): string;
  }
}

I believe that streaming encoders and decoders will be an entirely different beast, hence I moved these to a more minimal-and-fast-helpers-friendly place. Wdyt?

@dcodeIO
Copy link
Member Author

dcodeIO commented Jun 20, 2019

One thing to note here for implementers (into C/Rust environments) is that by using string and ArrayBuffer as the return values instead of pointers, the API becomes safe in managed scenarios, BUT that doesn't mean that the API can only be used in safe ways. For example, if a pointer to the bytes returned from String.UTF8.encode is required, switching from managed to unsafe would now happen after the return, with changetype<usize>(returnedArrayBuffer) essentially obtaining a pointer to the raw data. Of course, if done that way, when such a pointer leaves managed boundaries, it must be __retained in case it sticks around longer than the immediate external function call (and __released once not needed anymore) because the underlying ArrayBuffer wrapping the bytes could otherwise become collected in the meantime.

@dcodeIO dcodeIO merged commit b6feaab into master Jun 21, 2019
@dcodeIO
Copy link
Member Author

dcodeIO commented Jun 21, 2019

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

1 participant