Extend the capabilities of convert_ruby_to_v8 #128

Taiki-San · 2019-01-26T14:25:22Z

Extend the capabilities of convert_ruby_to_v8 to convert more types, and to work better when the encoding of strings isn't UTF-8.

This new code will call to_s when it's available in order to avoid simply sending an error message to V8.
Moreover, this PR make the string conversion code more resilient to varying string encodings, UTF-8 sometimes not being used. A fast path is used for the most common (UTF-8, ASCII, Latin1) and a slower path perform the actual conversion if more exotic encodings are used.

…and to work better when the encoding of strings isn't UTF-8.

SamSaffron · 2019-02-06T05:02:46Z

I am seeing lots of failures in travis, can you have a look? Overall I welcome this change but we need to make sure it works everywhere.

Taiki-San · 2019-02-10T10:06:32Z

Oh, you're correct, I forgot to include a header. This should be good to go!

SamSaffron · 2019-02-10T21:36:51Z

Hmmm the utf checker seems to come from https://github.com/lemire/fastvalidate-utf-8 ...we need to properly attribute this

SamSaffron · 2019-02-10T21:38:21Z

Also I wonder if we can just go slower and carry a lot less code by force encoding if we get something that is not UTF8 from Ruby? I am uneasy about carrying all this extra c++ code

Taiki-San · 2019-02-10T22:24:04Z

This change was initially instigated on our side as a performance optimisation. Due to the context in which our code is used, we have to be extremely robust to weird encodings and to avoid any corruption of string being transmitted to V8. This change resulted in substantial performance improvements in our benchmark (10-20% faster end-to-end, the overhead is now quite probably close to nil) over a similar Ruby implementation. Because those constraints are quite unusual, I can see why the maintenance tradeoff wouldn't be worth it for you.

If you prefer it as it was, I can rollback the encoding changes. I believe force_encoding only changes the string metadata and thus is pointless if we're about to collect its bytes into a C++ object.

If you feel comfortable proceeding with our changes, I agree that I initially didn't properly credit the UTF-8 checking code. Besides in a header, where do you think I should update the credit?

cataphract · 2019-02-12T12:39:04Z

@Taiki-San @SamSaffron I've added the notices required for compliance with the terms of the Apache 2 license, under which the added header is licensed.

As to whether the header is necessary: not really, ruby has the same functionality for validating UTF-8: we can call valid_encoding?. It's just faster this way. The UTF-8 validation is critical in our case and it represents a big chunk of our overhead, so this is a valuable improvement for us.

SamSaffron · 2019-02-12T20:41:12Z

@cataphract / @Taiki-San I am open to accepting this on the condition that we aim for this to be temporary, can you open up a change request to ruby (at: https://bugs.ruby-lang.org/issues and PR to the ruby/ruby repo) to expose a "fast" valid_encoding? to c extensions, then longer term in 2.7 we can remove this extra file.

ahorek · 2019-12-23T01:29:51Z

too late for Ruby 2.7, but Shopify/ruby#2 looks promising!

Extend the capabilities of convert_ruby_to_v8 to convert more types, …

c9354d8

…and to work better when the encoding of strings isn't UTF-8.

Add a forgotten header

5b0afc8

Taiki-San force-pushed the backport/improve-string-conversion branch from b90b2f9 to 5b0afc8 Compare February 10, 2019 10:04

Comply with the terms of the Apache license

70d2a56

awinograd mentioned this pull request Apr 3, 2019

Marshalling nested rb Date to js Date is not working. #136

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Extend the capabilities of convert_ruby_to_v8 #128

Extend the capabilities of convert_ruby_to_v8 #128

Taiki-San commented Jan 26, 2019

SamSaffron commented Feb 6, 2019

Taiki-San commented Feb 10, 2019

SamSaffron commented Feb 10, 2019

SamSaffron commented Feb 10, 2019

Taiki-San commented Feb 10, 2019

cataphract commented Feb 12, 2019

SamSaffron commented Feb 12, 2019

ahorek commented Dec 23, 2019

Extend the capabilities of convert_ruby_to_v8 #128

Are you sure you want to change the base?

Extend the capabilities of convert_ruby_to_v8 #128

Conversation

Taiki-San commented Jan 26, 2019

SamSaffron commented Feb 6, 2019

Taiki-San commented Feb 10, 2019

SamSaffron commented Feb 10, 2019

SamSaffron commented Feb 10, 2019

Taiki-San commented Feb 10, 2019

cataphract commented Feb 12, 2019

SamSaffron commented Feb 12, 2019

ahorek commented Dec 23, 2019