Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Allow translating from javascript files #15612

Open
wants to merge 41 commits into
base: main
Choose a base branch
from

Conversation

cofiem
Copy link
Contributor

@cofiem cofiem commented Mar 17, 2024

This PR enables translating strings in javascript.
It demonstrates the js translation in use by applying it to the timeago.js js util.
Note that the js util fetch-gettext.js returns a promise which is the Fetch API promise.

Please consider:

  • any concerns with this approach?
  • how to get the plurals to be included in the make translations output?

Changes:

  • enables extracting strings from javascript using pybabel,
  • adds a route that renders the appropriate translation using the existing machinery, and
  • adds js util for obtaining the translated string.

In progress:

  • tests
  • getting plurals in make translations output from js strings

Issues:

  • I can't seem to change the function name, and the plurals don't seem to be extracted, both are similar to this pybabel issue

Relates to issues:

@di
Copy link
Member

di commented Mar 19, 2024

Very cool! I like this approach a lot, I think this will definitely work.

One thing I'm wondering: since we probably won't have a ton of JS translations, would it make sense to load them all for a given locale in a bundle via a single request, as opposed to making N different requests?

@cofiem
Copy link
Contributor Author

cofiem commented Mar 19, 2024

One thing I'm wondering: since we probably won't have a ton of JS translations, would it make sense to load them all for a given locale in a bundle via a single request, as opposed to making N different requests?

I'm not sure I understand this.

I found a way to use the server-side rendering of translations because:

  • I can't see that there's a way to get just the js translation strings to make a bundle of translation strings
  • each translation can have up to 6 variants, which are chosen based on the num parameter
  • some translations have placeholders in the string, and I didn't want to have to find or make a way to fill the placeholders in js

If you're concerned about the server load or frequency of requests, the browser cache seems to work well, and there's likely some additional caching to be done in the server-side view that could reduce the need to execute the localiser.pluralize or localizer.translate.

Please let me know if I've misunderstood what you're asking.

@cofiem
Copy link
Contributor Author

cofiem commented Mar 19, 2024

I managed to get the plurals extracted.

It looks like python-babel expects particular function names.

TODO:

  • I'll include some screenshots of the js translations in action.
  • I may also have time to go through and make the rest of the text that's been identified in the other issues as translatable.
  • fix the couple of failing tests

Question:

  • Is there an existing way to cache the output of a view that depends on the request.params? That would work well for caching the translation view on the server side.
    Maybe origin_cache?

@di
Copy link
Member

di commented Mar 20, 2024

If you're concerned about the server load or frequency of requests, the browser cache seems to work well, and there's likely some additional caching to be done in the server-side view that could reduce the need to execute the localiser.pluralize or localizer.translate.

Yes, I was thinking about balancing the number of requests with the size of the requests. It's probably an over-optimization though, our CDN will cache these responses by default so it's not the load on our backend that I was concerned with, just the number of requests the user's browser would have to make after the page has loaded. I think it's fine to move forward as-is, just wanted to think through it a bit.

Is there an existing way to cache the output of a view that depends on the request.params? That would work well for caching the translation view on the server side.

By default, we cache all requests, but strip all URL parameters when caching a given view (with some exceptions). We can add an exception for this view. This happens here:

https://github.com/pypi/infra/blob/9568346c6781a6d9e754e9edc3883dd34febd97b/terraform/warehouse/vcl/main.vcl#L103-L126

@cofiem
Copy link
Contributor Author

cofiem commented Mar 24, 2024

The only remaining question is what @di was asking - is the number of requests made by the browser a problem?

Some example screenshots of the js translations in use.

Screenshot 2024-03-24 181338
Screenshot 2024-03-24 181556
Screenshot 2024-03-24 181952
Screenshot 2024-03-24 182303

@cofiem
Copy link
Contributor Author

cofiem commented Mar 24, 2024

Test errors:

> msg += " " + " ".join(results["feedback"]["suggestions"])
E TypeError: unsupported operand type(s) for +=: 'LazyString' and 'str'

This one could be fixed by adding str(msg), but there might be a better way?

- msg += " " + " ".join(results["feedback"]["suggestions"])
+ msg = str(msg) + " " + " ".join(results["feedback"]["suggestions"])

> return request.localizer.translate(_translation_factory(message, **kwargs))
E AttributeError: 'NoneType' object has no attribute 'localizer'

I'm not sure how to fix this error. Other form tests seem to be able to set request=pretend.stub(), but the Validator directly uses from warehouse.i18n import localize as _, which errors at request.localizer.

@cofiem cofiem marked this pull request as ready for review March 24, 2024 10:40
@cofiem cofiem requested a review from a team as a code owner March 24, 2024 10:40
warehouse/forms.py Outdated Show resolved Hide resolved
@miketheman
Copy link
Member

Not to completely derail this direction, but had you looked at https://www.npmjs.com/package/@zainulbr/i18n-webpack-plugin or such at all? We currently use Webpack to generate bundles, so it'd be cool if we could have those localized and served once, instead of on demand for each page.

@cofiem
Copy link
Contributor Author

cofiem commented Apr 6, 2024

@miketheman commit 8fb15dd is a draft of an alternate approach that generates a messages.json file as part of make translations.
It uses gettext.js to show the translations.

It's a draft (I couldn't get gettext.js to import / require properly).

Comparison:

server-side approach:
pros:

  • no changes to existing translation generation
  • already implemented

cons:

  • may make many http requests for translations

client-side approach:
pros:

  • generates js translation file as part of translation generation
  • loads all translations needed in js as part of page https requests
  • no additional requests

cons:

  • additional js dependency
  • includes all translations in page, even though only one will ever be needed
  • requires changing the translation generation

@cofiem
Copy link
Contributor Author

cofiem commented Apr 6, 2024

I had a look at that webpack plugin, and some other approaches.
They all seemed to assume that the translations were only done in js, and so included all translations. I also couldn't see how to do plurals.
Since only a very few translations are needed in js, gettext.js seemed like the most appropriate approach.

@miketheman miketheman added javascript requires change to JavaScript files i18n Internationalization labels Apr 19, 2024
@miketheman
Copy link
Member

@cofiem Taking another look here at the alternative approach from commit 8fb15dd

  1. It sounds like messages.json would contain all translated strings for all languages, is that correct?

  2. If so, can we fingerprint the file (again webpack might help here, it does this for other assets already) so we don't have worry about cDN/cache busting?

  3. On failure to load either the js lib or messages.json (degrades gracefully), would any strings end up on the page as a fallback?

If the answer to all of these "yes", it seems like this approach would be better for now.

@cofiem
Copy link
Contributor Author

cofiem commented Apr 20, 2024

Alright, I'll try to explain what I've done.

I attempted an approach similar to https://www.npmjs.com/package/@zainulbr/i18n-webpack-plugin
i.e. embed the translations into separate js bundles, one for each language.
Unfortunately, the existing webpack plugins were not suitable (issues with webpack 5 and/or did not work for the warehouse setup).
However, I did manage to create a webpack plugin that does the job (in webpack.plugin.localize.js).

There is a problem with gettext.js: it does a kind of js eval, which is not allowed by the CSP: Uncaught EvalError: call to Function() blocked by CSP.
I've solved this by using webpack plugin to check the plural forms and embed them into the generated js bundles.

So, the development process with these changes:

  1. Use gettext or ngettext as appropriate in js files
  2. Run make translations: pybabel extracts messages from js files, then a python script generates messages.json files for only the KNOWN_LOCALES and only includes translations that are needed in js
  3. Run make static_pipeline: the webpack plugin embeds the translated data from the messages.json files
  4. At runtime, the matching js bundle for the current locale is set in base.html, and gettext.js is initialised with the translation data. It can determine plurals and do string templating.

These are the benefits of this approach:

  • Uses the existing translation and js tooling - make, pybabel, locale files, webpack
  • Enables plurals and template placeholders
  • Limits or does not use the hard to cache routes nor additional http requests
  • Limits the number of unmaintained packages required, and works with webpack 5
  • Straightforward to maintain and use as part of contributions

@cofiem cofiem requested a review from di April 24, 2024 08:03
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
i18n Internationalization javascript requires change to JavaScript files
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

3 participants