Skip to content

Can this module be used from other JavaScript actions? #55

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
bahmutov opened this issue Nov 5, 2019 · 28 comments
Closed

Can this module be used from other JavaScript actions? #55

bahmutov opened this issue Nov 5, 2019 · 28 comments
Assignees
Labels
enhancement New feature or request

Comments

@bahmutov
Copy link

bahmutov commented Nov 5, 2019

I am trying to write a JavaScript action that runs npm ci and caches ~/.npm and ~/.cache/Cypress folders. I can use this action from YML file like this

# https://github.com/actions/cache
- name: Cache node modules
  uses: actions/cache@v1
  with:
    path: ~/.npm
    key: ${{ runner.os }}-node-${{ hashFiles('**/package-lock.json') }}
    restore-keys: |
      ${{ runner.os }}-node-

- name: Cache Cypress binary
  uses: actions/cache@v1
  with:
    path: ~/.cache/Cypress
    key: cypress-${{ runner.os }}-node-${{ hashFiles('**/package.json') }}
    restore-keys: |
      cypress-${{ runner.os }}-node-

- name: install dependencies
  env:
    # make sure every Cypress install prints minimal information
    CI: 1
  run: npm ci

But I would like to have our own Cypress action that does the caching and NPM install for the user, something like

- name: Install NPM and Cypress
  uses: @cypress/github-action

This means from Cypress GH action JS code I need to call restore and save files in this action. I have tried but without success, for example

// after executing `npm ci` successfully
const homeDirectory = os.homedir()
const npmCacheDirectory = path.join(homeDirectory, '.npm')

core.saveState('path', npmCacheDirectory)
core.saveState('key', 'abc123')
core.saveState('restore-keys', 'abc123')

console.log('loading cache dist restore')
require('cache/dist/restore')

which gives me

done installing NPM modules
loading cache dist restore
##[error]Input required and not supplied: path
##[error]Node run failed with exit code 1

I think this particular error is due to a different core singleton between "regular" npm module and ncc- bundled actions/cache one. But in general, do you have by any chance a plan to document how to use this module from JavaScript? I could write TypeScript and bundle it as a top level action, which should bundle actions/cache code I think.

@byCedric
Copy link

byCedric commented Nov 5, 2019

I don't think states are accessible from other actions, as you can see here. Also, it's kind of hard to use this repository's code as npm dependency due to the ncc bundled code, like you mentioned. However, I'm also interested if we can implement the caching mechanism in 3rd party actions.

Right now, I'm trying to implement the necessary API calls based on actions/cache, and I don't see any reason why it wouldn't work. We have the same security tokens, and access to the API doesn't seem to be limited to 1st party apps only. But, it does require some additional work actions/cache is doing here already.

Wouldn't it make sense to create a toolkit package that handles most of the cache code?

@bahmutov
Copy link
Author

bahmutov commented Nov 5, 2019

absolutely would make sense to create a caching utility that other actions could reuse programmatically - and I think even rewriting this module would make sense. For example - when you require dist/restore you immediately get "run()" that executes - but does not return a promise, so the code that requires it cannot wait on it. For now I think I can clone this repo and use it myself via code, but eagerly would use an official module instead

@chrispat
Copy link
Member

chrispat commented Nov 5, 2019

Saving state is meant to allow a given instance of an action in a workflow to save some information that can be read in its post entrypoint if it defines one. It is not meant for other actions to consume. To output information for other actions to consume you should define outputs.

Right now e do not have any plans to create a core sharable module for the caching service.

@DanTup
Copy link

DanTup commented Nov 12, 2019

I'd like to see this too. I created actions for setting up Dart and Flutter, for example:

      - name: Setup Dart ${{ matrix.dart-version }}
        uses: DanTup/gh-actions/setup-dart@master
        with:
          channel: ${{ matrix.dart-version }}

Since the Dart stable branch doesn't change very often (and similar for Flutter) it seems ideal for caching, however I don't want to do it in the workflow file because:

  1. The URL for the stable build never changes, even when a new build is made (this makes it difficult or impossible to know when to invalidate the cache)
  2. Users of the action shouldn't really need to pollute their workflow file with extra junk just to get a small performance increase (if they have to install 10 dependencies that all need their own caching configuration in the yaml file, it's going to explode)

It would be nice if the action could just manage this for the user. It may expose a flag to opt in/out, but ideally it would just always be on and be transparent (and reliable). In the action I can send a request for a JSON file that describes the version, figure out if it's one that already exists in the cache, and if so use that. It's actually what I thought https://github.com/actions/toolkit/tree/master/packages/tool-cache was going to help me to do, but sadly it appears not :-)

@bahmutov
Copy link
Author

We have cloned this repo and factored out core.getInput to make sure functions like saveCache and restoreCache can be imported and used. See the changes and example use in this internal pull request cypress-io#1

@joshmgross joshmgross added the enhancement New feature or request label Nov 13, 2019
@joshmgross
Copy link
Member

I've marked this as enhancement

A few concerns I have with a separate library for this:

  • Users may unknowingly be using caching if it's built into another action, causing other caches to be wasted or evicted early due to per-repo storage limits
  • The cache is only supported for push and pull_request events, and consumers of a library will have to know and handle that accordingly
  • Caching is not always ideal or optimal for all projects/scenarios, and a default caching from a setup action may delay or harm build times
  • Caching in a separate setup action limits how much configuration a user has when setting up caching

While I appreciate the integration and ease of use that can come from setup actions adding caching via a library, I think explicit use of the cache is preferred. Maybe setup actions could have a set of outputs that make it easy to start using caching?

Example:

- name: Setup Dart ${{ matrix.dart-version }}
  id: setup-dart
  uses: DanTup/gh-actions/setup-dart@master
    with:
      channel: ${{ matrix.dart-version }}
- name: Cache
  uses: actions/cache@v1
  with:
    path: ${{ steps.setup-dart.outputs.cache-path }}
    key: ${{ steps.setup-dart.outputs.cache-key }}

@joshmgross joshmgross self-assigned this Nov 13, 2019
@DanTup
Copy link

DanTup commented Nov 13, 2019

I do understand the concerns, but actions could easily provide a flag to let users enable/disable the cache. It's not in action authors interests to build something that works sub-optimally, since people won't use their actions.

The problem with the example above, is that if you have multiple things to set up, you'll very quickly end up with tens of lines of junk just to get a (potentially very small) benefit of caching. Most likely you just won't bother. One of my workflow files already has like 47 lines) of "setup bloat" in it.

Of course, I could wrap some of that up into an action that's stored in the same repo - but then we're back to the same request - how do I implement the caching in the action instead of the workflow file?

When coding, if a function gets too big or complex, we might extract parts of it into smaller functions. It feels like the same should be possible here - if my workflow file gets large and complex, I should be able to extract a chunk of it into an action. It's hard to do that if some features only work embedded directly in the workflow file.

@byCedric
Copy link

I have to agree with Danny, but I'd like to think about how the separated cache approach is supposed to work. For example, how does the original setup action "knows" there's a cache? Or is the cache action always executed before any other action, and because of that, it should only check a directory?

@joshmgross
Copy link
Member

how does the original setup action "knows" there's a cache

Right now, there would be no way to know. Checking the directory and proper placement of the setup step in a workflow is required. We've talked about the concept of "pre" actions that run before all other non-pre actions, but we don't support that at the moment (similar to the "post" action that actions/cache uses to save).

@bahmutov
Copy link
Author

Just as another data point - I have published https://github.com/bahmutov/npm-install which makes any Node project a breeze to use - because it takes care of NPM installation (or yarn)

name: main
on: [push]
jobs:
  build-and-test:
    runs-on: ubuntu-latest
    name: Build and test
    steps:
      - uses: actions/checkout@v1
      - uses: bahmutov/npm-install@v1
      - run: npm t

I doubt users will be confused by what it does - most projects just need npm ci or yarn --frozen-lockfile on CI before continuing onto npm run build, npm test, tec.

@Cyberbeni
Copy link

If you don't want people to accidentally use caching then add an "allows implicit caching" property to jobs that defaults to false that just makes restore/save just do nothing in the importable version of the caching action.

@chrispat
Copy link
Member

chrispat commented Dec 5, 2019

@Cyberbeni I am not sure how adding a property would do anything really becasue the action would have to respect that setting. What we would likely have to do is have that setting control the scopes in the runtime token and for the given job you simply would not have permissions to read or update the cache.

@DanTup
Copy link

DanTup commented Dec 5, 2019

Is the issue here one of convenience/defaults, or security? My understanding was that an action basically has full access to everything running anyway (it can run arbitrary processes?) so it could easily read/write data that comes from/goes to the cache? (eg. you should never use actions you don't completely trust?)

@Cyberbeni
Copy link

@chrispat Can't you just copy paste the cache action code into your action to be able to use caching? If yes, then having a property you have to respect is no different, just would be more convenient.

@joshmgross
Copy link
Member

@Cyberbeni Yes that's an option.

We're working on moving the core cache logic into a package (@actions/cache) that will live in the actions toolkit

@Skycoder42
Copy link

@joshmgross Any updates on this yet?

@joshmgross
Copy link
Member

No updates at this time, but this is definitely something we plan on doing in the future.

@eregon
Copy link
Contributor

eregon commented Mar 21, 2020

👍 for this, it would allow to use caching for setup-* actions without a lot of tricky-to-write boilerplate: caring about cache keys when just wanting to setup some tool leaks a lot of implementation details. The setup-* has a much better knowledge of what would be an appropriate cache key and could do all that transparently with cache in the toolkit.

@eregon
Copy link
Contributor

eregon commented Mar 21, 2020

Also, now that there is a artifact package in the toolkit, this feels like the logical next step.

@ktmud
Copy link

ktmud commented Apr 12, 2020

I found a way to monkeypatch this module to allow core.getState to read states specific to a cache key. It's very hacky, but can probably serve as an alternative before the official support comes in.

To do this, we must import the source code. Therefore we have to update tsconfigs.json to include node_modules/@actions:

{
  "compilerOptions": {
    "target": "es6",
    "module": "commonjs",
    "lib": ["esnext"],
    "moduleResolution": "node",
    "outDir": "./lib",
    "rootDir": ".",
    "strict": true,
    "noImplicitAny": true,
    "esModuleInterop": true,
    "preserveSymlinks": true,
  },
  "include": [
    "./src",
    "./node_modules/@actions"
  ],
  "exclude": ["**/*.test.ts", "__tests__"]
}

With this, I managed to create a Github Action that was able to setup multiple cache layers in one step. E.g.

steps:
- uses: actions/checkout@v2
- uses: ktmud/cached-dependencies@v1
  with:
    parallel: true
    run: |
      cache-restore npm
      npm install
      cache-save npm

      cache-restore pip
      pip install -r requirements.txt
      cache-save pip

      cache-restore cypress
      cd cypress/
      npm install
      cache-save cypress

More examples can be found here and here.

@eregon
Copy link
Contributor

eregon commented Apr 16, 2020

@chrispat @ethomson Could you prioritize this?
In the current state the manual caching with this action is so verbose (10 lines vs 1-2 lines) that many people don't use it.
This seems to notably result in excessive requests to RubyGems.org: rubygems/rubygems#3463

If there was a actions/cache package, we could automatically enable caching for ruby/setup-ruby and many other actions.

@aiqiaoy
Copy link
Contributor

aiqiaoy commented Apr 29, 2020

@eregon I'm picking this up. Hopefully it will get done next week.

@eregon
Copy link
Contributor

eregon commented May 1, 2020

One more reason for this is it's actually very difficult to write a correct cache key if caching any result of native compilation such as Ruby gems with C extensions.
As an example of a problem due to this: sass/sassc-ruby#183 (comment)

Probably the key should include OS, architecture, Ruby engine, exact Ruby version, Gemfile/Gemfile.lock contents, and maybe even processor model version.
This is all easier, cleaner and more flexible if done by some action instead of hardcoded in everyone's workflows.

@LinusU
Copy link

LinusU commented May 1, 2020

Another use case:

I'm currently using this workflow:

    - name: Use Node.js 12.15.0
      uses: actions/setup-node@v1
      with:
        node-version: 12.15.0

    - name: Cache dependencies
      id: cache-node-modules
      uses: actions/cache@v1
      with:
        path: node_modules
        key: ${{ runner.os }}-node-12.15.0-modules-${{ hashFiles('package-lock.json') }}

    - name: Install dependencies
      run: npm ci
      if: steps.cache-node-modules.outputs.cache-hit != 'true'

The problem is that the key leaves some things to be desired, like including the current processor arch, not including "version" in package-lock since that doesn't affect dependencies, etc. It also has the problems that the node_modules folder is written to the cache in a post step, instead of directly after npm ci. Potentially some script could have modified something in-between. I also want to save node_modules to the cache even if a later npm test step fails.

I would love it if I could just do:

    - name: Use Node.js 12.15.0
      uses: actions/setup-node@v1
      with:
        node-version: 12.15.0

    - name: Install dependencies w/ cache
      uses: LinusU/npm-install@v1

@oprypin
Copy link

oprypin commented May 2, 2020

I found that there are quite a few cases where people stumbled upon @actions/tool-cache and tried to use it for caching within the action. I spent quite some time under its illusion as well ("why isn't it caching anything???") before searching to find this issue.

@aiqiaoy
Copy link
Contributor

aiqiaoy commented May 15, 2020

@actions/cache@0.1.0 is released. Please try it out and file issues under the toolkit repo if you encountered any issue. Feedback is welcomed!

Code is here.

Cache action v2 (which we are releasing later this month) will be using the cache node package #313

@aiqiaoy
Copy link
Contributor

aiqiaoy commented May 20, 2020

@actions/cache@0.2.1 is released with a couple of fixes around zstd compression.

@aiqiaoy aiqiaoy closed this as completed May 20, 2020
@thisismydesign
Copy link

thisismydesign commented May 22, 2020

This is great, I managed to simplify our caching a lot:

1-liner NPM cache: https://github.com/c-hive/gha-npm-cache
1-liner Yarn cache: https://github.com/c-hive/gha-yarn-cache

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request
Projects
None yet
Development

No branches or pull requests