New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Search for the CodeQL bundle in multiple places. #123
Conversation
e6f00e5
to
5254a0d
Compare
5254a0d
to
813cb04
Compare
for (let downloadSource of uniqueDownloadSources) { | ||
let [apiURL, repository] = downloadSource; | ||
// If we've reached the final case, short-circuit the API check since we know the bundle exists and is public. | ||
if (apiURL === GITHUB_DOTCOM_API_URL && repository === CODEQL_DEFAULT_ACTION_REPOSITORY) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'd argue you could simplify the logic by removing this default case from the list of potential download sources, and moving this return statement to after the loop.
This would have a number of knock-on advantages:
- We get rid of the worrying "could not find an accessible CodeQL bundle" error case too which is actually dead code at the moment.
- It would also let you simplify the list of potential sources to just the repository name, as the instance URL is no longer needed there as it's always the current instance.
- We can avoid adding
octokit
as a explicit dependency and thus keep the size of this PR down.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The code can definitely be simplified, but I don't think we can easily remove the default case from the list, as we rely on filtering out the non-unique sources to handle the case where INSTANCE_API_URL
is the same as GITHUB_DOTCOM_API_URL
so we can break out early.
I've restructured the code slightly to achieve some of the advantages you've mentioned while preserving the deduplication , but let me know if I've missed a neater way of doing it.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'll leave it up to you to decide if it's simpler but I think this function is equivalent to the following which doesn't use a look at all. You'd better check my working though as I found it hard getting my head around the logic to work out if the two implementations are indeed equivalent or not.
async function checkRepoForCodeQLBundle(instanceApiUrl: string, repository: string): Promise<string | undefined> {
let [repositoryOwner, repositoryName] = repository.split("/");
try {
const release = await api.getApiClient().repos.getReleaseByTag({
owner: repositoryOwner,
repo: repositoryName,
tag: CODEQL_BUNDLE_VERSION
});
for (let asset of release.data.assets) {
if (asset.name === CODEQL_BUNDLE_NAME) {
core.info(`Found CodeQL bundle in ${repository} on ${instanceApiUrl} with URL ${asset.url}.`);
return asset.url;
}
}
} catch (e) {
core.info(`Looked for CodeQL bundle in ${repository} on ${instanceApiUrl} but got error ${e}.`);
}
return undefined;
}
async function getCodeQLBundleDownloadURL(): Promise<string> {
const instanceApiUrl = getInstanceAPIURL();
const codeQLActionRepository = getCodeQLActionRepository();
if (codeQLActionRepository !== CODEQL_DEFAULT_ACTION_REPOSITORY) {
const assetUrl = await checkRepoForCodeQLBundle(instanceApiUrl, codeQLActionRepository);
if (assetUrl !== undefined) {
return assetUrl;
}
}
if (instanceApiUrl !== GITHUB_DOTCOM_API_URL) {
const assetUrl = await checkRepoForCodeQLBundle(instanceApiUrl, CODEQL_DEFAULT_ACTION_REPOSITORY);
if (assetUrl !== undefined) {
return assetUrl;
}
}
return `https://github.com/${CODEQL_DEFAULT_ACTION_REPOSITORY}/releases/download/${CODEQL_BUNDLE_VERSION}/${CODEQL_BUNDLE_NAME}`;
}
I agree, this seem does seem like a security risk if they pick a non-default repository name. Also I think it's unlikely it would ever usefully trigger.
Thank you for the description. Also for raising two improvements to the actions core code. |
@@ -139,13 +142,32 @@ async function getCodeQLBundleDownloadURL(): Promise<string> { | |||
core.info(`Looked for CodeQL bundle in ${downloadSource[1]} on ${downloadSource[0]} but got error ${e}.`); | |||
} | |||
} | |||
throw new Error("Could not find an accessible CodeQL bundle."); | |||
return `https://github.com/${CODEQL_DEFAULT_ACTION_REPOSITORY}/releases/download/${CODEQL_BUNDLE_VERSION}/${CODEQL_BUNDLE_NAME}`; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It looks like it's possible to run into an error on line 142 above and then still return the default URL for the bundle.
This seems dangerous to me. If a user specified a bundle that can't be found, should we end the action with an error so this can be fixed?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hmm, to me the alternative seems more dangerous. People expect to be able to fork the CodeQL Action and have it still work, but if we require that CodeQL exists on the forked repository that won't necessarily be the case.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think the implementation that's there now is equivalent to how it would work now. Currently the download URL is hardcoded to point to the default action, unless a fork explicitly changes that behaviour. So that makes me lean towards keeping it as it is and always checking the default action.
Note that if a user specifies to use a specific bundle by setting the tools
input then all of this implementation is skipped and we just use the value they provided.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I made a couple more points for potential minor improvements but I'm going to tentatively approve this with the goal of unblocking it. This looks good to go to me assuming it's had enough manual testing.
const CODEQL_DEFAULT_ACTION_REPOSITORY = "github/codeql-action"; | ||
|
||
function getInstanceAPIURL(): string { | ||
return process.env["GITHUB_API_URL"] || GITHUB_DOTCOM_API_URL; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
My understanding was that this environment variable should always be populated now, so the default is unnecessary. We could use util.getRequiredEnvParam
to ensure it exists too. Do you agree?
for (let downloadSource of uniqueDownloadSources) { | ||
let [apiURL, repository] = downloadSource; | ||
// If we've reached the final case, short-circuit the API check since we know the bundle exists and is public. | ||
if (apiURL === GITHUB_DOTCOM_API_URL && repository === CODEQL_DEFAULT_ACTION_REPOSITORY) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'll leave it up to you to decide if it's simpler but I think this function is equivalent to the following which doesn't use a look at all. You'd better check my working though as I found it hard getting my head around the logic to work out if the two implementations are indeed equivalent or not.
async function checkRepoForCodeQLBundle(instanceApiUrl: string, repository: string): Promise<string | undefined> {
let [repositoryOwner, repositoryName] = repository.split("/");
try {
const release = await api.getApiClient().repos.getReleaseByTag({
owner: repositoryOwner,
repo: repositoryName,
tag: CODEQL_BUNDLE_VERSION
});
for (let asset of release.data.assets) {
if (asset.name === CODEQL_BUNDLE_NAME) {
core.info(`Found CodeQL bundle in ${repository} on ${instanceApiUrl} with URL ${asset.url}.`);
return asset.url;
}
}
} catch (e) {
core.info(`Looked for CodeQL bundle in ${repository} on ${instanceApiUrl} but got error ${e}.`);
}
return undefined;
}
async function getCodeQLBundleDownloadURL(): Promise<string> {
const instanceApiUrl = getInstanceAPIURL();
const codeQLActionRepository = getCodeQLActionRepository();
if (codeQLActionRepository !== CODEQL_DEFAULT_ACTION_REPOSITORY) {
const assetUrl = await checkRepoForCodeQLBundle(instanceApiUrl, codeQLActionRepository);
if (assetUrl !== undefined) {
return assetUrl;
}
}
if (instanceApiUrl !== GITHUB_DOTCOM_API_URL) {
const assetUrl = await checkRepoForCodeQLBundle(instanceApiUrl, CODEQL_DEFAULT_ACTION_REPOSITORY);
if (assetUrl !== undefined) {
return assetUrl;
}
}
return `https://github.com/${CODEQL_DEFAULT_ACTION_REPOSITORY}/releases/download/${CODEQL_BUNDLE_VERSION}/${CODEQL_BUNDLE_NAME}`;
}
@@ -139,13 +142,32 @@ async function getCodeQLBundleDownloadURL(): Promise<string> { | |||
core.info(`Looked for CodeQL bundle in ${downloadSource[1]} on ${downloadSource[0]} but got error ${e}.`); | |||
} | |||
} | |||
throw new Error("Could not find an accessible CodeQL bundle."); | |||
return `https://github.com/${CODEQL_DEFAULT_ACTION_REPOSITORY}/releases/download/${CODEQL_BUNDLE_VERSION}/${CODEQL_BUNDLE_NAME}`; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think the implementation that's there now is equivalent to how it would work now. Currently the download URL is hardcoded to point to the default action, unless a fork explicitly changes that behaviour. So that makes me lean towards keeping it as it is and always checking the default action.
Note that if a user specifies to use a specific bundle by setting the tools
input then all of this implementation is skipped and we just use the value they provided.
The aim of this change is to resolve a number of issues with where the CodeQL bundle is fetched from. In many cases it is either not possible or not desired to fetch the CodeQL bundle from the canonical location at
https://github.com/github/codeql-action/releases/download/codeql-bundle-20200630/codeql-bundle.tar.gz
. This might be because you are using the Action on a runner that does not have internet access. It may also be because you've copied the CodeQL Action repository to your own organization so that you can control what code is running on your runner machines.The general idea of this change was inspired by the way the Actions runner attempts to find an Action. We can apply a similar principle to finding the CodeQL bundle. This process works by looking for the CodeQL bundle in a place that's as local as possible and then working outwards. It checks if the bundle exists:
It downloads the bundle from the first place it finds it. For symmetry I initially considered also checking GitHub.com at the repository name for the current Action, however I decided against that due to the accidental security risk someone might expose themselves to if they incorrectly mirror the Action without copying the bundle.
To check whether the bundle exists in these locations we use the GitHub API. This is required because on GHES you must authenticate to access the API, even for accessing public projects. The one exception to this is when we're finally getting the bundle from the canonical location. In this case we do not use the API. This is because if we access the API from GHES we cannot authenticate with our token, meaning we have to make the request anonymously. You could easily run out of API rate limit making these anonymous requests. Because we know for sure the bundle exists in the canonical location we can safely download it directly rather than using the API.
This change currently relies on two hacks to work around limitations in other parts of Actions. I've opened pull requests to solve these limitations but it's unclear when they might be reviewed. The two pull requests are:
downloadTool(...)
method of the toolcache.Sorry for the excessive description. This change ended up being a bit more complicated than I expected and I wanted to make it clear why I made the design decision I have.
Merge / deployment checklist