Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add CodeQL workflow #1721

Open
wants to merge 3 commits into
base: main
Choose a base branch
from

Conversation

jorgectf
Copy link

Hello from GitHub Security Lab!

Your repository is critical to the security of the Open Source Software (OSS) ecosystem and as part of our mission to make OSS safer, we are contributing a CodeQL configuration for code scanning to your repository. By enabling code scanning with CodeQL, you will be able to continuously analyze your code and surface potential vulnerabilities before they can even reach your codebase. In fact, you may have seen some alerts already appearing on this pull request!

We’ve tested the configuration manually before opening this pull request and adjusted it to the needs of your particular repository, but feel free to tweak it further! Check this page for detailed documentation.

Questions? Check out the FAQ below!

FAQ

Click here to expand the FAQ section

How often will the code scanning analysis run?

By default, code scanning will trigger a scan with the CodeQL engine on the following events:

  • On every pull request — to flag up potential security problems for you to investigate before merging a PR.
  • On every push to your default branch and other protected branches — this keeps the analysis results on your repository’s Security tab up to date.
  • Once a week at a fixed time — to make sure you benefit from the latest updated security analysis even when no code was committed or PRs were opened.

What will this cost?

Nothing! The CodeQL engine will run inside GitHub Actions, making use of your unlimited free compute minutes for public repositories.

What types of problems does CodeQL find?

By default, code scanning runs the default query suite.

How do I upgrade my CodeQL engine?

No need! New versions of the CodeQL analysis are constantly deployed on GitHub.com; your repository will automatically benefit from the most recently released version.

The analysis doesn’t seem to be working

If you get an error in GitHub Actions that indicates that CodeQL wasn’t able to analyze your code, please follow the instructions here to debug the analysis.

Which source code hosting platforms does code scanning support?

GitHub code scanning is deeply integrated within GitHub itself. If you’d like to scan source code that is hosted elsewhere, we suggest that you create a mirror of that code on GitHub.

@github-advanced-security
Copy link

This pull request sets up GitHub code scanning for this repository. Once the scans have completed and the checks have passed, the analysis results for this pull request branch will appear on this overview. Once you merge this pull request, the 'Security' tab will show more code scanning analysis results (for example, for the default branch). Depending on your configuration and choice of analysis tool, future pull requests will be annotated with code scanning analysis results. For more information about GitHub code scanning, check out the documentation.

@madrob
Copy link
Contributor

madrob commented Jun 29, 2023

@jorgectf Thanks for starting us on this! I took a brief look at the scan output and notices some things that I'd like to make sure we're able to do before moving forward on the PR. I tried to look for answers in the docs, but didn't see how to make the appropriate configurations, so I'm coming back to you.

  1. We need to be able to exclude certain paths. For example, I don't think we're interested in the javascript scan results for solr/webapp/web/libs/* as that is all third party code that we don't intend to change.
  2. Is there a way to mark certain methods as returning sanitized inputs? There are some very long taint analysis chains provided by CodeQL which is awesome, but some of them seem to miss that a value becomes safe at some point.
  3. Is there a way to run this or similar analysis locally for developers? We have tried very hard in the past to avoid "surprises" in PRs where the developer runs into a failing check but has no way to verify that they've actually solved it other than to keep resubmitting their PR. It tends to lead to some bad experiences for contributors which we tend to care quite a bit about.
  4. Who will the report be visible to? Project members? Everybody?

@dsmiley
Copy link
Contributor

dsmiley commented Jun 29, 2023

What do we gain here over Sonatype Lift? We are already using Lift, which is a PR based workflow to existing static analyzers like Errorprone, FindSecBugs, and Infer. https://help.sonatype.com/lift/configuring-lift

https://lift.sonatype.com/results/github.com/apache/solr

# The branches below must be a subset of the branches above
branches: [ 'main' ]
schedule:
- cron: '27 21 * * 4'
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why does this need to be a cron job?

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

CodeQL libraries and queries are open-sourced in https://github.com/github/codeql. They are heavily contributed to by CodeQL engineers and the security/development community. By running CodeQL weekly, we make sure you have the most up-to-date alerts in the Security tab regardless when the last change was made to Solr.

@jorgectf
Copy link
Author

@madrob

  1. We need to be able to exclude certain paths. For example, I don't think we're interested in the javascript scan results for solr/webapp/web/libs/* as that is all third party code that we don't intend to change.

You can use paths-ignore in a CodeQL configuration file. See https://docs.github.com/en/code-security/code-scanning/automatically-scanning-your-code-for-vulnerabilities-and-errors/customizing-code-scanning#specifying-directories-to-scan. f9c7be6 should do the trick :)

  1. Is there a way to mark certain methods as returning sanitized inputs? There are some very long taint analysis chains provided by CodeQL which is awesome, but some of them seem to miss that a value becomes safe at some point.

That is what we call "Sanitizers". You are welcome to contribute to https://github.com/github/codeql or open a False Positive Issue so we can take a look and improve the query!

  1. Is there a way to run this or similar analysis locally for developers? We have tried very hard in the past to avoid "surprises" in PRs where the developer runs into a failing check but has no way to verify that they've actually solved it other than to keep resubmitting their PR. It tends to lead to some bad experiences for contributors which we tend to care quite a bit about.

That would be technically possible, but not comfortable for the PR author IMO. The pull request alerts should be easily actionable so the author knows what to do, and when they succeeded fixing the alert.

  1. Who will the report be visible to? Project members? Everybody?

The alerts will be visible in the Security tab of the repository for those with at least write access to the repository and security managers.

@madrob
Copy link
Contributor

madrob commented Jun 30, 2023

@dsmiley I think this produces a different set of results than our other scanning tools. At least, it produces results that I have not personally reviewed before. It took a bit of work to figure out, but you can preview what CodeQL would generate over at https://github.com/apache/solr/security/code-scanning?query=is%3Aopen+branch%3Amain+pr%3A1721

@madrob
Copy link
Contributor

madrob commented Jul 17, 2023

@dsmiley Also relevant, I just saw that Lift is going away, so we'll need some replacement anyway.

@dsmiley
Copy link
Contributor

dsmiley commented Jul 18, 2023

FYI I requested removal of Lift https://issues.apache.org/jira/browse/INFRA-24802

@risdenk
Copy link
Contributor

risdenk commented Oct 15, 2023

I merged main into this to get the latest results.

Copy link

This PR had no visible activity in the past 60 days, labeling it as stale. Any new activity will remove the stale label. To attract more reviewers, please tag someone or notify the dev@solr.apache.org mailing list. Thank you for your contribution!

@github-actions github-actions bot added the stale PR not updated in 60 days label Feb 15, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
stale PR not updated in 60 days
Projects
None yet
5 participants