Huge outputs #1056
Replies: 3 comments 7 replies
-
Have you considered just outputting the report in CSV format? I believe most CSV readers will read the file line-by-line so you shouldn't run into OutOfMemory issues. |
Beta Was this translation helpful? Give feedback.
-
In general, CSVs should be able to support multi-line strings. In gitleaks, we might have to tweak how the CSV is being written. e.g. We might have to enclose multi-line strings in quotes. Just as a simple PoC, here is a simple CSV:
And when I open it in Excel, it shows the multi-line string in a single cell. If Excel can do it, we can do it... |
Beta Was this translation helpful? Give feedback.
-
@jiri-bocan what type? Also give us an example of a secret you are seeing that is a FP (scramble the secret before posting it). |
Beta Was this translation helpful? Give feedback.
-
Hello team,
I just came across a peculiar issue.
Imagine you have a repository with an order of magnitude of 10,000-1,000,000 leaks and the desired outcome format is JSON, i.e., an output JSON will have an order of magnitude 10-1000 MB. The question is how to process such a huge file. Conventional ways will usually end up with the OutOfMemory exception. Therefore, I need to write a custom JSON parser.
May I take for granted that the JSON file will always contain an expanded (or "beautified" = one item per line, indented) JSON content or may it happen that the JSON be collapsed (whole JSON in one line, no new-line characters, no indentation, no extra white-spaces)? So far, JSON seems always expanded with depth equal to 1. Would it be possible to provide some switch, that will ensure that both SARIF and JSON be expanded (easier to parse, read) or collapsed (e.g., to save disk space)?
Thank you for your reply beforehand,
jiri
PS: Another option is to tell gitleaks to output only first, e.g., 50,000 leaks which will limit the output size, too.
PPS: After upgrading from Gitleaks 8.8 to 8.15, number of detected leaks of certain type literally skyrocketed. Hope there is no error in corresponding regexes...
Beta Was this translation helpful? Give feedback.
All reactions