Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Detection of plain strings #1347

Open
anotherbridge opened this issue Feb 16, 2024 · 3 comments
Open

Detection of plain strings #1347

anotherbridge opened this issue Feb 16, 2024 · 3 comments
Labels
enhancement New feature or request

Comments

@anotherbridge
Copy link
Contributor

Describe the solution you'd like
In some cases it may be useful to not perform a full regex search, since the "regex" you're trying to search may break down to a simple string. Therefore it might be useful to turn off the regex search if the pre-check with a single keyword was successful. I guess this could speed up the detection significanlty especially if there are a lot of these rules in a ruleset. An example here would be a certain list of known leaked credentials that should be included as rules, but doesn't follow any specific pattern that can nicely be formulated in regex.

If there is an already existing method or any better way of doing this with the currently implemented features of gitleaks I would be glad to get to know about them.

Describe alternatives you've considered
At the moment I am building rules like that as follows:

[[rules]]
id = "known-leaked-credential-0"
description = "Known Leaked Credential"
tags = ["leak"]
regex = '''<my-leaked-credential>'''
keywords = [
    "<my-leaked-credential>",
]

cc @zricethezav

@anotherbridge anotherbridge added the enhancement New feature or request label Feb 16, 2024
@rgmz
Copy link
Contributor

rgmz commented Feb 16, 2024

I think it would be worth doing a benchmark. My intuition is that keywords + regex matching literals is already fast enough that any potential increase would be negligible.

@somkanade-arzooo
Copy link

@anotherbridge In above example are you trying to detect all the secret which contain word/string ?

@sergiomarotco
Copy link
Contributor

sergiomarotco commented May 21, 2024

@anotherbridge i am using the same rule:

[[rules]]
        id = "known-leaked-credentials"
        description = "Known Leaked Credentials"
        tags = ["leak"]
        regex = '''(pass1|pass2|pass3|.............Pass1000)'''
        ]

The rule contains about 1000 corporate and previously compromised passwords known to me.
Regex string is very long, unconfortable to work with this long regex.
I can't get this config to work:

regexES = [
    '''pass1''',
    '''pass2''',
    '''pass3''',
    ...
    '''Pass1000''',
]

The TOML structure is correct, Gitleaks work, but the rule does not work (Finds nothing).
Can you share the resulting structure if you managed to reduce the rule to line-by-line form?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request
Projects
None yet
Development

No branches or pull requests

4 participants