New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
READY - Stop DOS attacks by making the lexer stop early on evil input. #2892
Merged
Merged
Changes from 10 commits
Commits
Show all changes
11 commits
Select commit
Hold shift + click to select a range
772084a
This stops DOS attacks by making the lexer stop early.
bbakerman 2caa273
This stops DOS attacks by making the lexer stop early. Added BadSitua…
bbakerman 30ee65a
This stops DOS attacks by making the lexer stop early. Added BadSitua…
bbakerman 79b989c
This stops DOS attacks by making the lexer stop early. Added per quer…
bbakerman 888d123
This stops DOS attacks by making the lexer stop early. Added whitespa…
bbakerman 1669ca7
This stops DOS attacks by making the lexer stop early. Added whitespa…
bbakerman 077e64a
This stops DOS attacks by making the lexer stop early. Added whitespa…
bbakerman 17ec04b
This stops DOS attacks by making the lexer stop early. Added whitespa…
bbakerman a0a7cfd
This stops DOS attacks by making the lexer stop early.Use array inste…
bbakerman 50206dd
This stops DOS attacks by making the lexer stop early.Use array inste…
bbakerman 0bd81e7
PR feedback - renamed options and added SDL options
bbakerman File filter
Filter by extension
Conversations
Failed to load comments.
Jump to
Jump to file
Failed to load files.
Diff view
Diff view
There are no files selected for viewing
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
|
@@ -13,32 +13,49 @@ | |
public class ParserOptions { | ||
|
||
/** | ||
* An graphql hacking vector is to send nonsensical queries that burn lots of parsing CPU time and burn | ||
* memory representing a document that wont ever execute. To prevent this for most users, graphql-java | ||
* A graphql hacking vector is to send nonsensical queries that burn lots of parsing CPU time and burn | ||
* memory representing a document that won't ever execute. To prevent this for most users, graphql-java | ||
* set this value to 15000. ANTLR parsing time is linear to the number of tokens presented. The more you | ||
* allow the longer it takes. | ||
* | ||
* If you want to allow more, then {@link #setDefaultParserOptions(ParserOptions)} allows you to change this | ||
* JVM wide. | ||
*/ | ||
public static final int MAX_QUERY_TOKENS = 15000; | ||
public static final int MAX_QUERY_TOKENS = 15_000; | ||
/** | ||
* Another graphql hacking vector is to send large amounts of whitespace in operations that burn lots of parsing CPU time and burn | ||
* memory representing a document. Whitespace token processing in ANTLR is 2 orders of magnitude faster than grammar token processing | ||
* however it still takes some time to happen. | ||
* | ||
* If you want to allow more, then {@link #setDefaultParserOptions(ParserOptions)} allows you to change this | ||
* JVM wide. | ||
*/ | ||
public static final int MAX_WHITESPACE_TOKENS = 200_000; | ||
|
||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. I think we should name this |
||
private static ParserOptions defaultJvmParserOptions = newParserOptions() | ||
.captureIgnoredChars(false) | ||
.captureSourceLocation(true) | ||
.captureLineComments(true) | ||
.maxTokens(MAX_QUERY_TOKENS) // to prevent a billion laughs style attacks, we set a default for graphql-java | ||
.maxWhitespaceTokens(MAX_WHITESPACE_TOKENS) | ||
.build(); | ||
|
||
private static ParserOptions defaultJvmOperationParserOptions = newParserOptions() | ||
.captureIgnoredChars(false) | ||
.captureSourceLocation(true) | ||
.captureLineComments(false) // #comments are not useful in query parsing | ||
.maxTokens(MAX_QUERY_TOKENS) // to prevent a billion laughs style attacks, we set a default for graphql-java | ||
.maxWhitespaceTokens(MAX_WHITESPACE_TOKENS) | ||
.build(); | ||
|
||
/** | ||
* By default the Parser will not capture ignored characters. A static holds this default | ||
* By default, the Parser will not capture ignored characters. A static holds this default | ||
* value in a JVM wide basis options object. | ||
* | ||
* Significant memory savings can be made if we do NOT capture ignored characters, | ||
* especially in SDL parsing. | ||
* | ||
* @return the static default value on whether to capture ignored chars | ||
* @return the static default JVM value | ||
* | ||
* @see graphql.language.IgnoredChar | ||
* @see graphql.language.SourceLocation | ||
|
@@ -48,7 +65,20 @@ public static ParserOptions getDefaultParserOptions() { | |
} | ||
|
||
/** | ||
* By default the Parser will not capture ignored characters. A static holds this default | ||
* By default, for operation parsing, the Parser will not capture ignored characters, and it will not capture line comments into AST | ||
* elements . A static holds this default value for operation parsing in a JVM wide basis options object. | ||
* | ||
* @return the static default JVM value for query parsing | ||
* | ||
* @see graphql.language.IgnoredChar | ||
* @see graphql.language.SourceLocation | ||
*/ | ||
public static ParserOptions getDefaultOperationParserOptions() { | ||
return defaultJvmOperationParserOptions; | ||
} | ||
|
||
/** | ||
* By default, the Parser will not capture ignored characters. A static holds this default | ||
* value in a JVM wide basis options object. | ||
* | ||
* Significant memory savings can be made if we do NOT capture ignored characters, | ||
|
@@ -65,17 +95,35 @@ public static void setDefaultParserOptions(ParserOptions options) { | |
defaultJvmParserOptions = assertNotNull(options); | ||
} | ||
|
||
/** | ||
* By default, the Parser will not capture ignored characters or line comments. A static holds this default | ||
* value in a JVM wide basis options object for operation parsing. | ||
* | ||
* This static can be set to true to allow the behavior of version 16.x or before. | ||
* | ||
* @param options - the new default JVM parser options for operation parsing | ||
* | ||
* @see graphql.language.IgnoredChar | ||
* @see graphql.language.SourceLocation | ||
*/ | ||
public static void setDefaultOperationParserOptions(ParserOptions options) { | ||
defaultJvmOperationParserOptions = assertNotNull(options); | ||
} | ||
|
||
|
||
private final boolean captureIgnoredChars; | ||
private final boolean captureSourceLocation; | ||
private final boolean captureLineComments; | ||
private final int maxTokens; | ||
private final int maxWhitespaceTokens; | ||
private final ParsingListener parsingListener; | ||
|
||
private ParserOptions(Builder builder) { | ||
this.captureIgnoredChars = builder.captureIgnoredChars; | ||
this.captureSourceLocation = builder.captureSourceLocation; | ||
this.captureLineComments = builder.captureLineComments; | ||
this.maxTokens = builder.maxTokens; | ||
this.maxWhitespaceTokens = builder.maxWhitespaceTokens; | ||
this.parsingListener = builder.parsingListener; | ||
} | ||
|
||
|
@@ -117,7 +165,7 @@ public boolean isCaptureLineComments() { | |
} | ||
|
||
/** | ||
* A graphql hacking vector is to send nonsensical queries that burn lots of parsing CPU time and burn | ||
* A graphql hacking vector is to send nonsensical queries that burn lots of parsing CPU time and burns | ||
* memory representing a document that won't ever execute. To prevent this you can set a maximum number of parse | ||
* tokens that will be accepted before an exception is thrown and the parsing is stopped. | ||
* | ||
|
@@ -127,6 +175,17 @@ public int getMaxTokens() { | |
return maxTokens; | ||
} | ||
|
||
/** | ||
* A graphql hacking vector is to send larges amounts of whitespace that burn lots of parsing CPU time and burn | ||
* memory representing a document. To prevent this you can set a maximum number of whitespace parse | ||
* tokens that will be accepted before an exception is thrown and the parsing is stopped. | ||
* | ||
* @return the maximum number of raw whitespace tokens the parser will accept, after which an exception will be thrown. | ||
*/ | ||
public int getMaxWhitespaceTokens() { | ||
return maxWhitespaceTokens; | ||
} | ||
|
||
public ParsingListener getParsingListener() { | ||
return parsingListener; | ||
} | ||
|
@@ -148,6 +207,7 @@ public static class Builder { | |
private boolean captureLineComments = true; | ||
private int maxTokens = MAX_QUERY_TOKENS; | ||
private ParsingListener parsingListener = ParsingListener.NOOP; | ||
private int maxWhitespaceTokens = MAX_WHITESPACE_TOKENS; | ||
|
||
Builder() { | ||
} | ||
|
@@ -157,6 +217,7 @@ public static class Builder { | |
this.captureSourceLocation = parserOptions.captureSourceLocation; | ||
this.captureLineComments = parserOptions.captureLineComments; | ||
this.maxTokens = parserOptions.maxTokens; | ||
this.maxWhitespaceTokens = parserOptions.maxWhitespaceTokens; | ||
this.parsingListener = parserOptions.parsingListener; | ||
} | ||
|
||
|
@@ -180,6 +241,11 @@ public Builder maxTokens(int maxTokens) { | |
return this; | ||
} | ||
|
||
public Builder maxWhitespaceTokens(int maxWhitespaceTokens) { | ||
this.maxWhitespaceTokens = maxWhitespaceTokens; | ||
return this; | ||
} | ||
|
||
public Builder parsingListener(ParsingListener parsingListener) { | ||
this.parsingListener = assertNotNull(parsingListener); | ||
return this; | ||
|
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,94 @@ | ||
package graphql.parser; | ||
|
||
import graphql.Internal; | ||
import org.antlr.v4.runtime.CharStream; | ||
import org.antlr.v4.runtime.Token; | ||
import org.antlr.v4.runtime.TokenFactory; | ||
import org.antlr.v4.runtime.TokenSource; | ||
|
||
import java.util.function.BiConsumer; | ||
|
||
/** | ||
* This token source can wrap a lexer and if it asks for more than a maximum number of tokens | ||
* the user can take some action, typically throw an exception to stop lexing. | ||
* | ||
* It tracks the maximum number per token channel, so we have 3 at the moment, and they will all be tracked. | ||
* | ||
* This is used to protect us from evil input. The lexer will eagerly try to find all tokens | ||
* at times and certain inputs (directives butted together for example) will cause the lexer | ||
* to keep doing work even though before the tokens are presented back to the parser | ||
* and hence before it has a chance to stop work once too much as been done. | ||
*/ | ||
@Internal | ||
public class SafeTokenSource implements TokenSource { | ||
|
||
private final TokenSource lexer; | ||
private final int maxTokens; | ||
private final int maxWhitespaceTokens; | ||
private final BiConsumer<Integer, Token> whenMaxTokensExceeded; | ||
private final int channelCounts[]; | ||
|
||
public SafeTokenSource(TokenSource lexer, int maxTokens, int maxWhitespaceTokens, BiConsumer<Integer, Token> whenMaxTokensExceeded) { | ||
this.lexer = lexer; | ||
this.maxTokens = maxTokens; | ||
this.maxWhitespaceTokens = maxWhitespaceTokens; | ||
this.whenMaxTokensExceeded = whenMaxTokensExceeded; | ||
// this could be a Map<int,int> however we want it to be faster as possible. | ||
// we only have 3 channels - but they are 0,2 and 3 so use 5 for safety - still faster than a map get/put | ||
// if we ever add another channel beyond 5 it will IOBEx during tests so future changes will be handled before release! | ||
this.channelCounts = new int[]{0, 0, 0, 0, 0}; | ||
} | ||
|
||
|
||
@Override | ||
public Token nextToken() { | ||
Token token = lexer.nextToken(); | ||
if (token != null) { | ||
int channel = token.getChannel(); | ||
int currentCount = ++channelCounts[channel]; | ||
if (channel == Parser.CHANNEL_IGNORED_CHARS) { | ||
// whitespace gets its own max count | ||
callbackIfMaxExceeded(maxWhitespaceTokens, currentCount, token); | ||
} else { | ||
callbackIfMaxExceeded(maxTokens, currentCount, token); | ||
} | ||
} | ||
return token; | ||
} | ||
|
||
private void callbackIfMaxExceeded(int maxCount, int currentCount, Token token) { | ||
if (currentCount > maxCount) { | ||
whenMaxTokensExceeded.accept(maxCount, token); | ||
} | ||
} | ||
|
||
@Override | ||
public int getLine() { | ||
return lexer.getLine(); | ||
} | ||
|
||
@Override | ||
public int getCharPositionInLine() { | ||
return lexer.getCharPositionInLine(); | ||
} | ||
|
||
@Override | ||
public CharStream getInputStream() { | ||
return lexer.getInputStream(); | ||
} | ||
|
||
@Override | ||
public String getSourceName() { | ||
return lexer.getSourceName(); | ||
} | ||
|
||
@Override | ||
public void setTokenFactory(TokenFactory<?> factory) { | ||
lexer.setTokenFactory(factory); | ||
} | ||
|
||
@Override | ||
public TokenFactory<?> getTokenFactory() { | ||
return lexer.getTokenFactory(); | ||
} | ||
} |
Oops, something went wrong.
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
lets call this also whitespace channel and not ignored ones to make it consistent with the options.