New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Customizing string representation of data flow nodes in SARIF or CSV results for Taint Tracking #16143
Comments
Hi @saikatG 👋 I am not sure there is a super easy way of doing that. Overriding the |
Thanks @mbg. Overriding the |
Hi CodeQL devs, re-opening this issue because I ran into a problem. I tried something like this: bindingset[n]
string getFullName(DataFlow::Node n){
result = (n.asExpr().(Call)).getCallee().getDeclaringType().getQualifiedName() + "." + (n.asExpr().(Call)).getCallee().getName()
or
result = (n.asExpr().(Argument)).getCall().getCallee().getDeclaringType().getAnAncestor().getQualifiedName() + "." + (n.asExpr().(Argument)).getCall().getCallee().getName() + "::arg::" + (n.asExpr().(Argument)).getPosition()
or
result = (n.(DataFlow::ParameterNode).asParameter().getCallable().getDeclaringType().getQualifiedName() + "." + (n.(DataFlow::ParameterNode).asParameter().getCallable().getDeclaringType().getName()) + "::par::" + (n.(DataFlow::ParameterNode).asParameter().getPosition()))
or
result = n.toString()
}
class MyNode extends DataFlow::Node {
override string toString() {
result = getFullName(this)
}
} It works well, but its turning out to be too expensive to compute in some cases. Any suggestions on how to optimize this? |
Could you provide more information? How does this manifest for you (long analysis time / out of memory errors)? How big is the Java project you are running this against? |
So codeql just takes a very long time to generate the csv/sarif files after the "Interpreting results" stage. I waited for 10-15 mins and then killed it. Ofc, it finishes fast w/o the string conversion code above. The project has 97k lines of code. I am running the XSS query, but that shouldn't be an important factor i think?
|
I assume you're running the CodeQL CLI directly on the command-line. What is the command you used and which version of CodeQL are you using? I tried running the XSS query with your additions on the database for the repository you linked to and this worked fine for me using the VSCode extension (although I note that there were no results). |
Sorry about the missing details. Yes, I am using the codeql cli 2.15.3. I use I am using a custom version of the XSS query that uses my own list of 100 sources and 38 sinks. I think this as well may be causing the difference in execution. So, I have a predicate isMySourceMethod(Method m)
{
(
m.getName() = "get" and
m.getDeclaringType().getSourceDeclaration().hasQualifiedName("java.util", "Map<String,String>")
)
or
(
m.getName() = "put" and
m.getDeclaringType().getSourceDeclaration().hasQualifiedName("java.util", "Map<String,String>")
)
or
... and I have a module MyXssConfig implements DataFlow::ConfigSig {
predicate isSource(DataFlow::Node source) {
isMySourceMethod(source.asExpr().(MethodCall).getMethod())
}
... (Same for sinks) I can't share the full files right now. But I hope this gives you an idea. Do you think this may be causing the slowdown? Interestingly, this still works ok when I don't use "toString". Note that I put the "toString" code in this |
This is a fairly old version. It may be worth upgrading to the latest version to see if that resolves the issue for you. (We release new versions every two weeks.)
This is very much possible of course. You could try to run the XSS query without your custom sources and sinks, but still with the overriden Let me know what the outcomes of those experiments are and we can go from there. |
Thanks a lot @mbg! I will try it out. While we are on this, another question for you: Is there an easy way to specify summaries with a custom tag? For instance, i see yml files in |
There is some documentation in the code which comments on this:
So in other words, What would you hope to accomplish with a custom tag? |
Hi,
Currently, when running a cwe query such as TaintedPath (cwe 22) on a java project, I retrieve the CodeFlow for each result in the SARIF files as shown below. Is there an easy way to customize the string representation of nodes in the output, such as, in the message["text"] part for the node? For instance, for a method call I would like the string to have the format "package:class:methodname" instead of "methodname(...)" -- which is the default. Would I need to override the data flow node for this?
Thanks in advance!
The text was updated successfully, but these errors were encountered: