New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
resolve race condition problem #106
Conversation
src/main/java/ognl/OgnlRuntime.java
Outdated
// if (cacheGetMethod.containsKey(targetClass, propertyName)) | ||
// return null; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Please remove these line instead of commenting them out. Also remove the comment above.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
done
// if (cacheSetMethod.containsKey(targetClass, propertyName)) | ||
// return null; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Please remove these line instead of commenting them out. Also remove the comment above.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
done
if (method == NULL_REPLACEMENT) | ||
return null; | ||
return method; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This breaks the logic, you will return NULL_REPLACEMENT
instead of null
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
👍
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What about this? As far I understand this will return NULL_REPLACEMENT
which was set in put
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sorry,I can't catch what you said.
Yes, It will return NULL_REPLACEMENT if you put.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
As far I understand, the previous code will return a raw null
if method
from cache equals to NULL_REPLACEMENT
(which is initialised to ClassPropertyMethodCache#get
). You've changed the code to return method
directly without checking if this is NULL_REPLACEMENT
, so this means you will return NULL_REPLACEMENT
instead of null
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It's now checked in the callers (getGetMethod and getSetMethod).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks @harawata for clarification :)
src/main/java/ognl/OgnlRuntime.java
Outdated
// boolean containsKey(Class clazz, String propertyName) | ||
// { | ||
// ConcurrentHashMap<String,Method> methodsByPropertyName = this.cache.get(clazz); | ||
// if (methodsByPropertyName == null) | ||
// return false; | ||
// | ||
// return methodsByPropertyName.containsKey(propertyName); | ||
// } |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Please remove the code instead of commenting it out
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
done
@harawata @JCgH4164838Gh792C124B5 @yasserzamani maybe you would like to also check this change. |
Thank you, @rolandhe ! I just took a quick look and didn't find any logical issue. 👍 |
@@ -3963,19 +3966,22 @@ void put(Class clazz, String propertyName, Method method) | |||
if (methodsByPropertyName == null) | |||
{ | |||
methodsByPropertyName = new ConcurrentHashMap<String, Method>(); | |||
this.cache.put(clazz, methodsByPropertyName); | |||
ConcurrentHashMap<String,Method> old = this.cache.putIfAbsent(clazz, methodsByPropertyName); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think above if
already has checked that it's absent. Maybe you would replace the whole if statement with?
methodsByPropertyName = this.cache.putIfAbsent(clazz, new ConcurrentHashMap<String, Method>());
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
`
/**
* {@inheritdoc}
*
* @return the previous value associated with the specified key,
* or {@code null} if there was no mapping for the key
* @throws NullPointerException if the specified key or value is null
*/
public V putIfAbsent(K key, V value)
`
"if" check and then put, this is classic race condition.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
No I meant isn't following piece redundant:
ConcurrentHashMap<String,Method> methodsByPropertyName = this.cache.get(clazz);
if (methodsByPropertyName == null)
{
because you've already used putIfAbsent
inside the if
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Using read-compare-write style, which would reduce “new” operation, and very few method call new concurrent map object.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
👏 thanks!
Hello @rolandhe . Thank you for reporting the race condition scenario (and thanks to @lukaszlenart for advising about the PR conversation). 👍 I read the proposed code changes in the PR and subsequent comments, but remain unsure if the proposed changes will work as intended. Can you tell us approximately how many concurrent threads were involved (10's, 100's, 1000's, etc.) before the symptom started being evident ? Also, could you provide a small example (inline in a comment here, or maybe a link to small reproducer project) that reproduces the condition (causing Since changes to the There may be a trade-off involved in changes to the "two-level-cache logic" that is currently in place for OGNL. Structural changes to avoid the If the race condition can be eliminated (or reduced) without any substantial reduction in performance, or other undesirable side-effects, that would be ideal. If not, then it may be a "judgment call" as to what is "better" (and maybe even something requiring a flag/choice). Thanks again, @rolandhe. Any additional details you can provide will help out. 😄 |
I committed example that reproduce condition exception, it is ognl.race.RaceTestCase#testOgnlRace,,before you run this code you should remove lines 2147 and 2148 of OgnlRuntime.java. Maybe you would run many times and then produce the "NoSuchPropertyException". |
Hello @rolandhe (and everyone). Thanks for providing the reproducer test code. 😄 Unfortunately, using the reproducer code I was unable to reproduce any failures or exceptions with the existing OGNL 3.2.16-SNAPSHOT code (many executions, with variations from a few threads to 1000s). Adding an additional test that called the original test multiple times with varied numbers of threads and iterations, along with cache clears, did not generate any exceptions either. I also tried the test code with the It seems that in practical terms, the probability of the race condition in the original code appears to be very slight, otherwise we would expect to be able to reproduce it with some frequency (at least with 100's or 1000's of threads and many repeated runs). I am not sure if the reproducer test, given it cannot (consistently) demonstrate the presence or absence of the race condition failure, should be included in the final revision of this PR ? Others might have different thoughts on that. The good news is that after running multiple builds, with both the old code and the new code from this PR, there were no indications of any observable performance differences between the two. 😄 With no apparent performance impact seen, the proposed |
Just FYI, here is the repro I used when investigating the import ognl.*;
import java.util.*;
import java.util.concurrent.*;
import java.util.stream.*;
import java.lang.reflect.*;
public class GetGetMethodRaceCondition {
public static void main(String[] args) throws Exception {
int run = Integer.valueOf(args[0]);
List<Future<Object>> futures = new ArrayList<>();
ExecutorService executor = Executors.newCachedThreadPool();
try {
IntStream.range(0, run).forEach(i -> {
futures.add(executor.submit(() -> {
return OgnlRuntime.getGetMethod(null, String.class, "bytes");
}));
});
for (int i = 0; i < run; i++) {
if (futures.get(i).get() == null) {
System.out.println("!!! ERROR !!!! at round " + i);
}
}
} finally {
executor.shutdown();
}
}
} To reproduce the issue reliably, I had to use a shell script like the below. #!/bin/sh
# compile
javac GetGetMethodRaceCondition.java -cp ognl-3.2.15.jar
echo ''
# run rounds
for i in {1..100}
do
java -cp ognl-3.2.15.jar:javassist-3.27.0-GA.jar:. GetGetMethodRaceCondition 100
done p.s. |
Please copy test code to your MASTER branch code, NOT this PR code, then remove or comment next two lines:
then run the test code, maybe run many times, I tested five times, and reproduce. |
Hi @rolandhe . Sorry if it was not clear from the earlier comment, but I previously had copied and ran the test code against the master branch of OGNL 3.2. I tested both with the original branch state for Today I repeated the same tests again many times manually, and was finally able to reproduce the failure (using the original OGNL 3.2 OgnlRuntime, with the two lines commented out of I also performed the test with the OgnlRuntime code from this PR in its original form, and with the same lines commented out in Using a process like the sample @harawata provided above (thanks @harawata 😄 ), it was easier to produce the error more consistently as well. Anyway, your PR's changes appears to eliminate the race without any noticeable side-effects. Thanks for your work on this. 👏 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM 👍
I found a race condition problem which cause to throw NoSuchPropertyException when there are to many threads since version 3.1.2. Although it was resolved by trying to call method of "getReadMethod", it will loss some performance, I think it is the best way to resolve race condition.