Performance tweaks #95

daveraja · 2022-03-08T00:10:43Z

Some performance tweaks for predicate facts. A few more things are pushed into the templating so that they are calculated when Predicate is sub-classed. Also the fact hash value is calculated and cached at fact creation time and not later in the hash() function.

For internal functions access the internal data and not the public API

Using the templating for Predicate sub-class __bool__() and __len__() because their values can be worked out when the sub-class is created. Slight change of behaviour for __bool__() that it now only behaves like a tuple if the predicate is declared as a tuple. So returns False only if it is an empty tuple, otherwise returns true. This seems like more appropriate behaviour.

This avoids having to check whether the value has already been calculated and cached in the __hash__() function. Since __hash__() is called a lot this seems worth it.

florianfischer91

Do you know how often the ordering-methods are called?
In dataclasses these methods always starts with if other.__class__ is self.__class__ instead of using isinstance, but i have not checked if this gives any performance improvements

If someone use the clone-method quite often to just modify one value we could modify the method to just call pytocl for the changed values, don't call __init__ and do all the other required stuff in the clone-method?

florianfischer91 · 2022-03-08T16:14:53Z

clorm/orm/core.py


-    template = PREDICATE_TEMPLATE.format(pdefn=pdefn)
+    bool_status = not pdefn.is_tuple or len(pdefn) > 0


Is this by intention that you have changed the bool expression? Before we have just checked for the length

Yes the change was intentional. I was trying to think what was most intuitive from a user perspective.

Most importantly, I think clorm tuples should behave like a normal tuple as much as possible. So an empty tuple should evaluate to false.

But for a normal fact, even one with no parameters, I'm not sure it makes sense for it to evaluate to false. I think the following would be unintuitive:

class P(Predicate): pass p = P() if not p: print("Feels strange")

Anyway, that was the rationale :)

In any case, It's a bit of a boundary case to define predicates with no parameters and clorm is not well suited to this use case. It's not very efficient or intuitive (every P instance is equivalent). Ideally it would be better to have a simpler mechanism for representing these cases. I just haven't come up with something that could also be made to fit nicely with the rest of clorm; unifiction, FactBases and the query stuff.

florianfischer91 · 2022-03-08T16:25:35Z

clorm/orm/core.py

@@ -2384,6 +2384,13 @@ def _generate_dynamic_predicate_functions(class_name: str, namespace: Dict):
        tmp.append(f"{f.name}_cltopy(raw_args[{idx}]), ")
    args_cltopy= "".join(tmp)

+    if pdefn.is_tuple:


I'm not so familiar in defining an own hash-function but why we don't have to add the sign to the hash-function?

You're right that the sign probably should be in the hash function. It doesn't matter in terms of correctness but would potentially reduce hash clashes.

This is another boundary case because negated facts are rarely used in well-written ASP programs. So I was doing a bit of premature optimisation :( by thinking that since it's rarely used it's cheaper to leave it out of the hash calculation. Which is a bit crazy considering the tiny cost of adding it!

daveraja · 2022-03-09T00:55:10Z

Do you know how often the ordering-methods are called? In dataclasses these methods always starts with if other.__class__ is self.__class__ instead of using isinstance, but i have not checked if this gives any performance improvements

__eq__() (and __hash__()) are called a lot when adding to sets and factbases. And for sorting and query __lt__() (and less often __gt__()) are also called a lot. So anything that we can do to make them faster would help. They probably should be included in the "templated" functions even if that only reduces a few checks.

If someone use the clone-method quite often to just modify one value we could modify the method to just call pytocl for the changed values, don't call init and do all the other required stuff in the clone-method?

Yes, that would be useful. It might also be useful to do the templating thing here as well and generate a custom function signature with keywords only and default to MISSING.

Based on discussions #97 the decision is to overload the Predicate instance comparison to simply call the underlying Symbol object.

Previously, Predicate comparison operators were overloaded to deal with standard Python tuples. This made using standard tuples in queries easy. However, iith the change of semantics we now have to detect when a python tuple is specified in a query that it needs to be explicitly converted to the appropriate clorm complex tuple type.

With the change of the Predicate semantics the FactBase set operations are probably broken and need more checking of boundary cases where two facts are of different type but are considered equivalent.

daveraja added 3 commits March 8, 2022 01:46

Trivial tweaks to improve performance

e588234

For internal functions access the internal data and not the public API

Calculate the cached hash value for a fact at creation time

19c31d1

This avoids having to check whether the value has already been calculated and cached in the __hash__() function. Since __hash__() is called a lot this seems worth it.

daveraja requested a review from florianfischer91 March 8, 2022 00:10

florianfischer91 approved these changes Mar 8, 2022

View reviewed changes

daveraja added 4 commits March 10, 2022 09:54

Add a fact's sign to hash generation

ee82644

Comparison operators for Predicate map directly to the Symbol object

931d326

Based on discussions #97 the decision is to overload the Predicate instance comparison to simply call the underlying Symbol object.

Passing current unit tests, but need to add more FactBase tests.

aeb1832

With the change of the Predicate semantics the FactBase set operations are probably broken and need more checking of boundary cases where two facts are of different type but are considered equivalent.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Performance tweaks #95

Performance tweaks #95

daveraja commented Mar 8, 2022

florianfischer91 left a comment

florianfischer91 Mar 8, 2022

daveraja Mar 9, 2022

florianfischer91 Mar 8, 2022

daveraja Mar 9, 2022

daveraja commented Mar 9, 2022


		template = PREDICATE_TEMPLATE.format(pdefn=pdefn)
		bool_status = not pdefn.is_tuple or len(pdefn) > 0

Performance tweaks #95

Are you sure you want to change the base?

Performance tweaks #95

Conversation

daveraja commented Mar 8, 2022

florianfischer91 left a comment

Choose a reason for hiding this comment

florianfischer91 Mar 8, 2022

Choose a reason for hiding this comment

daveraja Mar 9, 2022

Choose a reason for hiding this comment

florianfischer91 Mar 8, 2022

Choose a reason for hiding this comment

daveraja Mar 9, 2022

Choose a reason for hiding this comment

daveraja commented Mar 9, 2022