Faster fact creation #89

daveraja · 2022-02-26T04:27:14Z

Defines the Predicate.init() by custom generating a string version for each predicate type and then exec() it to get the bytecode. Some other performance related tweaks.

Set version to 1.4.0 because of small API tweaks.

Removed old usage of raw in the initialisation. Now raw should only be used for internal API purposes.
Cleaning up of a few other old things (like the field.unifies() method which served no useful purpose).

daveraja · 2022-02-26T06:04:06Z

Added the class factory function Predicate._unify() to also be dynamically generated which speeds up unification.

florianfischer91

Quite nice implementation and good performance improvements :-) (of course it adds a bit more complexity because of building the template)

Sry for the many comments but most of them are just some questions or small suggested changes^^
I have also two general questions:

Do you think it would be helpfull to create a separate attribute default_factory for BaseField so it will make handling callables a bit easier and is more similar to dataclass.py?
What about moving the boilerplate code for creating the methods into a separate file? (this is just a personal preference but core.py gets always longer and longer...) Or is it even not possible because of circular imports?

clorm/orm/core.py

florianfischer91 · 2022-02-26T14:53:58Z

clorm/orm/core.py

+AnySymbol = Union[clingo.Symbol,noclingo.Symbol]
+
+def _get_symbols():
+    return symbols


Is there a reason why you created a function here and not calling symbols directly? (Ah perhaps, because symbols can change when you change the symbol mode?)

Yes, exactly.

But it is ugly and I'm going to change this. The way I've integrated the noclingo stuff is not clean so I'm going to try to make it a bit more self-contained within the noclingo module. The idea is that noclingo.Function(), noclingo.String(), etc will internally decide what type of symbol to create based on the mode so all the indirection is contained within these functions and we don't need hackiness like _get_symbols(). Which will also make it easier if/when noclingo can be removed.

I've removed most of the hackiness in core.py of dealing with noclingo (such as the _get_symbols() function). However, I think there is still one lingering issue.

Now, the overhead introduced by the flexibility of noclingo is confined to noclingo.py and it is simply to add an indirection when calling the various symbol creating functions. This overhead is pretty small and you can only see the performance difference when you look at averages over multiple runs of the test code. But it is still annoying because there are many/most occasions when you don't need this functionality so why have the overhead?.

I've added some if statements in the module that could enable/disable noclingo when the module is loaded. I think the default should be to have it disabled. But I'm not sure what's the most pythonic way to trigger this. You would need to set some flag before the module is loaded to tell it to enable noclingo. Or potentially do it using a environment variable, in a similar way to PYTHONHASHSEED.

Any thoughts?

What about using something like a context-manager if you want to use the noclingo-mode?
I'm not sure if it is possible but you could switch to the noclingo-mode when entering the ctxt-manager and switch back to clingo-mode when leaving the ctxt-manager?

That's a nice idea, and might be a good way to switch to noclingo-mode. But it's not what I had in mind here, where I want to completely disable it, so there is no way to switch modes. We could get a small performance improvement because we can remove the indirection of the various symbol creation functions.

Anyway, for the moment I won't disable it and will just merge the branch. Can revisit this later.

clorm/orm/core.py

daveraja · 2022-02-27T07:07:19Z

Quite nice implementation and good performance improvements :-) (of course it adds a bit more complexity because of building the template)

Sry for the many comments but most of them are just some questions or small suggested changes^^ I have also two general questions:
* Do you think it would be helpfull to create a separate attribute `default_factory` for BaseField so it will make handling callables a bit easier and is more similar to dataclass.py?

* What about moving the boilerplate code for creating the methods into a separate file? (this is just a personal preference but core.py gets always longer and longer...) Or is it even not possible because of circular imports?

I'm not familiar with dataclass default_factory and callables. I'll look into it.

I do agree that core.py is getting too big. I've moved some of the template stuff into a separate file templating.py, but kept the main construction function in core as it relies on things that are declared in core.py and would mean passing a lot parameters into the function.

I'm very tempted to look at Jinja2 to see if it could be used to make a really clean template and simpify that generator function. But I really don't want to add the dependency. One thing I don't understand is that Python does have a standard Template class (https://docs.python.org/3/library/string.html#string.Template) but it seems so limited I don't see the point of it.

Longer term I think a lot of the define_XXX helper functions for defining new fields can be moved out of core into something like extra_fields.py

Predicate._unify(X) is available for internal use and users should use the API unify() functions.

For each Predicate sub-class a custom init function is generated as a string and then turned into Python bytecode by calling exec(). This creates a much leaner and faster initialisation, since as much work as possible is done when defining the init function and not have to be built into the logic of the init function itself.

For clorm tuple we need to coerce any normal tuple (or some other clorm tuple) to the correct type. If the value is not a clorm tuple then it must be of the correct type.

@florianfischer91

Fixes based on @florianfischer91 comments. Especially good to remove the raw paramater from Predicate __init__.

This works for any field that has a non-callable default

I think this is the simplest message but maybe a more descriptive message could be better.

NOCLINGO adds some small overhead when creating symbols. We can remove this overhead and disable NOCLINGO, by adding some conditional statements when loading the module. But I'm not sure the best way to use this; should it be monkey patching or some global variable?

I think this was not possible for clingo 5.4 because it didn't properly return NotImplemented.

Also added some parts to enable noclingo. But it is still not working as expected so noclingo is still enabled by default.

This improves performance of creating facts

daveraja mentioned this pull request Feb 26, 2022

Cleanup predicate init #88

Closed

florianfischer91 reviewed Feb 26, 2022

View reviewed changes

daveraja added 16 commits February 28, 2022 09:50

Cleaning up to reduce the number of pytocl calls

d273ba1

Updating docs to remove creating facts with a raw parameter

6a6e13f

Remove the field unifies() function which was previously deprecated

dd92ebe

Removed Predicate.__init__(raw=X) as this is no longer supported

033a2a5

Predicate._unify(X) is available for internal use and users should use the API unify() functions.

Set to version 1.4.0 because we have small API breaking changes

31b3361

Changed internal _predicate_init functions to avoid arg unpacking

f14cfb5

Add 'Unifier' to the export list

2796407

Predicate init arg mapping to tuple only required for clorm tuples

ba770ba

For clorm tuple we need to coerce any normal tuple (or some other clorm tuple) to the correct type. If the value is not a clorm tuple then it must be of the correct type.

More small improvements to predicate init

7e814ec

Make predicate sub-class _unify() function dynamically generated

e619da2

Main API change Predicate __init__ no longer exposes raw

8163649

Fixes based on @florianfischer91 comments. Especially good to remove the raw paramater from Predicate __init__.

Removed redundant commented out code from the predicate template

461131f

Add references to default values in the generated predicate init

e355c7c

This works for any field that has a non-callable default

Removed bad Predicate.__init__() declaration with strange typing

c83e50e

Chaged error message when the wrong type is passed to a clorm tuple

900bede

I think this is the simplest message but maybe a more descriptive message could be better.

daveraja force-pushed the faster_fact_creation branch from c73bc8d to 900bede Compare February 27, 2022 22:58

daveraja added 5 commits February 28, 2022 12:57

Fixed implementation of NoSymbol to compare correctly with Symbol

fb393bd

I think this was not possible for clingo 5.4 because it didn't properly return NotImplemented.

Simplified implementation of the Raw class

c250b56

Fixes for NoSymbol to behave more like Symbol.

b1a48e9

Also added some parts to enable noclingo. But it is still not working as expected so noclingo is still enabled by default.

In-lined the _instance_from_tuple() function in the templates

e04019a

This improves performance of creating facts

daveraja marked this pull request as ready for review March 1, 2022 00:52

daveraja merged commit 45e0dd5 into master Mar 1, 2022

daveraja deleted the faster_fact_creation branch March 2, 2022 10:55

daveraja mentioned this pull request Mar 2, 2022

Better initialisation for Predicate.__init__() #79

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Faster fact creation #89

Faster fact creation #89

daveraja commented Feb 26, 2022

daveraja commented Feb 26, 2022

florianfischer91 left a comment

florianfischer91 Feb 26, 2022

daveraja Feb 26, 2022

daveraja Feb 28, 2022 •

edited

florianfischer91 Feb 28, 2022

daveraja Mar 1, 2022

daveraja commented Feb 27, 2022

Faster fact creation #89

Faster fact creation #89

Conversation

daveraja commented Feb 26, 2022

daveraja commented Feb 26, 2022

florianfischer91 left a comment

Choose a reason for hiding this comment

florianfischer91 Feb 26, 2022

Choose a reason for hiding this comment

daveraja Feb 26, 2022

Choose a reason for hiding this comment

daveraja Feb 28, 2022 • edited

Choose a reason for hiding this comment

florianfischer91 Feb 28, 2022

Choose a reason for hiding this comment

daveraja Mar 1, 2022

Choose a reason for hiding this comment

daveraja commented Feb 27, 2022

daveraja Feb 28, 2022 •

edited