[Python] Typeinfo object identity equality #2735

thautwarm · 2022-01-15T15:51:19Z

This addresses fable-compiler/Fable.Python#42 in another approach. It does not need a change to the compiler code.

Such approach leverages a theory(need confirmation?) that .NET type identities are decided by the fullname and its generic type arguments. The only exception is anonymous records, which use unordered field info for equality.

This PR implements type comparisons that consume constant space and constant time.

… beyond

dbrattli · 2022-01-15T16:29:28Z

Looks good to me! What do you think @alfonsogarciacaro? Could you have a look at this? Do we btw have this problem in JavaScript?

dbrattli

Looks good, thanks for fixing!

alfonsogarciacaro · 2022-01-19T01:48:38Z

@dbrattli Sorry, I didn't realize you mentioned me in the thread 😅 I wish GH told you when a notification is because of a direct mention (maybe it does and I'm missing something?).

obj.ReferenceEquals is compiled to JS strict equality === and I use it sometimes for cheap equality. It doesn't always give you the same results as in .NET, but I think it's the best we can get. According to the documentation, ReferenceEquals is like Equals but cannot be overriden. .NET also gives you counter-intuitive results sometimes (see the remarks in the documentation).

In the case of types, it does seems that .NET actually produces the same instance, even when using MakeGenericType which I found quite surprising. But it seems to use a special mechanism that's not a cache.

let t1 = typeof<int list>
let t2 = typedefof<obj list>.MakeGeneric([|typeof<int>|])
obj.ReferenceEquals(t1, t2) // true

I guess a cache as in this PR would have the same effect, although I'm always reluctant to include caches in library code unless it's explicit because they may grow out of control. Another solution could be to have a special implementation of ReferenceEquals for built-in types (in the case of Type it could just defer to Equals which should have the same effect).
In that particular example it evals to false the two times, and with strings and value types, obj.ReferenceEquals can also be counter

alfonsogarciacaro · 2022-01-20T05:17:33Z

@dbrattli Sorry, I've been thinking about this, and I'm still unsure it's a good idea to include a cache in the library. Particularly not just to try to make ReferenceEquals consistent with .NET. As seen above, ReferenceEquals can also be counter-intuitive in .NET. In Fable JS we've tried to make equality consistent with F# in .NET but ReferenceEquals falls in the category of "Behavior depends on the platform". Among other reasons because we cannot reference vs value types in JS as in .NET, e.g. we cannot pass value by ref (I think it's the same in Python).

@thautwarm Is there a reason you cannot use Equals instead of ReferenceEquals for type comparison?

thautwarm · 2022-01-20T07:41:04Z

I think your idea of avoiding cache makes sense, but a function that creates typeinfo should be pure, I think it is okay to cache such a function.

@thautwarm Is there a reason you cannot use Equals instead of ReferenceEquals for type comparison?

I can always use Equals instead of ReferenceEquals, the only concern here is the performance. I use reflections for data parsing/validation, like what pydantic does. Pydantic has unsolved corner cases as Python is not really statically typed (pydantic/pydantic#2678). F# by Fable can provide a novel way (for Python users) to achieve safer and more correct type-driven data parsing/validation: we can define types, and do serialization/deserialization as convenient as those in FSharp.Json. Such capability is a selling point of Fable for me, and one day we might attract Python users with it.

Concretely, type-driven data parsing/validation via F# reflection needs to map type objects (statically resolved in F# hence more reliable) to handler functions, but map lookup using structural hashing and structural equality is expensive. As can be seen in the following example code, serializing/deserializing nested data structure will compare types frequently, while using object identity for hashing and equality makes such recursion really cheap.

let rec deserialize (registeredHandlers: Map<System.Type, Handler>) (t: System.Type) (stream): obj =
     if t is record/union then // composite types
           // also lookup the map and call 'deserialize' recursively
     else
       match Map.tryFind t registeredHandlers with
       | None -> ..
       | Some handler -> handler.deserialize stream

thautwarm · 2022-01-20T07:45:30Z

I did some rudimentary benchmark on JSON parsing, which shows my type-driven data parsing is slow due to F# reflection. I haven't make tests against this PR to show how much the performance gets improved, as it's difficult to locally use Fable.Cli...

alfonsogarciacaro · 2022-01-20T11:45:24Z

Thanks for your reply @thautwarm! Some notes:

About using caches in Fable library: the problem is it's difficult to know when to free the cache and while we don't we are retaining some memory users will expect to be freed. Suppose we end up adding attribute info to the types, attributes can get arbitrarily large.
About equality in F#/Fable: we can say F# is opinionated about equality by adding structural equality to its native types (unions/records) and collections. Fable has implemented this structural equality to keep the F# semantics, but I'm unsure about trying to match the .NET behavior with ReferenceEquals because its semantics are less clear. I was surprised to learn the ReferenceEquals returned always true for types, I would have always used Equals to compare types. In maps, dictionaries, the performance is not achieved by using ReferenceEquals but GetHashCode which allows to quickly discard non-matching elements.
Thoth.Json also generates coders using reflection. I think it's quite fast (we haven't really done benchmarks, but nobody has complained it's too slow), it has a non-cached and cached (explicitly) version. The coders are stored in a map using the type fullname (there's an optimization in Fable to compile typeof<'T>.FullName as a string).
Finally, to quickly test a local version of Fable :)

clone https://github.com/fable-compiler/Fable
cd Fable
git checkout beyond # or the branch you're working on
dotnet fsi build.fsx library-py
dotnet fsi build.fsx quicktest-py

Then you can edit src/quictest-py/quicktest.fsx and the quicktest.py file next to it will be updated (note the generated F# will be enclosed in the #fsharp delimiters so you can keep Python code outside and it won't be overwritten).

dbrattli · 2022-01-20T19:34:10Z

Thanks for the good explanation @alfonsogarciacaro. I agree that caches (especially unbounded) are scary (and if the user do not know about them). So we need to figure out a better way to solve this or revert.

thautwarm added 2 commits January 15, 2022 22:51

Merge branch 'beyond' of https://github.com/fable-compiler/Fable into…

24a1d0e

… beyond

cache approach

538ba8d

dbrattli changed the title ~~Typeinfo object identity equality~~ [Python] Typeinfo object identity equality Jan 15, 2022

dbrattli requested review from alfonsogarciacaro and dbrattli January 15, 2022 16:42

Merge branch 'beyond' into typeinfo-object-identity-equality

6c48437

dbrattli approved these changes Jan 18, 2022

View reviewed changes

dbrattli merged commit bc8d787 into fable-compiler:beyond Jan 18, 2022

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[Python] Typeinfo object identity equality #2735

[Python] Typeinfo object identity equality #2735

thautwarm commented Jan 15, 2022

dbrattli commented Jan 15, 2022 •

edited

dbrattli left a comment

alfonsogarciacaro commented Jan 19, 2022

alfonsogarciacaro commented Jan 20, 2022

thautwarm commented Jan 20, 2022

thautwarm commented Jan 20, 2022

alfonsogarciacaro commented Jan 20, 2022

dbrattli commented Jan 20, 2022

[Python] Typeinfo object identity equality #2735

[Python] Typeinfo object identity equality #2735

Conversation

thautwarm commented Jan 15, 2022

dbrattli commented Jan 15, 2022 • edited

dbrattli left a comment

Choose a reason for hiding this comment

alfonsogarciacaro commented Jan 19, 2022

alfonsogarciacaro commented Jan 20, 2022

thautwarm commented Jan 20, 2022

thautwarm commented Jan 20, 2022

alfonsogarciacaro commented Jan 20, 2022

dbrattli commented Jan 20, 2022

dbrattli commented Jan 15, 2022 •

edited