Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Polyglot does not allow ruby to add methods to Java classes. #3139

Open
hymanroth opened this issue Jun 29, 2023 · 9 comments
Open

Polyglot does not allow ruby to add methods to Java classes. #3139

hymanroth opened this issue Jun 29, 2023 · 9 comments
Assignees
Labels

Comments

@hymanroth
Copy link

hymanroth commented Jun 29, 2023

As suggested by @eregon in this StackOverflow question https://stackoverflow.com/questions/76567012/extending-monkey-patching-java-classes-in-truffle-ruby, I am opening an issue here.

We are currently checking the feasibility of porting a large Java/JRuby project to Truffle/GraalVM and we've hit a major problem with so-called monkey patching of Java classes in Ruby, which JRuby allows but Truffle does not. @eregon suggested that if the Java classes implement the "correct" interfaces then the polyglot api will convert additional ruby methods (eg [] or []=) to Java calls automatically.

Unfortunately this is not going to work for us, as I imagine it possibly won't for other large JRuby projects looking to port. The problem is as follows: over the course of 10 years and thousands of lines of code, the ruby programmers generally didn't interface with the Java programmers to add domain specific methods to their Java classes. They just implemented them themselves by monkey patching the Java classes directly in JRuby. This was both an agile and a "clean" approach, because the Java class was kept generic and free from project-specific extensions.

The result is that for some Java classes we have ruby files with additional method definitions running to several hundred lines. There is no way we can justify refactoring the entire ruby code base to rework all of the references to these methods.

Finally, and just to emphasize that the approach taken by polyglot is elegant in theory but unfeasible in practice I will post the polyglot HashTrait for [] and its analog in our monkey patch file:

Polyglot:

def [](key)
  Truffle::Interop.read_hash_value_or_default(self, key, nil)
end

Our legacy code:

def [] (*keys)
  if keys.size == 2 && keys[0].instance_of?(Array)
    self.java_send :fgetObj, [java.lang.String[], java.lang.Object], keys[0].to_s.to_java(:string), keys[1]
  elsif keys.size > 1  
    self.java_send :fgetObj, [java.lang.String[]], keys.to_java(:string)
  elsif keys[0].instance_of? Array
     self.java_send :fgetObj, [java.lang.String[]], keys[0].to_s.to_java(:string)
  else  
     self.java_send :getObj, [java.lang.String], keys[0].to_s.to_java(:string) rescue nil
  end   
end

We can see no way round this at the moment and so will continue to use JRuby for our ruby environment and will use polyglot for our Python implementation, which has much less legacy code.

@eregon
Copy link
Member

eregon commented Jun 29, 2023

Thank you for reporting the issue.
We need to look how JRuby supports that and if we can support the same or similar.

@eregon
Copy link
Member

eregon commented Jun 29, 2023

I found these docs about that feature in JRuby:
https://github.com/jruby/jruby/wiki/CallingJavaFromJRuby#reopening-java-classes
https://kofno.wordpress.com/2007/02/24/monkey-patch-java-objects-from-jruby/

JRuby seems to create a Ruby Class for every Java Class (which is some memory overhead), when a Java object is accessed in Ruby:

irb(main):001:0> h=java.util.HashMap.new
=> #<Java::JavaUtil::HashMap: {}>
irb(main):002:0> h.class
=> Java::JavaUtil::HashMap
irb(main):003:0> h.class.ancestors
=> 
[Java::JavaUtil::HashMap,
 Java::JavaLang::Cloneable,
 Java::JavaIo::Serializable,
 Java::JavaUtil::AbstractMap,
 Java::JavaUtil::Map,
 Enumerable,
 Java::JavaLang::Object,
 ConcreteJavaProxy,
 JavaProxy,
 JavaProxyMethods,
 Object,
 PP::ObjectMixin,
 Kernel,
 BasicObject]

irb(main):016:0> java.lang.System.getProperties
=> #<Java::JavaUtil::Properties: {"java.specification.version"=>...

The interesting question is how does JRuby map from a Java class to a Ruby class.
That's probably a hashtable lookup or so, which seems quite expensive when moving any object from Java to Ruby (like for java.lang.System.getProperties above).

From the ancestors output, it seems JRuby creates a real Ruby object for each Java object passed to Ruby, and forwards method calls via some proxying. In fact one can set instance variables on Java objects in Ruby land in JRuby, which most likely implies a real Ruby object to hold them.
Indeed:
https://github.com/jruby/jruby/blob/0635bb6605838c48c426db9b3cf351716eed0a16/core/src/main/java/org/jruby/java/proxies/ConcreteJavaProxy.java#L58
https://github.com/jruby/jruby/blob/0635bb6605838c48c426db9b3cf351716eed0a16/core/src/main/java/org/jruby/java/proxies/JavaProxy.java#L58
https://github.com/jruby/jruby/blob/0635bb6605838c48c426db9b3cf351716eed0a16/core/src/main/java/org/jruby/RubyObject.java#L77
https://github.com/jruby/jruby/blob/0635bb6605838c48c426db9b3cf351716eed0a16/core/src/main/java/org/jruby/RubyBasicObject.java#L136

In comparison, TruffleRuby (currently) does not create a Ruby class for a Java class, the Java class object is directly exposed to Ruby. And the same for Java instances. There is still a HostObject wrapper to be able to differentiate with internal Java objects and to implement InteropLibrary.

@hymanroth
Copy link
Author

hymanroth commented Jun 29, 2023

To be honest, I've never really had to think about how JRuby works behind the scenes. If a Ruby proxy class really is created for each Java class then I'm not sure how Java methods can then take as arguments these Java classes which have been extended in Ruby.

When speed is an issue (or a lot of data is involved) we often pass instances of ruby-extended Java classes to pure Java methods and they work fine. Perhaps behind the scenes JRuby creates a wrapper around the the Java instance and strips it away when passing the instance to pure Java methods. But that's just speculation on my part.

Here is some code which shows that Java "sees" native Java classes even if they have been extended in Ruby:

java_import java.util.HashMap

class HashMap
  def ruby_test()
    puts("Dummy code")
  end  
end  

m = HashMap.new
m.ruby_test()
puts(m.getClass())

=> 
Dummy code
class java.util.HashMap

@eregon
Copy link
Member

eregon commented Jun 29, 2023

As the JRuby wiki says it, modifications on the Ruby side are not visible to Java.
And the extra methods are just stored in a Ruby class.

Perhaps behind the scenes JRuby creates a wrapper around the the Java instance and strips it away when passing the instance to pure Java methods.

Indeed.

The tricky part is creating a Ruby Class for every Java Class where an object of that Java Class goes from Java to Ruby, and the mapping Java Class->Ruby Class, which is probably quite expensive (we can't store a mutable Ruby Class in the Java Class, that would be incorrect, so it most likely needs to be a ConcurrentHashMap per Context or so). I still need to look into how JRuby does that.

@eregon
Copy link
Member

eregon commented Jul 3, 2023

OK I found the ConcurrentHashMap used for proxy classes in JRuby:
https://github.com/jruby/jruby/blob/0635bb6605838c48c426db9b3cf351716eed0a16/core/src/main/java/org/jruby/util/collections/MapBasedClassValue.java
https://github.com/jruby/jruby/blob/master/core/src/main/java/org/jruby/javasupport/JavaSupport.java#L113
https://github.com/jruby/jruby/blob/master/core/src/main/java/org/jruby/javasupport/Java.java#L417-L421 (this also shows there is a second ConcurrentHashMap lookup)
It seems like JRuby can use ClassValue too, and then it creates a new ClassValue per JRuby instance, but that's disabled by default.

This all seems very expensive, especially since in TruffleRuby we do not wrap Java instances, we let Truffle interop handle all of that and generally do not wrap foreign objects at all. So that would mean we would need that ConcurrentHashMap/ClassValue lookup every time we compute the class of a Java object.

Maybe this is something we could support in a JRuby compatibility mode or so, via a separate API than Polyglot. That API would then always wrap, closer to what JRuby does, and so at least the ConcurrentHashMap/ClassValue lookup is done when a Java instance is shared in Ruby and not on every access to a Java instance's Ruby Class.
Implementation-wise we might be able to use something like:

@ExportLibrary(value = InteropLibrary.class, delegateTo = "javaObject")`
public final class WrappedJavaObject {
    private final RubyClass javaObject;
    private final RubyClass metaClass;
}

I think this kind of wrapping doesn't work for general interop with other languages, for instance objects of other languages might not have a class name or a class/metaObject at all, it might not be OK to hold onto their class/metaObject, etc. Also currently the class of a foreign object is computed dynamically based on its interop traits which also means it can change to reflect that object changing what messages it supports.

@eregon eregon self-assigned this Jul 3, 2023
@eregon
Copy link
Member

eregon commented Jul 3, 2023

I will experiment with creating a Ruby Class for every foreign object which has a MetaObject (class), and see how well that works. It adds a hashmap lookup every time we ask/need the class of a foreign object, but we could inline-cache that.

@hymanroth
Copy link
Author

I like the idea of a having a JRuby compatibility mode. That would make it much less problematic for projects to migrate. Also, I would not worry too much about performance; the main benefit we see from accessing Java from Ruby is the availability of a huge range of very solid and performant Java libraries. In our projects we use Ruby mainly for business logic (concise and highly -readable, and Java for the heavy lifting).

It might be worth considering wrapping Java objects in a Ruby object on a lazy basis, ie. only when they have been extended in Ruby. This means there would be hardly any impact on performance (apart from an initial lookup - not too expensive - to check whether a given Java class has been extended) meaning that in situations were performance is crucial, users must simply remember to keep their Java classes "pure". Behind the scenes, pure Java methods would call the polyglot api directly.

If you need any help in testing let me know.

@eregon
Copy link
Member

eregon commented Jul 10, 2023

I have a very early, unoptimized, proof-of-concept for giving a Ruby class based on the foreign meta object qualified name at #3155
It needs more work, including a cache from foreign meta object to Ruby class (both global and inline cache).

@headius
Copy link
Contributor

headius commented Aug 16, 2023

This issue came up in my search for JRuby-related bug reports.

@hymanroth If you are having issues with JRuby, please let us know. It is still actively developed and we focus on user issues before just about everything else.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests

4 participants
@headius @hymanroth @eregon and others