Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Issue/2563 #2571

Open
wants to merge 11 commits into
base: main
Choose a base branch
from
Open

Issue/2563 #2571

wants to merge 11 commits into from

Conversation

msevcenko
Copy link

Fixes #2563

Purpose

The purpose is to fix #2563 as described therein.

Description

See discussion in #2563. See InferenceFromTypeVariableTest for demo.

Copy link

google-cla bot commented Dec 7, 2023

Thanks for your pull request! It looks like this may be your first contribution to a Google open source project. Before we can look at your pull request, you'll need to sign a Contributor License Agreement (CLA).

View this failed invocation of the CLA check for more information.

For the most up to date status, view the checks section at the bottom of the pull request.

to testGenericWrapperWithBoundDeserialization
@Marcono1234
Copy link
Collaborator

Sorry, as seen by the test failures, my suggestion to just change $Gson$Types.getRawType does not work.
In isolation, I think this would have been the correct fix1. But in combination with the other Gson code there seem to be at least two issues:

  1. As seen by the test failures, Gson might now try to construct reflective adapters for the type variable bound, which can fail, even though in the end only serialization is performed and TypeAdapterRuntimeTypeWrapper would not even choose the reflective adapter.
    The behavior was the same before, except that Gson constructed ObjectTypeAdapter before (where construction did not fail) but then TypeAdapterRuntimeTypeWrapper ignored that adapter and preferred the adapter for the runtime type as well.
  2. ObjectTypeAdapter.write(JsonWriter, Object) apparently has also logic to prefer the adapter for the runtime type.
    With this change here, ObjectTypeAdapter is not used anymore, so maybe (?) there are cases now where serialization will be performed using the adapter for the compile-time type instead of the runtime type.

I am not sure at the moment how / if these issues can be solved.
For 1) maybe ReflectiveTypeAdapterFactory should defer the creation of the actual adapter until it is actually used. But it might not be easy to implement, or not worth the effort. It would also remove the fail-fast behavior which can be useful in cases where the reflective adapter would actually have been used.
For 2) that might not actually be a problem (or only for some corner cases) because often Gson uses TypeAdapterRuntimeTypeWrapper anyway, which in most cases would also prefer the adapter for the runtime type.

Any other ideas or thoughts on this are welcome!

Footnotes

  1. Possibly even with an additional change to the type resolution logic so that for example for class MyClass<T extends List<Number>, the raw MyClass becomes MyClass<List<Number>>.

@msevcenko
Copy link
Author

msevcenko commented Dec 8, 2023

Thanks for the analysis @Marcono1234 .

As seen by the test failures

Well, it seems to me that there is only some problem with Enums? And the main problem is, that what is returned by the modified getRawType, is not later intercepted by ENUM_FACTORY type adapter factory, and defaults to ReflectiveTypeAdapterFactory. Let me see what exactly happened there and if it can be rectified.

But the morale of this story is, that something apparently relied on the arguably incorrect behavior of the getRawType method, namely returning ObjectTypeAdapter for <T extends Enum> type variable.

So it is a breaking change, and may break more things.

@msevcenko
Copy link
Author

Thats interesting.

It seems to me, that TypeAdapters.ENUM_FACTORY is designed to handle enum types, but only specific types, not raw Enum.class.

          if (!Enum.class.isAssignableFrom(rawType) || rawType == Enum.class) {
            return null;
          }

but, it seems that ObjectTypeAdapter is handling the raw Enum.class, somehow. And this relied on the former behavior. Using eg EnumSet, was looking for adapter for <T extends Enum>, which returned ObjectTypeAdapter. Improving the type match returned TypeToken with Enum.class, which was intercepted by neither ObjectTypeAdapter nor ENUM_FACTORY.

I think we can fix this by adding a rule that ObjectTypeAdapter can handle both Object.class, and raw Enum.class - if it is handling Enum.class anyway, but only under the assumption that type variable with Enum bound is resolved as Object.

@msevcenko
Copy link
Author

So what I did, I added binding between raw enum type, and ObjectTypeAdapter. It is not directly in the ObjectTypeAdapter, so it is overridable by the user.

Note that there was no such binding, so you couldn't even serialize classes containing raw enum fields.

With this binding, it works as intended. So for instance, testEnumSet, previously failed to even construct the deserializer. This was because raw enum type from the EnumSet declaration, was bound to ReflectiveTypeAdapterFactory, which failed to analyze the bound type (raw Enum).

After the fix, raw enum is bound to ObjectTypeAdapter, as it was before this MR, but via the rule that for type variables, bound is ignored.

@Marcono1234
Copy link
Collaborator

Well, it seems to me that there is only some problem with Enums?

No, it could affect any case where the bound of the variable has no type adapter and construction of the reflective adapter fails. Here is a contrived example where the same occurs. Before the changes to getRawType this prints {"buffer":[1,2]}; after the changes it throws an exception because it tries to access the internal fields of java.nio.Buffer.

static class BufferHolder<T extends java.nio.Buffer> {
  final T buffer;

  public BufferHolder(T buffer) {
    this.buffer = buffer;
  }
}

@Test
public void test() {
  BufferHolder<?> holder = new BufferHolder<>(ByteBuffer.wrap(new byte[] {1, 2}));
  Gson gson = new GsonBuilder()
      .registerTypeHierarchyAdapter(ByteBuffer.class, new JsonSerializer<ByteBuffer>() {
        @Override
        public JsonElement serialize(ByteBuffer src, Type typeOfSrc, JsonSerializationContext context) {
          return context.serialize(src.array());
        }
      })
      .create();
  System.out.println(gson.toJson(holder));
}

I am not sure though how common such cases are. Éamonn is able to run changes against internal Google tests to see if / how many projects are affected, but this would just give an estimation, other users might still be affected.

If you want you can still continue work on this, but just as warning that maybe in the end this pull request could be rejected if the fix for this is not considered worth the (unintentional) behavior changes.

it seems that ObjectTypeAdapter is handling the raw Enum.class, somehow

It looks like the unit tests only covered serialization of the raw Enum class (implicitly through EnumMap and EnumSet), and there the ObjectTypeAdapter.write method will then look up the adapter for the runtime type, which is the proper enum adapter.

Comment on lines +1034 to +1048
public static final TypeAdapterFactory RAW_ENUM_FACTORY =
new TypeAdapterFactory() {
@Override
public <T> TypeAdapter<T> create(Gson gson, TypeToken<T> typeToken) {
Class<? super T> rawType = typeToken.getRawType();
if (rawType == Enum.class) {
@SuppressWarnings("unchecked")
TypeAdapter<T> adapter =
(TypeAdapter<T>) new ObjectTypeAdapter(gson, ToNumberPolicy.DOUBLE);
return adapter;
} else {
return null;
}
}
};
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

While this might be functionally equivalent to the previous behavior, I think if we are going to add an adapter for java.lang.Enum anyway, then we should implement it properly and have it behave explicitly as desired (instead of implicitly relying on ObjectTypeAdapter's behavior).

Something like this should work:

public static final TypeAdapterFactory ENUM_BASE_CLASS_FACTORY = new TypeAdapterFactory() {
  @Override
  public <T> TypeAdapter<T> create(Gson gson, TypeToken<T> type) {
    if (type.getRawType() != Enum.class) {
      return null;
    }

    return new TypeAdapter<T>() {
      @Override
      public T read(JsonReader in) throws IOException {
        throw new UnsupportedOperationException("Deserializing base class java.lang.Enum is not supported;"
            + " only deserialization of subclasses is supported");
      }

      @Override
      public void write(JsonWriter out, T value) throws IOException {
        if (value == null) {
          out.nullValue();
          return;
        }

        @SuppressWarnings("unchecked")
        TypeAdapter<T> delegate = (TypeAdapter<T>) gson.getAdapter(value.getClass());
        delegate.write(out, value);
      }
    };
  }
};

@msevcenko
Copy link
Author

msevcenko commented Dec 11, 2023

Ok, it looks like this is not a safe option how to solve the issue.

Please have a look at alternative approach filed as another PR #2573 .

This is still not 100% backward compatible, but can be made so by implementing it as an opt-in feature, with some extra work.

I checked that this would be sufficient to solve my issue - all the usecases I need to solve in my project do work with this fix.

@Marcono1234
Copy link
Collaborator

Sorry for the late response and thank you for your continued work on this!

And disclaimer again that I am not the one who will make the final decision on this.

The example I provided in #2571 (comment) was really contrived, maybe that hypothetical problem described there would not actually affect real projects. Personally I still feel that the approach taken in this PR here is the right and a clean approach.
While the changes in #2573 might solve your specific use case, it requires an additional TypeAdapterFactory for that (ResolveGenericBoundFactory in your example). And the code changes in ObjectTypeAdapter and registering the adapter twice with different configuration feels more like a workaround than a proper solution, and it makes ObjectTypeAdapter more complex and error-prone.

As mentioned in #2571 (comment), maybe another solution could be to have ReflectiveTypeAdapterFactory create a lazy adapter which only creates the real adapter when it is actually used. This would also go in the same direction proposed by #2143, and might have a positive effect on performance.
Though it might also have disadvantages, such as:

  • adapter creation not failing fast anymore
  • delaying creation of the real adapter might cause unpredictable bad performance for the first action which uses these adapters (e.g. deserializing a POST request), instead of predictable at creation time of Gson instance
  • making the ReflectiveTypeAdapterFactory implementation more complex

I have created a hacky proof of concept here to demonstrate what I meant: https://github.com/Marcono1234/gson/commits/marcono1234/type-var-bound-raw-type/
But I am not sure if that is a good approach.

Sorry for this uncertainty regarding this issue and your pull requests. At least to me the current handling of type variables seems to be a valid issue, but I am not sure what the best approach here would be to fix this.

@msevcenko
Copy link
Author

Hello @Marcono1234 , thank you for your feedback.

The example I provided in #2571 (comment) was really contrived, maybe that hypothetical problem described there would not actually affect real projects.

May be it is hypothetical, may be it is not. I don't know who will be the judge for that, definitely not me :) Anyway, it looks like the change is more dangerous and potentially breaking than I previously thought, so I am no longer very confident this approach is doable. We can kindly ask to run this PR against google code base as some indication, how many things it would break.

Personally I still feel that the approach taken in this PR here is the right and a clean approach.
While the changes in #2573 might solve your specific use case, it requires an additional TypeAdapterFactory for that (ResolveGenericBoundFactory in your example). And the code changes in ObjectTypeAdapter and registering the adapter twice with different configuration feels more like a workaround than a proper solution, and it makes ObjectTypeAdapter more complex and error-prone.

#2573 definitely is a workaround. I am not very happy about it, but I think that this might be the best what we can do, in terms of backward compatibility. I don't think it contributes too much to complexity or error-prone-ness to the existing logic.

As mentioned in #2571 (comment), maybe another solution could be to have ReflectiveTypeAdapterFactory create a lazy adapter which only creates the real adapter when it is actually used.

Sounds like a big change to me. I am definitely not able to assess whether you can do this safely without breaking something already on top of this logic.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Generic deserializer should extract type information from generic boundary
2 participants