Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Generic deserializer should extract type information from generic boundary #2563

Open
msevcenko opened this issue Dec 4, 2023 · 15 comments · May be fixed by #2571 or #2573
Open

Generic deserializer should extract type information from generic boundary #2563

msevcenko opened this issue Dec 4, 2023 · 15 comments · May be fixed by #2571 or #2573
Labels

Comments

@msevcenko
Copy link

Gson version

2.10.1

Java / Android version

17

Description

If field is declared with static type, the type is correctly used during deserialization. If type is generic with boundary, the boundary information is not used, it is presumably deserialized as object, that is usually as map.

Expected behavior

Generic type with boundary should deserialize similarly as static type.

Actual behavior

Boundary type information is ignored

Reproduction steps

	public class Foo {
		private final String text;

		public Foo(String text) {
			this.text = text;
		}
		
		public String getText() {
			return text;
		}
	}
	
	public class BarStatic {
		private final Foo foo;

		public BarStatic(Foo foo) {
			this.foo = foo;
		}

		public Foo getFoo() {
			return foo;
		}
		
	}

	public class BarDynamic<T extends Foo> {
		private final T foo;

		public BarDynamic(T foo) {
			this.foo = foo;
		}

		public T getFoo() {
			return foo;
		}
		
	}

	@Test
	public void testSerialize() {
		Gson gson = new GsonBuilder().create();
		
		BarStatic barStatic = new BarStatic(new Foo("foo"));
		String barJsonStatic = gson.toJson(barStatic);
		
		BarDynamic<Foo> barDynamic = new BarDynamic<>(new Foo("foo"));
		String barJsonDynamic = gson.toJson(barDynamic);
		
		Assert.assertEquals(barJsonStatic, barJsonDynamic);

		BarStatic barStaticDeserialized = gson.fromJson(barJsonStatic, BarStatic.class);
		Assert.assertEquals(barStaticDeserialized.getFoo().getText(), "foo");

		// A problem!
		BarDynamic<Foo> barDeserialized = gson.fromJson(barJsonDynamic, BarDynamic.class);
		
	}
@msevcenko msevcenko added the bug label Dec 4, 2023
@eamonnmcmanus
Copy link
Member

Some clarifying questions. Not wishing to nitpick about language, but when you say "boundary" I think you mean "bound"? So we're talking about the upper bound of a generic type, which is the extends Foo in <T extends Foo>. Then, I think the issue is that you would expect the foo property to be deserialized as an instance of Foo, but actually you get an exception like this:

java.lang.IllegalArgumentException: Can not set final com.google.gson.internal.WhateverTest$Foo field com.google.gson.internal.WhateverTest$BarDynamic.foo to com.google.gson.internal.LinkedTreeMap
        at java.base/jdk.internal.reflect.FieldAccessorImpl.throwSetIllegalArgumentException(FieldAccessorImpl.java:228)
        at java.base/jdk.internal.reflect.FieldAccessorImpl.throwSetIllegalArgumentException(FieldAccessorImpl.java:232)
        at java.base/jdk.internal.reflect.MethodHandleObjectFieldAccessorImpl.set(MethodHandleObjectFieldAccessorImpl.java:115)
        at java.base/java.lang.reflect.Field.set(Field.java:836)
        at com.google.gson.internal.bind.ReflectiveTypeAdapterFactory$1.readIntoField(ReflectiveTypeAdapterFactory.java:251)
        at com.google.gson.internal.bind.ReflectiveTypeAdapterFactory$FieldReflectionAdapter.readField(ReflectiveTypeAdapterFactory.java:531)
        at com.google.gson.internal.bind.ReflectiveTypeAdapterFactory$Adapter.read(ReflectiveTypeAdapterFactory.java:489)
        at com.google.gson.Gson.fromJson(Gson.java:1370)
        at com.google.gson.Gson.fromJson(Gson.java:1260)
        at com.google.gson.Gson.fromJson(Gson.java:1169)
        at com.google.gson.Gson.fromJson(Gson.java:1105)

(It would have helped if you had included this information in the bug report.)

I'm not sure if this is a bug in Gson or not. There is in any case a workaround: instead of

gson.fromJson(barJsonDynamic, BarDynamic.class)

use

gson.fromJson(barJsonDynamic, new TypeToken<BarDynamic<Foo>>() {})

@msevcenko
Copy link
Author

msevcenko commented Dec 4, 2023

Sorry for the mistype, of course type bound was what I meant. Yes this is the exception - the object is deserialized as generic map, because the information about the type is not passed to that json node for deserialization.

I don't think the workaround would be useful, I would expect that the static and dynamic types could be used interchangeably inside nested structures, when type information cannot be passed from the outside as you suggest.

For instance, in my case I would need to deserialize a List of BarDynamic instances, and there is also a polymorphism involved, so I need also RuntimeTypeAdapterFactory. But even the RuntimeTypeAdapterFactory needs to get information about the base type being deserialized.

I can deserialize List<BarStatic> instances (even with polymorphism, if Foo has subclasses), but if I put Foos inside generic wrapper (BarDynamic), it won't work, and I would expect it to work. The information about the target type is there, it is just harder to extract.

I would speculate that alternative frameworks like jackson can handle this situation, but I might be wrong. I'll try it.

@msevcenko
Copy link
Author

I tried similar scenario in jackson and it indeed works, out of the box.

You just need to add jackson annotations, and I also noticed that I did the classes non-static, which surprisingly wasn't problem for gson, but was for jackson, so I made them static.

	public static class Foo {
		private final String text;

		@JsonCreator
		public Foo(@JsonProperty("text") String text) {
			this.text = text;
		}
		
		public String getText() {
			return text;
		}
	}

	public static class BarDynamic<T extends Foo> {
		private final T foo;

		@JsonCreator
		public BarDynamic(@JsonProperty("foo") T foo) {
			this.foo = foo;
		}

		public T getFoo() {
			return foo;
		}
		
	}

	@Test
	public void testSerialize() throws JsonProcessingException {
		BarDynamic<Foo> barDynamic = new BarDynamic<>(new Foo("foo"));
		
		ObjectMapper commonMapper = new ObjectMapper();
		
		String barJsonDynamic = commonMapper.writeValueAsString(barDynamic);
		BarDynamic<?> barDynamicDeserialized = commonMapper.readValue(barJsonDynamic, BarDynamic.class);

		Assert.assertEquals("text", barDynamicDeserialized.getFoo().getText());
	}

There is no problem for jackson to deserialize the dynamic wrapper.

After all, if I write in Java

var foo = barDynamic.getFoo();

the compiler infers the type of foo as Foo, so does jackson. As I said, the information about the type is there.

So that's my argument why I do expect this scenario to work.

@msevcenko
Copy link
Author

msevcenko commented Dec 4, 2023

I thought that I could work around with the following type adapter factory:

	public static class ResolveGenericBoundFactory implements TypeAdapterFactory  {

		@SuppressWarnings("unchecked")
		@Override
		public <T> TypeAdapter<T> create(Gson gson, TypeToken<T> type) {
			System.out.println(type);
			if (type.getType() instanceof TypeVariable<?> tv) {
				Type[] bounds = tv.getBounds();
				if (bounds.length == 1 && bounds[0] != Object.class) {
					Type bound = bounds[0];
					return (TypeAdapter<T>) gson.getAdapter(TypeToken.get(bound));
				}
			}
			return null;
		}
		
	}

and I actually did, but only form patched gson library. The problem is, that the high-priority ObjectTypeAdapter, rather greedily processes the TypeVariable token, on the condition that TypeToken.getRawType() is Object.class, which it is.

I suppose there is not any chance to get my type adapter factory before ObjectTypeAdapter.

@Marcono1234
Copy link
Collaborator

Marcono1234 commented Dec 4, 2023

Seems to be the same issue as #1741, and the analysis there seems to be still correct: The internal method $Gson$Types.getRawType(Type) ignores the bounds of a TypeVariable.

I assume this could be changed to instead return (recursively) $Gson$Types.getRawType(typeVariable.getBounds()[0]). However, to avoid cryptic runtime exceptions it might be good if ReflectiveTypeAdapterFactory then checked the type variable for the corner cases described in #1741 (comment), and threw an exception when the user tries to deserialize such a type variable (serialization is most likely fine though). Edit: That would probably only catch fields of type T, but not List<T> or similar. So would either have to accept that for type variables with multiple bounds ClassCastExceptions might still be possible, or TypeToken should on construction throw an exception for type variables with multiple bounds (?).

I also noticed that I did the classes non-static, which surprisingly wasn't problem for gson

Yes, Gson supports this by default, but it has its downsides, see GsonBuilder.disableJdkUnsafe().

@msevcenko
Copy link
Author

Many thanks for the reference @Marcono1234 . So it is actually issue of the TypeToken, interesting.

I don´t understand how this could be worked around using custom type adapter factory, as mine never gets a chance to handle the variable token. It is handled by the ObjectTypeAdapter, just because the TypeToken actually behaves as Object.class type as no attempt was made to decode the bound.

@Marcono1234
Copy link
Collaborator

Marcono1234 commented Dec 4, 2023

To clarify, my previous comment was meant as potential solution for how this can be fixed in Gson.

But I think you are right, this can not be worked around using a custom TypeAdapterFactory since the raw type is Object and Gson's built-in factory has higher precedence. This does not apply when using Gson's @JsonAdapter on the field, but I assume it is too cumbersome to place @JsonAdapter on all affected fields.

@msevcenko
Copy link
Author

To clarify, my previous comment was meant as potential solution for how this can be fixed in Gson.

Ok, understood. I was just trying to reuse the proposed solution to provide general workaround.

You are right that @JsonAdapter solves the problem. It is not a great solution as you remarked, but good enough for me - there are not so many fields with generic variable type, and I have control over the code so I can put annotation there.

I would however vote to at least fix the filter of the ObjectTypeAdapter which seems to be too loose. Which together with the fact that ObjectTypeAdapter is not the last resort adapter (which I would expect) prevents to handle generic variable type with custom adapters.

@msevcenko
Copy link
Author

msevcenko commented Dec 5, 2023

So, no :(

The workaround can be used only for situation when the type variable is used standalone. When it is used in another expression (such as collection), there is no way how to hint gson to use custom @JsonAdapter instead of ObjectTypeAdapter for that type :(

So back to square one. Without letting the type variable pass through ObjectTypeAdapter, it is not possible to transport values involving type variables, in expressions.

Unit test:

	public static class Foo {
		private final String text;

		public Foo(String text) {
			this.text = text;
		}
		
		public String getText() {
			return text;
		}
	}
	
	public static class BarDynamic<T extends Foo> {
		// this prevents ObjectTypeAdapter from converting this type variable to object
		@JsonAdapter(ResolveGenericBoundFactory.class)
		private final T foo;

		// however, it is not possible here, so items are deserialized as objects
		private final List<T> foos;

		public BarDynamic(T foo, List<T> foos) {
			this.foo = foo;
			this.foos = foos;
		}

		public T getFoo() {
			return foo;
		}
		
		public List<T> getFoos() {
			return foos;
		}
	}

	public static class ResolveGenericBoundFactory implements TypeAdapterFactory  {

		@SuppressWarnings("unchecked")
		@Override
		public <T> TypeAdapter<T> create(Gson gson, TypeToken<T> type) {
			if (type.getType() instanceof TypeVariable<?> tv) {
				Type[] bounds = tv.getBounds();
				if (bounds.length == 1 && bounds[0] != Object.class) {
					Type bound = bounds[0];
					return (TypeAdapter<T>) gson.getAdapter(TypeToken.get(bound));
				}
			}
			return null;
		}	
	}

	@Test
	public void testSerialize() throws JsonProcessingException {
		Gson gson = new GsonBuilder().registerTypeAdapterFactory(new ResolveGenericBoundFactory()).create();
		
		BarDynamic<Foo> barDynamic = new BarDynamic<>(new Foo("foo"), List.of(new Foo("item")));
		String barJsonDynamic = gson.toJson(barDynamic);
		
		Assert.assertEquals("{\"foo\":{\"text\":\"foo\"},\"foos\":[{\"text\":\"item\"}]}", barJsonDynamic);

		// A problem!
		BarDynamic<Foo> barDeserialized = gson.fromJson(barJsonDynamic, BarDynamic.class);
		
		// the deserialization seemingly succeeded, but caused "heap pollution", later leading to class cast exception
		
		Assert.assertTrue( 
			Assert.assertThrows(ClassCastException.class, () -> {
				Foo item = barDeserialized.getFoos().get(0);
			}).getMessage().startsWith("class com.google.gson.internal.LinkedTreeMap cannot be cast to class "));
		
	}

@msevcenko
Copy link
Author

msevcenko commented Dec 6, 2023

So to conclude, I would suggest the following changes.

  1. as noted above and in the comment Unsatisfactory handling of type variables #1741 (comment) , I would approximate a TypeVariable type with bound type, not Object type.
@@ -145,9 +145,11 @@ public final class $Gson$Types {
       Type componentType = ((GenericArrayType) type).getGenericComponentType();
       return Array.newInstance(getRawType(componentType), 0).getClass();
 
-    } else if (type instanceof TypeVariable) {
-      // we could use the variable's bounds, but that won't work if there are multiple.
-      // having a raw type that's more general than necessary is okay
+    } else if (type instanceof TypeVariable<?> tv) {
+    	// we could use the variable's bounds, but that won't work if there are multiple.
+    	Type[] bounds = tv.getBounds();
+    	if (bounds.length == 1)
+    		return getRawType(bounds[0]);
       return Object.class;
 
     } else if (type instanceof WildcardType) {

rationale: it is a better approximation, and there is a chance that the deserialization would produce consistent data structure.

When approximating non-object type with Object, it is almost certain that the result is incorrect, leading to CCE during deserialization, or during accessing a polluted collection.

  1. I discovered another related problem:

In TypeAdapterRuntimeTypeWrapper the serialization sometimes replaces the declared type, with runtime type. The rule is that declared is used only when ReflectiveTypeAdapterFactory.Adapter is used for serialization of the value, i.e. default serialization is used, not custom.

I think this exception should be extended also for ReflectiveTypeAdapterFactory.RecordAdapter, which is only a different type of default serializer.

Now, when I serialize polymorphic type, whose implementations are records, the serialization fails to include type information needed to deserialize. This works fine when implementations are classes, as they are deserialized with ReflectiveTypeAdapterFactory.Adapter and the rule applies.

EDIT: sorry this turned out to be a false alarm.

I checked that these two modifications solve most of my problems which deserialization of generic structures.

What do you think?

@Marcono1234
Copy link
Collaborator

Marcono1234 commented Dec 6, 2023

For 1) maybe you could omit the bounds.length == 1 check and even return the first bound if there are > 1 bounds. That might improve results for serialization, and for deserialization it either works (for example when an adapter by accident happens to create instances which satisfy all bounds), or still fails with an exception at runtime.
But I am not completely sure. I am not a direct member of this repository though so I don't make the final decision how (or if) this should be changed.

For 2) maybe we should track that in a separate GitHub issue since this does not seem to be directly related to type variables, right? Though could you please provide a short example demonstrating where this makes a difference? It seems RecordAdapter extends Adapter, so I think your code change should not actually make a difference.

@eamonnmcmanus
Copy link
Member

@msevcenko would you be up for making a proof-of-concept PR? What I often do is run a proposed change past all of Google's internal tests for projects that use Gson (which is a lot of them). That tends to show up issues we might not have thought of. If the change doesn't break things then it does seem like it would make sense. At that point we'd want to clean things up, with thorough unit tests and the like.

I agree with @Marcono1234 that the record thing seems like a separate issue. Are you able to make some code where it does one thing and you think it should do another? That would be the basis for a new issue, and also a starting point for the test associated with a fix.

@msevcenko
Copy link
Author

For 1) maybe you could omit the bounds.length == 1 check and even return the first bound if there are > 1 bounds. That might improve results for serialization, and for deserialization it either works (for example when an adapter by accident happens to create instances which satisfy all bounds), or still fails with an exception at runtime.
But I am not completely sure.

I am not sure too. I just wanted to be conservative. You are right that giving up on a bound for whatever reason most certainly makes thinks worse. Since gson does not attempt to detect problems early and goes with Object anyway, with accepting any bound you probably won't make a mistake.

For 2) maybe we should track that in a separate GitHub

True, it is only loosely related, I'll file another report with separate unit test. I believe it is reproducible without fixing the first problem.

@msevcenko
Copy link
Author

@msevcenko would you be up for making a proof-of-concept PR?

I'll try to do that. I'll replace the instanceof variable as it is java 14 construct.

msevcenko added a commit to msevcenko/gson that referenced this issue Dec 7, 2023
msevcenko added a commit to msevcenko/gson that referenced this issue Dec 7, 2023
@msevcenko msevcenko linked a pull request Dec 7, 2023 that will close this issue
@msevcenko
Copy link
Author

msevcenko commented Dec 7, 2023

It seems RecordAdapter extends Adapter, so I think your code change should not actually make a difference.

Yes indeed. I must have made some mistake when experimenting with modified gson, ReflectiveTypeAdapterFactory.RecordAdapter is instanceof ReflectiveTypeAdapterFactory.Adapter. The change has no effect and is not needed.

@msevcenko msevcenko linked a pull request Dec 11, 2023 that will close this issue
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

Successfully merging a pull request may close this issue.

3 participants