Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Update ProGuard default shrinking rules #2448

Merged
merged 11 commits into from
Aug 22, 2023

Conversation

sgjesse
Copy link
Contributor

@sgjesse sgjesse commented Jul 26, 2023

This change deals correctly with classes without a no-args constructor. If a no-args constructor was not present the conditional keep rule would not trigger.

See https://issuetracker.google.com/150189783#comment11.

Copy link
Member

@eamonnmcmanus eamonnmcmanus left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks like ShrinkingIT needs to be updated?

@sgjesse
Copy link
Contributor Author

sgjesse commented Jul 26, 2023

Yes, and that makes sense. However, I cannot reproduce locally. If I run mvn clean verify I get 7 failures, but not the one on ShrinkingIT. What am I missing?

eamonnmcmanus added a commit to eamonnmcmanus/gson that referenced this pull request Jul 26, 2023
@eamonnmcmanus
Copy link
Member

Yes, and that makes sense. However, I cannot reproduce locally. If I run mvn clean verify I get 7 failures, but not the one on ShrinkingIT. What am I missing?

I was seeing the same thing with JDK 11 on Google's internal Linux. #2450 should fix it. With that PR, I am able to reproduce the test failure in that environment.

@Marcono1234
Copy link
Collaborator

This case of using <init>() vs. <init>(...) was what we were discussing in #2420 (comment) I think.

Out of curiosity, could you please explain a bit why you think it is needed?

In the end it is still Éamonn's decission whether this is included, but I am a bit afraid that with this change we contribute to people unwittingly relying on Unsafe, while we would have the chance here to indicate to them that their class is misconfigured (missing a no-args constructor).

@Marcono1234
Copy link
Collaborator

Marcono1234 commented Jul 26, 2023

If I run mvn clean verify I get 7 failures, but not the one on ShrinkingIT. What am I missing?

Just to be sure, you run this in the parent directory, right? ShrinkingIT is part of a separate Maven module, so it won't be executed if you run inside /gson/gson.

Edit: Nevermind, sorry the reason is the other test failures both of you mentioned. Probably due to that the Maven build already fails for the gson module and doesn't reach the shrinker-test module which contains ShrinkingIT.

@sgjesse
Copy link
Contributor Author

sgjesse commented Jul 27, 2023

After testing with the new configuration I realized that with just <init>() the conditional -if rule would not actually match classes without a no-args constructor - which was also what we discussed and decided on. However, for common code patterns data classes will not have a no-args constructor, e.g.:

Java

public class Data {
  @SerializedName("s")
  private final String s;
  public Data(String s) { this.s = s; }
}

Kotlin

data class Data(
  @field:SerializedName("s") val s: String
)

To get no-args constructors requires effectively assigning default values to all fields for these patterns (either directly for the field or assigning in an explicit no-args constructor), which I don't think is feasible to require, and then the provided rules will not work for many developers.

I is still an option for developers to add no-args constructors to not use sun.misc.Unsafe, and if present they will still be kept.

@sgjesse
Copy link
Contributor Author

sgjesse commented Jul 27, 2023

Tests updated, please review.

Copy link
Contributor

@sfreilich sfreilich left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Three other issues stand out to me here:

  • There are two rules about classes with @SerializedName fields, which I think could be consolidated into one -keepclasseswithmembers rule?
  • More of these -keep/-keepclassmembers declarations could use allowobfuscation, which should work fine with reflection that refers to things by symbol instead of string name (including these annotations).
  • Possibly this should have -keep,allowobfuscation class rules for the annotations used here. I think that can matter when there are multiple optimization passes? Not 100% sure, basing this on other examples that follow that pattern. Not sure if that would be best to do with a rule that matches com.google.gson.annotations.* or just JsonAdapter and SerializedName specifically.

@@ -65,6 +65,6 @@
# See also https://github.com/google/gson/pull/2420#discussion_r1241813541 for a more detailed explanation
-if class *
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this if class *, -classeswithmembers... class <1> doesn't do anything relative to -classeswithmembers... class * , after reading some of the follow-up threads on that change.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

At least for R8 it does seem to make a difference: -keepclasseswithmembers ... class * always keeps the class, even if it isn't used. Whereas -if class * ... -keepclasseswithmembers ... class <1> only keeps the class if it is actually used.

Unfortunately this is not covered by the tests yet; have created #2455 to add a test for this.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks also for the other points you raised, but unless sgjesse wants to address them here as well could you please create a separate GitHub issue or pull request? Otherwise the discussion on this pull request here might drift too far away from the original topic.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Changed the rules to be two -if rules

  1. Keep all @SerializedName fields for classes which are present after shrinking
  2. If a class with a @SerializedName field is present keep its no-args constructor

@sfreilich
Copy link
Contributor

Possibly this should have -keep,allowobfuscation class rules for the annotations used here.

Realized this bit probably doesn't matter, I was confused by examples for annotations that are just referenced by proguard config rules, and not actually otherwise referenced in the code.

@Marcono1234 Marcono1234 mentioned this pull request Jul 29, 2023
9 tasks
Copy link
Collaborator

@Marcono1234 Marcono1234 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@sgjesse, hmmm ok, let's hope then that those users don't encounter the pitfalls of JDK Unsafe (see documentation of GsonBuilder.disableJdkUnsafe()) (and also let's hope that neither the JDK nor Android remove the needed JDK Unsafe functionality any time soon).

Personal opinion on this (click to show)

Personally I am a bit indecisive on these changes; on one hand I can definitely see that it would be convenient for users if it just worked automatically. On the other hand, the problem is that JDK Unsafe usage is opt-out and I assume most users are not aware of it being used at all, nor what the implications of it are (if it was opt-in I wouldn't have a problem with that). If as user you want to use your classes with Gson, then you should (in my opinion) adjust them to follow Gson's requirements (or if necessary differentiate between application logic classes and DTO classes, or however you want to call them), and from that perspective I would appreciate any changes which make users less dependent on JDK Unsafe.

Though I am not a direct member of this project, so it is not up to me to decide this.


Thanks for the test changes, could you please adjust the comments which are now partially obsolete?

Also, ideally there would be a new test in shrinker-test's com.example.Main which covers the successful case of a class without no-args constructor but with fields annotated with @SerializedName, which is now supported with your changes.
If you don't want to do that I can do it in a follow-up PR.

Technically Troubleshooting.md is now also a bit outdated (e.g. it says "has a no-args constructor"), but I would leave it that way for now to encourage usage of a no-args constructor to avoid relying on JDK Unsafe.

@@ -65,6 +65,6 @@
# See also https://github.com/google/gson/pull/2420#discussion_r1241813541 for a more detailed explanation
-if class *
-keepclasseswithmembers,allowobfuscation,allowoptimization class <1> {
<init>();
<init>(...);
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please adjust the comment in lines 62 and 63 (GitHub does not let me comment there); for example reword it to:

# If a class is used in some way by the application and has fields annotated with @SerializedName,
# keep those fields and the constructors of the class
# For convenience this also matches classes without no-args constructor, in which case Gson uses JDK Unsafe

Comment on lines 20 to 21
// making class abstract); other constructors are ignored to suggest to user adding default
// constructor instead of implicitly relying on JDK Unsafe
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please adjust this comment (starting in line 19), for example:

// Specify explicit constructor with args to remove implicit no-args default constructor

The current comment mentioning ProGuard rules and JDK Unsafe is now obsolete.

@@ -20,7 +20,7 @@ static class TestClassNotAbstract {
// making class abstract); other constructors are ignored to suggest to user adding default
// constructor instead of implicitly relying on JDK Unsafe
static class TestClassWithoutDefaultConstructor {
@SerializedName("s")
// No @SerializedName annotation to avoid matching the consumer rules from gson.pro.
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is a good addition; could you please copy it to the other classes TestClass and TestClassNotAbstract above as well?

Though I think the trailing period should be omitted, it just looks a bit weird after the file name:

Suggested change
// No @SerializedName annotation to avoid matching the consumer rules from gson.pro.
// No @SerializedName annotation to avoid matching the consumer rules from gson.pro

@sfreilich
Copy link
Contributor

I'll file a PR for whatever @sgjesse doesn't want to handle here. We were just discussing that since I was trying to solve some obstacles to adopting these rules downstream, and we're colleagues, we share a downstream.

@Marcono1234
Copy link
Collaborator

@sgjesse, I have tried to address my review comments in sgjesse#1 now; I hope that is easier for you. That pull request targets your fix-proguard-rules branch, so in case you merge it, the changes should show up here as well.

Comment on lines 18 to 20
#-keep class com.example.ClassWithSerializedName {
# <init>(...);
#}
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is this needed?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good catch, no removed the rules in comments. Also removed the rule above, as that was also not needed. I probably added it while testing different rules.

Comment on lines 57 to 59
# Keep fields annotated with @SerializedName for classes which are present.
# If classes with fields annotated with @SerializedName have a no-args
# constructor keep that as well.
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why did you remove the link to the pull request discussion?

At least in my opinion this -if class * ... is pretty cryptic and its intended behavior is not directly obvious, so it would be good to have a reference explaining what this actually does.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good point, added those references back.

-if class *
-keepclasseswithmembers,allowobfuscation,allowoptimization class <1> {
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why did you remove allowoptimization?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Having allowoptimization can allow R8 to propagate field values, so if you have an explicit construction of a serialized class always setting a field to a specific value, then that single value can become the default value for the created instances instead of 0/null.

-if class *
-keepclasseswithmembers,allowobfuscation,allowoptimization class <1> {
<init>(...);
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is it intentional that you removed <init>(...) again and used <init>() below? My personal opinion in #2448 (review) was not blocking the merge of this pull request (sorry if I did not make that clear enough).

This change to use <init>() again might render some or all of the shrinker test changes of this pull request obsolete.
Edit: Maybe not, see comment below.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Or at least I am a bit surprised that no exception occurs for ClassWithoutDefaultConstructor; I assume it might be related to your newly added -keepclasseswithmembers.

So, do I understand it correctly that your new approach of using -keepclasseswithmembers achieves the same thing as your initial <init>(...) approach (i.e. Gson can instantiate classes without no-args constructor using Unsafe; instead of failing because R8 made the class abstract). But this new approach is better than the <init>(...) rule because that would have unnecessarily kept the constructors with args around, even though they are not actually used by Gson?

Comment on lines -32 to -35
# Keep fields with @SerializedName annotation, but allow obfuscation of their names
-keepclassmembers,allowobfuscation class * {
@com.google.gson.annotations.SerializedName <fields>;
}
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Out of curiosity, why did you remove this here and add it below guarded by a -if class *? Does that make a difference for the end result?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, in R8 full mode we have a distinction between referenced and instantiated classes. For classes which are not seen by R8 as instantiated (e.g. if only referenced statically using .class as one will do in fromJson then -keepclassmembers will only keep static members and not instance members. For ProGuard and R8 in compatability mode the two will do the same.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hmm thanks, ok that sounds like the previous rules would not have worked properly then because -keepclassmembers would not have kepts the non-static @SerializedName fields. Though I think the shrinker-test showed that it did work as expected?
I am bit confused, or am I misunderstanding your comment or are the shrinker-tests flawed and not close enough to real code?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

One way to see the difference is to have an instance field annotated with @SerializedName which is not used in the code at all, then the field will be removed with the previous rule in R8 full mode (which is default from AGP 8.0.0). Then in a pass through from fromJson to toJson the field value will be lost. Of course it is a matter of taste if unused fields should be kept this way.

Copy link
Contributor Author

@sgjesse sgjesse left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Addressed all comments.

@@ -65,6 +65,6 @@
# See also https://github.com/google/gson/pull/2420#discussion_r1241813541 for a more detailed explanation
-if class *
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Changed the rules to be two -if rules

  1. Keep all @SerializedName fields for classes which are present after shrinking
  2. If a class with a @SerializedName field is present keep its no-args constructor

Comment on lines -32 to -35
# Keep fields with @SerializedName annotation, but allow obfuscation of their names
-keepclassmembers,allowobfuscation class * {
@com.google.gson.annotations.SerializedName <fields>;
}
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, in R8 full mode we have a distinction between referenced and instantiated classes. For classes which are not seen by R8 as instantiated (e.g. if only referenced statically using .class as one will do in fromJson then -keepclassmembers will only keep static members and not instance members. For ProGuard and R8 in compatability mode the two will do the same.

Comment on lines 57 to 59
# Keep fields annotated with @SerializedName for classes which are present.
# If classes with fields annotated with @SerializedName have a no-args
# constructor keep that as well.
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good point, added those references back.

-if class *
-keepclasseswithmembers,allowobfuscation,allowoptimization class <1> {
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Having allowoptimization can allow R8 to propagate field values, so if you have an explicit construction of a serialized class always setting a field to a specific value, then that single value can become the default value for the created instances instead of 0/null.

Comment on lines 18 to 20
#-keep class com.example.ClassWithSerializedName {
# <init>(...);
#}
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good catch, no removed the rules in comments. Also removed the rule above, as that was also not needed. I probably added it while testing different rules.

Copy link
Collaborator

@Marcono1234 Marcono1234 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for your update and also for your work on this!

Though it looks like some of the changes in your last commit might not have been intentional (see comments)?

Also, I think #2448 (comment) is still open; just to clarify that nothing was overlooked.

See also #2448 (comment) (unfortunately GitHub does not show that comment down here as well, so it might be easy to miss).

@@ -22,6 +22,7 @@
# Keep class TypeToken (respectively its generic signature) if present
-if class com.google.gson.reflect.TypeToken
-keep,allowobfuscation class com.google.gson.reflect.TypeToken
#-keep class com.google.gson.reflect.TypeToken { *; }
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is this commented line intentional?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No removed.

-if class *
-keepclasseswithmembers,allowobfuscation class <1> {
@com.google.gson.annotations.SerializedName <fields>;
}
-if class * {
@com.google.gson.annotations.SerializedName <fields>;
}
-keepclassmembers,allowobfuscation class <1> {
-keepclassmembers,allowobfuscation,allowoptimization class <1> {
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I thought in #2448 (comment) you just explained that you intentionally removed allowoptimization, but now you are adding it back? Is that intentional?

(Or am I misunderstanding this?)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This rule is only covering the no-args constructor (<init>()), so here there is no issue with allowoptimization. For the rule above for the annotated fields it should not be there as otherwise R8 can do value propagation.

@@ -38,3 +38,5 @@
-keepclassmembernames class com.example.ClassWithVersionAnnotations {
<fields>;
}

-dontobfuscate
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is this intentional? I think this should not be necessary and there were explicit rules which preserved names where needed for the tests.
Though it looks like you (accidentally?) removed one in proguard.pro:

- -keepclassmembernames class com.example.NoSerializedNameMain$TestClassWithoutDefaultConstructor {
-   <fields>;
- }

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No, this was all for some testing. These changes should not have been there.

@sgjesse
Copy link
Contributor Author

sgjesse commented Aug 21, 2023

@Marcono1234 Thank you for being such an observant reviewer! Addressed my mistakes in my previous upload. Please take another look.

Copy link
Member

@eamonnmcmanus eamonnmcmanus left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this is close enough now. If there are further changes needed, we can make them in a follow-up PR.

@eamonnmcmanus eamonnmcmanus merged commit 7b5629b into google:main Aug 22, 2023
9 checks passed
@Marcono1234
Copy link
Collaborator

@Marcono1234 Thank you for being such an observant reviewer!

Thank you for all the work you have put into this, and all the detailed explanations! That was helpful for me to understand at least a bit better how R8 works. And hopefully it will also be useful for future reference.

The changes look good, so I don't think there is any direct follow-up PR needed.

@sfreilich
Copy link
Contributor

@Marcono1234, @sgjesse, @eamonnmcmanus: Thank you all for your work on this, I really appreciate it!

@Marcono1234 Marcono1234 added the proguard-r8 Issues relating to the use of ProGuard and/or R8, such as problems due to obfuscation label Mar 29, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
proguard-r8 Issues relating to the use of ProGuard and/or R8, such as problems due to obfuscation
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

4 participants