Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Using paddedCell, spotless is not idempotent (check passed but apply is not a noop) #338

Closed
6 tasks done
kennknowles opened this issue Jan 16, 2019 · 8 comments · Fixed by #455
Closed
6 tasks done
Labels

Comments

@kennknowles
Copy link

On apache/beam#7505 ./gradlew spotlessCheck passed (you can see it in the Jenkins statuses) but ./gradlew spotlessApply is not a noop, we discovered afterwards.

I confirmed that with paddedCell disabled spotlessCheck failed.

Gradle version: 4.10.3
Spotless plugin version: 3.16.0
GoogleJavaFormat: 1.7
Operating system & version: various Mac & Linux
Configuration block (also easy to see in the PR):

def disableSpotlessCheck = project.hasProperty('disableSpotlessCheck')
spotless {
  enforceCheck !disableSpotlessCheck
  java {
    licenseHeader javaLicenseHeader
    googleJavaFormat()

    // Details see: https://github.com/diffplug/spotless/blob/master/PADDEDCELL.md
    paddedCell()
  }
}

Errors emitted: none

If you are submitting a bug, please include the following:

  • summary of problem
  • gradle or maven version
  • spotless version
  • operating system and version
  • copy-paste your full Spotless configuration block(s), and a link to a public git repo that reproduces the problem if possible
  • copy-paste the full content of any console errors emitted by gradlew spotless[Apply/Check] --stacktrace

If you're just submitting a feature request or question, no need for the above.

@nedtwigg
Copy link
Member

you can see it in the Jenkins statuses

Can you post an absolute link to a failed commit? I cloned https://github.com/kennknowles/beam and checked out spotless (165399), and I can't reproduce on mac.

@nedtwigg
Copy link
Member

Closing due to inactivity. Happy to reopen with a reproducible testcase :)

@kennknowles
Copy link
Author

Try d05da8cf3dd74b2e5496ca85c10f0965c68cadf5. I just repro'd on a mac. Just clone, check out that commit and ./gradlew spotlessCheck passes but ./gradlew spotlessApply changes a ton of files.

@nedtwigg nedtwigg reopened this Feb 10, 2019
@nedtwigg nedtwigg added bug and removed question labels Feb 10, 2019
@nedtwigg
Copy link
Member

Hmm, very interesting. I just checkout that commit. spotlessCheck passes, and spotlessApply causes no changes. Can you copy-paste the output of git diff, or at least some of the output if it's huge?

@kennknowles
Copy link
Author

Sure thing. Btw I think due to gradle cache it doesn't repro after the first time. I rm'd .gradle and it reproduced again.

The diff touches 215 files so I'll just give a few of them:

diff --git a/examples/java/src/main/java/org/apache/beam/examples/WindowedWordCount.java b/examples/java/src/main/java/org/apache/beam/examples/WindowedWordCount.java
index 4a516bb054..100869ae90 100644
--- a/examples/java/src/main/java/org/apache/beam/examples/WindowedWordCount.java
+++ b/examples/java/src/main/java/org/apache/beam/examples/WindowedWordCount.java
@@ -186,7 +186,8 @@ public class WindowedWordCount {
         pipeline
             /* Read from the GCS file. */
             .apply(TextIO.read().from(options.getInputFile()))
-            // Concept #2: Add an element timestamp, using an artificial time just to show windowing.
+            // Concept #2: Add an element timestamp, using an artificial time just to show
+            // windowing.
             // See AddTimestampFn for more detail on this.
             .apply(ParDo.of(new AddTimestampFn(minTimestamp, maxTimestamp)));
 
diff --git a/examples/java/src/main/java/org/apache/beam/examples/common/ExampleUtils.java b/examples/java/src/main/java/org/apache/beam/examples/common/ExampleUtils.java
index 6bcae8da28..14a11b71b5 100644
--- a/examples/java/src/main/java/org/apache/beam/examples/common/ExampleUtils.java
+++ b/examples/java/src/main/java/org/apache/beam/examples/common/ExampleUtils.java
@@ -224,7 +224,8 @@ public class ExampleUtils {
             Transport.getJsonFactory(),
             chainHttpRequestInitializer(
                 options.getGcpCredential(),
-                // Do not log 404. It clutters the output and is possibly even required by the caller.
+                // Do not log 404. It clutters the output and is possibly even required by the
+                // caller.
                 new RetryHttpRequestInitializer(ImmutableList.of(404))))
         .setApplicationName(options.getAppName())
         .setGoogleClientRequestInitializer(options.getGoogleApiTrace());
@@ -237,7 +238,8 @@ public class ExampleUtils {
             Transport.getJsonFactory(),
             chainHttpRequestInitializer(
                 options.getGcpCredential(),
-                // Do not log 404. It clutters the output and is possibly even required by the caller.
+                // Do not log 404. It clutters the output and is possibly even required by the
+                // caller.
                 new RetryHttpRequestInitializer(ImmutableList.of(404))))
         .setRootUrl(options.getPubsubRootUrl())
         .setApplicationName(options.getAppName())
diff --git a/examples/java/src/main/java/org/apache/beam/examples/cookbook/TriggerExample.java b/examples/java/src/main/java/org/apache/beam/examples/cookbook/TriggerExample.java
index 9a3bfc26d5..bc6541341b 100644
--- a/examples/java/src/main/java/org/apache/beam/examples/cookbook/TriggerExample.java
+++ b/examples/java/src/main/java/org/apache/beam/examples/cookbook/TriggerExample.java
@@ -122,7 +122,7 @@ import org.joda.time.Instant;
  * and then exits.
  */
 public class TriggerExample {
-  //Numeric value of fixed window duration, in minutes
+  // Numeric value of fixed window duration, in minutes
   public static final int WINDOW_DURATION = 30;
   // Constants used in triggers.
   // Speeding up ONE_MINUTE or FIVE_MINUTES helps you get an early approximation of results.
@@ -189,18 +189,22 @@ public class TriggerExample {
               .apply(
                   "Default",
                   Window
-                      // The default window duration values work well if you're running the default input
+                      // The default window duration values work well if you're running the default
+                      // input
                       // file. You may want to adjust the window duration otherwise.
                       .<KV<String, Integer>>into(
                           FixedWindows.of(Duration.standardMinutes(windowDuration)))
-                      // The default trigger first emits output when the system's watermark passes the end
+                      // The default trigger first emits output when the system's watermark passes
+                      // the end
                       // of the window.
                       .triggering(Repeatedly.forever(AfterWatermark.pastEndOfWindow()))
                       // Late data is dropped
                       .withAllowedLateness(Duration.ZERO)
                       // Discard elements after emitting each pane.
-                      // With no allowed lateness and the specified trigger there will only be a single
-                      // pane, so this doesn't have a noticeable effect. See concept 2 for more details.
+                      // With no allowed lateness and the specified trigger there will only be a
+                      // single
+                      // pane, so this doesn't have a noticeable effect. See concept 2 for more
+                      // details.
                       .discardingFiredPanes())
               .apply(new TotalFlow("default"));
 
@@ -229,7 +233,8 @@ public class TriggerExample {
                           FixedWindows.of(Duration.standardMinutes(windowDuration)))
                       // Late data is emitted as it arrives
                       .triggering(Repeatedly.forever(AfterWatermark.pastEndOfWindow()))
-                      // Once the output is produced, the pane is dropped and we start preparing the next
+                      // Once the output is produced, the pane is dropped and we start preparing the
+                      // next
                       // pane for the window
                       .discardingFiredPanes()
                       // Late data is handled up to one day
@@ -264,8 +269,10 @@ public class TriggerExample {
                               AfterProcessingTime.pastFirstElementInPane()
                                   // Speculative every ONE_MINUTE
                                   .plusDelayOf(ONE_MINUTE)))
-                      // After emitting each pane, it will continue accumulating the elements so that each
-                      // approximation includes all of the previous data in addition to the newly arrived
+                      // After emitting each pane, it will continue accumulating the elements so
+                      // that each
+                      // approximation includes all of the previous data in addition to the newly
+                      // arrived
                       // data.
                       .accumulatingFiredPanes()
                       .withAllowedLateness(ONE_DAY))
@@ -414,7 +421,7 @@ public class TriggerExample {
         return;
       }
       if (laneInfo.length < VALID_NUM_FIELDS) {
-        //Skip the invalid input.
+        // Skip the invalid input.
         return;
       }
       String freeway = laneInfo[2];
diff --git a/examples/java/src/test/java/org/apache/beam/examples/complete/game/LeaderBoardTest.java b/examples/java/src/test/java/org/apache/beam/examples/complete/game/LeaderBoardTest.java
index 6dd112f5e2..962d5fae13 100644
--- a/examples/java/src/test/java/org/apache/beam/examples/complete/game/LeaderBoardTest.java
+++ b/examples/java/src/test/java/org/apache/beam/examples/complete/game/LeaderBoardTest.java
@@ -129,14 +129,16 @@ public class LeaderBoardTest implements Serializable {
             .addElements(
                 event(TestUser.BLUE_ONE, 3, Duration.standardSeconds(3)),
                 event(TestUser.BLUE_ONE, 2, Duration.standardMinutes(1)))
-            // Some time passes within the runner, which causes a speculative pane containing the blue
+            // Some time passes within the runner, which causes a speculative pane containing the
+            // blue
             // team's score to be emitted
             .advanceProcessingTime(Duration.standardMinutes(10))
             .addElements(event(TestUser.RED_TWO, 5, Duration.standardMinutes(3)))
             // Some additional time passes and we get a speculative pane for the red team
             .advanceProcessingTime(Duration.standardMinutes(12))
             .addElements(event(TestUser.BLUE_TWO, 3, Duration.standardSeconds(22)))
-            // More time passes and a speculative pane containing a refined value for the blue pane is
+            // More time passes and a speculative pane containing a refined value for the blue pane
+            // is
             // emitted
             .advanceProcessingTime(Duration.standardMinutes(10))
             // Some more events occur
@@ -238,7 +240,8 @@ public class LeaderBoardTest implements Serializable {
                 event(TestUser.RED_TWO, 2, Duration.ZERO),
                 event(TestUser.RED_TWO, 5, Duration.standardMinutes(1)),
                 event(TestUser.RED_TWO, 3, Duration.standardMinutes(3)))
-            // A late refinement is emitted due to the advance in processing time, but the window has
+            // A late refinement is emitted due to the advance in processing time, but the window
+            // has
             // not yet closed because the watermark has not advanced
             .advanceProcessingTime(Duration.standardMinutes(12))
             // These elements should appear in the final pane
@@ -303,7 +306,8 @@ public class LeaderBoardTest implements Serializable {
                     .plus(ALLOWED_LATENESS)
                     .plus(TEAM_WINDOW_DURATION)
                     .plus(Duration.standardMinutes(1)))
-            // These elements within the expired window are droppably late, and will not appear in the
+            // These elements within the expired window are droppably late, and will not appear in
+            // the
             // output
             .addElements(
                 event(
diff --git a/runners/apex/src/main/java/org/apache/beam/runners/apex/TestApexRunner.java b/runners/apex/src/main/java/org/apache/beam/runners/apex/TestApexRunner.java
index bd10307f7e..c53f48f0ff 100644
--- a/runners/apex/src/main/java/org/apache/beam/runners/apex/TestApexRunner.java
+++ b/runners/apex/src/main/java/org/apache/beam/runners/apex/TestApexRunner.java
@@ -33,7 +33,7 @@ public class TestApexRunner extends PipelineRunner<ApexRunnerResult> {
 
   private TestApexRunner(ApexPipelineOptions options) {
     options.setEmbeddedExecution(true);
-    //options.setEmbeddedExecutionDebugMode(false);
+    // options.setEmbeddedExecutionDebugMode(false);
     this.delegate = ApexRunner.fromOptions(options);
   }
 
diff --git a/runners/apex/src/main/java/org/apache/beam/runners/apex/translation/ParDoTranslator.java b/runners/apex/src/main/java/org/apache/beam/runners/apex/translation/ParDoTranslator.java
index ca1c7ffa24..eb625e4acc 100644
--- a/runners/apex/src/main/java/org/apache/beam/runners/apex/translation/ParDoTranslator.java
+++ b/runners/apex/src/main/java/org/apache/beam/runners/apex/translation/ParDoTranslator.java
@@ -77,9 +77,7 @@ class ParDoTranslator<InputT, OutputT>
     List<PCollectionView<?>> sideInputs = transform.getSideInputs();
 
     Map<TupleTag<?>, Coder<?>> outputCoders =
-        outputs
-            .entrySet()
-            .stream()
+        outputs.entrySet().stream()
             .filter(e -> e.getValue() instanceof PCollection)
             .collect(
                 Collectors.toMap(e -> e.getKey(), e -> ((PCollection) e.getValue()).getCoder()));
@@ -138,9 +136,7 @@ class ParDoTranslator<InputT, OutputT>
       List<PCollectionView<?>> sideInputs = transform.getSideInputs();
 
       Map<TupleTag<?>, Coder<?>> outputCoders =
-          outputs
-              .entrySet()
-              .stream()
+          outputs.entrySet().stream()
               .filter(e -> e.getValue() instanceof PCollection)
               .collect(
                   Collectors.toMap(e -> e.getKey(), e -> ((PCollection) e.getValue()).getCoder()));
@@ -221,8 +217,8 @@ class ParDoTranslator<InputT, OutputT>
           .getWindowingStrategy()
           .equals(firstSideInput.getWindowingStrategy())) {
         // TODO: check how to handle this in stream codec
-        //String msg = "Multiple side inputs with different window strategies.";
-        //throw new UnsupportedOperationException(msg);
+        // String msg = "Multiple side inputs with different window strategies.";
+        // throw new UnsupportedOperationException(msg);
         LOG.warn(
             "Side inputs union with different windowing strategies {} {}",

@nedtwigg
Copy link
Member

I rm'd .gradle and it reproduced again.

Aha! I've got it reproducing now also. Here are my observations:

  • delete .gradle, check passes, apply is a no-op, this is all as expected.
  • delete .gradle, apply causes a lot of changes, check says everything is fine, which would be expected if it weren't for the previous result...

It seems that check/apply is kind of self-consistent, but it shouldn't matter which is called first. Thanks very much for this important bug, it is certain to unmask something interesting.

I haven't dug into the code yet, but here's my guess: one of the ways a formatter can fail at idempotence is a cycle. e.g. a line is too short, so it rewraps long. Now the line is too long, so it rewraps short. In this case, PaddedCell has to make an arbitrary decision as to what the "canonical" form ought to be. We make this decision like this:

case CONVERGE: return steps.get(steps.size() - 1);
case CYCLE: return Collections.min(steps, Comparator.comparing(String::length).thenComparing(Function.identity()));
case DIVERGE: throw new IllegalArgumentException("No canonical form for a diverging result");

I'm guessing that somehow this isn't doing what we expect, so that the cycle's canonical form depends on where the file's current state is in the cycle when we first look at it. I don't see how that could be the case, but that's my best guess. Not sure if I'll have time to fix this for our next release, but we'll definitely fix it.

@nedtwigg
Copy link
Member

Fixed in 3.24.3. Sorry it took so long to get this resolved, thanks for isolating the problem.

@kennknowles
Copy link
Author

Nice! Thanks!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

Successfully merging a pull request may close this issue.

2 participants