Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

dyff stutters when diffing some multiline documents #264

Open
dhduvall opened this issue Nov 17, 2022 · 8 comments
Open

dyff stutters when diffing some multiline documents #264

dhduvall opened this issue Nov 17, 2022 · 8 comments
Assignees
Labels
bug Something isn't working

Comments

@dhduvall
Copy link

I have files with these contents:

top: |
  object:
    aa_aaa: true
    bbb_bb_bbbbbbb: true
    ccccccc_ccccccc_ccccccc: true
  is_longenough123:
    # Stupid comment one
    # Stupid comment two
    # Stupid comment three

and

top: |
  is_longenough123:
    - name
  object:
    aa_aaa: true
    bbb_bb_bbbbbbb: true
    ccccccc_ccccccc_ccccccc: true

If I dyff them, I get

     _        __  __
   _| |_   _ / _|/ _|  between dyffbug-a.yaml
 / _' | | | | |_| |_       and dyffbug-b.yaml
| (_| | |_| |  _|  _|
 \__,_|\__, |_| |_|   returned one difference
        |___/

top
  ± value change in multiline text (three inserts, one deletion)
      is_longenough123:
      object:
      object:
      object:
        aa_aaa: true
      object:
        bbb_bb_bbbbbbb: true
      object:
        ccccccc_ccccccc_ccccccc: true
      is_longenough123:
        # Stupid comment one
        # Stupid comment two
        # Stupid comment three

broken-dyff
which I'm pretty sure is not right.

The issue is very sensitive to the contents of the string. I haven't been able to get it to reproduce with fewer than three lines in both object and is_longenough123, or changing the keys very much. The keys in object can't be shortened (or at least not much; I haven't tried all possible combinations) either, which is weird. Other than the length, it doesn't seem like the key names matter.

For instance, if I drop the ccccccc_ccccccc_ccccccc key, I get what I'd expect:

top
  ± value change in multiline text (one insert, one deletion)
      is_longenough123:
        - name
      object:
        aa_aaa: true
        bbb_bb_bbbbbbb: true
      is_longenough123:
        # Stupid comment one
        # Stupid comment two
        # Stupid comment three

working-dyff

(Screenshots are necessary, since the colors are what actually indicate what dyff thinks is being added or removed, and the problem goes away with --color=no. The blue bits are just iTerm2 being helpful.)

@HeavyWombat
Copy link
Member

Thanks for letting me know. I do not know where this happens, but I have a feeling. Support for comments was definitely done in a later iteration when I changed the YAML libraries. So this sounds very much like a gap in the implementation there.

@HeavyWombat
Copy link
Member

By the way, regarding screenshots, I hope this does not sound like cheap advertisement, but these kind of screenshot were actually the reason why I started to write https://github.com/homeport/termshot. You can give it a try, if you want.

@dhduvall
Copy link
Author

I saw that utility, but I think maybe not until after I filed this. Looks nice.

@com6056
Copy link

com6056 commented Sep 8, 2023

👋 any updates on getting to the bottom of this? thanks! 🎉

@HeavyWombat
Copy link
Member

Thanks for the nudge. I checked with the latest version of dyff and the output looks different from the original bug report.
image

From what I can see in the provided files, this kind of looks as expected. In this case dyff compares a multiline string (because of the pipe character for top) in YAML. For these scenarios, it will try to highlight the differences within the text using the same familiar green and red colors. Visually comparing them, it returns what I expect: There are new lines added and some removed.

I do not recall what exactly, but there must have been a fix between the time the issue was opened and now, which addressed the weird stutter when rendering the differences in this multiline output.

@HeavyWombat HeavyWombat self-assigned this Sep 10, 2023
@com6056
Copy link

com6056 commented Sep 11, 2023

Interesting, we still run into it periodically with multiline text (mostly things like ConfigMap values). Here's a repro you can use for debugging, there's still some cases where we get the weird behavior:

I ran prettier against the a portion of a shell script we have inside a ConfigMap and it shows up pretty weirdly in the dyff output:

image

a.yaml:

kind: ConfigMap
apiVersion: v1
metadata:
  name: foo
  namespace: bar
data:
  baz.sh: |
    HOSTNAME="$(hostname)"
    INDEX="${HOSTNAME##*-}"
    DISCOVERY_IPS="$(nslookup ${SERVICE}${JOIN_DOMAIN} | grep -A 1 ^Name | grep ^Address | cut -f2 -d':' | cut -f2 -d' ')"
    # Validate the IPs or exit with error so we don't incorrectly set ordinal 0 pod to primary by default
    for IP in $DISCOVERY_IPS; do
        echo "checking $IP is valid..."
        echo $IP | grep -qE '^[0-9]{1,3}\.[0-9]{1,3}\.[0-9]{1,3}\.[0-9]{1,3}$' && echo ok || exit 1
    done
    MASTER="$(for IP in ${DISCOVERY_IPS};do timeout 5 redis-cli -h ${IP} -p 26379 sentinel get-master-addr-by-name ${SERVICE}-master | grep -E '[0-9]{1,3}\.[0-9]{1,3}\.[0-9]{1,3}\.[0-9]{1,3}' && break;done)"
    MASTER_GROUP="${SERVICE}-master"
    REDIS_CONF=/data/conf/redis.conf
    REDIS_PORT=6379
    SENTINEL_CONF=/data/conf/sentinel.conf
    SENTINEL_PORT=26379
    set -eu

    setup_defaults() {
        echo "Setting up defaults"
        if [ "$INDEX" = "0" ]; then
            echo "Setting this pod as the default master"
            redis_update "$ANNOUNCE_IP"
            sentinel_update "$ANNOUNCE_IP"
            sed -i "s/^.*slaveof.*//" "$REDIS_CONF"
        else
            DEFAULT_MASTER="$(getent hosts "$SERVICE-0.$SERVICE" | awk '{ print $1 }')"
            if [ -z "$DEFAULT_MASTER" ]; then
                echo "Unable to resolve host"
                exit 1
            fi
            echo "Setting default slave config.."
            redis_update "$DEFAULT_MASTER"
            sentinel_update "$DEFAULT_MASTER"
        fi
    }

b.yaml:

kind: ConfigMap
apiVersion: v1
metadata:
  name: foo
  namespace: bar
data:
  baz.sh: |
    HOSTNAME="$(hostname)"
    INDEX="${HOSTNAME##*-}"
    DISCOVERY_IPS="$(nslookup ${SERVICE}${JOIN_DOMAIN} | grep -A 1 ^Name | grep ^Address | cut -f2 -d':' | cut -f2 -d' ')"
    # Validate the IPs or exit with error so we don't incorrectly set ordinal 0 pod to primary by default
    for IP in $DISCOVERY_IPS; do
      echo "checking $IP is valid..."
      echo $IP | grep -qE '^[0-9]{1,3}\.[0-9]{1,3}\.[0-9]{1,3}\.[0-9]{1,3}$' && echo ok || exit 1
    done
    MASTER="$(for IP in ${DISCOVERY_IPS}; do timeout 5 redis-cli -h ${IP} -p 26379 sentinel get-master-addr-by-name ${SERVICE}-master | grep -E '[0-9]{1,3}\.[0-9]{1,3}\.[0-9]{1,3}\.[0-9]{1,3}' && break; done)"
    MASTER_GROUP="${SERVICE}-master"
    REDIS_CONF=/data/conf/redis.conf
    REDIS_PORT=6379
    SENTINEL_CONF=/data/conf/sentinel.conf
    SENTINEL_PORT=26379
    set -eu

    setup_defaults() {
      echo "Setting up defaults"
      if [ "$INDEX" = "0" ]; then
        echo "Setting this pod as the default master"
        redis_update "$ANNOUNCE_IP"
        sentinel_update "$ANNOUNCE_IP"
        sed -i "s/^.*slaveof.*//" "$REDIS_CONF"
      else
        DEFAULT_MASTER="$(getent hosts "$SERVICE-0.$SERVICE" | awk '{ print $1 }')"
        if [ -z "$DEFAULT_MASTER" ]; then
          echo "Unable to resolve host"
          exit 1
        fi
        echo "Setting default slave config.."
        redis_update "$DEFAULT_MASTER"
        sentinel_update "$DEFAULT_MASTER"
      fi
    }

@HeavyWombat
Copy link
Member

Interesting. Thanks for the files to reproduce it. I will have a look.

@silverraven691
Copy link

Dealt with this issue by splitting ConfigMaps off and using diff on them for now

Using casey/just and mikefarah/yq

tmpdir := `mktemp -d`

diff-manifests:
    # generate your manifests in {{tmpdir / "from"}} and {{tmpdir / "to"}}
    mkdir {{tmpdir / "from-cm-only"}} {{tmpdir / "to-cm-only"}}
    yq 'select(.kind != "ConfigMap")' {{tmpdir / "from"}} > {{tmpdir / "from-no-cm"}}
    yq 'select(.kind != "ConfigMap")' {{tmpdir / "to"}} > {{tmpdir / "to-no-cm"}}
    yq 'select(.kind == "ConfigMap") | sort_keys(..)' \
        -s 'filename + "-cm-only/" + .metadata.namespace + ":" + .metadata.name + ".yaml"' \
        {{tmpdir / "from"}} {{tmpdir / "to"}}
    dyff between -i --color=on {{tmpdir / "from-no-cm"}} {{tmpdir / "to-no-cm"}}
    diff -u --color=always {{tmpdir / "from-cm-only"}} {{tmpdir / "to-cm-only"}} || true

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working
Projects
None yet
Development

No branches or pull requests

4 participants