Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

😲 Emoji characters in keys & values in v4 are lost/corrupted #814

Open
elasticdotventures opened this issue May 13, 2021 · 8 comments
Open
Labels

Comments

@elasticdotventures
Copy link

elasticdotventures commented May 13, 2021

$ yq -V
yq version 4.7.0

The way yq v4 handles emoji is odd, inconsistent, unpredictable (which did not occur on earlier yq 2x versions which had other limitations)

yq should (imho) pass utf8/emoji through unmolested. yq works properly with pinyin (chinese mandarin) fonts but ideograms are so much more powerful and universal it'd be nice to use them with.

For example let's say emojifile.yaml with contents:

---
"bash.🔨/init.10级.🥾.b00t.sh": ""
"bash.🔨/init.20级.🐧.linux.sh": ""
"bash.🔨/init.22级.🐙.git.sh": ""
"bash.🔨/init.30级.🐳.层.docker.sh": ""
"bash.🔨/init.32级.💠.层.hashicorp.sh": ""
"bash.🔨/init.40级.🐍.语.python.sh": ""
"bash.🔨/init.40级.🚀.语.node.sh": ""
"bash.🔨/init.42级.🦄.语.typescript.sh": ""
"bash.🔨/init.43级.🥷.语.vue.sh": ""
"bash.🔨/init.44级.☕.语.java.sh": ""
"bash.🔨/init.44级.🏇.语.go.sh": ""
"bash.🔨/init.50级.👾.云☁️.gcp.sh": ""
"bash.🔨/init.50级.🤖.云☁️.azure.sh": ""
"bash.🔨/init.50级.🦉.云☁️.aws.sh": ""
"bash.🔨/init.60级.🎙️💙.应用.vscode.sh": ""
"bash.🔨/init.70级.☎️.msg.sh": ""
"bash.🔨/init.70级.🎬.video.sh": ""
"bash.🔨/init.70级.📱.mobile.sh": ""
"bash.🔨/init.70级.🕹️.gamesim.sh": ""
"bash.🔨/init.70级.🤑.ecommerce.sh": ""
"bash.🔨/init.70级.🥯.crypto.sh": ""
"bash.🔨/init.70级.🧠.ai.sh": ""
"bash.🔨/init.80级.🐱‍💻.esp32.sh": ""

then

$ cat emojifile.yaml | yq eval

will produce (on my ubuntu system)

"bash.\/init.10级.\.b00t.sh": ""
"bash.\/init.20级.\.linux.sh": ""
"bash.\/init.22级.\.git.sh": ""
"bash.\/init.30级.\.层.docker.sh": ""
"bash.\/init.32级.\.层.hashicorp.sh": ""
"bash.\/init.40级.\.语.python.sh": ""
"bash.\/init.40级.\.语.node.sh": ""
"bash.\/init.42级.\.语.typescript.sh": ""
"bash.\/init.43级.\.语.vue.sh": ""
"bash.\/init.44级.☕.语.java.sh": ""
"bash.\/init.44级.\.语.go.sh": ""
"bash.\/init.50级.\.云☁️.gcp.sh": ""
"bash.\/init.50级.\.云☁️.azure.sh": ""
"bash.\/init.50级.\.云☁️.aws.sh": ""
"bash.\/init.60级.\️\.应用.vscode.sh": ""
"bash.\/init.70级.☎️.msg.sh": ""
"bash.\/init.70级.\.video.sh": ""
"bash.\/init.70级.\.mobile.sh": ""
"bash.\/init.70级.\️.gamesim.sh": ""
"bash.\/init.70级.\.ecommerce.sh": ""
"bash.\/init.70级.\.crypto.sh": ""
"bash.\/init.70级.\.ai.sh": ""
"bash.\/init.80级.\‍\.esp32.sh": ""

This is for b00t framework.

@elasticdotventures
Copy link
Author

cat emojifile.yaml | yq eval -M

"bash.\U0001F528/init.10级.\U0001F97E.b00t.sh": ""
"bash.\U0001F528/init.20级.\U0001F427.linux.sh": ""
"bash.\U0001F528/init.22级.\U0001F419.git.sh": ""
"bash.\U0001F528/init.30级.\U0001F433.层.docker.sh": ""
"bash.\U0001F528/init.32级.\U0001F4A0.层.hashicorp.sh": ""
"bash.\U0001F528/init.40级.\U0001F40D.语.python.sh": ""
"bash.\U0001F528/init.40级.\U0001F680.语.node.sh": ""
"bash.\U0001F528/init.42级.\U0001F984.语.typescript.sh": ""
"bash.\U0001F528/init.43级.\U0001F977.语.vue.sh": ""
"bash.\U0001F528/init.44级.☕.语.java.sh": ""
"bash.\U0001F528/init.44级.\U0001F3C7.语.go.sh": ""
"bash.\U0001F528/init.50级.\U0001F47E.云☁️.gcp.sh": ""
"bash.\U0001F528/init.50级.\U0001F916.云☁️.azure.sh": ""
"bash.\U0001F528/init.50级.\U0001F989.云☁️.aws.sh": ""
"bash.\U0001F528/init.60级.\U0001F399️\U0001F499.应用.vscode.sh": ""
"bash.\U0001F528/init.70级.☎️.msg.sh": ""
"bash.\U0001F528/init.70级.\U0001F3AC.video.sh": ""
"bash.\U0001F528/init.70级.\U0001F4F1.mobile.sh": ""
"bash.\U0001F528/init.70级.\U0001F579️.gamesim.sh": ""
"bash.\U0001F528/init.70级.\U0001F911.ecommerce.sh": ""
"bash.\U0001F528/init.70级.\U0001F96F.crypto.sh": ""
"bash.\U0001F528/init.70级.\U0001F9E0.ai.sh": ""
"bash.\U0001F528/init.80级.\U0001F431‍\U0001F4BB.esp32.sh": ""

@elasticdotventures
Copy link
Author

elasticdotventures commented May 13, 2021

BUT -j (json) apparently works

$ cat emojifile.yaml | yq eval -j
{
  "bash.🔨/init.10级.🥾.b00t.sh": "",
  "bash.🔨/init.20级.🐧.linux.sh": "",
  "bash.🔨/init.22级.🐙.git.sh": "",
  "bash.🔨/init.30级.🐳.层.docker.sh": "",
  "bash.🔨/init.32级.💠.层.hashicorp.sh": "",
  "bash.🔨/init.40级.🐍.语.python.sh": "",
  "bash.🔨/init.40级.🚀.语.node.sh": "",
  "bash.🔨/init.42级.🦄.语.typescript.sh": "",
  "bash.🔨/init.43级.🥷.语.vue.sh": "",
  "bash.🔨/init.44级.☕.语.java.sh": "",
  "bash.🔨/init.44级.🏇.语.go.sh": "",
  "bash.🔨/init.50级.👾.云☁️.gcp.sh": "",
  "bash.🔨/init.50级.🤖.云☁️.azure.sh": "",
  "bash.🔨/init.50级.🦉.云☁️.aws.sh": "",
  "bash.🔨/init.60级.🎙️💙.应用.vscode.sh": "",
  "bash.🔨/init.70级.☎️.msg.sh": "",
  "bash.🔨/init.70级.🎬.video.sh": "",
  "bash.🔨/init.70级.📱.mobile.sh": "",
  "bash.🔨/init.70级.🕹️.gamesim.sh": "",
  "bash.🔨/init.70级.🤑.ecommerce.sh": "",
  "bash.🔨/init.70级.🥯.crypto.sh": "",
  "bash.🔨/init.70级.🧠.ai.sh": "",
  "bash.🔨/init.80级.🐱‍💻.esp32.sh": ""
}

@elasticdotventures
Copy link
Author

Just confirmed same behavior on yq 4.8.0

@elasticdotventures
Copy link
Author

Just confirmed that the "other" yq project works properly with Emoji.
https://github.com/kislyuk/yq

When I said "earlier" versions worked, that was incorrect.
I didn't realize I'd switched repos.

@mikefarah
Copy link
Owner

Digging a little into this - and as far as I can tell it's an issue with go-yaml, the underlying yaml parser :(

go-yaml/yaml#279

Not sure if I'll be able to work around it

@mikefarah
Copy link
Owner

Raised a new issue here: go-yaml/yaml#737

@mikefarah
Copy link
Owner

Note that '-j' works because the issue is with the yaml Encoder and the json encoder works fine.

@zhangguanzhang
Copy link

if use shell, could used this command

tr -cd '\11\12\15\40-\176' < 1.yml  > new.yml

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests

3 participants