Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

taskcluster d2g subcommand to support yaml #6731

Open
petemoore opened this issue Dec 7, 2023 · 10 comments · May be fixed by #6765
Open

taskcluster d2g subcommand to support yaml #6731

petemoore opened this issue Dec 7, 2023 · 10 comments · May be fixed by #6765
Labels
bug Something isn't working good first issue Good for newcomers

Comments

@petemoore
Copy link
Member

petemoore commented Dec 7, 2023

We now point users at the d2g subcommand in task logs if they use a docker worker payload. It is likely many users will now look at this utility and start using it, once this lands on firefox-ci.

I suspect several will stumble / give up, when they see that .taskcluster.yml file is yaml, and they need to submit json.

I think it would be reasonable for the d2g subcommand to automatically switch between json/yaml depending on what it receives. If it receives yaml, it outputs yaml back, and vice versa. We could also add a command line option to switch format, but that already might be overkill. Presumably if you have a task definition in one format, and want to change it, you are happy to receive the output in the same format that you sent it, because in all likelihood, you will just replace the input with the output in the original file you got it from. And pretty much that is always yaml, so it would be nice to remove the burden for users to perform this additional translation manually, and just do it in the tool directly.

In Generic Worker we already support converting between yaml and json, so could be some useful code snippets there to see how to implement this.

@petemoore petemoore added bug Something isn't working good first issue Good for newcomers labels Dec 7, 2023
@anirudh-pandey
Copy link

anirudh-pandey commented Dec 27, 2023

Hi @petemoore*, I want to pick this bug. Please let me know if it's okay by you.

Problem statement[what I have understood, Please correct me If I am wrong]:
So we are currently only taking json file as an input/output. But now we need to make the d2g subcommand flexible that it takes yaml file as an input as well(and outputs yaml). Let me know if I have understood the problem statement correctly.

Also I am referring to this readme: https://github.com/taskcluster/taskcluster/blob/main/dev-docs/development-process.md for setting up and getting started with making changes in taskcluster code. Do let me know if I am referring to the correct doc.

PS: This is my first time contributing to any open source project so please forgive me for any silly/basic question. Thank you!

@anirudh-pandey
Copy link

@bhearsum @lotas Hi guys, can you please revert to me about this? As it's been a bit over a week and pete has still not replied. And I am far tooo eager to work on this 😅.

Thank you as always 😃.

@lotas
Copy link
Contributor

lotas commented Jan 5, 2024

Hey @anirudh-pandey and thanks for the interest!
d2g is a "small" part of the entire project, so you wouldn't likely need to follow the whole development-process.md just for that. Of course, you can, but you would not likely need everything, as this will require lots of resource to launch all the services.
You only need recent go to compile and run main shell client

@anirudh-pandey
Copy link

anirudh-pandey commented Jan 6, 2024

Hey @lotas, I have completed the coding of the issue. But I think maybe I don't have write access to this repo due to which I can't push my branch. It is showing me this error, when trying to push:

ERROR: Permission to taskcluster/taskcluster.git denied to anirudh-pandey.
fatal: Could not read from remote repository.

Please make sure you have the correct access rights
and the repository exists.

I can fetch changes but can't push. Can you please give me access to this?

@lotas
Copy link
Contributor

lotas commented Jan 8, 2024

Hey @lotas, I have completed the coding of the issue. But I think maybe I don't have write access to this repo due to which I can't push my branch. It is showing me this error, when trying to push:

ERROR: Permission to taskcluster/taskcluster.git denied to anirudh-pandey.
fatal: Could not read from remote repository.

Please make sure you have the correct access rights
and the repository exists.

I can fetch changes but can't push. Can you please give me access to this?

hey @anirudh-pandey this is expected. Please create a fork first, and then create a PR from your fork.
Once forked, just add new remote to your local repo, publish and create PR from threre.
Thanks!

@anirudh-pandey
Copy link

anirudh-pandey commented Jan 8, 2024

Hi @lotas, Thanks for the comment. I have always worked on a self created repo or a repo where I had write access so missed the forking part 😅.


I have created a PR, but there are 2 issues in which I need a little bit of your guidance.
So we have 4 different variations of d2g subcommand as follows:-


1. taskcluster d2g -f /path/to/input/payload.json                              [-f option]
2. taskcluster d2g --task-def --file /path/to/input/task-definition.json       [--task-def]
3. cat /path/to/input/task-definition.json | taskcluster d2g                   [Piped input]
3. cat /path/to/input/task-definition.json | taskcluster d2g -t                [Piped input and task def enabled]

So among these the first one is working properly.

So for others now I have 2 doubts, as follows:-


1. The second variant(Task Definition), I think my input case is missing a payload, because it gives me an error saying "Error: task definition input does not contain a payload". I get this SAME error even without any of my code changes.
Sample input json file used for testing: https://pastebin.com/EeheDscb.
Sample input yaml file used for testing: https://pastebin.com/WV4dWzd1.

2. If you see my code, I detect the type of input by reading file's extension. But in piped input I don't get this, so here I am confused on how to move forward. 
I can write my own function to detect this. But I thought that maybe I should ask you once if a function to detect if input is yaml/json is present or not(I tried searching but couldn't find anything like that). Or there is some other approach that I should take instead?

Thank you for your patience. Really appreciate it!!

@petemoore
Copy link
Member Author

Hi @anirudh-pandey,

Apologies for not spotting your messages in this issue earlier. Many thanks for your contributions here!

If you see my code, I detect the type of input by reading file's extension. But in piped input I don't get this, so here I am confused on how to move forward.
I can write my own function to detect this. But I thought that maybe I should ask you once if a function to detect if input is yaml/json is present or not(I tried searching but couldn't find anything like that). Or there is some other approach that I should take instead?

Any json is valid yaml, but not all yaml is valid json. Therefore I would propose first attempting to interpret the input as json. If this succeeds, you know the input is json, and you can generate a json response. If it doesn't, try to interpret it as yaml. If this succeeds, you know you have a yaml input, so generate a yaml output. If it is also not valid yaml, then report that the input is not valid.

The ordering is important here. If you were to test for yaml first, all json inputs would be valid yaml, and yaml output would be generated - that would be undesirable. Instead make sure to test with json first.

Note, you do not need to write separate code to test if the input is json or yaml. Just assume that it is, and if you get an error, you know that it isn't.

Good luck!

@anirudh-pandey
Copy link

Hi @petemoore, Thank you for the reply.

I have pondered hard on your comment but am still slightly confused.

I am mostly confused about these two lines:
but not all yaml is valid json. Therefore I would propose first attempting to interpret the input as json. If this succeeds, you know the input is json, and you can generate a json response. If it doesn't, try to **interpret** it as yaml. and you do not need to write separate code to test if the input is json or yaml.

If not every YAML is a valid JSON, how can I interpret it as YAML? without using YAMLToJSON function like I am doing currently.

So my current thinking/approach(based on your comment) is like this, please let me know incase of any issues:

Let's say we move current(master branch) Convert function to a boolean based function, let's name it ConvertTemp for now. It will take 2 inputs cmd and a bool flag named isJson and will return true/false. False incase of any error occurs.

Now inside my current Convert function I will first call ConvertTemp(cmd, isJson: true). If this returns false(error occured), then I'll call ConvertTemp(cmd, isJson: false). If this returns false I'll simply print an error saying input is invalid.

if isJson flag is true I'll call YAMLToJson and JSONToYAML functions at the start and end of my code like I am doing in my PR currently, even during Piped input. Were you trying to suggest something like this?

PS: Also can you please reply to my first doubt in this comment: #6731 (comment) related to --task-def variant of the command.

Thank you! 😄

@petemoore
Copy link
Member Author

Hi @anirudh-pandey
Many thanks for this - I'll take a look at this later today!

@petemoore
Copy link
Member Author

PS: Also can you please reply to my first doubt in this comment: #6731 (comment) related to --task-def variant of the command.

The --task-def variant is intended when you provide a full task definition, rather than just the payload portion of the task definition (the task definition includes other fields than just payload).

An improvement here would be to also detect if just a payload has been passed, or a full task definition, and handle the output accordingly (but leaving the --task-def option available for backwards compatibility). This would simplify the user experience.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working good first issue Good for newcomers
Projects
Status: Todo
Development

Successfully merging a pull request may close this issue.

3 participants