New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Research using zod for validating options #5639
Comments
I went through the documentation of Zod. Certainly looks interesting. There are some areas where I immediately see how we can apply it, for example: defining different option object shapes, depending on a certain key. (For example the authType of the login command, which could solve the optional required thing) Other areas do not immediately seem obvious, like optionSets. |
It would solve the flag problem which I noticed the other day. Using zod and coercion of primitives. |
It would be great to do a PoC to see it in action and explore its limitations if any in our scenario. There was also an alternative to zod which was way smaller which we could also look into, but I can't recall its name. Will need to try and find it back |
Might've been valibot |
There's a list of competitors on their site. Valibot isn't one of 'em... |
seems like another huge refactor is on the way 😅. |
We agreed that I'll do a POC so that we have something to look at and see how it meets our needs |
OK, I've done some work, and here's what I've got. Using zod, we can define a schema that allows us to replace const globalOptions = z.object({
query: z.string().optional(),
output: z.enum(['csv', 'json', 'md', 'text', 'none']).optional(),
debug: z.boolean().optional().default(false),
verbose: z.boolean().optional().default(false)
});
const options = globalOptions.extend({
authType: (z.enum(['certificate', 'deviceCode', 'password', 'identity', 'browser', 'secret']).optional().default('deviceCode') as unknown as ZodAliasType).alias('t'),
cloud: z.nativeEnum(CloudType).optional().default(CloudType.Public),
userName: z.string().optional(),
password: z.string().optional(),
certificateFile: z.string().optional()
// single property validator
.refine(filePath => !filePath || fs.existsSync(filePath), filePath => ({
message: `Certificate file ${filePath} does not exist`
})),
certificateBase64Encoded: z.string().optional(),
thumbprint: z.string().optional(),
appId: z.string().optional(),
tenant: z.string().optional(),
secret: z.string().optional()
})
// option sets
.refine(options => options.authType !== 'password' || options.userName, {
message: 'Username is required when using password authentication'
})
.refine(options => options.authType !== 'password' || options.password, {
message: 'Password is required when using password authentication'
})
.refine(options => options.authType !== 'certificate' || !(options.certificateFile && options.certificateBase64Encoded), {
message: 'Specify either certificateFile or certificateBase64Encoded, but not both.'
})
.refine(options => options.authType !== 'certificate' || options.certificateFile || options.certificateBase64Encoded, {
message: 'Specify either certificateFile or certificateBase64Encoded'
})
.refine(options => options.authType !== 'secret' || options.secret, {
message: 'Secret is required when using secret authentication'
}); zod implements basic type validation that you can extend using the For defining a short option, I added a custom function named What's cool about this approach, is that we define everything in a single place. We've got one schema that defines options, aliases, autocomplete and whether they're required or not. When the validation succeeds, we get a strongly-typed object. zod runs its validation on an object. To convert user-provided command-line args into an object that we can validate with zod, we can use yargs-parser (we use minimist for this now). yargs-parser is very similar to minimist but offers some more control about parsing numbers, removing aliases, etc. Next, I'll update one of the commands to see how it would look like inside the CLI. Looking forward to hearing your thoughts @pnp/cli-for-microsoft-365-maintainers |
One more benefit that we get with zod, is that we can work with enums more reliably by getting design-time typesafety, eg: // define output enum type
const OutputType = z.enum(['csv', 'json', 'md', 'text', 'none']);
type OutputType = z.infer<typeof OutputType>;
// add output to global options, make optional
export const globalOptions = z.object({
query: z.string().optional(),
output: OutputType.optional(),
debug: z.boolean().optional().default(false),
verbose: z.boolean().optional().default(false)
});
// parse user-provided command line args using yargs-parser
const output = ["--output", "json"];
const argv = parse(output);
// check what output was provided
switch (argv.output) {
// cool! design-time typesafe enum without fiddling with strings
case OutputType.enum.csv:
console.log('csv');
break;
case OutputType.enum.json:
console.log('json');
break;
case OutputType.enum.md:
console.log('md');
break;
case OutputType.enum.text:
console.log('text');
break;
case OutputType.enum.none:
console.log('none');
break;
} |
For commands that allow unknown options, we extend the schema with const options = globalOptions.extend({
authType: (z.enum(['certificate', 'deviceCode', 'password', 'identity', 'browser', 'secret']).optional().default('deviceCode') as unknown as ZodAliasType).alias('t'),
cloud: z.nativeEnum(CloudType).optional().default(CloudType.Public),
userName: z.string().optional(),
password: z.string().optional(),
certificateFile: z.string().optional()
.refine(filePath => !filePath || fs.existsSync(filePath), filePath => ({
message: `Certificate file ${filePath} does not exist`
})),
certificateBase64Encoded: z.string().optional(),
thumbprint: z.string().optional(),
appId: z.string().optional(),
tenant: z.string().optional(),
secret: z.string().optional(),
dummyNumber: z.number().optional(),
dummyBoolean: z.boolean().optional(),
})
.refine(options => options.authType !== 'password' || options.userName, {
message: 'Username is required when using password authentication'
})
.refine(options => options.authType !== 'password' || options.password, {
message: 'Password is required when using password authentication'
})
.refine(options => options.authType !== 'certificate' || !(options.certificateFile && options.certificateBase64Encoded), {
message: 'Specify either certificateFile or certificateBase64Encoded, but not both.'
})
.refine(options => options.authType !== 'certificate' || options.certificateFile || options.certificateBase64Encoded, {
message: 'Specify either certificateFile or certificateBase64Encoded'
})
.refine(options => options.authType !== 'secret' || options.secret, {
message: 'Secret is required when using secret authentication'
})
// allow unknown options
.and(z.any()); |
After fiddling with it some more, I found out that most likely we can't expose the whole schema incl. option set validation as one definition. |
Interesting work @waldekmastykarz 👍 |
I've got a working setup with options definition and single-option validation being in one part and the option sets being in another. Had to change One thing I like so far a lot is fewer strings and implicit definitions. Options and their types are expressed in code which gives us more robustness and design time feedback. Here's btw the experimental code I've been fiddling with: https://github.com/waldekmastykarz/clim365-zod/blob/main/src/index.ts. It shows the option definition, but also how we can parse schema to get the necessary information (aliases and value types) to parse args with yargs-parser. |
Nice research @waldekmastykarz! 👏 Looks like something we could definitely use. A few remarks from my side:
|
Yes, this is basically the condition you specify on a schema-wide refiner, eg. .refine(options => options.authType !== 'password' || options.userName, {
message: 'Username is required when using password authentication'
}) Runs only when
We'd do this in yargs-parser, which supports it. If we wait for zod, it's too late because the data is lost by then by whoever parsed command line args. |
@waldekmastykarz another question. We have very few options that take 2 types as value. Take planner task set for example. Is this supported by ZOD? This will also be quite annoying with yargs-parser right? Worst case we always have to parse it as string. |
zod allows you to define union types so in this can it could be |
Awesome work on this one @waldekmastykarz! Looks very promising to use. |
OK, I've got one command (login) done. Check out: waldekmastykarz@390cfda. I've also replaced minimist with yargs-parser. The current setup is so that both old and new zod-based validations work side by side. That way we can gradually replace it across all commands over time. zod-based validation is btw 3x slower than our: 0.6ms vs. 0.2ms, still the difference is imperceptible. This is just the first step, so looking forward to hearing what you think. |
I've added support for dynamically building the telemetry: waldekmastykarz@fabf164. It turns out, that we don't need to parse the zod schema. Instead, we can use the command options info, which we gather while building the project. We have this information when loading the command and to get access to it, we need to pass it from the runtime into the command. Then it's just a matter of iterating over the options and correctly recording the different properties. |
Nice! That way it simplifies the command codebase even more. Looks very clean! |
Just pushed one more commit to include the necessary tests for 100% coverage. I suggest that we:
|
It's pretty cool to see how much we can simplify building commands. Building the schema will take getting used to, but with a few examples, we'll have the most cases covered. Looking forward to seeing it adopted across our codebase and finally removing the obsolete code. 🚀 |
I'll try to make some time to go through it more thoroughly. But, like mentioned before, big fan of the fluent writing style. It'll indeed take a bit of time to get used to but the read-ability and simplicity in the end will make it worth it. |
I just pulled the code locally and took a deeper look at the setup. It seems like this will be a substantial refactor, especially since we'll need to update the tests too. However, the tests won’t be asynchronous anymore, which is a benefit. Here are a few observations:
Just a few minor points, as the implementation overall seems very well done. Once again, great job on this! |
Yes! This is a factory which is responsible for retrieving invoking specific parsing functions. In each function, we want to have access to the exact type it supports. But when getting the function, we need to support every type and as far as I can tell, there's no shared type across all types that we could, hence
If
I chose for |
I figured this might be the reason. We could create a typed alias for this scenario, but I'm not sure it's worth the effort.
Well, appeared to overlook that one 😅
Good point. I considered using |
With that, @pnp/cli-for-microsoft-365-maintainers are we ready to accept this PR or is there anything else we should research first? |
Had one more check (I hope final) on the implementation:
My opinion: If we want to make this step (and we want) and we see benefit in it (why reinvent the wheel right?) then lets do it! if zod implementation may work side by side to what we already have I don't see why we do such a hold up and I would open a PR with it 👍.
|
The schema decides which properties are tracked and we're using the same logic which we're using now:
I'll get on submitting the PR. I appreciate your feedback! |
A ok, perfect than 🤩😍 |
I've opened the PR. After we merge it, let's create an epic where we can track upgrading all commands to use zod and finally decommissioning the obsolete code. |
I'd like to propose that we research the viability of using zod for validating options. While originally we only had required and optional options, later we introduced option sets. Yet, it seems like we're missing some scenarios (#5636 and #5637). What's more, our validation of options is split into a few parts: required/optional, optionsets, and the actual values and types.
Using a package like zod would allow us centralize all validation rules in a single schema, covering all aspects for each option. Let's do a PoC to see if it would help us. Refactoring would mean a significant effort so let's give it a good though before we commit to it.
The text was updated successfully, but these errors were encountered: