Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

osbuild-worker: improve error "reason" in case of stage failures (HMS-1442) #4113

Open
wants to merge 2 commits into
base: main
Choose a base branch
from

Conversation

schuellerf
Copy link
Contributor

This should improve the error in the frontend from

stage_error_before

to

stage_error_after

There is no test framework to test changes like this, right?

without the ".Error()" the json serialization fails
osbuildJobResult.JobError = clienterrors.WorkerClientError(clienterrors.ErrorBuildJob, "osbuild build failed", osbErrors)

reason := "osbuild build failed"
if len(failedStages) > 0 {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It feels a bit awkward to be pluralizing things manually :) You could just use in stage(s): instead.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

As this ends up in the frontend at the customers - I thought it nicer there...
although I really expected that there is a simple library for this (I'm not the golang guy :-/ )

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fwiw, this is usually handled via something like https://www.gnu.org/software/gettext/manual/html_node/Plural-forms.html as plural forms are complex in different languages. Not sure if we support i18n in the frontend (yet) but even if we don't preparing for it and getting "free" plural handling might be worth it (or not, idk). The topic of what gettext (if we go this route) to use in go is a bit more complex as there is no clear winner (at least last I looked).

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'd also prefer a library (not sure which one, though) but I got the feeling we only support English for now.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can there be actually more than one failed stage? I don't think so... 🤔

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I wasn't sure if there will be support for running parallel stages soonish - so I'd rather support it here?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

OK, in that case up to you. I was just looking at the current implementation, which can't return failures in more than one stage.

@schuellerf schuellerf changed the title osbuild-worker: improve error "reason" in case of stage failures osbuild-worker: improve error "reason" in case of stage failures (HMS-1442) Apr 30, 2024
@@ -401,7 +401,7 @@ func (impl *OSBuildJobImpl) Run(job worker.Job) error {

osbuildVersion, err := osbuild.OSBuildVersion()
if err != nil {
osbuildJobResult.JobError = clienterrors.WorkerClientError(clienterrors.ErrorBuildJob, "Error getting osbuild binary version", err)
osbuildJobResult.JobError = clienterrors.WorkerClientError(clienterrors.ErrorBuildJob, "Error getting osbuild binary version", err.Error())
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

btw. these lines (this one and 519) are maybe a fix for some cases where the details have not been propagated...

@@ -534,6 +535,7 @@ func (impl *OSBuildJobImpl) Run(job worker.Job) error {
logWithId.Infof(" %s success", stageResult.Type)
} else {
logWithId.Infof(" %s failure:", stageResult.Type)
failedStages = append(failedStages, stageResult.Type)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think that the information about the failed stage name should be instead taken from the osbuildJobResult.OSBuildOutput.Error, which will contain it if the failure is caused by a stage failure.

See https://github.com/osbuild/osbuild/blob/1b5b015fef9da025ee872b63f46e0dc69f0c108b/osbuild/formats/v2.py#L426-L439

BTW, this also means that all of this is already part of the osbErrors added as a detail to JobError.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Probably it still would make sense to add in the the reason (additionally) as it's shown more prominently in the UI - but I'll check to use osbuildJobResult.OSBuildOutput.Error

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@thozza at that level I get osbuildJobResult.OSBuildOutput.Error only as json.RawMessage not sure if I should try to guess the structure of it - or where could I get the type of the structure for sure? (couldn't find something like the ClientErrorCode or similar in that context)

Screenshot_20240502_113019

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think that a custom unmarshaler would be needed, which would check the value of the "type" property and unmarshal to the appropriate struct. We never implemented proper osbuild errors unmarshaling, because it was not needed. Until now.

I agree with putting more information into the "reason" of the error 😉

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@thozza ... and I thought that the type is not very strict so it would be more some kind of guessing and trying to cover all types of structs?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We control what osbuild returns and it currently returns exactly one type of a struct in case of the stage failure. IMHO, it makes sense to standardize this in osbuild if it is not nice and easy to interpret in osbuild-composer (or any other place that consumes osbuild's JSON output).

We do this type of unmarshaling already in multiple places, e.g.:

func (u *uploadRequest) UnmarshalJSON(data []byte) error {
var rawUploadRequest rawUploadRequest
err := json.Unmarshal(data, &rawUploadRequest)
if err != nil {
return err
}
var settings uploadSettings
switch rawUploadRequest.Provider {
case "azure":
settings = new(azureUploadSettings)
case "aws":
settings = new(awsUploadSettings)
case "aws.s3":
settings = new(awsS3UploadSettings)
case "gcp":
settings = new(gcpUploadSettings)
case "vmware":
settings = new(vmwareUploadSettings)
case "oci":
settings = new(ociUploadSettings)
case "generic.s3":
// While the API still accepts provider type "generic.s3", the request is handled
// in the same way as for a request with provider type "aws.s3"
settings = new(awsS3UploadSettings)
case "container":
settings = new(containerUploadSettings)
case "pulp.ostree":
settings = new(pulpOSTreeUploadSettings)
default:
return errors.New("unexpected provider name")
}
err = json.Unmarshal(rawUploadRequest.Settings, settings)
if err != nil {
return err
}
u.Provider = rawUploadRequest.Provider
u.ImageName = rawUploadRequest.ImageName
u.Settings = settings
return err
}

func (target *Target) UnmarshalJSON(data []byte) error {
var rawTarget rawTarget
err := json.Unmarshal(data, &rawTarget)
if err != nil {
return err
}
var options TargetOptions
switch rawTarget.Name {
case TargetNameAzure:
options = new(AzureTargetOptions)
case TargetNameAWS:
options = new(AWSTargetOptions)
case TargetNameAWSS3:
options = new(AWSS3TargetOptions)
case TargetNameGCP:
options = new(GCPTargetOptions)
case TargetNameAzureImage:
options = new(AzureImageTargetOptions)
// Kept for backward compatibility
case TargetNameLocal:
options = new(LocalTargetOptions)
case TargetNameKoji:
options = new(KojiTargetOptions)
case TargetNameVMWare:
options = new(VMWareTargetOptions)
case TargetNameOCI:
options = new(OCITargetOptions)
case TargetNameOCIObjectStorage:
options = new(OCIObjectStorageTargetOptions)
case TargetNameContainer:
options = new(ContainerTargetOptions)
case TargetNameWorkerServer:
options = new(WorkerServerTargetOptions)
case TargetNamePulpOSTree:
options = new(PulpOSTreeTargetOptions)
default:
return fmt.Errorf("unexpected target name: %s", rawTarget.Name)
}
err = json.Unmarshal(rawTarget.Options, options)
if err != nil {
return err
}
target.Uuid = rawTarget.Uuid
target.ImageName = rawTarget.ImageName
target.OsbuildArtifact = rawTarget.OsbuildArtifact
target.Name = rawTarget.Name
target.Created = rawTarget.Created
target.Status = rawTarget.Status
target.Options = options
type compatOptionsType struct {
// Deprecated: `Filename` is now set in the target itself as `ExportFilename`, not in its options.
Filename string `json:"filename"`
}
var compat compatOptionsType
err = json.Unmarshal(rawTarget.Options, &compat)
if err != nil {
return err
}
// Kept for backward compatibility
// If the `ExportTarget` is not set in the `Target`, the request is most probably
// coming from an old composer. Copy the value from the target options.
if target.OsbuildArtifact.ExportFilename == "" {
target.OsbuildArtifact.ExportFilename = compat.Filename
}
return nil
}

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

4 participants