Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Running compose status fails on Get "http://localhost/api/v1/compose/queue": EOF #3779

Open
lubosmj opened this issue Nov 2, 2023 · 5 comments
Assignees

Comments

@lubosmj
Copy link

lubosmj commented Nov 2, 2023

Describe the bug
With a fresh osbuild install, I cannot check the status of the current queue. Adding a new build to the queue works without any issues.

sudo composer-cli compose status --json
ERROR: List Error: Get "http://localhost/api/v1/compose/queue": EOF
null
sudo systemctl status osbuild-composer.service
● osbuild-composer.service - OSBuild Composer
     Loaded: loaded (/usr/lib/systemd/system/osbuild-composer.service; disabled; preset: disabled)
     Active: active (running) since Thu 2023-11-02 16:47:36 CET; 41min ago
TriggeredBy: ● osbuild-composer.socket
             ● osbuild-local-worker.socket
   Main PID: 104723 (osbuild-compose)
      Tasks: 14 (limit: 18968)
     Memory: 1.9G
        CPU: 4min 633ms
     CGroup: /system.slice/osbuild-composer.service
             └─104723 /usr/libexec/osbuild-composer/osbuild-composer

Nov 02 17:24:30 localhost.localdomain osbuild-composer[104723]:         /builddir/build/BUILD/osbuild-composer-92/int>
Nov 02 17:24:30 localhost.localdomain osbuild-composer[104723]: net/http.serverHandler.ServeHTTP({0xc000bbc030?}, {0x>
Nov 02 17:24:30 localhost.localdomain osbuild-composer[104723]:         /usr/lib/golang/src/net/http/server.go:2947 +>
Nov 02 17:24:30 localhost.localdomain osbuild-composer[104723]: net/http.(*conn).serve(0xc0003d41e0, {0x56212b527d40,>
Nov 02 17:24:30 localhost.localdomain osbuild-composer[104723]:         /usr/lib/golang/src/net/http/server.go:1991 +>
Nov 02 17:24:30 localhost.localdomain osbuild-composer[104723]: created by net/http.(*Server).Serve
Nov 02 17:24:30 localhost.localdomain osbuild-composer[104723]:         /usr/lib/golang/src/net/http/server.go:3102 +>
Nov 02 17:28:35 localhost.localdomain osbuild-composer[104723]: 2023/11/02 17:28:35 POST /api/v1/blueprints/new
Nov 02 17:28:36 localhost.localdomain osbuild-composer[104723]: 2023/11/02 17:28:36 POST /api/v1/compose
Nov 02 17:28:36 localhost.localdomain osbuild-composer[104723]: time="2023-11-02T17:28:36+01:00" level=warning msg="F>
lines 1-22/22 (END)
sudo journalctl -b -u osbuild-composer.service
Nov 02 17:24:30 localhost.localdomain osbuild-composer[104723]: 2023/11/02 17:24:30 GET /api/v1/compose/queue
Nov 02 17:24:30 localhost.localdomain osbuild-composer[104723]: 2023/11/02 17:24:30 http: panic serving @: error unmarshaling result for job '5308aeca-384d-4710-9857-ffbc997178fe': unexpected end of JSON input
Nov 02 17:24:30 localhost.localdomain osbuild-composer[104723]: goroutine 168 [running]:
Nov 02 17:24:30 localhost.localdomain osbuild-composer[104723]: net/http.(*conn).serve.func1()
Nov 02 17:24:30 localhost.localdomain osbuild-composer[104723]:         /usr/lib/golang/src/net/http/server.go:1850 +0xbf
Nov 02 17:24:30 localhost.localdomain osbuild-composer[104723]: panic({0x56212b356680, 0xc0009b24f0})
Nov 02 17:24:30 localhost.localdomain osbuild-composer[104723]:         /usr/lib/golang/src/runtime/panic.go:890 +0x262
Nov 02 17:24:30 localhost.localdomain osbuild-composer[104723]: github.com/osbuild/osbuild-composer/internal/weldr.(*API).getComposeStatus(0xc000296000, {0xc000b448f0, {0x0, {0x56212b5301b8, 0xc000532ea0}, {0xc000da8000, 0x2ca7b, 0x2e000>
Nov 02 17:24:30 localhost.localdomain osbuild-composer[104723]:         /builddir/build/BUILD/osbuild-composer-92/internal/weldr/api.go:398 +0x2e8
Nov 02 17:24:30 localhost.localdomain osbuild-composer[104723]: github.com/osbuild/osbuild-composer/internal/weldr.(*API).composeQueueHandler(0xc000296000, {0x56212b526a50?, 0xc0007622a0}, 0xc0002e60fa?, {0xc000474b00, 0x1, 0x4})
Nov 02 17:24:30 localhost.localdomain osbuild-composer[104723]:         /builddir/build/BUILD/osbuild-composer-92/internal/weldr/api.go:2819 +0x1c5
Nov 02 17:24:30 localhost.localdomain osbuild-composer[104723]: github.com/julienschmidt/httprouter.(*Router).ServeHTTP(0xc0004d5a40, {0x56212b526a50, 0xc0007622a0}, 0xc000138100)
Nov 02 17:24:30 localhost.localdomain osbuild-composer[104723]:         /builddir/build/BUILD/osbuild-composer-92/vendor/github.com/julienschmidt/httprouter/router.go:387 +0x81c
Nov 02 17:24:30 localhost.localdomain osbuild-composer[104723]: github.com/osbuild/osbuild-composer/internal/weldr.(*API).ServeHTTP(0xc000296000, {0x56212b526a50, 0xc0007622a0}, 0xc000138100)
Nov 02 17:24:30 localhost.localdomain osbuild-composer[104723]:         /builddir/build/BUILD/osbuild-composer-92/internal/weldr/api.go:303 +0x16a
Nov 02 17:24:30 localhost.localdomain osbuild-composer[104723]: net/http.serverHandler.ServeHTTP({0xc000bbc030?}, {0x56212b526a50, 0xc0007622a0}, 0xc000138100)
Nov 02 17:24:30 localhost.localdomain osbuild-composer[104723]:         /usr/lib/golang/src/net/http/server.go:2947 +0x30c
Nov 02 17:24:30 localhost.localdomain osbuild-composer[104723]: net/http.(*conn).serve(0xc0003d41e0, {0x56212b527d40, 0xc000b334a0})
Nov 02 17:24:30 localhost.localdomain osbuild-composer[104723]:         /usr/lib/golang/src/net/http/server.go:1991 +0x607
Nov 02 17:24:30 localhost.localdomain osbuild-composer[104723]: created by net/http.(*Server).Serve
Nov 02 17:24:30 localhost.localdomain osbuild-composer[104723]:         /usr/lib/golang/src/net/http/server.go:3102 +0x4db
Nov 02 17:28:35 localhost.localdomain osbuild-composer[104723]: 2023/11/02 17:28:35 POST /api/v1/blueprints/new
Nov 02 17:28:36 localhost.localdomain osbuild-composer[104723]: 2023/11/02 17:28:36 POST /api/v1/compose
Nov 02 17:28:36 localhost.localdomain osbuild-composer[104723]: time="2023-11-02T17:28:36+01:00" level=warning msg="Failed to load consumer certs: no consumer key found" func=github.com/osbuild/images/pkg/rhsm.LoadSystemSubscriptions fil>
lines 542-595/595 (END)

Environment

  • Fedora 37
  • osbuild-composer version (rpm -qi osbuild-composer):
Name        : osbuild-composer
Version     : 92
Release     : 1.fc37
Architecture: x86_64
Install Date: Thu 02 Nov 2023 04:47:07 PM CET
Group       : Unspecified
Size        : 17780
License     : Apache-2.0
Signature   : RSA/SHA256, Wed 18 Oct 2023 11:21:42 AM CEST, Key ID f55ad3fb5323552a
Source RPM  : osbuild-composer-92-1.fc37.src.rpm
weldr-client.x86_64     35.7-1.fc37

To Reproduce
Run composer-cli compose status, e.g.:

echo """name = \"fishy-commit\"
description = \"Fishy OSTree commit\"
version = \"0.0.1\"

[[packages]]
name = \"fish\"
version = \"*\"""" > fishy.toml

sudo composer-cli blueprints push fishy.toml
sudo composer-cli compose start-ostree fishy-commit fedora-iot-commit --ref fedora/stable/x86_64/iot
sudo composer-cli compose status

Expected behaviour
The status of the queue being printed out.

Additional context
I tried to follow up the instructions at https://access.redhat.com/solutions/7016594. I rebooted my system a couple of times as well. Nothing helped to resolve the issue. I am sure I was able to run the attached reproducer 6+ months ago.

@lubosmj lubosmj changed the title composer-cli compose status fails on Get "http://localhost/api/v1/compose/queue": EOF Running compose status fails on Get "http://localhost/api/v1/compose/queue": EOF Nov 2, 2023
@bcl bcl self-assigned this Nov 2, 2023
@bcl
Copy link
Contributor

bcl commented Nov 2, 2023

I'm not sure how the system got into this state, but it looks like a job info json file in /var/lib/osbuild-composer/jobs became corrupted and the server does not handle that gracefully.

If you do not mind losing your saved blueprints, sources, previous composes, etc. you should be able to work around this by:

rm -rf /var/lib/osbuild-composer/jobs/*
rm /var/lib/osbuild-composer/state.json
reboot

This deletes all the job files and the state file which holds all the details of the system. After rebooting it you will have to push any blueprints and re-add any sources you need. This is of course only a temporary fix and I'll look into making things fail more gracefully.

@lubosmj
Copy link
Author

lubosmj commented Nov 2, 2023

Yes. I can confirm that this workaround helped:

[lmjachky@localhost ostree]$ sudo composer-cli blueprints push fishy.toml
[lmjachky@localhost ostree]$ sudo composer-cli compose start-ostree fishy-commit fedora-iot-commit --ref fedora/stable/x86_64/iot
Compose 15881de1-d729-4f7c-9a9e-2114c0eddaaf added to the queue
[lmjachky@localhost ostree]$ sudo composer-cli compose status
ID                                     Status     Time                      Blueprint         Version   Type               Size
15881de1-d729-4f7c-9a9e-2114c0eddaaf   RUNNING    Thu Nov 2 21:57:00 2023   fishy-commit      0.0.1     iot-commit      

Could the JSON file become corrupted during OS upgrades?

@bcl
Copy link
Contributor

bcl commented Nov 3, 2023

I suppose if it was rebooted right in the middle of writing it it is possible, but I'd consider that very unlikely.

@sjr0228
Copy link

sjr0228 commented Nov 3, 2023

Found myself in a similar situation this morning after removing the files in the jobs directory ..
osbuild-composer-75.1.el8.x86_64

My solution was:

find /var/lib/osbuild-composer/ -type f -delete 
rm -rf /var/cache/osbuild-composer/rpmmd/*
systemctl | grep '^  osbuild.*' | cut -d" " -f 3 | xargs systemctl restart --no-block

Rather than bring down the system, just another approach if you need to recover again.

@bcl
Copy link
Contributor

bcl commented Nov 3, 2023

The next time someone hits this could you tar up the /var/lib/osbuild-composer/ directory tree and email it to me? Assuming there is nothing in there you don't mind me looking at. 2 people in one week tells me this is happening way more often than I thought.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants