Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Docker for Mac doesn't work properly with IPv6 networks on MacOS Sierra 10.12.4 #1586

Closed
BrendonW opened this issue May 3, 2017 · 29 comments

Comments

@BrendonW
Copy link

BrendonW commented May 3, 2017

Expected behavior

I expect Docker to resolve external names to IP addresses correctly

Actual behavior

The problem seems shows itself in many forms, The first was failing "apk update" inside Alpine images.

fetch http://dl-cdn.alpinelinux.org/alpine/v3.5/main/x86_64/APKINDEX.tar.gz
ERROR: http://dl-cdn.alpinelinux.org/alpine/v3.5/main: temporary error (try again later)
WARNING: Ignoring APKINDEX.c51f8f92.tar.gz: No such file or directory
fetch http://dl-cdn.alpinelinux.org/alpine/v3.5/community/x86_64/APKINDEX.tar.gz
ERROR: http://dl-cdn.alpinelinux.org/alpine/v3.5/community: temporary error (try again later)
WARNING: Ignoring APKINDEX.d09172fd.tar.gz: No such file or directory
2 errors; 11 distinct packages available
/ # apk add bind-tools
WARNING: Ignoring APKINDEX.c51f8f92.tar.gz: No such file or directory
WARNING: Ignoring APKINDEX.d09172fd.tar.gz: No such file or directory
ERROR: unsatisfiable constraints:
 bind-tools (missing):
   required by: world[bind-tools]

Another example, which the activity on the host shows it fails to use the correct DNS resolution, leading to a failure:

Firefly:mariadb brendon$ docker run -it ubuntu bash
Unable to find image 'ubuntu:latest' locally
docker: Error response from daemon: Get https://registry-1.docker.io/v2/library/ubuntu/manifests/latest: unauthorized: incorrect username or password.
See 'docker run --help'.

Information

This seems to be an interaction of the Mac DHCP/DNS resolution system on machines with IPv6 access. Disabling IPv6 on the machine and changing networking settings to use just 8.8.8.8 for DNS allows existing containers to operate correctly (although now your machine can't do IPv6!) internally, but the Docker for Mac VM doesn't stop trying to resolve using AAAA resolvers. And if the destination has no IPv6 address, then even finding a valid IPv4 address, doesn't work because it DOES NOT USE the IPv4 address!

Docker for Mac: version: 17.05.0-ce-rc1-mac8 (73d01bb48)
macOS: version 10.12.4 (build: 16E195)
logs: /tmp/627A70C3-1104-434E-BD75-52C76F842636/20170503-105450.tar.gz
[OK]     db.git
[OK]     vmnetd
[OK]     dns
[OK]     driver.amd64-linux
[OK]     virtualization VT-X
[OK]     app
[OK]     moby
[OK]     system
[OK]     moby-syslog
[OK]     db
[OK]     env
[OK]     virtualization kern.hv_support
[OK]     slirp
[OK]     osxfs
[OK]     moby-console
[OK]     logs
[OK]     docker-cli
[OK]     menubar
[OK]     disk

  • A reproducible case if this is a bug, Dockerfiles FTW
    Nothing special required.

  • syslog -k Sender Docker output from a docker restart:-

May  3 10:52:48 Firefly Docker[2393] <Notice>: Logging to Apple System Log
May  3 10:52:48 Firefly Docker[2395] <Warning>: Sending SIGKILL to com.docker.hyperkit 1014
May  3 10:52:48 Firefly Docker[2395] <Notice>: Acquired hypervisor lock
May  3 10:52:48 Firefly Docker[2395] <Notice>: Docker is not responding: Get http://./info: dial unix /Users/brendon/Library/Containers/com.docker.docker/Data/00000003.00000948: connect: connection refused: waiting 0.5s
May  3 10:52:48 Firefly Docker[2395] <Notice>: OSX version = 10.12.4, default value of on-sleep = no not freeze
May  3 10:52:48 Firefly Docker[2394] <Notice>: Logging to Apple System Log
May  3 10:52:48 Firefly Docker[2394] <Notice>: Setting handler to ignore all SIGPIPE signals
May  3 10:52:48 Firefly Docker[2394] <Notice>: vpnkit version aa7a73e738cff5a450de96df974f65378430ebc3 with hostnet version   uwt version 0.1.0 hvsock version 0.13.0 
May  3 10:52:48 Firefly Docker[2394] <Notice>: starting port forwarding server on port_control_url:fd:4 vsock_path:/Users/brendon/Library/Containers/com.docker.docker/Data/connect
May  3 10:52:48 Firefly Docker[2394] <Notice>: 2 upstream DNS servers are configured
May  3 10:52:48 Firefly Docker[2394] <Notice>: attempting to reconnect to database
May  3 10:52:48 Firefly Docker[2394] <Notice>: hosts file has bindings for localhost broadcasthost localhost
May  3 10:52:48 Firefly Docker[2394] <Notice>: Add(2): DNS configuration changed to: use upstream DNS servers nameserver 8.8.8.8#53
	order 0
May  3 10:52:48 Firefly Docker[2394] <Notice>: reconnected transport layer
May  3 10:52:48 Firefly Docker[2394] <Notice>: updating connection limit to 2000
May  3 10:52:48 Firefly Docker[2394] <Notice>: allowing binds to any IP addresses
May  3 10:52:48 Firefly Docker[2394] <Notice>: updating resolvers to use upstream DNS servers nameserver 8.8.8.8#53
	timeout 2000
	order 200000
May  3 10:52:48 Firefly Docker[2394] <Notice>: Add(3): DNS configuration changed to: use upstream DNS servers nameserver 8.8.8.8#53
	timeout 2000
	order 200000
May  3 10:52:48 Firefly Docker[2394] <Notice>: 1 upstream DNS servers are configured
May  3 10:52:48 Firefly Docker[2394] <Notice>: updating resolvers to use host resolver
May  3 10:52:48 Firefly Docker[2394] <Notice>: Remove(3): DNS configuration changed to: use upstream DNS servers nameserver 8.8.8.8#53
	order 0
May  3 10:52:48 Firefly Docker[2394] <Notice>: Add(3): DNS configuration changed to: use host resolver
May  3 10:52:48 Firefly Docker[2394] <Notice>: Will use the host's DNS resolver
May  3 10:52:48 Firefly Docker[2394] <Notice>: Creating slirp server peer_ip:192.168.65.2 local_ip:192.168.65.1 domain_search: mtu:1500 bridge:true
May  3 10:52:48 Firefly Docker[2395] <Notice>: No need to perform database migration: defaults branch already exists
May  3 10:52:49 Firefly Docker[2395] <Notice>: Docker is not responding: Get http://./info: dial unix /Users/brendon/Library/Containers/com.docker.docker/Data/00000003.00000948: connect: connection refused: waiting 0.5s
May  3 10:52:49 Firefly Docker[2395] <Notice>: hypervisor: native
May  3 10:52:49 Firefly Docker[2395] <Notice>: TRIM is enabled; recycling thread will keep 0 sectors free and will compact after 0 more sectors are free
May  3 10:52:49 Firefly Docker[2395] <Notice>: filesystem: osxfs
May  3 10:52:49 Firefly Docker[2395] <Notice>: network: hybrid
May  3 10:52:49 Firefly Docker[2393] <Notice>: Using protocol TwoThousand msize 16384
May  3 10:52:49 Firefly Docker[2395] <Notice>: Hypervisor: native; BootProtocol: direct; UefiBootDisk: /Users/brendon/UefiBoot.qcow2
May  3 10:52:49 Firefly Docker[2395] <Notice>: Syslog socket is /Users/brendon/Library/Containers/com.docker.docker/Data/00000002.00000202
May  3 10:52:49 Firefly Docker[2395] <Notice>: Logfile is /Users/brendon/Library/Containers/com.docker.docker/Data/com.docker.driver.amd64-linux/syslog
May  3 10:52:49 Firefly Docker[2395] <Notice>: Launched[2399]: /Applications/Docker.app/Contents/Resources/bin/hyperkit -A -m 2048M -c 4 -u -s 0:0,hostbridge -s 31,lpc -s 2:0,virtio-vpnkit,uuid=ff1e8970-c750-4461-a68e-45c66e2acb6d,path=/Users/brendon/Library/Containers/com.docker.docker/Data/s50,macfile=/Users/brendon/Library/Containers/com.docker.docker/Data/com.docker.driver.amd64-linux/mac.0 -s 3,ahci-hd,file:///Users/brendon/Library/Containers/com.docker.docker/Data/com.docker.driver.amd64-linux/Docker.qcow2?sync=os&buffered=1,format=qcow,qcow-config=discard=true;compact_after_unmaps=0;keep_erased=0;runtime_asserts=false -s 4,virtio-9p,path=/Users/brendon/Library/Containers/com.docker.docker/Data/s40,tag=db -s 5,virtio-rnd -s 6,virtio-9p,path=/Users/brendon/Library/Containers/com.docker.docker/Data/s51,tag=port -s 7,virtio-sock,guest_cid=3,path=/Users/brendon/Library/Containers/com.docker.docker/Data,guest_forwards=2376;1525 -l com1,autopty=/Users/brendon/Library/Containers/com.docker.docker/Data/com.docker.driver.amd64-linux/tty,log=/Users/brendon/Library/Containers/com.docker.docker/Data/com.docker.driver.amd64-linux/console-ring -f kexec,/Applications/Docker.app/Contents/Resources/moby/vmlinuz64,/Applications/Docker.app/Contents/Resources/moby/initrd.img,earlyprintk=serial console=ttyS0 com.docker.driver="com.docker.driver.amd64-linux", com.docker.database="com.docker.driver.amd64-linux" ntp=gateway mobyplatform=mac vsyscall=emulate page_poison=1 panic=1 -F /Users/brendon/Library/Containers/com.docker.docker/Data/com.docker.driver.amd64-linux/hypervisor.pid
May  3 10:52:49 Firefly Docker[2395] <Notice>: SC database lists search domains: 
May  3 10:52:49 Firefly Docker[2395] <Notice>: SC database includes DNS service: { Addresses: 8.8.8.8; Order: 200000; Zones:  }
May  3 10:52:49 Firefly Docker[2395] <Notice>: SC database has domain name: 
May  3 10:52:49 Firefly Docker[2395] <Notice>: watcher.Add(/etc/resolver) failed: &os.PathError{Op:"lstat", Path:"/etc/resolver", Err:0x2}
May  3 10:52:49 Firefly Docker[2395] <Warning>: Failed to watch /etc/resolver: &os.PathError{Op:"lstat", Path:"/etc/resolver", Err:0x2}: if this is needed, create the directory and restart the app
May  3 10:52:49 Firefly Docker[2394] <Notice>: PPP.negotiate: received { magic = VMN3T; version = 1; commit = 73d01bb48e39db1d7d76f279a68f7fadc46b15a1 }
May  3 10:52:49 Firefly Docker[2394] <Notice>: PPP.negotiate: received Ethernet ff1e8970-c750-4461-a68e-45c66e2acb6d
May  3 10:52:49 Firefly Docker[2395] <Notice>: virtio-net-vpnkit: magic=VMN3T version=1 commit=0123456789012345678901234567890123456789
May  3 10:52:49 Firefly Docker[2394] <Notice>: PPP.negotiate: sending { mtu = 1500; max_packet_size = 1550; client_macaddr = 02:50:00:00:00:01 }
May  3 10:52:49 Firefly Docker[2394] <Notice>: Client mac: 02:50:00:00:00:01 server mac: f6:16:36:bc:f9:c6
May  3 10:52:49 Firefly Docker[2394] <Notice>: TCP/IP ready
May  3 10:52:49 Firefly Docker[2394] <Notice>: stack connected
May  3 10:52:49 Firefly Docker[2394] <Notice>: starting introspection server on: fd:5
May  3 10:52:49 Firefly Docker[2394] <Notice>: starting diagnostics server on: fd:6
May  3 10:52:49 Firefly Docker[2395] <Notice>: mirage_block_open: block_config = file:///Users/brendon/Library/Containers/com.docker.docker/Data/com.docker.driver.amd64-linux/Docker.qcow2?sync=os&buffered=1 and qcow_config = discard=true;compact_after_unmaps=0;keep_erased=0;runtime_asserts=false
May  3 10:52:49 Firefly Docker[2395] <Notice>: com.docker.hyperkit: [INFO] Resized file to 284294 clusters (36389632 sectors)
May  3 10:52:49 Firefly Docker[2395] <Notice>: Docker is not responding: Get http://./info: dial unix /Users/brendon/Library/Containers/com.docker.docker/Data/00000003.00000948: connect: connection refused: waiting 0.5s
May  3 10:52:50 Firefly Docker[2395] <Notice>: mirage_block_open: block_config = file:///Users/brendon/Library/Containers/com.docker.docker/Data/com.docker.driver.amd64-linux/Docker.qcow2?sync=os&buffered=1 and qcow_config = discard=true;compact_after_unmaps=0;keep_erased=0;runtime_asserts=false returning 0
May  3 10:52:50 Firefly Docker[2395] <Notice>: com.docker.hyperkit: [INFO] image has 0 free sectors and 284291 used sectors
May  3 10:52:50 Firefly Docker[2395] <Notice>: mirage_block_stat
May  3 10:52:50 Firefly Docker[2395] <Notice>: vsock init 7:0 = /Users/brendon/Library/Containers/com.docker.docker/Data, guest_cid = 00000003
May  3 10:52:52 Firefly Docker[2395] <Notice>: 
	rdmsr to register 0x34 on vcpu 0
May  3 10:52:52 Firefly Docker[2395] <Notice>: Docker is not responding: Get http://./info: EOF: waiting 0.5s
--- last message repeated 11 times ---
May  3 10:52:58 Firefly Docker[2395] <Notice>: com.docker.hyperkit: [INFO] Allocator: 284291 used; 0 junk; 0 erased; 0 available; 0 copies; 0 roots; 0 Copying; 0 Copied; 0 Flushed; 0 Referenced; max_cluster = 284293
May  3 10:52:58 Firefly Docker[2395] <Notice>: com.docker.hyperkit: [INFO] Allocator: file contains cluster 0 .. 284293 will enlarge file to 0 .. 284805
May  3 10:52:58 Firefly Docker[2395] <Notice>: com.docker.hyperkit: [INFO] resize: adding available clusters (Node ((x 284294) (y 284805) (l Empty) (r Empty) (h 1) (cardinal 512)))
May  3 10:52:58 Firefly Docker[2395] <Notice>: Docker is not responding: Get http://./info: EOF: waiting 0.5s
--- last message repeated 2 times ---
May  3 10:53:00 Firefly Docker[2394] <Notice>: BOOTREQUEST from 02:50:00:00:00:01
May  3 10:53:00 Firefly Docker[2394] <Notice>: BOOTREPLY to 02:50:00:00:00:01 yiddr 192.168.65.2 siddr 192.168.65.1 dns 192.168.65.1 router 192.168.65.1 domain local
May  3 10:53:00 Firefly Docker[2393] <Notice>: transfused: mount /bin/fusermount -o allow_other,max_read=1048576,subtype=osxfs /Users
May  3 10:53:00 Firefly Docker[2393] <Notice>: transfused: mount /bin/fusermount -o allow_other,max_read=1048576,subtype=osxfs /Volumes
May  3 10:53:00 Firefly Docker[2393] <Notice>: transfused: mount /bin/fusermount -o allow_other,max_read=1048576,subtype=osxfs /tmp
May  3 10:53:00 Firefly Docker[2393] <Notice>: transfused: mount /bin/fusermount -o allow_other,max_read=1048576,subtype=osxfs /private
May  3 10:53:00 Firefly Docker[2393] <Notice>: transfused: mount /bin/fusermount -o allow_other,max_read=1048576,subtype=osxfs /host_docker_app
May  3 10:53:00 Firefly Docker[2393] <Notice>: Negotiated transfuse notification channel for /Users
May  3 10:53:00 Firefly Docker[2393] <Notice>: Negotiated transfuse notification channel for /Volumes
May  3 10:53:00 Firefly Docker[2393] <Notice>: Negotiated transfuse notification channel for /tmp
May  3 10:53:00 Firefly Docker[2393] <Notice>: Negotiated transfuse notification channel for /private
May  3 10:53:00 Firefly Docker[2393] <Notice>: Negotiated transfuse notification channel for /host_docker_app
May  3 10:53:00 Firefly Docker[2393] <Notice>: sending continue to client
May  3 10:53:00 Firefly Docker[2395] <Notice>: Docker is not responding: Get http://./info: EOF: waiting 0.5s
May  3 10:53:00 Firefly Docker[2394] <Notice>: Using protocol TwoThousand msize 8192
May  3 10:53:05 Firefly Docker[2395] <Notice>: Docker is responding
May  3 10:53:05 Firefly Docker[1001] <Notice>: VM started at 2017-05-03 10:53:05 -0700 PDT
May  3 10:53:05 Firefly Docker[2393] <Warning>: UNEXPECTED event message 31 (10) FUSE_GETXATTR.p1439.u0.g0 name=security.capability size=0
May  3 10:53:06 Firefly Docker[2395] <Notice>: startWatchingEvents existing (autostarted?) container eefd0f4c38b9ee3a67677bca26bfe35322ac32d45347977c75dd39d4016b7f35
May  3 10:53:41 Firefly Docker[2394] <Warning>: DNS lookup dl-cdn.alpinelinux.org AAAA: NoSuchRecord
May  3 10:53:41 Firefly Docker[2394] <Notice>: DNS lookup dl-cdn.alpinelinux.org A: dl-cdn.alpinelinux.org <IN|176> [CNAME (global.prod.fastly.net)], global.prod.fastly.net <IN|11> [A (151.101.192.249)], global.prod.fastly.net <IN|11> [A (151.101.128.249)], global.prod.fastly.net <IN|11> [A (151.101.64.249)], global.prod.fastly.net <IN|11> [A (151.101.0.249)]
May  3 10:53:52 Firefly Docker[2394] <Notice>: DNS lookup registry-1.docker.io A: registry-1.docker.io <IN|22> [A (52.0.56.248)], registry-1.docker.io <IN|22> [A (34.205.194.204)], registry-1.docker.io <IN|22> [A (50.17.48.108)]
May  3 10:53:52 Firefly Docker[2394] <Warning>: DNS lookup registry-1.docker.io AAAA: NoSuchRecord
May  3 10:53:53 Firefly Docker[2394] <Warning>: DNS lookup auth.docker.io AAAA: NoSuchRecord
May  3 10:53:53 Firefly Docker[2394] <Notice>: DNS lookup auth.docker.io A: auth.docker.io <IN|30> [A (34.205.194.204)], auth.docker.io <IN|30> [A (50.17.48.108)], auth.docker.io <IN|30> [A (52.0.56.248)]
May  3 10:54:17 Firefly Docker[2394] <Warning>: DNS lookup registry-1.docker.io AAAA: NoSuchRecord
May  3 10:54:17 Firefly Docker[2394] <Notice>: DNS lookup registry-1.docker.io A: registry-1.docker.io <IN|22> [A (52.0.56.248)], registry-1.docker.io <IN|22> [A (34.205.194.204)], registry-1.docker.io <IN|22> [A (50.17.48.108)]
May  3 10:54:18 Firefly Docker[2394] <Notice>: DNS lookup auth.docker.io A: auth.docker.io <IN|30> [A (34.205.194.204)], auth.docker.io <IN|30> [A (50.17.48.108)], auth.docker.io <IN|30> [A (52.0.56.248)]
May  3 10:54:18 Firefly Docker[2394] <Warning>: DNS lookup auth.docker.io AAAA: NoSuchRecord

Steps to reproduce the behavior

1. Install Docker for Mac on MacOS Sierra 10.12.4

2. Connect to a network that supports IPv6 using DHCP.

3. Try to use Docker.
@djs55
Copy link
Contributor

djs55 commented May 3, 2017

@BrendonW thanks for your report. I'm hoping to have a look at the logs tomorrow.

@djs55 djs55 self-assigned this May 3, 2017
@djs55
Copy link
Contributor

djs55 commented May 4, 2017

@BrendonW the diagnostic upload has 2 kinds of DNS logs: high-precision packet traces which only cover the time period from when the VM was last started; and low-precision logs which cover a couple of days. In the high-precision traces I can see the following lookups:

  • A dl-cdn.alpinelinux.org: this seems to successfully reply with a CNAME and 4 A records
  • AAAA dl-cdn.alpinelinux.org: this replies with does-not-exist
  • A registry-1.docker.io: this seems to successfully reply with 3 A records
  • AAAA registry-1.docker.io: this replies with does-not-exist
  • A auth.docker.io: this seems to successfully reply with 3 A records
  • AAAA auth.docker.io: this replies with does-not-exist

-- these seem to be fine.

Looking at the (less accurate) older logs, I see logs like this:

DNS lookup dl-cdn.alpinelinux.org A: global.prod.fastly.net <IN|11> [A (151.101.192.249)], global.prod.fastly.net <IN|11> [A (151.101.128.249)], global.prod.fastly.net <IN|11> [A (151.101.64.249)], global.prod.fastly.net <IN|11> [A (151.101.0.249)], dl-cdn.alpinelinux.org <IN|171> [CNAME (global.prod.fastly.net)], global.prod.fastly.net <IN|11> [A (151.101.192.249)], global.prod.fastly.net <IN|11> [A (151.101.128.249)], global.prod.fastly.net <IN|11> [A (151.101.64.249)], global.prod.fastly.net <IN|11> [A (151.101.0.249)]

-- this might mean you're still suffering from a recent bug where CNAME and A records were permuted in the response, confusing the Go resolver. Did you get a chance to try the experimental binary from #1569 (comment) ? Note that reinstalling Docker for Mac will undo this fix, so you might need to reapply.

However I also see unrelated errors of the form:

May  3 17:53:52 moby root: time="2017-05-03T17:53:52.886240345Z" level=debug msg="Trying to pull alpine from https://registry-1.docker.io v2"  
May  3 17:53:53 moby root: time="2017-05-03T17:53:53.677851642Z" level=info msg="Attempting next endpoint for pull after error: Get https://registry-1.docker.io/v2/library/alpine/manifests/latest: unauthorized: incorrect username or password"  

-- this suggests that there is some stale authentication information in your ~/.docker/config.json file. Try removing the auths and try again.

If it's still failing, could you do the following:

  • sha1sum /Applications/Docker.app/Contents/Resources/bin/vpnkit just in case the binary has reverted
  • disable IPv6 (to make it work again) and docker run -it alpine sh and then apk update && apk add bind-tools to install dig
  • enable IPv6 and reproduce the problem a few times
  • inside the interactive container, try a couple of dig www.google.com dig www.docker.com etc
  • upload a new diagnostics: this will include the high-precision packet traces

I'm not sure yet how IPv6 fits into the picture but I've escalated this to our networking team and they're taking a look.

@BrendonW
Copy link
Author

BrendonW commented May 4, 2017

Thanks for the response Dave.

I'll work through all your suggestions and let you know the results of each one. I'll also grab my daughter's macbook and install the, theoretically same, combination of OS and EDGE. If that works on this network, then digging into why MY machine has the problem will have a different focus.

In the last week, while trying to debug, I see lots of people who seem to have (some) specific similar symptoms and could be suffering from a related issue. Hope I can help make the product more robust, because I'm in love...

@BrendonW
Copy link
Author

BrendonW commented May 5, 2017

However I also see unrelated errors of the form:

May  3 17:53:52 moby root: time="2017-05-03T17:53:52.886240345Z" level=debug msg="Trying to pull alpine from https://registry-1.docker.io v2"  
May  3 17:53:53 moby root: time="2017-05-03T17:53:53.677851642Z" level=info msg="Attempting next endpoint for pull after error: Get https://registry1.docker.io/v2/library/alpine/manifests/latest: unauthorized: incorrect username or password"

> -- this suggests that there is some stale authentication information in your ~/.docker/config.json file.
> Try removing the auths and try again.

Oh well, I can't get quote formatting to work right!

I removed the auth section without any other changes and it made no difference. Syslog continues to show the AAAA fails, even though the host machine has IPv6 turned OFF. (Although I think this may also be an artifact of MacOS and some of its odd network behaviors.)

Regardless, removing auths and then restarting Docker made no difference.

@BrendonW
Copy link
Author

BrendonW commented May 5, 2017

Morning Dave,

After installign the vpnkit, apk update now works inside existing containers even with IPv6 in place.

This is what I get whenever I try to pull an image:

Firefly:Downloads brendon$ docker run -it ubuntu bash
Unable to find image 'ubuntu:latest' locally
docker: Error response from daemon: Get https://registry-1.docker.io/v2/library/ubuntu/manifests/latest: unauthorized: incorrect username or password.
See 'docker run --help'.

This is after removing the Auth section from .docker/json.conf

I will have to upgrade my daughter's mac -- as soon as I get her AppleID and wait the 50 hours that a Sierra upgrade takes :) and then will see if her machine works here.

@djs55
Copy link
Contributor

djs55 commented May 5, 2017

@BrendonW good news about the apk update. I'm still not sure where the unauthorised: incorrect username or password is coming from. Here's a quick experiment you could try:

$ docker run --rm --net=host --pid=host --privileged -it justincormack/nsenter1 /bin/sh
/ # docker run -it ubuntu bash

-- the first command runs effectively a root shell inside the helper Linux VM, and the second uses the Linux docker CLI in the VM rather than the Mac docker CLI on the host.

Another random thought: do you have an HTTP/HTTPS proxy set up? Could that be requiring some authentication?

@BrendonW
Copy link
Author

BrendonW commented May 6, 2017

I upgraded my daughter's Mac and installed Docker for Mac. It worked perfectly on the same network.

Now I'm going to try and reinstall on my machine. That's a pain because I believe I have to figure out how (and which images/containers/volumes) to backup so I can put them back after the re-install.

@BrendonW
Copy link
Author

BrendonW commented May 8, 2017

I completely uninstalled Docker and then reinstalled...

Firefly:STATE brendon$ docker run hello-world
Unable to find image 'hello-world:latest' locally
docker: Error response from daemon: Get https://registry-1.docker.io/v2/library/hello-world/manifests/latest: unauthorized: incorrect username or password.

New Diagnostic ID: 59485F30-B400-41E3-9324-24918BC5FBF5

I'm a bit stumped as to where to go next... but I REALLY need to get docker working again!

@BrendonW
Copy link
Author

BrendonW commented May 8, 2017

Is this a problem?
May 7 19:51:23 Firefly Docker[51949] <Warning>: DNS lookup 1.65.168.192.in-addr.arpa PTR: NoSuchRecord

I've seen other reports, but no symptoms, just that the IP doesn't show up in ifconfig.

@BrendonW
Copy link
Author

BrendonW commented May 8, 2017

I tried your experiment on the clean install:

Firefly:STATE brendon$ docker run --rm --net=host --pid=host --privileged -it justincormack/nsenter1 /bin/sh
Unable to find image 'justincormack/nsenter1:latest' locally
docker: Error response from daemon: Get https://registry-1.docker.io/v2/justincormack/nsenter1/manifests/latest: unauthorized: incorrect username or password.
See 'docker run --help'.

I don't have any proxy and have the machine Firewall turned off.
I also tried to re-run the commands after restarting docker and changing over to a wireless hotspot. The results were the same.

@djs55
Copy link
Contributor

djs55 commented May 8, 2017

@BrendonW the DNS 1.65.168.192.in-addr.arpa failure should be benign.

Sorry, I designed a bad experiment. Could you try this revised version:

screen ~/Library/Containers/com.docker.docker/Data/com.docker.driver.amd64-linux/tty 
# hit enter, should get a prompt inside the VM
docker run -it ubuntu bash

To exit the screen use 'Control+a' and then 'd'.

It's also worth running docker login on the host to refresh your hub login just in case that's out of date.

I checked the IP addresses you're seeing for registry-1 and auth: they're the same I see from here.

@BrendonW
Copy link
Author

BrendonW commented May 8, 2017

I did do a new docker login without changing the results.

Interestingly, your screen command didn't return a prompt but then did take the docker run command:

docker run -it ubuntu bash
Unable to find image 'ubuntu:latest' locally
latest: Pulling from library/ubuntu
aafe6b5e13de: Pull complete 
0a2b43a72660: Pull complete 
18bdd1e546d2: Pull complete 
8198342c3e05: Pull complete 
f56970a44fd4: Pull complete 
Digest: sha256:f3a61450ae43896c4332bda5e78b453f4a93179045f20c8181043b26b5e79028
Status: Downloaded newer image for ubuntu:latest
root@eb1b8d850095:/# 

So I then tried in a (new) terminal:

Last login: Fri May  5 06:12:42 on ttys000
You have new mail.
gpg-agent[53395]: a gpg-agent is already running - not starting a new one
Firefly:~ brendon$ docker run -it ubuntu bash
root@8c59053572eb:/# exit  
exit
Firefly:~ brendon$ docker run -it alpine bash
Unable to find image 'alpine:latest' locally
docker: Error response from daemon: Get https://registry-1.docker.io/v2/library/alpine/manifests/latest: unauthorized: incorrect username or password.
See 'docker run --help'.
Firefly:~ brendon$ 

@djs55
Copy link
Contributor

djs55 commented May 8, 2017

@BrendonW that's really interesting -- if the pull succeeds from inside the VM then there must be something wrong with the CLI on the host. Could you check which binary you're running on the host? I have:

$ which docker
/usr/local/bin/docker

$ ls -l /usr/local/bin/docker
lrwxr-xr-x  1 <userid>  staff  63  8 May 13:26 /usr/local/bin/docker -> /Users/<username>/Library/Group Containers/group.com.docker/bin/docker

$ ls -l "/Users/<username>/Library/Group Containers/group.com.docker/bin/docker"
lrwxr-xr-x  1 <userid>  staff  54  8 May 13:26 /Users/<username>/Library/Group Containers/group.com.docker/bin/docker -> /Applications/Docker.app/Contents/Resources/bin/docker

Could you also check you have no DOCKER environment variables set? Perhaps use a command like

$ set | grep DOCKER

(just in case there's a DOCKER_HOST pointing somewhere unfortunate)

Another experiment on the host, could you try running

$ /usr/local/bin/docker/docker -H unix:///Users/<username>/Library/Containers/com.docker.docker/Data/00000003.00000948 ps

-- this bypasses any DOCKER_HOST and the host-side docker API proxy by talking directly to the VM.

@BrendonW
Copy link
Author

BrendonW commented May 8, 2017

From the Terminal:-

Firefly:~ brendon$ /usr/local/bin/docker run -it ubuntu bash
Unable to find image 'ubuntu:latest' locally
/usr/local/bin/docker: Error response from daemon: Get https://registry-1.docker.io/v2/library/ubuntu/manifests/latest: unauthorized: incorrect username or password.
See '/usr/local/bin/docker run --help'.

The CLI binary is in the locations you have.

Firefly:~ brendon$ ls -al `which docker`
lrwxr-xr-x  1 brendon  staff    67B May  8 05:51 /usr/local/bin/docker -> /Users/brendon/Library/Group Containers/group.com.docker/bin/docker
Firefly:~ brendon$ ls -alL `which docker`
-rwxr-xr-x  1 brendon  admin    15M Apr 12 08:53 /usr/local/bin/docker
Firefly:~ brendon$ md5 `which docker`
MD5 (/usr/local/bin/docker) = 06a55b4d7ce53edb270d1bec02cda2e9
Firefly:~ brendon$ md5 /Users/brendon/Library/Group\ Containers/group.com.docker/bin/docker
MD5 (/Users/brendon/Library/Group Containers/group.com.docker/bin/docker) = 06a55b4d7ce53edb270d1bec02cda2e9
Firefly:~ brendon$ ls -l /Users/brendon/Library/Group\ Containers/group.com.docker/bin/docker
lrwxr-xr-x  1 brendon  staff    54B May  8 05:51 /Users/brendon/Library/Group Containers/group.com.docker/bin/docker -> /Applications/Docker.app/Contents/Resources/bin/docker
Firefly:~ brendon$ ls -l /Applications/Docker.app/Contents/Resources/bin/docker
-rwxr-xr-x@ 1 brendon  admin    15M Apr 12 08:53 /Applications/Docker.app/Contents/Resources/bin/docker
Firefly:~ brendon$ md5sum /Applications/Docker.app/Contents/Resources/bin/docker
-bash: md5sum: command not found
Firefly:~ brendon$ md5 /Applications/Docker.app/Contents/Resources/bin/docker
MD5 (/Applications/Docker.app/Contents/Resources/bin/docker) = 06a55b4d7ce53edb270d1bec02cda2e9
Firefly:~ brendon$ /Applications/Docker.app/Contents/Resources/bin/docker run ubuntu ps
Unable to find image 'ubuntu:latest' locally
/Applications/Docker.app/Contents/Resources/bin/docker: Error response from daemon: Get https://registry-1.docker.io/v2/library/ubuntu/manifests/latest: unauthorized: incorrect username or password.
See '/Applications/Docker.app/Contents/Resources/bin/docker run --help'.

Anything else to try???

@BrendonW
Copy link
Author

BrendonW commented May 8, 2017

Note, with only one "docker" in the /usr/local/bin/docker command, so not exactly as you had.

> Firefly:~ brendon$ /usr/local/bin/docker -H unix:///Users/brendon/Library/Containers/com.docker.docker/Data/00000003.00000948 ps
> CONTAINER ID        IMAGE               COMMAND             CREATED             STATUS              PORTS               NAMES
> 

@djs55
Copy link
Contributor

djs55 commented May 8, 2017

Could you try: (sorry I had a typo in my original path)

/usr/local/bin/docker -H unix:///Users/brendon/Library/Containers/com.docker.docker/Data/00000003.00000948 pull ubuntu

-- my guess is that will fail, but just in case.

Since it worked on your network on your daughter's computer but not on yours, even after you reinstalled your computer, perhaps there is something specific to your user like a config file or keychain item? Could you try

  • closing Docker
  • creating a "test" user on the Mac and logging in with it (mine was an admin user but that shouldn't be necessary)
  • starting Docker
  • running docker pull ubuntu

(Sorry this is taking so long to narrow down)

@BrendonW
Copy link
Author

BrendonW commented May 8, 2017

As you suspected:-

Firefly:monolith brendon$ /usr/local/bin/docker -H unix:///Users/brendon/Library/Containers/com.docker.docker/Data/00000003.00000948 pull ubuntu
Using default tag: latest
Error response from daemon: Get https://registry-1.docker.io/v2/: net/http: request canceled while waiting for connection (Client.Timeout exceeded while awaiting headers)

I created a new user, non-admin, started docker and ran docker run -ti ubuntu bash which pulled the image and ran.

Here is my env output with the items that are basically the same as the TestUser removed:
term_comp.txt

Hopefully you can see some clues in there...

@BrendonW
Copy link
Author

BrendonW commented May 8, 2017

Current situation after logging out of the test user, and back to my user.

This is in the screen tty:-

/ # docker run -it ubuntu bash
root@0fc94c66c955:/# docker run hello-world
bash: docker: command not found
root@0fc94c66c955:/# exit
exit
/ # docker run hello-world
Unable to find image 'hello-world:latest' locally
docker: Error response from daemon: Get https://registry-1.docker.io/v2/: dial tcp: lookup registry-1.docker.io on 192.168.65.1:53: read udp 192.168.65.2:47152->192.168.65.1:53: i/o timeout.
See 'docker run --help'.
/ # 

And on the mac terminal:

Firefly:monolith brendon$ docker run -it ubuntu bash
root@73db99c44acd:/# ^C
root@73db99c44acd:/# exit
exit
Firefly:monolith brendon$ docker run hello-world
Unable to find image 'hello-world:latest' locally
docker: Error response from daemon: Get https://registry-1.docker.io/v2/: net/http: request canceled while waiting for connection (Client.Timeout exceeded while awaiting headers).
See 'docker run --help'.

@djs55
Copy link
Contributor

djs55 commented May 8, 2017

Thanks for the update -- I can't see anything unexpected in your environment variables. I think the problem must be in the filesystem state.

Perhaps try:

  • shut down the app completely
  • cp ~/Library/Containers/com.docker.docker/Data/com.docker.driver.amd64-linux/Docker.qcow2 ~/Docker.qcow2 -- take a backup of the containers and images
  • start the app
  • use the whale menu -> Preferences -> Reset -> Reset to factory defaults -- this will zap the app configuration
  • shut down the app completely (give it 30s to complete)
  • cp ~/Docker.qcow2 ~/Library/Containers/com.docker.docker/Data/com.docker.driver.amd64-linux/Docker.qcow2 -- restore the backup of the containers and images
  • start the app again
  • try to pull an image

(As an aside I'm a bit suspicious that the error has changed from "unauthorized" to "request canceled". I hope the other instance of Docker is definitely completely shutdown -- it might be worth a reboot to make sure)

Another possibility:

  • start the app
  • try to pull the image
  • mv ~/.docker ~/.docker.backup -- just in case something is added to the config.json file. I'm a bit suspicious about the OSX credentials store... could it be somehow involved in the unauthorised failures?
  • try to pull the image again

@BrendonW
Copy link
Author

BrendonW commented May 8, 2017

I can't see anything unexpected in your environment variables.
Except for my Github access token...

How can I make sure no Docker processes are running?

@djs55
Copy link
Contributor

djs55 commented May 8, 2017

@BrendonW saw the token -- should have suggested you redact it :)

To check if all the docker processes are gone:

ps uax | grep com.docker

-- you can kill any process starting with com.docker

@BrendonW
Copy link
Author

BrendonW commented May 8, 2017

This is the only thing that keeps running...
root 1153 0.0 0.0 2473088 4116 ?? Ss 5:51AM 0:00.01 /Library/PrivilegedHelperTools/com.docker.vmnetd

@djs55
Copy link
Contributor

djs55 commented May 8, 2017

com.docker.vmnetd is safe to leave -- everything else should be removed.

@BrendonW
Copy link
Author

BrendonW commented May 8, 2017

Same results after the clean reinstall -- I didn't bother with the cow backup -- I'd already lost anything I care about in earlier reinstall attempts.

I noticed that even though I'd reset everything, I was still logged in... so I logged out and Docker crashed -- or at least went away.

I restarted docker and then everything worked! There is some config that is hiding somewhere through uninstalls and reboots...

@BrendonW
Copy link
Author

BrendonW commented May 8, 2017

I did kill vmnet... btw

@BrendonW
Copy link
Author

BrendonW commented May 8, 2017

How can I figure out the problem so we can prevent this happening in the future?

@BrendonW
Copy link
Author

BrendonW commented May 9, 2017

What I don't understand is that I logged/out and in multiple times during the process, and only this last logout "fixed" the issue. This logout was different, because Docker literally vanished when I logged out.

@djs55
Copy link
Contributor

djs55 commented May 9, 2017

I don't understand the cause yet either :( Let me know if you spot any other recurrences.

BTW a new version of edge was released today: Version 17.05.0-ce-mac9 (17434). Be aware that the vpnkit DNS fix is not in this version -- it'll be in the next version -- so if you upgrade you would need to replace the vpnkit binary again (from #1569 (comment))

@docker-robott
Copy link
Collaborator

Closed issues are locked after 30 days of inactivity.
This helps our team focus on active issues.

If you have found a problem that seems similar to this, please open a new issue.

Send feedback to Docker Community Slack channels #docker-for-mac or #docker-for-windows.
/lifecycle locked

@docker docker locked and limited conversation to collaborators Jun 23, 2020
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Projects
None yet
Development

No branches or pull requests

3 participants