Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

OSError: [Errno 24] Too many open files #1030

Open
2 tasks done
SynclabIO opened this issue May 5, 2021 · 18 comments
Open
2 tasks done

OSError: [Errno 24] Too many open files #1030

SynclabIO opened this issue May 5, 2021 · 18 comments

Comments

@SynclabIO
Copy link

SynclabIO commented May 5, 2021

Checklist

  • The bug is reproducible against the latest release and/or master.
  • There are no similar issues or pull requests to fix it yet.

Describe the bug

Server will hit some limit (I don't know what it is) and stop accept any new request.
Looks like Some connection is always alive

To reproduce

Host a server and keep making request on it after around 24H.

Expected behavior

The connection should be close after 60 sec.

Actual behavior

Get asyncio:socket.accept() out of system resource and OSError: [Errno 24] Too many open files error log when try to make new request on server.
I already make sure everything or browser is closed. But looks like some connection is still alive.

Debugging material

The ESTABLISHED will increase by time.

run lsof in container:

COMMAND  PID TID TASKCMD USER   FD      TYPE             DEVICE SIZE/OFF     NODE NAME
uvicorn    1             root    0u      CHR                1,3      0t0        6 /dev/null
uvicorn    1             root    1w     FIFO               0,13      0t0 18371653 pipe
uvicorn    1             root    2w     FIFO               0,13      0t0 18371654 pipe
uvicorn    1             root    3u  a_inode               0,14        0    13401 [eventpoll]
uvicorn    1             root    4u     unix 0x0000000000000000      0t0 18369360 type=STREAM
uvicorn    1             root    5u     unix 0x0000000000000000      0t0 18369361 type=STREAM
uvicorn    1             root    6u     IPv4           18369363      0t0      TCP *:https (LISTEN)
uvicorn    1             root    7u     IPv4           18413076      0t0      TCP 59c08b6aac89:https->61-66-209-161.askey.com.tw:27651 (ESTABLISHED)
uvicorn    1             root    8u     IPv4           18369366      0t0      TCP 59c08b6aac89:33846->ip-172-31-28-203.ap-northeast-1.compute.internal:postgresql (ESTABLISHED)
uvicorn    1             root    9u     IPv4           18600291      0t0      TCP 59c08b6aac89:https->125-227-151-121.HINET-IP.hinet.net:61830 (ESTABLISHED)
uvicorn    1             root   10u     IPv4           18384947      0t0      TCP 59c08b6aac89:https->61-222-56-55.HINET-IP.hinet.net:14377 (ESTABLISHED)
uvicorn    1             root   11u     IPv4           18388349      0t0      TCP 59c08b6aac89:https->210.241.98.253:12903 (ESTABLISHED)
uvicorn    1             root   12u     IPv4           18402240      0t0      TCP 59c08b6aac89:https->125-227-151-121.HINET-IP.hinet.net:57692 (ESTABLISHED)
uvicorn    1             root   13u     IPv4           18370708      0t0      TCP 59c08b6aac89:33856->ip-172-31-28-203.ap-northeast-1.compute.internal:postgresql (ESTABLISHED)
uvicorn    1             root   14u     IPv4           18369763      0t0      TCP 59c08b6aac89:56040->ip-172-31-5-149.ap-northeast-1.compute.internal:postgresql (ESTABLISHED)
uvicorn    1             root   15u     IPv4           18369373      0t0      TCP 59c08b6aac89:56042->ip-172-31-5-149.ap-northeast-1.compute.internal:postgresql (ESTABLISHED)
uvicorn    1             root   16u     IPv4           18370707      0t0      TCP 59c08b6aac89:56044->ip-172-31-5-149.ap-northeast-1.compute.internal:postgresql (ESTABLISHED)
uvicorn    1             root   17u     IPv4           18402915      0t0      TCP 59c08b6aac89:https->61-30-51-61.static.tfn.net.tw:26547 (ESTABLISHED)
uvicorn    1             root   18u     IPv4           18373916      0t0      TCP 59c08b6aac89:https->61-222-56-55.HINET-IP.hinet.net:14034 (ESTABLISHED)
uvicorn    1             root   19u     IPv4           18479991      0t0      TCP 59c08b6aac89:https->125-230-66-32.dynamic-ip.hinet.net:46420 (ESTABLISHED)
uvicorn    1             root   20u     IPv4           18604769      0t0      TCP 59c08b6aac89:https->125-227-151-121.HINET-IP.hinet.net:61864 (ESTABLISHED)
uvicorn    1             root   21u     IPv4           18509169      0t0      TCP 59c08b6aac89:https->114-137-84-64.emome-ip.hinet.net:54518 (ESTABLISHED)
uvicorn    1             root   22u     IPv4           18668143      0t0      TCP 59c08b6aac89:https->125-227-151-121.HINET-IP.hinet.net:63229 (ESTABLISHED)
uvicorn    1             root   23u     IPv4           18520950      0t0      TCP 59c08b6aac89:https->125-227-151-121.HINET-IP.hinet.net:60139 (ESTABLISHED)
uvicorn    1             root   24u     IPv4           18374580      0t0      TCP 59c08b6aac89:https->211-75-187-47.HINET-IP.hinet.net:56651 (ESTABLISHED)
uvicorn    1             root   25u     IPv4           18376897      0t0      TCP 59c08b6aac89:https->61-222-56-55.HINET-IP.hinet.net:14167 (ESTABLISHED)
uvicorn    1             root   26u     IPv4           18386427      0t0      TCP 59c08b6aac89:https->210.241.98.253:12898 (ESTABLISHED)
uvicorn    1             root   27u     IPv4           18399495      0t0      TCP 59c08b6aac89:https->101.10.59.24:19382 (ESTABLISHED)
uvicorn    1             root   28u     IPv4           18409532      0t0      TCP 59c08b6aac89:https->36-224-32-223.dynamic-ip.hinet.net:1918 (ESTABLISHED)
uvicorn    1             root   29u     IPv4           18386429      0t0      TCP 59c08b6aac89:https->210.241.98.253:12899 (ESTABLISHED)
uvicorn    1             root   30u     IPv4           18388350      0t0      TCP 59c08b6aac89:https->210.241.98.253:12900 (ESTABLISHED)
uvicorn    1             root   31u     IPv4           18388352      0t0      TCP 59c08b6aac89:https->210.241.98.253:12901 (ESTABLISHED)
uvicorn    1             root   32u     IPv4           18494758      0t0      TCP 59c08b6aac89:https->125-230-66-32.dynamic-ip.hinet.net:47366 (ESTABLISHED)
uvicorn    1             root   33u     IPv4           18372777      0t0      TCP 59c08b6aac89:56334->ip-172-31-5-149.ap-northeast-1.compute.internal:postgresql (ESTABLISHED)
uvicorn    1             root   34u     IPv4           18386431      0t0      TCP 59c08b6aac89:https->61-222-56-55.HINET-IP.hinet.net:14590 (ESTABLISHED)
uvicorn    1             root   35u     IPv4           18370173      0t0      TCP 59c08b6aac89:https->61-222-56-55.HINET-IP.hinet.net:13950 (ESTABLISHED)
uvicorn    1             root   36u     IPv4           18372577      0t0      TCP 59c08b6aac89:https->61-222-56-55.HINET-IP.hinet.net:13988 (ESTABLISHED)
uvicorn    1             root   37u     IPv4           18530325      0t0      TCP 59c08b6aac89:https->49.216.39.2:8508 (ESTABLISHED)
uvicorn    1             root   38u     IPv4           18370760      0t0      TCP 59c08b6aac89:56102->ip-172-31-5-149.ap-northeast-1.compute.internal:postgresql (ESTABLISHED)
uvicorn    1             root   39u     IPv4           18388354      0t0      TCP 59c08b6aac89:https->210.241.98.253:12902 (ESTABLISHED)
uvicorn    1             root   40u     IPv4           18554686      0t0      TCP 59c08b6aac89:https->210-209-175-100.veetime.com:9852 (ESTABLISHED)
uvicorn    1             root   41u     IPv4           18441399      0t0      TCP 59c08b6aac89:https->36-227-149-58.dynamic-ip.hinet.net:63682 (ESTABLISHED)
...
...
...
...

There shows like 500u or more after server up time for 24h

When it hit the limit, the server no longer can accept any new request.
docker logs shows below:

ERROR:asyncio:socket.accept() out of system resource
socket: <asyncio.TransportSocket fd=6, family=AddressFamily.AF_INET, type=SocketKind.SOCK_STREAM, proto=6, laddr=('0.0.0.0', 443)>
Traceback (most recent call last):
  File "/usr/local/lib/python3.9/asyncio/selector_events.py", line 164, in _accept_connection
    conn, addr = sock.accept()
  File "/usr/local/lib/python3.9/socket.py", line 293, in accept
    fd, addr = self._accept()
OSError: [Errno 24] Too many open files

I have to down and up the docker container again.
And the number will go down to only like 10u

Run ulimit -a in container:

core file size          (blocks, -c) 0
data seg size           (kbytes, -d) unlimited
scheduling priority             (-e) 0
file size               (blocks, -f) unlimited
pending signals                 (-i) 63280
max locked memory       (kbytes, -l) 64
max memory size         (kbytes, -m) unlimited
open files                      (-n) 1024
pipe size            (512 bytes, -p) 8
POSIX message queues     (bytes, -q) 819200
real-time priority              (-r) 0
stack size              (kbytes, -s) 8192
cpu time               (seconds, -t) unlimited
max user processes              (-u) 63280
virtual memory          (kbytes, -v) unlimited
file locks                      (-x) unlimited

I already tried increase the open files (-n) 1024.
But it didn't resolve the problem.
Only increase the time to happen this problem.

Environment

Run server by following command:

uvicorn main:app --host 0.0.0.0 --port 443 --proxy-headers --timeout-keep-alive 60 --limit-concurrency 1000 --ssl-keyfile=./xxxx.key --ssl-certfile=./xxxx.cer

uvicorn --version:

Running uvicorn 0.13.4 with CPython 3.9.4 on Linux

python --version:

Python 3.9.4

cat /etc/os-release in container:

PRETTY_NAME="Debian GNU/Linux 10 (buster)"
NAME="Debian GNU/Linux"
VERSION_ID="10"
VERSION="10 (buster)"
VERSION_CODENAME=buster
ID=debian
HOME_URL="https://www.debian.org/"
SUPPORT_URL="https://www.debian.org/support"
BUG_REPORT_URL="https://bugs.debian.org/"

Important

  • We're using Polar.sh so you can upvote and help fund this issue.
  • We receive the funding once the issue is completed & confirmed by you.
  • Thank you in advance for helping prioritize & fund our backlog.
Fund with Polar
@jeho0815
Copy link

jeho0815 commented May 7, 2021

the open files is 1024, you should increase it!
open files (-n) 1024

@SynclabIO
Copy link
Author

the open files is 1024, you should increase it!
open files (-n) 1024

I already tried it. It only increase time to this error.
It still happen after more time.
It shouldn't happen.
Even I set up firewall to block all connection.
The lsof number is still not going down.

@euri10
Copy link
Member

euri10 commented May 21, 2021

Hard to say if this is on our side (there's nothing in the logs that says something about uvicorn or I'm missing something ?).

It would be also interesting to check if this happens without ssl enabled.

If that does happen only after 24h it's going to be hard to reproduce so please try to come up with a reproducible example that would trigger that without being that long so we can investigate.

On my side I tried to reproduce with an app running pretty much like yours uvicorn app:app --proxy-headers --timeout-keep-alive 60 --ssl-keyfile=server.key --ssl-certfile=server.pem
Hitting it with wrk for 3 minutes with wrk -c 2048 -d 3m -t16 https://localhost:8000
I run also an lsof check with watch -n 1 'lsof -i :8000 | wc -l' and it's stable at 4098 for the whole 3 minutes so no issues nor ESTABLISHED increasing over time, i see no reason why this would occur after a given period of time, we should be able to see this asap.

@tkor1
Copy link

tkor1 commented Jun 25, 2021

I see this problem too. Some of my experience:

  1. lsof shows ~800 before starting the server and goes to about 880 upon start. netstat shows ~120 lines (all included, even headers).
  2. Not an active server, I'm playing on the side with it, so no load except my control test.
  3. I see similar result for the open file limit set to 900, 1024 and 2048.
  4. I start the server and immediately do an HTTP request, getting response successfully.
  5. Leave the server running for about 10-15minutes without any HTTP request. lsof stays pretty stable at around 880. netstat shows ~120 lines (all included, even headers).
  6. Check lsof and netstat and you see the same numbers. Make an HTTP request and it doesn't get a response, the server log shows the asyncio message that @SynclabIO posted. The error message only happens upon making the new HTTP request. no change on lsof and netstat.
  7. At this point I couldn't find any remedy to make it work again. If I move the task to background, change the ulimit and move back to the foreground it just Sigfault.

From what I can tell, some resource is leaking, but I can't point to the specific one.

@euri10
Copy link
Member

euri10 commented Jun 25, 2021

what would help is the application or the endpoint your're suspecting that would leak.

@euri10
Copy link
Member

euri10 commented Jul 7, 2022

will close this as stale, feel free to reopen if the issue persists

@euri10 euri10 closed this as completed Jul 7, 2022
@slava-shor-emporus
Copy link

slava-shor-emporus commented Jan 30, 2023

Hm... It looks like we experienced precisely the same issue. But in our case takes a few days until the service enters into a loop of such error without a recovery.

asyncio - ERROR - socket.accept() out of system resource
socket: <asyncio.TransportSocket fd=9, family=AddressFamily.AF_INET, type=SocketKind.SOCK_STREAM, proto=6, laddr=('0.0.0.0', 8080)>
Traceback (most recent call last):
File "/usr/local/lib/python3.10/asyncio/selector_events.py", line 159, in _accept_connection
conn, addr = sock.accept()
File "/usr/local/lib/python3.10/socket.py", line 293, in accept
fd, addr = self._accept()
OSError: [Errno 24] Too many open files

@AranVinkItility
Copy link

AranVinkItility commented Feb 1, 2023

Same here, running uvicorn 0.20.0/FastAPI 0.89.1/Python 3.9 inside a container on ECS. Ran fine for a few days with some occasional load and then errors out.

ERROR:asyncio:socket.accept() out of system resource
socket: <asyncio.TransportSocket fd=6, family=AddressFamily.AF_INET, type=SocketKind.SOCK_STREAM, proto=0, laddr=('0.0.0.0', 80)>
Traceback (most recent call last):
File "/venv/lib/python3.9/asyncio/selector_events.py", line 159, in _accept_connection
File "/venv/lib/python3.9/socket.py", line 293, in accept
OSError: [Errno 24] Too many open files

@slava-shor-emporus
Copy link

Same here, running uvicorn 0.20.0/FastAPI 0.89.1/Python 3.9 inside a container on ECS. Ran fine for a few days with some occasional load and then errors out.

Pardon me, forgot to attach our spec.
We run on AWS Fargate in a docker image build from python:3.10-slim (current Python version 3.10.2) and among dependencies:

  • fastapi 0.89.1
  • uvicorn 0.20.0
  • uvloop 0.17.0

@slava-shor-emporus
Copy link

I sense @AranVinkItility we have something in common. We are running containers on AWS.

@DazEdword
Copy link

DazEdword commented Feb 6, 2023

I am having a similar issue. I am just experimenting with FastAPI for the first time, so my application is barely a hello world. It seems to run OK locally (directly), but when trying to run once it's containerised it throws this error right away:

Skipping virtualenv creation, as specified in config file.
INFO:     Will watch for changes in these directories: ['/app']
INFO:     Uvicorn running on http://127.0.0.1:8000 (Press CTRL+C to quit)
INFO:     Started reloader process [1] using WatchFiles
Traceback (most recent call last):
  File "/usr/local/bin/uvicorn", line 8, in <module>
    sys.exit(main())
  File "/usr/local/lib/python3.10/site-packages/click/core.py", line 1130, in __call__
    return self.main(*args, **kwargs)
  File "/usr/local/lib/python3.10/site-packages/click/core.py", line 1055, in main
    rv = self.invoke(ctx)
  File "/usr/local/lib/python3.10/site-packages/click/core.py", line 1404, in invoke
    return ctx.invoke(self.callback, **ctx.params)
  File "/usr/local/lib/python3.10/site-packages/click/core.py", line 760, in invoke
    return __callback(*args, **kwargs)
  File "/usr/local/lib/python3.10/site-packages/uvicorn/main.py", line 404, in main
    run(
  File "/usr/local/lib/python3.10/site-packages/uvicorn/main.py", line 564, in run
    ChangeReload(config, target=server.run, sockets=[sock]).run()
  File "/usr/local/lib/python3.10/site-packages/uvicorn/supervisors/basereload.py", line 45, in run
    for changes in self:
  File "/usr/local/lib/python3.10/site-packages/uvicorn/supervisors/basereload.py", line 64, in __next__
    return self.should_restart()
  File "/usr/local/lib/python3.10/site-packages/uvicorn/supervisors/watchfilesreload.py", line 85, in should_restart
    changes = next(self.watcher)
  File "/usr/local/lib/python3.10/site-packages/watchfiles/main.py", line 119, in watch
    with RustNotify([str(p) for p in paths], debug, force_polling, poll_delay_ms, recursive) as watcher:
_rust_notify.WatchfilesRustInternalError: Error creating recommended watcher: Too many open files (os error 24)

My dependencies (using Poetry):

[tool.poetry.dependencies]
python = "^3.10"
fastapi = "^0.89.1"
uvicorn = {extras = ["standard"], version = "^0.20.0"}

The docker image causing trouble is python:3.10-slim-bullseye. Trying to increase the open files in the container doesn't seem to help.

Edit: After some further experimentation with open file limits I have managed to make it work, which suggests there is an issue with docker and the local system defining their ulimit.

Setting Docker ulimit explicitly in the docker commands did not work:
docker build . -t fastapi-template && docker run --rm --ulimit nofile=90000:90000 -it -p 8000:8000 fastapi-template

However, bumping my local user's ulimit as explained here helped: https://stackoverflow.com/a/24331638
Then I could run:

docker build . -t fastapi-template && docker run --rm -it -p 8000:8000 fastapi-template

@zvolsky
Copy link

zvolsky commented Mar 14, 2023

Same here,
pyenv local 3.11.2
poetry env use $(which python)
fastapi
and HelloWorld application.
uvicorn subdir.main:app --reload --reload-include='subdir/'

[tool.poetry.dependencies]
python = "^3.11"
fastapi = "^0.94.1"
uvicorn = {extras = ["standard"], version = "^0.21.0"}
tortoise-orm = "^0.19.3"
aerich = "^0.7.1"
python-multipart = "^0.0.6"
watchfiles = "^0.18.1"

@zvolsky
Copy link

zvolsky commented Mar 15, 2023

Regarding DazEdword and my similar issue it looks like that it is bound to the usage of poetry. If I recreate same virtual environment using python -m venv and pip, then I have no problem.
So maybe some interaction of Rust watchfiles notifier and poetry. Maybe the error is internally (rust?) something completly different and just on python level raised as "Too many files"?
I don't understand the mechanism of raising this python exception and I don't know how to see the original error from rust.

@punitvara
Copy link

punitvara commented Feb 14, 2024

I am facing the same issue as well.

2024-02-14 15:07:08,273 - ERROR - socket.accept() out of system resource
socket: <asyncio.TransportSocket fd=8, family=2, type=1, proto=0, laddr=('0.0.0.0', 8000)>
Traceback (most recent call last):
File "/opt/homebrew/Cellar/python@3.11/3.11.7_1/Frameworks/Python.framework/Versions/3.11/lib/python3.11/asyncio/selector_events.py", line 165, in _accept_connection
File "/opt/homebrew/Cellar/python@3.11/3.11.7_1/Frameworks/Python.framework/Versions/3.11/lib/python3.11/socket.py", line 294, in accept
OSError: [Errno 24] Too many open files

What is the solution for this issue ?

@BrightXiaoHan
Copy link

Same issue here. Any update about this?

@abduakhatov
Copy link

Same error here. Any updates? shall we re-open the issue?

Apr 01 06:15:23 10.13.7.14 uvicorn[27356]:   File "/usr/lib/python3.9/socket.py", line 293, in accept

Apr 01 06:15:23 10.13.7.14 uvicorn[27356]: OSError: [Errno 24] Too many open files

Apr 01 06:15:23 10.13.7.14 uvicorn[27356]: 2024-04-01 06:15:17,534 ERROR:     loc=asyncio default_exception_handler() L1738 ->  socket.accept() out of system resource

Apr 01 06:15:23 10.13.7.14 uvicorn[27356]: socket: <asyncio.TransportSocket fd=3, family=AddressFamily.AF_INET, type=SocketKind.SOCK_STREAM, proto=0, laddr=('0.0.0.0', 8080)>

Apr 01 06:15:23 10.13.7.14 uvicorn[27356]: Traceback (most recent call last):

Apr 01 06:15:23 10.13.7.14 uvicorn[27356]:   File "/usr/lib/python3.9/asyncio/selector_events.py", line 164, in _accept_connection

S

@TheMadrasTechie
Copy link

I am also facing the same issue.
Please help.
Can wee ree open it ?

@Kludex Kludex reopened this Apr 17, 2024
@TheMadrasTechie
Copy link

socket.accept() out of system resource
socket: <asyncio.TransportSocket fd=7, family=AddressFamily.AF_INET, type=SocketKind.SOCK_STREAM, proto=6, laddr=('0.0.0.0', 8000)>
Traceback (most recent call last):
File "/usr/local/lib/python3.10/asyncio/selector_events.py", line 159, in _accept_connection
File "/usr/local/lib/python3.10/socket.py", line 293, in accept
OSError: [Errno 24] Too many open files

Hi,
I recieve this.
I am using docker.
Any suggestions for thiss.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests