New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Poor performance for reading Numpy #460
Comments
@Narsil please help |
The import pickle
import numpy as np
from safetensors.numpy import load_file, load
from safetensors.numpy import save_file
w = np.empty([256,1024,1024])
state_dict = {"weight": w}
import time
fp = "test_file.safetensors"
t1 = time.time()
save_file(state_dict,fp)
print("sf save:", time.time()-t1)
t1 = time.time()
state = load_file(fp)
print("sf load:", time.time()-t1)
t1 = time.time()
with open(fp, "rb") as f:
data = f.read()
loaded = load(data)
print("sf load2:", time.time()-t1)
fp = fp.replace(".safetensors", ".pickle")
t1 = time.time()
state = pickle.dump(state_dict, open(fp, "wb"))
print("pickle save:", time.time()-t1)
t1 = time.time()
state = pickle.load(open(fp, "rb"))
print("pickle load:", time.time()-t1)
|
@mishig25 can you give some help? |
@LysandreJik can you give some help? |
For
|
let pydata: PyObject = PyByteArray::new(py, tensor.data()).into(); |
Advise
Can we support loading file without additional MEM->MEM?
If memcpy + mmap
is inevitable, can we have substitution?
Something is wrong in your system, what are you using ? Windows + WSL is a usual culprit for very poor mmap support/performance. In order to make things "fast" we could always skip a few things, but that makes the thing Pyo3 0.21 could enable something a bit faster though since we could skip the rust owned version of the tensors. |
https://stackoverflow.com/questions/52845387/improving-mmap-memcpy-file-read-performance my os ubuntu 18.04. you can have a test using above scripts. There are some suggestions for using |
This issue is stale because it has been open 30 days with no activity. Remove stale label or comment or this will be closed in 5 days. |
Still a big problem. |
System Info
I test 7b model with fp32 weight, store with numpy format. I found that compared with pickle, the loading speed is slower more than 50% !!!
time usage.
Information
Reproduction
python3.10
safetensors=0.4.2
Expected behavior
$ cat test_safetensor.py
results:
sf save: 2.818842887878418
sf load: 1.8608193397521973
pickle save: 2.3684301376342773
pickle load: 1.004188060760498
The text was updated successfully, but these errors were encountered: