Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Segfault when using sentence-transformers (2.7.0) with Sklearn (1.3.0) #2631

Open
delip opened this issue May 7, 2024 · 7 comments
Open

Comments

@delip
Copy link

delip commented May 7, 2024

I have a strange failure case.

  1. The following code works fine:
    CleanShot 2024-05-07 at 18 22 24

  2. If I import this single line from sklearn, I get a segfault!
    CleanShot 2024-05-07 at 18 21 15

Any idea what could be happening? I have tried this in a fresh environment and still see it.

Environment details:

$ python --version                                                                                                
Python 3.12.2

$  pip list
Package                           Version
--------------------------------- ------------
aiobotocore                       2.7.0
aiohttp                           3.9.3
aioitertools                      0.7.1
aiosignal                         1.2.0
alabaster                         0.7.12
altair                            5.0.1
annotated-types                   0.6.0
anyio                             4.2.0
appdirs                           1.4.4
applaunchservices                 0.3.0
appnope                           0.1.3
appscript                         1.1.2
argon2-cffi                       21.3.0
argon2-cffi-bindings              21.2.0
arrow                             1.2.3
astroid                           2.14.2
astropy                           5.3.4
asttokens                         2.0.5
async-lru                         2.0.4
atomicwrites                      1.4.0
attrs                             23.1.0
Automat                           20.2.0
autopep8                          2.0.4
Babel                             2.11.0
backcall                          0.2.0
bcrypt                            3.2.0
beautifulsoup4                    4.12.2
binaryornot                       0.4.4
black                             23.11.0
bleach                            4.1.0
blinker                           1.6.2
bokeh                             3.3.4
botocore                          1.31.64
Bottleneck                        1.3.7
Brotli                            1.0.9
cachetools                        4.2.2
certifi                           2024.2.2
cffi                              1.16.0
chardet                           4.0.0
charset-normalizer                2.0.4
click                             8.1.7
cloudpickle                       2.2.1
colorama                          0.4.6
colorcet                          3.1.0
comm                              0.2.1
constantly                        23.10.4
contourpy                         1.2.0
cookiecutter                      2.6.0
cryptography                      42.0.5
cssselect                         1.2.0
cycler                            0.11.0
cytoolz                           0.12.2
dask                              2023.11.0
datasets                          2.19.1
datashader                        0.16.0
debugpy                           1.6.7
decorator                         5.1.1
defusedxml                        0.7.1
diff-match-patch                  20200713
dill                              0.3.8
distributed                       2023.11.0
distro                            1.9.0
docopt                            0.6.2
docstring-to-markdown             0.11
docutils                          0.18.1
entrypoints                       0.4
et-xmlfile                        1.1.0
executing                         0.8.3
fastjsonschema                    2.16.2
filelock                          3.13.1
flake8                            7.0.0
Flask                             2.2.5
fonttools                         4.25.0
frozenlist                        1.4.0
fsspec                            2023.10.0
gensim                            4.3.2
gitdb                             4.0.7
GitPython                         3.1.37
greenlet                          3.0.1
h11                               0.14.0
h5py                              3.9.0
HeapDict                          1.0.1
holoviews                         1.18.3
httpcore                          1.0.5
httpx                             0.27.0
huggingface-hub                   0.23.0
hvplot                            0.9.2
hyperlink                         21.0.0
idna                              3.4
imagecodecs                       2023.1.23
imageio                           2.33.1
imagesize                         1.4.1
imbalanced-learn                  0.11.0
importlib-metadata                7.0.1
incremental                       22.10.0
inflection                        0.5.1
iniconfig                         1.1.1
intake                            0.6.8
intervaltree                      3.1.0
ipykernel                         6.28.0
ipython                           8.12.3
ipython-genutils                  0.2.0
ipywidgets                        7.8.1
isort                             5.9.3
itemadapter                       0.3.0
itemloaders                       1.1.0
itsdangerous                      2.0.1
jaraco.classes                    3.2.1
jedi                              0.18.1
jellyfish                         1.0.1
Jinja2                            3.1.3
jmespath                          1.0.1
joblib                            1.2.0
json5                             0.9.6
jsonschema                        4.19.2
jsonschema-specifications         2023.7.1
jupyter                           1.0.0
jupyter_client                    8.6.0
jupyter-console                   6.6.3
jupyter_core                      5.5.0
jupyter-events                    0.8.0
jupyter-lsp                       2.2.0
jupyter_server                    2.10.0
jupyter_server_terminals          0.4.4
jupyterlab                        4.0.11
jupyterlab-pygments               0.1.2
jupyterlab_server                 2.25.1
jupyterlab-widgets                1.0.0
keyring                           24.3.1
kiwisolver                        1.4.4
lazy_loader                       0.3
lazy-object-proxy                 1.6.0
lckr_jupyterlab_variableinspector 3.1.0
linkify-it-py                     2.0.0
litellm                           1.35.38
llvmlite                          0.42.0
lmdb                              1.4.1
locket                            1.0.0
lxml                              4.9.3
lz4                               4.3.2
Markdown                          3.4.1
markdown-it-py                    2.2.0
MarkupSafe                        2.1.3
matplotlib                        3.8.0
matplotlib-inline                 0.1.6
mccabe                            0.7.0
mdit-py-plugins                   0.3.0
mdurl                             0.1.0
mistune                           2.0.4
more-itertools                    10.1.0
mpmath                            1.3.0
msgpack                           1.0.3
multidict                         6.0.4
multipledispatch                  0.6.0
multiprocess                      0.70.16
munkres                           1.1.4
mypy                              1.8.0
mypy-extensions                   1.0.0
nbclient                          0.8.0
nbconvert                         7.16.4
nbformat                          5.9.2
nest-asyncio                      1.6.0
networkx                          3.1
nltk                              3.8.1
notebook                          7.0.8
notebook_shim                     0.2.3
numba                             0.59.0
numexpr                           2.8.7
numpy                             1.26.4
numpydoc                          1.5.0
openai                            1.25.2
openpyxl                          3.0.10
overrides                         7.4.0
packaging                         23.2
pandas                            2.1.4
pandocfilters                     1.5.0
panel                             1.3.8
param                             2.1.0
parsel                            1.8.1
parso                             0.8.3
partd                             1.4.1
pathspec                          0.10.3
patsy                             0.5.3
pexpect                           4.8.0
pickleshare                       0.7.5
pillow                            10.2.0
pip                               23.3.1
pipreqs                           0.5.0
platformdirs                      3.10.0
plotly                            5.19.0
pluggy                            1.0.0
ply                               3.11
prometheus-client                 0.14.1
prompt-toolkit                    3.0.43
Protego                           0.1.16
protobuf                          3.20.3
psutil                            5.9.0
ptyprocess                        0.7.0
pure-eval                         0.2.2
py-cpuinfo                        9.0.0
pyarrow                           14.0.2
pyarrow-hotfix                    0.6
pyasn1                            0.4.8
pyasn1-modules                    0.2.8
pycodestyle                       2.11.1
pycparser                         2.21
pyct                              0.5.0
pycurl                            7.45.2
pydantic                          2.7.1
pydantic_core                     2.18.2
pydeck                            0.8.0
PyDispatcher                      2.0.5
pydocstyle                        6.3.0
pyerfa                            2.0.0
pyflakes                          3.2.0
Pygments                          2.15.1
pylint                            2.16.2
pylint-venv                       3.0.3
pyls-spyder                       0.4.0
pyobjc-core                       10.1
pyobjc-framework-Cocoa            10.1
pyobjc-framework-CoreServices     10.1
pyobjc-framework-FSEvents         10.1
pyodbc                            5.0.1
pyOpenSSL                         24.0.0
pyparsing                         3.0.9
PyQt5                             5.15.10
PyQt5-sip                         12.13.0
PyQtWebEngine                     5.15.6
PySocks                           1.7.1
pytest                            7.4.0
python-dateutil                   2.8.2
python-dotenv                     1.0.1
python-json-logger                2.0.7
python-lsp-black                  2.0.0
python-lsp-jsonrpc                1.1.2
python-lsp-server                 1.10.0
python-slugify                    5.0.2
python-snappy                     0.6.1
pytoolconfig                      1.2.6
pytz                              2023.3.post1
pyviz_comms                       3.0.2
pywavelets                        1.5.0
PyYAML                            6.0.1
pyzmq                             25.1.2
QDarkStyle                        3.2.3
qstylizer                         0.2.2
QtAwesome                         1.2.2
qtconsole                         5.5.1
QtPy                              2.4.1
queuelib                          1.6.2
referencing                       0.30.2
regex                             2023.10.3
requests                          2.31.0
requests-file                     1.5.1
rfc3339-validator                 0.1.4
rfc3986-validator                 0.1.1
rich                              13.3.5
rope                              1.12.0
rpds-py                           0.10.6
Rtree                             1.0.1
s3fs                              2023.10.0
safetensors                       0.4.3
scikit-image                      0.22.0
scikit-learn                      1.3.0
scipy                             1.11.4
Scrapy                            2.11.1
seaborn                           0.12.2
Send2Trash                        1.8.2
sentence-transformers             2.7.0
service-identity                  18.1.0
setuptools                        68.2.2
sip                               6.7.12
six                               1.16.0
smart-open                        5.2.1
smmap                             4.0.0
sniffio                           1.3.0
snowballstemmer                   2.2.0
sortedcontainers                  2.4.0
soupsieve                         2.5
Sphinx                            5.0.2
sphinxcontrib-applehelp           1.0.2
sphinxcontrib-devhelp             1.0.2
sphinxcontrib-htmlhelp            2.0.0
sphinxcontrib-jsmath              1.0.1
sphinxcontrib-qthelp              1.0.3
sphinxcontrib-serializinghtml     1.1.5
spyder                            5.5.1
spyder-kernels                    2.5.0
SQLAlchemy                        2.0.25
stack-data                        0.2.0
statsmodels                       0.14.0
streamlit                         1.32.0
sympy                             1.12
tables                            3.9.2
tabulate                          0.9.0
tblib                             1.7.0
tenacity                          8.2.2
terminado                         0.17.1
text-unidecode                    1.3
textdistance                      4.2.1
threadpoolctl                     2.2.0
three-merge                       0.1.1
tifffile                          2023.4.12
tiktoken                          0.6.0
tinycss2                          1.2.1
tldextract                        3.2.0
tokenizers                        0.19.1
toml                              0.10.2
tomli                             2.0.1
tomlkit                           0.11.1
toolz                             0.12.0
torch                             2.3.0
tornado                           6.3.3
tqdm                              4.65.0
traitlets                         5.7.1
transformers                      4.40.2
Twisted                           23.10.0
typing_extensions                 4.9.0
tzdata                            2023.3
uc-micro-py                       1.0.1
ujson                             5.4.0
Unidecode                         1.2.0
urllib3                           2.0.3
w3lib                             2.1.2
watchdog                          2.1.6
wcwidth                           0.2.5
webencodings                      0.5.1
websocket-client                  0.58.0
Werkzeug                          2.2.3
whatthepatch                      1.0.2
wheel                             0.41.2
widgetsnbextension                3.6.6
wrapt                             1.14.1
wurlitzer                         3.0.2
xarray                            2023.6.0
xlwings                           0.29.1
xxhash                            3.4.1
xyzservices                       2022.9.0
yapf                              0.40.2
yarg                              0.1.9
yarl                              1.9.3
zict                              3.0.0
zipp                              3.17.0
zope.interface                    5.4.0
@da03
Copy link

da03 commented May 8, 2024

Works fine for me using Python 3.9.16 though:

python --version

Python 3.9.16

pip list

Package                       Version
----------------------------- ------------------
accelerate                    0.24.1
aiohttp                       3.8.4
aiosignal                     1.3.1
appdirs                       1.4.4
async-timeout                 4.0.2
attrs                         23.1.0
bibtexparser                  1.4.1
blis                          0.7.11
brotlipy                      0.7.0
cachetools                    5.3.3
catalogue                     2.0.10
certifi                       2022.12.7
cffi                          1.15.1
charset-normalizer            2.0.4
click                         8.1.3
cloudpathlib                  0.16.0
colorama                      0.4.6
confection                    0.1.3
contourpy                     1.0.7
cryptography                  39.0.1
cupy-cuda113                  10.6.0
curated-tokenizers            0.0.8
curated-transformers          0.1.1
cycler                        0.11.0
cymem                         2.0.8
datasets                      2.19.0
DAWG-Python                   0.7.2
de-core-news-lg               3.7.0
deepspeed                     0.9.5
dill                          0.3.6
docker-pycreds                0.4.0
docopt-ng                     0.9.0
einops                        0.7.0
en-core-web-lg                3.7.0
en-core-web-trf               3.7.2
es-core-news-lg               3.7.0
evaluate                      0.4.0
fastrlock                     0.8.2
filelock                      3.9.0
fonttools                     4.39.3
fr-core-news-lg               3.7.0
frozenlist                    1.3.3
fsspec                        2023.10.0
gitdb                         4.0.10
GitPython                     3.1.31
gmpy2                         2.1.2
google-api-core               2.17.1
google-auth                   2.28.1
google-cloud-aiplatform       1.43.0
google-cloud-bigquery         3.17.2
google-cloud-core             2.4.1
google-cloud-resource-manager 1.12.2
google-cloud-storage          2.14.0
google-crc32c                 1.5.0
google-resumable-media        2.7.0
googleapis-common-protos      1.62.0
grpc-google-iam-v1            0.13.0
grpcio                        1.62.0
grpcio-status                 1.62.0
hjson                         3.1.0
huggingface-hub               0.22.2
idna                          3.4
importlib-resources           5.12.0
it-core-news-lg               3.7.0
ja-core-news-sm               3.7.0
ja-core-news-trf              3.7.2
Jinja2                        3.1.2
joblib                        1.2.0
kiwisolver                    1.4.4
ko-core-news-lg               3.7.0
ko-core-news-sm               3.7.0
langcodes                     3.3.0
latexcodec                    2.0.1
lxml                          4.9.3
MarkupSafe                    2.1.1
matplotlib                    3.7.1
mkl-fft                       1.3.1
mkl-random                    1.2.2
mkl-service                   2.4.0
mpmath                        1.2.1
multidict                     6.0.4
multiprocess                  0.70.14
murmurhash                    1.0.10
networkx                      2.8.4
ninja                         1.11.1
numpy                         1.23.5
openai                        0.28.0
packaging                     23.1
pandas                        2.0.1
pathtools                     0.1.2
pdfminer.six                  20221105
pdfplumber                    0.10.2
phonenumbers                  8.13.24
Pillow                        9.4.0
pip                           23.0.1
portalocker                   2.8.2
preshed                       3.0.9
presidio-analyzer             2.2.350
presidio-anonymizer           2.2.350
proto-plus                    1.23.0
protobuf                      4.25.3
psutil                        5.9.5
pt-core-news-lg               3.7.0
py-cpuinfo                    9.0.0
pyarrow                       14.0.1
pyarrow-hotfix                0.6
pyasn1                        0.5.1
pyasn1-modules                0.3.0
pybtex                        0.24.0
pycparser                     2.21
pycryptodome                  3.19.0
pydantic                      1.10.11
pylatexenc                    2.10
pymorphy3                     1.2.1
pymorphy3-dicts-ru            2.4.417150.4580142
pyOpenSSL                     23.0.0
pyparsing                     3.0.9
pypdfium2                     4.22.0
PySocks                       1.7.1
python-dateutil               2.8.2
pytz                          2023.3
PyYAML                        6.0
rebiber                       1.1.3
regex                         2023.3.23
requests                      2.28.1
requests-file                 1.5.1
responses                     0.18.0
rsa                           4.9
ru-core-news-lg               3.7.0
sacrebleu                     2.3.3
safetensors                   0.4.2
scikit-learn                  1.3.0
scipy                         1.10.1
sentence-transformers         2.7.0
sentencepiece                 0.1.99
sentry-sdk                    1.24.0
setproctitle                  1.3.2
setuptools                    66.0.0
shapely                       2.0.3
six                           1.16.0
sklearn                       0.0.post4
smart-open                    6.4.0
smmap                         5.0.0
spacy                         3.7.2
spacy-curated-transformers    0.2.0
spacy-legacy                  3.0.12
spacy-loggers                 1.0.5
spacy-pkuseg                  0.0.33
srsly                         2.4.8
SudachiDict-core              20230927
SudachiPy                     0.6.7
sympy                         1.11.1
tabulate                      0.9.0
tenacity                      8.2.3
termcolor                     2.3.0
thinc                         8.2.1
threadpoolctl                 3.1.0
tiktoken                      0.4.0
tldextract                    5.1.0
tokenizers                    0.15.1
torch                         2.0.0
torchaudio                    2.0.0
torchvision                   0.15.0
tqdm                          4.65.0
transformers                  4.38.0.dev0
triton                        2.0.0
tsv                           1.2
typer                         0.9.0
typing_extensions             4.5.0
tzdata                        2023.3
Unidecode                     1.3.7
urllib3                       1.26.15
wandb                         0.15.3
wasabi                        1.1.2
weasel                        0.3.4
wheel                         0.38.4
xxhash                        3.2.0
yarl                          1.9.2
zh-core-web-lg                3.7.0
zh-core-web-trf               3.7.2
zipp                          3.15.0

@delip
Copy link
Author

delip commented May 8, 2024

Could this be a Python 12 problem in sentence transformer?

@chottuthejimmy
Copy link

It worked for me. Apart from a FutureWarning. I am using 3.12.0

pip list
Package Version


certifi 2024.2.2
charset-normalizer 3.3.2
filelock 3.14.0
fsspec 2024.3.1
huggingface-hub 0.23.0
idna 3.7
Jinja2 3.1.4
joblib 1.4.2
MarkupSafe 2.1.5
mpmath 1.3.0
networkx 3.3
numpy 1.26.4
packaging 24.0
pillow 10.3.0
pip 23.2.1
PyYAML 6.0.1
regex 2024.4.28
requests 2.31.0
safetensors 0.4.3
scikit-learn 1.4.2
scipy 1.13.0
sentence-transformers 2.7.0
sympy 1.12
threadpoolctl 3.5.0
tokenizers 0.19.1
torch 2.3.0
tqdm 4.66.4
transformers 4.40.2
typing_extensions 4.11.0
urllib3 2.2.1

@da03
Copy link

da03 commented May 8, 2024

also works for python 3.12.2 (exact same version):

python --version

Python 3.12.2

pip list

Package                  Version
------------------------ ----------
certifi                  2024.2.2
charset-normalizer       3.3.2
filelock                 3.14.0
fsspec                   2024.3.1
huggingface-hub          0.23.0
idna                     3.7
Jinja2                   3.1.4
joblib                   1.4.2
MarkupSafe               2.1.5
mpmath                   1.3.0
networkx                 3.3
numpy                    1.26.4
nvidia-cublas-cu12       12.1.3.1
nvidia-cuda-cupti-cu12   12.1.105
nvidia-cuda-nvrtc-cu12   12.1.105
nvidia-cuda-runtime-cu12 12.1.105
nvidia-cudnn-cu12        8.9.2.26
nvidia-cufft-cu12        11.0.2.54
nvidia-curand-cu12       10.3.2.106
nvidia-cusolver-cu12     11.4.5.107
nvidia-cusparse-cu12     12.1.0.106
nvidia-nccl-cu12         2.20.5
nvidia-nvjitlink-cu12    12.4.127
nvidia-nvtx-cu12         12.1.105
packaging                24.0
pillow                   10.3.0
pip                      24.0
PyYAML                   6.0.1
regex                    2024.4.28
requests                 2.31.0
safetensors              0.4.3
scikit-learn             1.3.0
scipy                    1.13.0
sentence-transformers    2.7.0
setuptools               69.5.1
sympy                    1.12
threadpoolctl            3.5.0
tokenizers               0.19.1
torch                    2.3.0
torchaudio               2.3.0
torchvision              0.18.0
tqdm                     4.66.4
transformers             4.40.2
typing_extensions        4.11.0
urllib3                  2.2.1
wheel                    0.43.0

@delip
Copy link
Author

delip commented May 9, 2024

forgot to add: the OS I am on is Sonoma 14.4 Beta (23E5205c).
Where are you seeing this working, @da03 @chottuthejimmy?

@chottuthejimmy
Copy link

forgot to add: the OS I am on is Sonoma 14.4 Beta (23E5205c). Where are you seeing this working, @da03 @chottuthejimmy?

image

@chottuthejimmy
Copy link

@delip I can hop on a call tmr if you feel like there's more to the problem.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants