We read every piece of feedback, and take your input very seriously.
To see all available qualifiers, see our documentation.
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
The following three ways use the same cosine similarity for sc.pp.neighbors following by leiden clustering renders different results:
metric = "cosine"
sc.pp.neighbors()
KNeighborsTransformer
adata.obsp['connectivities']
import scanpy as sc from sklearn.neighbors import KNeighborsTransformer import numpy as np from numpy.linalg import norm from sklearn.metrics.pairwise import cosine_similarity adata = sc.datasets.pbmc68k_reduced() ###use built-in cosine similarity option sc.pp.neighbors(adata, n_neighbors=15,n_pcs=0,metric= "cosine") sc.tl.umap(adata,random_state =42) sc.tl.leiden(adata,resolution=10) clusters= np.array(adata.obs["leiden"]).astype(int) print('num of clusters: '+str(len(set(clusters)))) ###use callable cosine similarity metrics def cos_distance(A, B): # calculate the distance, return a float cosine = np.dot(A, B) / (norm(A) * norm(B)) return cosine transformer = KNeighborsTransformer(n_neighbors=15, metric=cos_distance) sc.pp.neighbors(adata, transformer=transformer,n_pcs=0) sc.tl.umap(adata,random_state =42) sc.tl.leiden(adata,resolution=10) clusters= np.array(adata.obs["leiden"]).astype(int) print('num of clusters: '+str(len(set(clusters)))) ###use precomputed cosine distance metrics dis_mat = cosine_similarity(adata.X) tmp = sc.neighbors._common._get_indices_distances_from_dense_matrix(dis_mat, n_neighbors=15) adata.obsp["connectivities"] = sc.neighbors._connectivity.umap( knn_indices = tmp[0], knn_dists = tmp[1], n_obs = dis_mat.shape[0], n_neighbors = 15, ) adata.uns["neighbors"] = {"connectivities_key": "connectivities", "params": {"method": None}} sc.tl.umap(adata,random_state =42) sc.tl.leiden(adata,resolution=10) clusters= np.array(adata.obs["leiden"]).astype(int) print('num of clusters: '+str(len(set(clusters))))
num of clusters: 85 num of clusters: 170 num of clusters: 183
----- anndata 0.10.6 scanpy 1.10.1 ----- IPython 8.22.2 PIL 10.2.0 asttokens NA console_thrift NA cycler 0.12.1 cython_runtime NA dateutil 2.9.0 decorator 5.1.1 executing 2.0.1 h5py 3.10.0 igraph 0.11.4 jedi 0.19.1 joblib 1.3.2 kiwisolver 1.4.5 legacy_api_wrap NA leidenalg 0.10.2 llvmlite 0.42.0 matplotlib 3.8.3 mpl_toolkits NA natsort 8.4.0 numba 0.59.1 numpy 1.26.4 packaging 24.0 pandas 2.2.1 parso 0.8.3 pickleshare 0.7.5 pkg_resources NA prompt_toolkit 3.0.42 psutil 5.9.0 pure_eval 0.2.2 pydev_console NA pydev_ipython NA pydevconsole NA pydevd_file_utils NA pydevd_plugins NA pydevd_tracing NA pygments 2.17.2 pynndescent 0.5.11 pyparsing 3.1.2 pytz 2024.1 scipy 1.12.0 session_info 1.0.0 six 1.16.0 sklearn 1.4.1.post1 stack_data 0.6.2 texttable 1.7.0 threadpoolctl 3.4.0 tqdm 4.66.2 traitlets 5.14.2 typing_extensions NA umap 0.5.5 wcwidth 0.2.13 ----- Python 3.11.7 (main, Dec 15 2023, 12:09:56) [Clang 14.0.6 ] macOS-14.3.1-arm64-arm-64bit ----- Session information updated at 2024-04-22 09:58
The text was updated successfully, but these errors were encountered:
No branches or pull requests
Please make sure these conditions are met
What happened?
The following three ways use the same cosine similarity for sc.pp.neighbors following by leiden clustering renders different results:
metric = "cosine"
insc.pp.neighbors()
KNeighborsTransformer
and pass to the transformer option insc.pp.neighbors()
adata.obsp['connectivities']
Option 1 generates 85 clusters, option 2 generates 170 clusters and option 3 generates 183 clusters.
Minimal code sample
Error output
Versions
The text was updated successfully, but these errors were encountered: