Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Suggestion to support VisiumHD tissue_position_list files (using parquet files) #2982

Open
Rafael-Silva-Oliveira opened this issue Apr 7, 2024 · 2 comments

Comments

@Rafael-Silva-Oliveira
Copy link

Rafael-Silva-Oliveira commented Apr 7, 2024

What kind of feature would you like to request?

Additional function parameters / changed functionality / changed defaults?

Please describe your wishes

Hey

VisiumHD will now be the main data type for 10X technology on spatial data, and due to the higher number of barcodes used, we can't use a normal .csv file anymore due to row limitations. Instead, they're now using .parquet files, but on scanpy when using the read_visium method, there doesn't seem to have support for that.

Lines of code:

    if load_images:
        files = dict(
            tissue_positions_file=path / 'spatial/tissue_positions_list.csv',
            scalefactors_json_file=path / 'spatial/scalefactors_json.json',
            hires_image=path / 'spatial/tissue_hires_image.png',
            lowres_image=path / 'spatial/tissue_lowres_image.png',
        )


I would suggest something like this to check if there's a csv file or parquet file:

if load_images:
		files = dict(
			tissue_positions_file = next((path / f'spatial/tissue_positions_list{suffix}' for suffix in ['.csv', '.parquet'] if (path / f'spatial/tissue_positions_list{suffix}').exists()), None),
			scalefactors_json_file=path / 'spatial/scalefactors_json.json',
			hires_image=path / 'spatial/tissue_hires_image.png',
			lowres_image=path / 'spatial/tissue_lowres_image.png',
		)

		if files['tissue_positions_file'].suffix == '.csv':
			positions = pd.read_csv(files['tissue_positions_file'], header=None)
		elif files['tissue_positions_file'].suffix == '.parquet':
			positions = pd.read_parquet(files['tissue_positions_file'])

This way we can use read_visium if the tissue location file is .parquet instead:

AnnData object with n_obs × n_vars = 605471 × 18085
    obs: 'in_tissue', 'array_row', 'array_col'
    var: 'gene_ids', 'feature_types', 'genome'
    uns: 'spatial'
    obsm: 'spatial'

Using scanpy 1.9.6

@Rafael-Silva-Oliveira Rafael-Silva-Oliveira changed the title Supporting VisiumHD tissue_position.parquet files Suggestion to support VisiumHD tissue_position_list files (using parquet files) Apr 7, 2024
@Rafael-Silva-Oliveira
Copy link
Author

Hello, I'd like to ask which version of scanpy you're using? The code you provided doesn't match the latest version, which is causing errors when I try to use it.

Hi, I'm using 1.9.6

@111-dep
Copy link

111-dep commented Apr 9, 2024

Thank you very much ,I've resolved the error.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

2 participants