pandas.read_hdf issue with latest anaconda version #10

fcfcfcfcf · 2020-09-11T13:52:26Z

Hello!

In module 1.1.3, the pandas read_hdf function is used to read AAPL stock price data from a h5 file. Unfortunately, pandas v1.0.0 or higher does not support reading older h5 files such as the one included (according to https://github.com/pandas-dev/pandas/issues/33186). Anaconda seems to be the python distribution recommended in module 1.1.1, but the latest version of anaconda includes a version of pandas incompatible with the given h5 file. The easiest way to get around this for me was to simply install an older version of anaconda (2019.3), which comes with pandas 0.24.2, but it might be worth mentioning as this could be tricky for users to figure out.

c-cunningham · 2020-11-05T09:15:52Z

I had this issue too, as well as other compatibility issues with different versions of Python. For instance, the "alt.renderers.enable('notebook')" line worked in Python 2 but caused the graphs to fail in Python 3.

JoBe10 · 2023-10-02T07:10:32Z

You can solve the read_hdf issue by using the h5py Python library (install that library using pip or conda as usual). For the Apple data the code is the following:

import h5py

with h5py.File("data/AAPL.h5", 'r') as f:
    aapl_group = f['AAPL']

    # Initialize an empty dictionary to collect data
    data_dict = {}

    # Iterate over all items in the group
    for name, item in aapl_group.items():
        if isinstance(item, h5py.Dataset):
            # For each dataset, add its data to the dictionary with the dataset's name as the key
            data_dict[name] = item[:]

# Extract the columns and their data
cols_data = {
    **dict(zip(data_dict['block0_items'].astype(str), data_dict['block0_values'].T)),
    **dict(zip(data_dict['block1_items'].astype(str), data_dict['block1_values'].T))
}

# Create the DataFrame
aapl = pd.DataFrame(cols_data, index=pd.to_datetime(data_dict['axis1']))

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

pandas.read_hdf issue with latest anaconda version #10

pandas.read_hdf issue with latest anaconda version #10

fcfcfcfcf commented Sep 11, 2020

c-cunningham commented Nov 5, 2020

JoBe10 commented Oct 2, 2023 •

edited

pandas.read_hdf issue with latest anaconda version #10

pandas.read_hdf issue with latest anaconda version #10

Comments

fcfcfcfcf commented Sep 11, 2020

c-cunningham commented Nov 5, 2020

JoBe10 commented Oct 2, 2023 • edited

JoBe10 commented Oct 2, 2023 •

edited