Initialize a Dataset from plain python dicts #491

JWCook · 2021-05-18T00:09:52Z

I'd like to be able to create a Dataset from plain python dicts. I read through just about everything in the docs and some of the code, and can't seem to find a way to do this directly.

What I'd like to be able to do is something like:

source_data = [
     {'column_1': 'value_1A',  'column_2':  'value_2A'},
     {'column_1': 'value_1B',  'column_2':  'value_2B'},
]
data = Dataset().load(*source_data)

Or maybe:

data = import_set(source_data, format='dict')

Current workarounds:

json.dumps the data first before loading
Register a custom format class

Separately add headers and row values:

data = DataSet()
data.headers = source_data[0].keys()
data.extend([item.values() for item in source_data])

So I guess my questions are:

Is there currently a better way to do this?
If not, would you be interested in a PR for this?

The text was updated successfully, but these errors were encountered:

brandonrobertz · 2021-10-07T01:40:18Z

data = DataSet()
data.headers = source_data[0].keys()
data.extend([item.values() for item in source_data])

I'll note that this only works if your dicts have all of the keys, which a lot of data doesn't. It's a waste to serialize every blank value as None so many applications will not contain every possible key.

I'd like to see movement on this as well and am willing to start developing on it. My current solution is to pass over the dataset a few times. Once to collect all the keys and gather the rows. Then again to build rows from the full set of headers and append to the Dataset.

It would be really nice if append could operate on a dict, adding a new column automatically, setting all previous values to blank, when encountered.

brandonrobertz · 2021-11-07T22:35:16Z

An update: I've been working through this and in the code there's references to the necessity of rows aligning, etc, but I'm not sure if this is strictly required by the validation checks or if there's something deeper in the functionality of appending rows that requires this. Some guidance from a dev would be helpful here!

daghemo · 2022-01-10T10:58:13Z

Same problem/question here.
My dicts have all of the keys, just like in the @JWCook's example. That's why I've been able to use:

source_data = [
     {'column_1': 'value_1A',  'column_2':  'value_2A'},
     {'column_1': 'value_1B',  'column_2':  'value_2B'},
]
data = Dataset()
data.dict = source_data

I may also override the column names with:

data.headers = ["Column 1","Column 2"]

This works only if every item in the dict has all of the keys. If not, a tablib.exceptions.InvalidDimensions exception is raised.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Initialize a Dataset from plain python dicts #491

Initialize a Dataset from plain python dicts #491

JWCook commented May 18, 2021

brandonrobertz commented Oct 7, 2021 •

edited by hugovk

brandonrobertz commented Nov 7, 2021

daghemo commented Jan 10, 2022 •

edited by hugovk

Initialize a Dataset from plain python dicts #491

Initialize a Dataset from plain python dicts #491

Comments

JWCook commented May 18, 2021

brandonrobertz commented Oct 7, 2021 • edited by hugovk

brandonrobertz commented Nov 7, 2021

daghemo commented Jan 10, 2022 • edited by hugovk

brandonrobertz commented Oct 7, 2021 •

edited by hugovk

daghemo commented Jan 10, 2022 •

edited by hugovk