Allow label based indexing in Rows (incl. test updates) #268

ms8r · 2017-01-01T23:17:49Z

Enables access to the items in a Dataset Row by index or by column header. For example, data[0]['first_name'] == data[0][0] if 'first_name' is the label of the first column as specified in the Dataset's headers. (Ref. issues #22, #158, #265.)

Implemented by adding a Row attribute _dset that stores a reference to the Dataset that "owns" the Row and thus allowing each Row access to the parent Dataset's headers. Constructors, insert methods and itemgetters/setters have been updated accordingly. In addition Dataset has a new attribute _lblidx that indicates whether label based indexing is possible (i.e. header with unique labels exists). _lblidxis maintained via updated headers property.

To allow label based access within a Row the Dataset's __getitem__ now returns a Row rather than a tuple, with the Row basically behaving like a list externally. This has the potential to cause some backwards compatibility issues if client code relied on Dataset items being returned as plain tuples. To minimize this impact the PR adds __add__, __eq__, and __ne__ methods for Rows. Tests have been updated by applying the Row.tuple property for comparisons with tuple literals (PR will fail existing tests otherwise). Independent of the label based indexing I'd suggest returning Dataset items as Rows instead of plain tuples may be preferable in any case to enable adding additional functionality in the future.

Other changes/additions:

Add copy method for Datasets that updates _dset references in new object's Rows and uses copy.deepcopy instead of copy.copy. This should also fix a bug in the current version where copies (in filterand stack) are shallow and the new object's _data attribute points to the same list as the original object (filter and stack updated accordingly).
Add assertions to existing tests for methods that return new Dataset objects to verify that Row's _dset points to the new object and that the new object is not a shallow copy (filter, stack, stack_col, subset, sorted, and transpose)
Add tests for new functionality (plus one for existing filter)

timofurrer · 2019-03-02T11:14:28Z

Can you please resolve the conflicts. Thanks 🎉

ms8r · 2019-03-17T10:12:45Z

Done ;-) This also surfaced a left over bug in the has_tag method (incorrect unicode handling under Python 2.7.... time to move to Python 3 only...

ms8r added 3 commits January 1, 2017 23:06

Allow label based indexing in Rows (incl. test updates)

204ecbe

Change assertRaises for Python 2.6 compatibility

a79f638

Change format strings for Python 2.6 compatibility

7fba346

DanielDisisto mentioned this pull request Aug 16, 2017

Row should be dict not list #158

Open

ms8r and others added 3 commits March 16, 2019 14:39

Resolve merge conflict with master

c94658e

Merge branch 'master' into develop

791dcd5

fix has_tag method for correct unicode handling

80ae28c

timofurrer approved these changes Mar 17, 2019

View reviewed changes

hugovk mentioned this pull request Oct 4, 2019

Drop support for Python 2 #390

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Allow label based indexing in Rows (incl. test updates) #268

Allow label based indexing in Rows (incl. test updates) #268

ms8r commented Jan 1, 2017 •

edited

timofurrer commented Mar 2, 2019

ms8r commented Mar 17, 2019

Allow label based indexing in Rows (incl. test updates) #268

Are you sure you want to change the base?

Allow label based indexing in Rows (incl. test updates) #268

Conversation

ms8r commented Jan 1, 2017 • edited

timofurrer commented Mar 2, 2019

ms8r commented Mar 17, 2019

ms8r commented Jan 1, 2017 •

edited