Arraycore #634

albertosm27 · 2017-06-19T15:21:49Z

First implementation of the new array using h5py backend. The objective of this branch is to pass the basic tests in tables/tests/test_array.py.

FrancescAlted · 2017-06-19T15:37:13Z

Looks good to me so far. I specially like the first attempt at getting rid of the node manager as h5py should be the place to handle this.

tomkooij · 2017-06-19T19:41:01Z

Nice work!

…ls on closing)

…ead() always returns in sys.byteorder)

FrancescAlted · 2017-06-27T06:47:55Z

tables/core/group.py

+            if atom is None or shape is None:
+                raise TypeError('if the obj parameter is not specified '
+                                '(or None) then both the atom and shape '
+                                'parametes should be provided.')


FrancescAlted · 2017-06-27T06:59:01Z

tables/core/array.py

+            if not out.flags['C_CONTIGUOUS']:
+                raise ValueError('output array not C contiguous')
+
+            np.copyto(out, arr)


You can avoid the arr temporary by using read_direct(array, source_sel=None, dest_sel=None).

FrancescAlted · 2017-06-27T07:00:21Z

tables/core/group.py

+        else:
+            flavor = flavor_of(obj)
+            # use a temporary object because converting obj at this stage
+            # breaks some test. This is soultion performs a double,


"This is soultion performs " -> "This fix performs".

FrancescAlted · 2017-06-27T07:01:15Z

tables/core/group.py

+                                 strides=(0,) * len(shape))
+        else:
+            flavor = flavor_of(obj)
+            # use a temporary object because converting obj at this stage


[STYLE] Start comments in its own line with an uppercase letter.

FrancescAlted · 2017-06-27T07:03:43Z

tables/core/leaf.py

+            if self.shape == ():
+                return 1
+            else:
+                return len(self)


For compactness you can make use of the ternary operator there:

return 1 if self.shape == () else len(self)

tacaswell · 2017-07-05T21:17:01Z

Where did the work from the Perth workshop end up?

tacaswell · 2017-07-05T21:20:39Z

🐑 Never mind, I sorted it out (that code is on the pt4 branch which this is a PR into).

tacaswell · 2017-07-05T21:26:20Z

tables/utils.py

+np_byteorders = {
+    'big': '>',
+    'little': '<',
+    'irrelevant': '|'


'irrelevant' should be 'native'

'irrelevant' is meant for ASCII strings and the like, where the order is, well, irrelevant.

tacaswell · 2017-07-05T21:30:14Z

tables/core/group.py

+
+        # Finally, check whether the desired node is an instance
+        # of the expected class.
+        if classname:


might be worth checking classname is not None so other 'falsy' values (which should probably raise) do not safely pass.

tacaswell · 2017-07-05T21:41:00Z

tables/core/group.py

        return where.create_table(name, desc, *args, **kwargs)

-    def get_node(self, where):
-        return self.root[where]
+    def get_node(self, where, name=None, classname=None):


This is to match the API on tables.file.File.get_node ?

- this will conflict with the develop branch - this will become irrelevant on this branch eventually (as all of the cython should be dropped)

- createparents is positional-only now, adjust test - match positional order of remaining args

tacaswell · 2017-07-05T23:45:07Z

albertosm27#1 <- getting copy to work.

tacaswell · 2017-07-24T14:54:50Z

tables/tests/test_carray.py

@@ -2209,7 +2209,7 @@ def test00b_zeros(self):
            print("First row-->", ca[0])
            print("Defaults-->", ca.atom.dflt)
        self.assertTrue(allequal(ca[0], numpy.zeros(N, 'S3')))
-        self.assertTrue(allequal(ca.atom.dflt, numpy.zeros(N, 'S3')))
+        self.assertTrue(allequal(ca.atom.dflt, b""))


Why change the tests like this?

Multidimensional dtypes are giving me a hard time.
I thought h5py as numpy was putting the dtype.shape inside the array.shape so I did that for the Atom.dflt (equivalent to fillvalue in h5py) and thats why I needed to change the test like that.

I still do not know if multidimensional dtypes are fully supported in h5py, for example in tables/tests/array_mdatom.h5 you can get '/arr' and it will be a multidimensional array but trying it to copy (with create_dataset) will raise an error like 'Can't broadcast shape (5, 5, 5) to (5, 5, 5, 3)' the 3 comes from the dtype shape.

Using the h5py copy method from Group will copy the array correctly with its multidensional dtype (still reading the array returns it as numpy with the dtype shape inside the array shape)

What do you mean by 'multidimensional dtypes'?

h5py should support all dtypes that hdf5 supports (and if any are missing it should be fixed in h5py).

I mean dtypes with shape like this numpy.dtype('(3,)i4').

I actually think this is a numpy bug

In [58]: np.zeros(5, dtype=np.dtype('(3,)i4')).shape Out[58]: (5, 3) In [59]: np.zeros(5, dtype=np.dtype('(3,)i4,f4')).shape Out[59]: (5,) In [60]: np.zeros(5, dtype=np.dtype('(3,)i4')).dtype Out[60]: dtype('int32') In [61]: np.zeros(5, dtype=np.dtype('(3,)i4,f4')).dtype Out[61]: dtype([('f0', '<i4', (3,)), ('f1', '<f4')])

Maybe:

In [63]: np.zeros(5, dtype=np.dtype('i4,i4,i4')).shape Out[63]: (5,)

Ok, so '(3,)i4' and 3*'i4' are definitely different datatypes, the question is if we expect the promotion (demotion?) to a int32 array with shape + (3,) or not.

https://docs.scipy.org/doc/numpy/user/basics.rec.html#structured-arrays

Sorry for the stream of conscious comments as I sort this out 🐑

@tacaswell I remember discussing with Travis the 'demotion' of multidimensional dtypes in NumPy for atomic dtypes (i.e. not compound or structured) quite long time ago and IIRC while he agreed that the demotion was not consistent, he argued easiness of implementation to keep the current behaviour. Perhaps the NumPy guys can change that, but IMO, this ship sailed long time ago so it is not a good idea to ask for that change now.

…tidimensional dtypes

avalentino · 2017-07-29T07:12:18Z

tables/core/array.py

@@ -136,8 +140,10 @@ def copy(self, newparent=None, newname=None,
            newparent = self.root._get_or_create_path(newparent, createparents)
        if self.__class__.__name__ == 'Array':
            create_function = newparent.create_array
-        else:
+        elif self.__class__.__name__ == 'CArray':


Probably it would be saner to use isinstance to test the condition, but you need to reverse the order of tests because CArraiy is an Array and EArray is a CArray.

I tried, but with isinstance I need to import CArray in array.py and CArray also imports Array which makes it crash.

@albertosm27 yes, you are right. Now I remember the issue :/

FrancescAlted · 2017-09-12T17:51:21Z

tables/tests/test_vlarray.py

-        self.assertRaises(ValueError, vlarray.__setitem__, 1, "shrt")
-        self.assertRaises(ValueError, vlarray.__setitem__, 1, "toolong")
+        #self.assertRaises(ValueError, vlarray.__setitem__, 1, "shrt")
+        #self.assertRaises(ValueError, vlarray.__setitem__, 1, "toolong")



Make sure that you add new tests for the assignation of values that are of different length than the old values.

Also, do not forget to add an entry abut this new feature in the CHANGELOG for PyTables 4.0.

albertosm27 added 5 commits June 16, 2017 16:24

tables/node vscode pep8 formatted and fixed a comment

2dcf6b2

tables/leaf vscode pep8 formatted

e22230c

Array working as the original in examples/array1.py

7281ff9

Removed cache

d4c0f59

Passing first two test in test_array::Basic0DOneTestCase

95ad2a4

albertosm27 added 3 commits June 22, 2017 17:22

[WIP] core/array passes tests/test_array BasicTests (still random fai…

0b54c4b

…ls on closing)

[WIP] core/array passes all BasicTests in tests/tes_array.py (array.r…

286a854

…ead() always returns in sys.byteorder)

File create functions now accept paths

ea4c2c7

FrancescAlted requested changes Jun 27, 2017

View reviewed changes

albertosm27 added 5 commits June 27, 2017 16:28

Fixed requested changes by Francesc Alted

bcfb752

Avoid read_direct() on empty datasets

4111f5a

Added size_on_disk property on the backend h5py.Dataset

ae92a09

[WIP] core/array.py passing UnalignedAndComplexTestCase tests

bb1d9b9

Method get_node accepts now the original parameters

f1feedb

tacaswell reviewed Jul 5, 2017

View reviewed changes

tacaswell added 6 commits July 5, 2017 17:47

BLD: add h5py as required dependency

763922f

BLD: changes to compile with hdf5 1.10

c3feb6c

- this will conflict with the develop branch - this will become irrelevant on this branch eventually (as all of the cython should be dropped)

STY: whitespace

83184e9

API/TST: create_array

3241158

- createparents is positional-only now, adjust test - match positional order of remaining args

WIP: get more copy tests working

fa62f67

WIP: more work getting copy tests to pass

141c697

tacaswell added 2 commits July 5, 2017 20:03

WIP: fix more array tests

bef6f26

TST: change inheritance a bit to make pytest happy

18ec5a6

albertosm27 added 6 commits July 19, 2017 13:07

WIP: added extdim in Array

9342989

WIP: added flavor

6bb2a60

WIP: fixed flavor copy

f5011f4

WIP: passing AtomDefaultReprNoReopen

e94e0eb

WIP: carray with correct atoms and fillvalues

ff79fc3

WIP: no need to cast to the default numpy flavor

1ca64aa

tacaswell reviewed Jul 24, 2017

View reviewed changes

albertosm27 added 12 commits July 25, 2017 16:13

TST: pytest friendly earray tests

91b6933

WIP: implemented EArray

3d9e05d

WIP: forgot EArray class

57c071c

WIP: corrected earray append method and nrow property in iterations

1a070b3

WIP: passing basic tests in test_earray.py

48ec03a

WIP: iter over extdim and added filters to attributes

833a5ba

WIP: EArray slices correctly placed in maindim

ec6cf2b

WIP: corrected readarray

8a13ad7

WIP: check None filters and correct byteorder on empty chunked arrays

6e1114c

WIP: added earray copy functionality

5b21de9

WIP: added truncate dataset functionality

cd9ff2e

WIP: check_open raises ClosedNodeError and atom shape reduced for mul…

2616165

…tidimensional dtypes

avalentino reviewed Jul 29, 2017

View reviewed changes

albertosm27 added 5 commits August 4, 2017 15:48

TST: pytest friendly vlarray tests

d268ff1

WIP: basic vlarray creation implemented

171e093

WIP: accepting simple point selections

2b45b77

WIP: added read case for vlarray

044675d

WIP: vlarray getters with flavor

e42128c

FrancescAlted mentioned this pull request Aug 27, 2017

No track times #642

Merged

albertosm27 added 2 commits September 12, 2017 19:38

TST: provisional skips

e219398

WIP: added vlarray read method and support for VLStringAtom

3c50c56

FrancescAlted reviewed Sep 12, 2017

View reviewed changes

avalentino pushed a commit to avalentino/PyTables that referenced this pull request Oct 7, 2017

doc(group): document PyTables#634 on group about autocompletions

1cb712f

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Arraycore #634

Arraycore #634

albertosm27 commented Jun 19, 2017

FrancescAlted commented Jun 19, 2017

tomkooij commented Jun 19, 2017

FrancescAlted Jun 27, 2017

FrancescAlted Jun 27, 2017

FrancescAlted Jun 27, 2017

FrancescAlted Jun 27, 2017

FrancescAlted Jun 27, 2017

tacaswell commented Jul 5, 2017

tacaswell commented Jul 5, 2017

tacaswell Jul 5, 2017

FrancescAlted Jul 6, 2017

tacaswell Jul 5, 2017

tacaswell Jul 5, 2017

tacaswell commented Jul 5, 2017

tacaswell Jul 24, 2017

albertosm27 Jul 24, 2017

tacaswell Jul 25, 2017

albertosm27 Jul 25, 2017

tacaswell Jul 26, 2017

tacaswell Jul 26, 2017 •

edited

tacaswell Jul 26, 2017

FrancescAlted Jul 26, 2017

avalentino Jul 29, 2017

albertosm27 Jul 31, 2017

avalentino Aug 1, 2017

FrancescAlted Sep 12, 2017

FrancescAlted Sep 12, 2017

Arraycore #634

Are you sure you want to change the base?

Arraycore #634

Conversation

albertosm27 commented Jun 19, 2017

FrancescAlted commented Jun 19, 2017

tomkooij commented Jun 19, 2017

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

tacaswell commented Jul 5, 2017

tacaswell commented Jul 5, 2017

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

tacaswell commented Jul 5, 2017

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

tacaswell Jul 26, 2017 • edited

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

tacaswell Jul 26, 2017 •

edited