Open
Description
If (N, 4)-dimensional numpy array gets written to HdfEventWriter.pmu
, instead of converting the columns to the named fields, each column gets broadcasted over the x, y, z, e elements, resulting in a 4x increase in disk usage.
That is, from this:
array([[-2.52392785e+01, 4.17925601e-02, 1.63704453e+01,
3.00834579e+01],
[-1.31029150e+01, 2.16167656e-02, 8.49904537e+00,
1.56179590e+01],
[ 4.83393040e+02, 4.55137104e+01, 3.82795685e+02,
6.18282166e+02],
...
what gets stored is this
array([[(-2.52392785e+01, -2.52392785e+01, -2.52392785e+01, -2.52392785e+01),
( 4.17925601e-02, 4.17925601e-02, 4.17925601e-02, 4.17925601e-02),
( 1.63704453e+01, 1.63704453e+01, 1.63704453e+01, 1.63704453e+01),
( 3.00834579e+01, 3.00834579e+01, 3.00834579e+01, 3.00834579e+01)],
[(-1.31029150e+01, -1.31029150e+01, -1.31029150e+01, -1.31029150e+01),
( 2.16167656e-02, 2.16167656e-02, 2.16167656e-02, 2.16167656e-02),
( 8.49904537e+00, 8.49904537e+00, 8.49904537e+00, 8.49904537e+00),
( 1.56179590e+01, 1.56179590e+01, 1.56179590e+01, 1.56179590e+01)],
[( 4.83393040e+02, 4.83393040e+02, 4.83393040e+02, 4.83393040e+02),
( 4.55137104e+01, 4.55137104e+01, 4.55137104e+01, 4.55137104e+01),
( 3.82795685e+02, 3.82795685e+02, 3.82795685e+02, 3.82795685e+02),
( 6.18282166e+02, 6.18282166e+02, 6.18282166e+02, 6.18282166e+02)]
...
This is an obvious bug! Patch it by converting the (N, 4) array as a view compatible with the HDF5 dataset.