Skip to content
This repository has been archived by the owner on Nov 16, 2023. It is now read-only.

[O11Y-224] add support for merging parquet files. #45

Merged
merged 146 commits into from
Jan 10, 2022
Merged

Commits on Dec 16, 2021

  1. refactor row traversal

    Achille Roussel committed Dec 16, 2021
    Configuration menu
    Copy the full SHA
    6297550 View commit details
    Browse the repository at this point in the history
  2. reorganize code

    Achille Roussel committed Dec 16, 2021
    Configuration menu
    Copy the full SHA
    63faa7c View commit details
    Browse the repository at this point in the history
  3. micro optimization

    Achille Roussel committed Dec 16, 2021
    Configuration menu
    Copy the full SHA
    fe2dad8 View commit details
    Browse the repository at this point in the history
  4. optimize struct field lookup

    Achille Roussel committed Dec 16, 2021
    Configuration menu
    Copy the full SHA
    fa039c4 View commit details
    Browse the repository at this point in the history
  5. remove unnecessary check

    Achille Roussel committed Dec 16, 2021
    Configuration menu
    Copy the full SHA
    4096e31 View commit details
    Browse the repository at this point in the history
  6. export parquet.NewSchema

    Achille Roussel committed Dec 16, 2021
    Configuration menu
    Copy the full SHA
    eb07e76 View commit details
    Browse the repository at this point in the history

Commits on Dec 17, 2021

  1. refactor traversal into deconstruct

    Achille Roussel committed Dec 17, 2021
    Configuration menu
    Copy the full SHA
    fa0c576 View commit details
    Browse the repository at this point in the history
  2. don't use a static array as row buffer is row group writer

    Achille Roussel committed Dec 17, 2021
    Configuration menu
    Copy the full SHA
    6b205ed View commit details
    Browse the repository at this point in the history

Commits on Dec 19, 2021

  1. working row deconstruction + reconstruction

    Achille Roussel committed Dec 19, 2021
    Configuration menu
    Copy the full SHA
    e6d6860 View commit details
    Browse the repository at this point in the history
  2. optimize decoding of repeated values

    Achille Roussel committed Dec 19, 2021
    Configuration menu
    Copy the full SHA
    92f06e0 View commit details
    Browse the repository at this point in the history
  3. remove unused code

    Achille Roussel committed Dec 19, 2021
    Configuration menu
    Copy the full SHA
    872fe74 View commit details
    Browse the repository at this point in the history

Commits on Dec 20, 2021

  1. fix bugs in delta binary packed encoding + add more tests for parquet…

    ….Reader
    Achille Roussel committed Dec 20, 2021
    Configuration menu
    Copy the full SHA
    bc57ee9 View commit details
    Browse the repository at this point in the history
  2. fix issue when reading empty sequences of repeated columns

    Achille Roussel committed Dec 20, 2021
    Configuration menu
    Copy the full SHA
    e86f6a5 View commit details
    Browse the repository at this point in the history
  3. validate column index in deconstruction/reconstruction tests

    Achille Roussel committed Dec 20, 2021
    Configuration menu
    Copy the full SHA
    efe4bf1 View commit details
    Browse the repository at this point in the history
  4. add benchmarks for parquet.Reader

    Achille Roussel committed Dec 20, 2021
    Configuration menu
    Copy the full SHA
    a24ce66 View commit details
    Browse the repository at this point in the history

Commits on Dec 21, 2021

  1. introduce new batching APIs + optimizations + bug fixes

    Achille Roussel committed Dec 21, 2021
    Configuration menu
    Copy the full SHA
    028f7d5 View commit details
    Browse the repository at this point in the history
  2. add more tests and documentation

    Achille Roussel committed Dec 21, 2021
    Configuration menu
    Copy the full SHA
    500414a View commit details
    Browse the repository at this point in the history
  3. remove more unused code

    Achille Roussel committed Dec 21, 2021
    Configuration menu
    Copy the full SHA
    f0e0754 View commit details
    Browse the repository at this point in the history
  4. refactor reader internals to remove intermediary buffers

    Achille Roussel committed Dec 21, 2021
    Configuration menu
    Copy the full SHA
    53d2ea2 View commit details
    Browse the repository at this point in the history
  5. optimize row reader

    Achille Roussel committed Dec 21, 2021
    Configuration menu
    Copy the full SHA
    60441d6 View commit details
    Browse the repository at this point in the history

Commits on Dec 22, 2021

  1. more tests and more fixes

    Achille Roussel committed Dec 22, 2021
    Configuration menu
    Copy the full SHA
    140a2fe View commit details
    Browse the repository at this point in the history
  2. simplify column read func code

    Achille Roussel committed Dec 22, 2021
    Configuration menu
    Copy the full SHA
    c77ad3a View commit details
    Browse the repository at this point in the history
  3. simplify column read func code (2)

    Achille Roussel committed Dec 22, 2021
    Configuration menu
    Copy the full SHA
    c3319bf View commit details
    Browse the repository at this point in the history
  4. simplify column read func code (3)

    Achille Roussel committed Dec 22, 2021
    Configuration menu
    Copy the full SHA
    89490fc View commit details
    Browse the repository at this point in the history
  5. simplify column read func code (4)

    Achille Roussel committed Dec 22, 2021
    Configuration menu
    Copy the full SHA
    39bf0ab View commit details
    Browse the repository at this point in the history
  6. optimize reading column groups with a single element

    Achille Roussel committed Dec 22, 2021
    Configuration menu
    Copy the full SHA
    a924da2 View commit details
    Browse the repository at this point in the history
  7. allow the reader to reuse ColumnPages objects

    Achille Roussel committed Dec 22, 2021
    Configuration menu
    Copy the full SHA
    bbe7d99 View commit details
    Browse the repository at this point in the history
  8. fix bugs in RLE encoding and data page reader

    Achille Roussel committed Dec 22, 2021
    Configuration menu
    Copy the full SHA
    d17209e View commit details
    Browse the repository at this point in the history

Commits on Dec 23, 2021

  1. Configuration menu
    Copy the full SHA
    40433b0 View commit details
    Browse the repository at this point in the history
  2. fix counting of null values when writing repeated columns

    Achille Roussel committed Dec 23, 2021
    Configuration menu
    Copy the full SHA
    cba040b View commit details
    Browse the repository at this point in the history
  3. add test with nested lists

    Achille Roussel committed Dec 23, 2021
    Configuration menu
    Copy the full SHA
    2c1624a View commit details
    Browse the repository at this point in the history
  4. fix RLE/Bit-Packed tests + document that the decoders may return io.E…

    …OF when values have been read
    Achille Roussel committed Dec 23, 2021
    Configuration menu
    Copy the full SHA
    a9099b2 View commit details
    Browse the repository at this point in the history
  5. inline function calls to improve reader throughput

    Achille Roussel committed Dec 23, 2021
    Configuration menu
    Copy the full SHA
    c10efd4 View commit details
    Browse the repository at this point in the history
  6. fix test representation of fixed-length byte array

    Achille Roussel committed Dec 23, 2021
    Configuration menu
    Copy the full SHA
    7cc724b View commit details
    Browse the repository at this point in the history

Commits on Dec 26, 2021

  1. split ReadRow/WriteRow into Read/ReadRow and Write/WriteRow

    Achille Roussel committed Dec 26, 2021
    Configuration menu
    Copy the full SHA
    f216a28 View commit details
    Browse the repository at this point in the history
  2. add parquet.RowGroup APIs

    Achille Roussel committed Dec 26, 2021
    Configuration menu
    Copy the full SHA
    50bc2a5 View commit details
    Browse the repository at this point in the history

Commits on Dec 27, 2021

  1. introduce Page type and refactor PageWriter

    Achille Roussel committed Dec 27, 2021
    Configuration menu
    Copy the full SHA
    f5dd389 View commit details
    Browse the repository at this point in the history
  2. refactor package to use encoding.ByteArrayList

    Achille Roussel committed Dec 27, 2021
    Configuration menu
    Copy the full SHA
    2c31f5a View commit details
    Browse the repository at this point in the history
  3. use encoding.ByteArrayList in parquet.RowGroupColumn + add parquet.Pa…

    …ge.Size
    Achille Roussel committed Dec 27, 2021
    Configuration menu
    Copy the full SHA
    8beedc8 View commit details
    Browse the repository at this point in the history
  4. Configuration menu
    Copy the full SHA
    209936c View commit details
    Browse the repository at this point in the history
  5. remove traverse.go

    Achille Roussel committed Dec 27, 2021
    Configuration menu
    Copy the full SHA
    975243d View commit details
    Browse the repository at this point in the history
  6. Configuration menu
    Copy the full SHA
    186d822 View commit details
    Browse the repository at this point in the history
  7. simplify value reader/writer interfaces

    Achille Roussel committed Dec 27, 2021
    Configuration menu
    Copy the full SHA
    10ba1df View commit details
    Browse the repository at this point in the history
  8. replace parquet.PageWriter with parquet.RowGroupColumn

    Achille Roussel committed Dec 27, 2021
    Configuration menu
    Copy the full SHA
    fd4532e View commit details
    Browse the repository at this point in the history
  9. delegate generation of repetition and definition levels to parquet.Ro…

    …wGroupColumn
    Achille Roussel committed Dec 27, 2021
    Configuration menu
    Copy the full SHA
    9953df9 View commit details
    Browse the repository at this point in the history

Commits on Dec 28, 2021

  1. refactor dictionary types + replace parquet.PageReader with parquet.V…

    …alueDecoder
    Achille Roussel committed Dec 28, 2021
    Configuration menu
    Copy the full SHA
    8e25392 View commit details
    Browse the repository at this point in the history
  2. expose the content of pages via a value reader

    Achille Roussel committed Dec 28, 2021
    Configuration menu
    Copy the full SHA
    558ce36 View commit details
    Browse the repository at this point in the history
  3. lazily configure schema on writer and row group + write row groups to…

    … writer
    Achille Roussel committed Dec 28, 2021
    Configuration menu
    Copy the full SHA
    478b224 View commit details
    Browse the repository at this point in the history
  4. fix memory management when decoding into parquet.Value

    Achille Roussel committed Dec 28, 2021
    Configuration menu
    Copy the full SHA
    b232d45 View commit details
    Browse the repository at this point in the history
  5. hold the number of rows in pages

    Achille Roussel committed Dec 28, 2021
    Configuration menu
    Copy the full SHA
    cf82611 View commit details
    Browse the repository at this point in the history
  6. lazily allocate row group columns in writer

    Achille Roussel committed Dec 28, 2021
    Configuration menu
    Copy the full SHA
    15e0906 View commit details
    Browse the repository at this point in the history

Commits on Dec 29, 2021

  1. Configuration menu
    Copy the full SHA
    4063d1a View commit details
    Browse the repository at this point in the history
  2. get rid of the boolean managing state of the writer

    Achille Roussel committed Dec 29, 2021
    Configuration menu
    Copy the full SHA
    da52a81 View commit details
    Browse the repository at this point in the history
  3. write footer in rowGroupWriter

    Achille Roussel committed Dec 29, 2021
    Configuration menu
    Copy the full SHA
    d03cac5 View commit details
    Browse the repository at this point in the history
  4. handling flush called after close

    Achille Roussel committed Dec 29, 2021
    Configuration menu
    Copy the full SHA
    8b6bf13 View commit details
    Browse the repository at this point in the history
  5. split row group column pages to respect target page size

    Achille Roussel committed Dec 29, 2021
    Configuration menu
    Copy the full SHA
    3d6b732 View commit details
    Browse the repository at this point in the history
  6. document parquet.SortingColumn

    Achille Roussel committed Dec 29, 2021
    Configuration menu
    Copy the full SHA
    d041a2f View commit details
    Browse the repository at this point in the history
  7. remove unused NodeAt function

    Achille Roussel committed Dec 29, 2021
    Configuration menu
    Copy the full SHA
    b698ad7 View commit details
    Browse the repository at this point in the history
  8. [O11Y-219] initial support for parquet MAP type

    Achille Roussel committed Dec 29, 2021
    Configuration menu
    Copy the full SHA
    140149d View commit details
    Browse the repository at this point in the history
  9. Update row_unsafe.go

    Co-authored-by: Thomas Pelletier <thomas.pelletier@segment.com>
    Achille and pelletier committed Dec 29, 2021
    Configuration menu
    Copy the full SHA
    58e5900 View commit details
    Browse the repository at this point in the history
  10. add more tests and fix bugs

    Achille Roussel committed Dec 29, 2021
    Configuration menu
    Copy the full SHA
    a249a9a View commit details
    Browse the repository at this point in the history
  11. fix tests

    Achille Roussel committed Dec 29, 2021
    Configuration menu
    Copy the full SHA
    1871d3e View commit details
    Browse the repository at this point in the history
  12. revisit row reconstruction logic

    Achille Roussel committed Dec 29, 2021
    Configuration menu
    Copy the full SHA
    7129198 View commit details
    Browse the repository at this point in the history
  13. fix tests

    Achille Roussel committed Dec 29, 2021
    Configuration menu
    Copy the full SHA
    8c0a277 View commit details
    Browse the repository at this point in the history
  14. Configuration menu
    Copy the full SHA
    f99088f View commit details
    Browse the repository at this point in the history

Commits on Dec 30, 2021

  1. remove parquet.node type, the abstraction was not worth it

    Achille Roussel committed Dec 30, 2021
    Configuration menu
    Copy the full SHA
    2d38f12 View commit details
    Browse the repository at this point in the history
  2. [O11Y-220] add schema conversions

    Achille Roussel committed Dec 30, 2021
    Configuration menu
    Copy the full SHA
    eedb358 View commit details
    Browse the repository at this point in the history
  3. remove unused max function

    Achille Roussel committed Dec 30, 2021
    Configuration menu
    Copy the full SHA
    1e30ad6 View commit details
    Browse the repository at this point in the history
  4. Merge remote-tracking branch 'origin/main' into O11Y-194

    Achille Roussel committed Dec 30, 2021
    Configuration menu
    Copy the full SHA
    3cc049a View commit details
    Browse the repository at this point in the history
  5. Merge branch 'O11Y-194' of ssh://github.com/segmentio/parquet into O1…

    …1Y-194
    Achille Roussel committed Dec 30, 2021
    Configuration menu
    Copy the full SHA
    5281285 View commit details
    Browse the repository at this point in the history
  6. Merge remote-tracking branch 'origin/O11Y-194' into O11Y-221

    Achille Roussel committed Dec 30, 2021
    Configuration menu
    Copy the full SHA
    7cbeca3 View commit details
    Browse the repository at this point in the history
  7. Merge remote-tracking branch 'origin/O11Y-221' into O11Y-219

    Achille Roussel committed Dec 30, 2021
    Configuration menu
    Copy the full SHA
    babea64 View commit details
    Browse the repository at this point in the history
  8. Merge remote-tracking branch 'origin/O11Y-219' into O11Y-220

    Achille Roussel committed Dec 30, 2021
    Configuration menu
    Copy the full SHA
    04c6734 View commit details
    Browse the repository at this point in the history
  9. Configuration menu
    Copy the full SHA
    867b716 View commit details
    Browse the repository at this point in the history
  10. refactor row group abstractions

    Achille Roussel committed Dec 30, 2021
    Configuration menu
    Copy the full SHA
    f94de8d View commit details
    Browse the repository at this point in the history

Commits on Dec 31, 2021

  1. refactor buffer column to use ReadRowAt/WriteRow

    Achille Roussel committed Dec 31, 2021
    Configuration menu
    Copy the full SHA
    9b4372c View commit details
    Browse the repository at this point in the history
  2. modify RowGroupColumn interface to return the list of pages it contai…

    …ns instead of a single page
    Achille Roussel committed Dec 31, 2021
    Configuration menu
    Copy the full SHA
    2c50b29 View commit details
    Browse the repository at this point in the history
  3. add back the slicing of pages to target size in writer

    Achille Roussel committed Dec 31, 2021
    Configuration menu
    Copy the full SHA
    633a257 View commit details
    Browse the repository at this point in the history
  4. Merge remote-tracking branch 'origin/main' into O11Y-224

    Achille Roussel committed Dec 31, 2021
    Configuration menu
    Copy the full SHA
    8924c5b View commit details
    Browse the repository at this point in the history
  5. implement parquet.ValueReaderAt on parquet.BufferColumn

    Achille Roussel committed Dec 31, 2021
    Configuration menu
    Copy the full SHA
    9d1d87b View commit details
    Browse the repository at this point in the history
  6. Configuration menu
    Copy the full SHA
    0dd07b9 View commit details
    Browse the repository at this point in the history
  7. express WriteRowGroup as a row copy operation

    Achille Roussel committed Dec 31, 2021
    Configuration menu
    Copy the full SHA
    41a3e2f View commit details
    Browse the repository at this point in the history
  8. Configuration menu
    Copy the full SHA
    76c2ae9 View commit details
    Browse the repository at this point in the history
  9. Configuration menu
    Copy the full SHA
    88681d3 View commit details
    Browse the repository at this point in the history
  10. express reading rows from parquet.Buffer and parquet.Reader with the …

    …same algorithm
    Achille Roussel committed Dec 31, 2021
    Configuration menu
    Copy the full SHA
    121a19e View commit details
    Browse the repository at this point in the history
  11. document the APIs

    Achille Roussel committed Dec 31, 2021
    Configuration menu
    Copy the full SHA
    d4d7dd3 View commit details
    Browse the repository at this point in the history
  12. commont on the design decisions in parquet.CopyRows

    Achille Roussel committed Dec 31, 2021
    Configuration menu
    Copy the full SHA
    0475a91 View commit details
    Browse the repository at this point in the history
  13. work on row group merging

    Achille Roussel committed Dec 31, 2021
    Configuration menu
    Copy the full SHA
    3049139 View commit details
    Browse the repository at this point in the history
  14. express sorting colums of row group as []parquet.SortingColumn rather…

    … than []format.SortingColumn
    Achille Roussel committed Dec 31, 2021
    Configuration menu
    Copy the full SHA
    0991785 View commit details
    Browse the repository at this point in the history
  15. add configuration options to row group merging function

    Achille Roussel committed Dec 31, 2021
    Configuration menu
    Copy the full SHA
    c214ab2 View commit details
    Browse the repository at this point in the history
  16. don't precompute the number of rows in merged row groups

    Achille Roussel committed Dec 31, 2021
    Configuration menu
    Copy the full SHA
    7b9412a View commit details
    Browse the repository at this point in the history
  17. expose column index on row group columns

    Achille Roussel committed Dec 31, 2021
    Configuration menu
    Copy the full SHA
    e651579 View commit details
    Browse the repository at this point in the history
  18. sort row group sorting columns

    Achille Roussel committed Dec 31, 2021
    Configuration menu
    Copy the full SHA
    e55dc83 View commit details
    Browse the repository at this point in the history
  19. refactor parquet.Type.Less into parquet.Type.Compare

    Achille Roussel committed Dec 31, 2021
    Configuration menu
    Copy the full SHA
    cd863e2 View commit details
    Browse the repository at this point in the history

Commits on Jan 2, 2022

  1. refactor writer

    Achille Roussel committed Jan 2, 2022
    Configuration menu
    Copy the full SHA
    0307f66 View commit details
    Browse the repository at this point in the history
  2. optimize writes using pages

    Achille Roussel committed Jan 2, 2022
    Configuration menu
    Copy the full SHA
    736360f View commit details
    Browse the repository at this point in the history
  3. optimize value count of repeated pages

    Achille Roussel committed Jan 2, 2022
    Configuration menu
    Copy the full SHA
    bcc4220 View commit details
    Browse the repository at this point in the history
  4. rename BufferColumn => ColumnBuffer

    Achille Roussel committed Jan 2, 2022
    Configuration menu
    Copy the full SHA
    925e6f8 View commit details
    Browse the repository at this point in the history
  5. fix typos

    Achille Roussel committed Jan 2, 2022
    Configuration menu
    Copy the full SHA
    6df25e5 View commit details
    Browse the repository at this point in the history
  6. refactor writer to support reading pages from arbitrary row group col…

    …umns
    Achille Roussel committed Jan 2, 2022
    Configuration menu
    Copy the full SHA
    5e37223 View commit details
    Browse the repository at this point in the history
  7. add columnPath type

    Achille Roussel committed Jan 2, 2022
    Configuration menu
    Copy the full SHA
    2ea6b89 View commit details
    Browse the repository at this point in the history
  8. introduced BufferedPage/CompressedPage interfaces

    Achille Roussel committed Jan 2, 2022
    Configuration menu
    Copy the full SHA
    809f80d View commit details
    Browse the repository at this point in the history
  9. add tests for row group merging logic

    Achille Roussel committed Jan 2, 2022
    Configuration menu
    Copy the full SHA
    414461e View commit details
    Browse the repository at this point in the history

Commits on Jan 3, 2022

  1. Configuration menu
    Copy the full SHA
    1866936 View commit details
    Browse the repository at this point in the history
  2. remove unused code

    Achille Roussel committed Jan 3, 2022
    Configuration menu
    Copy the full SHA
    afa0d2e View commit details
    Browse the repository at this point in the history
  3. remove more unused code

    Achille Roussel committed Jan 3, 2022
    Configuration menu
    Copy the full SHA
    5e6d1d1 View commit details
    Browse the repository at this point in the history
  4. capture buffer schema when creating reader of buffer rows

    Achille Roussel committed Jan 3, 2022
    Configuration menu
    Copy the full SHA
    48c5463 View commit details
    Browse the repository at this point in the history
  5. Configuration menu
    Copy the full SHA
    64a8652 View commit details
    Browse the repository at this point in the history
  6. add more validations when writing row groups

    Achille Roussel committed Jan 3, 2022
    Configuration menu
    Copy the full SHA
    205f8be View commit details
    Browse the repository at this point in the history
  7. fix handling of sorting columns when merging row groups

    Achille Roussel committed Jan 3, 2022
    Configuration menu
    Copy the full SHA
    d6ffc4f View commit details
    Browse the repository at this point in the history
  8. move all error values to errors.go

    Achille Roussel committed Jan 3, 2022
    Configuration menu
    Copy the full SHA
    fe303a2 View commit details
    Browse the repository at this point in the history
  9. make row group readers more generic

    Achille Roussel committed Jan 3, 2022
    Configuration menu
    Copy the full SHA
    0beac07 View commit details
    Browse the repository at this point in the history

Commits on Jan 4, 2022

  1. Configuration menu
    Copy the full SHA
    8b3cb88 View commit details
    Browse the repository at this point in the history
  2. refactor readers to use file row group APIs

    Achille Roussel committed Jan 4, 2022
    Configuration menu
    Copy the full SHA
    bb35581 View commit details
    Browse the repository at this point in the history
  3. rename RowGroupColumn to ColumnChunk

    Achille Roussel committed Jan 4, 2022
    Configuration menu
    Copy the full SHA
    f9c4b88 View commit details
    Browse the repository at this point in the history
  4. make page readers reusable to address performance regression

    Achille Roussel committed Jan 4, 2022
    Configuration menu
    Copy the full SHA
    ff77f51 View commit details
    Browse the repository at this point in the history
  5. support reusing inner fields of page header

    Achille Roussel committed Jan 4, 2022
    Configuration menu
    Copy the full SHA
    35a5b4d View commit details
    Browse the repository at this point in the history
  6. udpate segmentio/encoding

    Achille Roussel committed Jan 4, 2022
    Configuration menu
    Copy the full SHA
    3e4842f View commit details
    Browse the repository at this point in the history
  7. cleanups

    Achille Roussel committed Jan 4, 2022
    Configuration menu
    Copy the full SHA
    6911f3e View commit details
    Browse the repository at this point in the history

Commits on Jan 5, 2022

  1. PR feedback

    Achille Roussel committed Jan 5, 2022
    Configuration menu
    Copy the full SHA
    08ee163 View commit details
    Browse the repository at this point in the history
  2. remove Pages method on parquet.RowGroup

    Achille Roussel committed Jan 5, 2022
    Configuration menu
    Copy the full SHA
    58a5d57 View commit details
    Browse the repository at this point in the history
  3. PR feedback

    Achille Roussel committed Jan 5, 2022
    Configuration menu
    Copy the full SHA
    0327ad6 View commit details
    Browse the repository at this point in the history
  4. rename: pageHeaderStatisticsOf => filePage.statistics

    Achille Roussel committed Jan 5, 2022
    Configuration menu
    Copy the full SHA
    df21f95 View commit details
    Browse the repository at this point in the history
  5. Configuration menu
    Copy the full SHA
    bc88eb1 View commit details
    Browse the repository at this point in the history
  6. expose page index on column chunks and re-enable the tests

    Achille Roussel committed Jan 5, 2022
    Configuration menu
    Copy the full SHA
    0f5e476 View commit details
    Browse the repository at this point in the history
  7. remove column_chunk.go

    Achille Roussel committed Jan 5, 2022
    Configuration menu
    Copy the full SHA
    930c695 View commit details
    Browse the repository at this point in the history
  8. fix page index generation when writing compressed pages

    Achille Roussel committed Jan 5, 2022
    Configuration menu
    Copy the full SHA
    1222779 View commit details
    Browse the repository at this point in the history
  9. fix broken build

    Achille Roussel committed Jan 5, 2022
    Configuration menu
    Copy the full SHA
    88d16e7 View commit details
    Browse the repository at this point in the history
  10. all: add documentation and links to additional documentation

    Co-authored-by: Achille <achille@segment.com>
    kevinburkesegment and Achille committed Jan 5, 2022
    Configuration menu
    Copy the full SHA
    b64505b View commit details
    Browse the repository at this point in the history

Commits on Jan 6, 2022

  1. always buffer column pages in parquet writer to avoid wrongly reorder…

    …ing pages when writing directly to the output file
    Achille Roussel committed Jan 6, 2022
    Configuration menu
    Copy the full SHA
    9fd4ac7 View commit details
    Browse the repository at this point in the history
  2. simplify writer internals

    Achille Roussel committed Jan 6, 2022
    Configuration menu
    Copy the full SHA
    ab3fe9c View commit details
    Browse the repository at this point in the history
  3. Merge branch 'O11Y-224' of ssh://github.com/segmentio/parquet into O1…

    …1Y-224
    Achille Roussel committed Jan 6, 2022
    Configuration menu
    Copy the full SHA
    3e26561 View commit details
    Browse the repository at this point in the history
  4. Configuration menu
    Copy the full SHA
    09be285 View commit details
    Browse the repository at this point in the history
  5. cleanup

    Achille Roussel committed Jan 6, 2022
    Configuration menu
    Copy the full SHA
    835f7f3 View commit details
    Browse the repository at this point in the history
  6. Update file.go

    Co-authored-by: Kevin Burke <kevin.burke@segment.com>
    Achille and kevinburkesegment committed Jan 6, 2022
    Configuration menu
    Copy the full SHA
    f6a2d84 View commit details
    Browse the repository at this point in the history
  7. Update row.go

    Co-authored-by: Kevin Burke <kevin.burke@segment.com>
    Achille and kevinburkesegment committed Jan 6, 2022
    Configuration menu
    Copy the full SHA
    95c2f33 View commit details
    Browse the repository at this point in the history
  8. Update row.go

    Co-authored-by: Kevin Burke <kevin.burke@segment.com>
    Achille and kevinburkesegment committed Jan 6, 2022
    Configuration menu
    Copy the full SHA
    0a527ec View commit details
    Browse the repository at this point in the history
  9. Update value.go

    Co-authored-by: Kevin Burke <kevin.burke@segment.com>
    Achille and kevinburkesegment committed Jan 6, 2022
    Configuration menu
    Copy the full SHA
    47d4ab0 View commit details
    Browse the repository at this point in the history
  10. Update column_buffer.go

    Co-authored-by: Kevin Burke <kevin.burke@segment.com>
    Achille and kevinburkesegment committed Jan 6, 2022
    Configuration menu
    Copy the full SHA
    fbe6d81 View commit details
    Browse the repository at this point in the history
  11. PR feedback

    Achille Roussel committed Jan 6, 2022
    Configuration menu
    Copy the full SHA
    616933e View commit details
    Browse the repository at this point in the history

Commits on Jan 10, 2022

  1. wrap up

    Achille Roussel committed Jan 10, 2022
    Configuration menu
    Copy the full SHA
    46b4427 View commit details
    Browse the repository at this point in the history
  2. Update buffer.go

    Co-authored-by: Kevin Burke <kevin.burke@segment.com>
    Achille and kevinburkesegment committed Jan 10, 2022
    Configuration menu
    Copy the full SHA
    203b961 View commit details
    Browse the repository at this point in the history
  3. Update buffer.go

    Co-authored-by: Kevin Burke <kevin.burke@segment.com>
    Achille and kevinburkesegment committed Jan 10, 2022
    Configuration menu
    Copy the full SHA
    98650c0 View commit details
    Browse the repository at this point in the history
  4. Update format/parquet.go

    Co-authored-by: Kevin Burke <kevin.burke@segment.com>
    Achille and kevinburkesegment committed Jan 10, 2022
    Configuration menu
    Copy the full SHA
    acbab33 View commit details
    Browse the repository at this point in the history
  5. Update page.go

    Co-authored-by: Kevin Burke <kevin.burke@segment.com>
    Achille and kevinburkesegment committed Jan 10, 2022
    Configuration menu
    Copy the full SHA
    61e0490 View commit details
    Browse the repository at this point in the history
  6. Update reader.go

    Co-authored-by: Kevin Burke <kevin.burke@segment.com>
    Achille and kevinburkesegment committed Jan 10, 2022
    Configuration menu
    Copy the full SHA
    f346f1f View commit details
    Browse the repository at this point in the history