We read every piece of feedback, and take your input very seriously.
To see all available qualifiers, see our documentation.
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Describe the bug
As the title says, the ParquetRecordBatchReader can not recognize duration type written by pandas or polars.
ParquetRecordBatchReader
To Reproduce
First, we should prepare parquet file
import polars as pl from datetime import timedelta df = pl.DataFrame({ "a": [timedelta(days=1) for _ in range(100)] }) df.write_parquet("./test.parquet")
Then, read in rust arrow-rs:
fn main() -> Result<()> { // Create parquet file that will be read. let path = "./test.parquet"; let file = File::open(path).unwrap(); let parquet_reader = ParquetRecordBatchReaderBuilder::try_new(file)? .with_batch_size(8192) .build()?; let mut batches = Vec::new(); for batch in parquet_reader { batches.push(batch?); } println!("{:#?}", batches[0].schema()); Ok(()) }
finally we get:
Schema { fields: [ Field { name: "a", data_type: Int64, nullable: true, dict_id: 0, dict_is_ordered: false, metadata: {}, }, ], metadata: {}, }
Expected behavior
polars result:
shape: (100, 1) ┌──────────────┐ │ a │ │ --- │ │ duration[μs] │ ╞══════════════╡ │ 1d │ │ 1d │ │ 1d │ │ 1d │ │ 1d │ │ … │ │ 1d │ │ 1d │ │ 1d │ │ 1d │ │ 1d │ └──────────────┘
pandas result:
a 0 1 days 1 1 days 2 1 days 3 1 days 4 1 days .. ... 95 1 days 96 1 days 97 1 days 98 1 days 99 1 days [100 rows x 1 columns]
Additional context
The text was updated successfully, but these errors were encountered:
Successfully merging a pull request may close this issue.
Describe the bug
As the title says, the
ParquetRecordBatchReader
can not recognize duration type written by pandas or polars.To Reproduce
First, we should prepare parquet file
Then, read in rust arrow-rs:
finally we get:
Expected behavior
polars result:
pandas result:
Additional context
The text was updated successfully, but these errors were encountered: