Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

question: Does csview fixed the stream rendering? #61

Closed
6 tasks done
zhiburt opened this issue May 31, 2022 · 7 comments
Closed
6 tasks done

question: Does csview fixed the stream rendering? #61

zhiburt opened this issue May 31, 2022 · 7 comments

Comments

@zhiburt
Copy link
Contributor

zhiburt commented May 31, 2022

Hi @wfxr

I bump into csview by

phsym/prettytable-rs#104
#19

I think it's a great idea to not allocate the huge chunk of memory but rather do things lazily.

And I was thinking how to do it myself zhiburt/tabled#123
I was thinking about requiring Clone for iterator so 1 pass for demension the second pass when print.
But it seems to be restrictive to me.

So I looked at your ideas.

And I just wonder if you've fixed the original issue.

Specifically my concern is about this Vec allocation.

let (widths, buf) = sniff_widths(&mut rdr, header.as_ref(), sniff_rows)?;

let mut buf = Vec::new();

Am I wrong that essentially we will allocate n * m memory anyway here.

Take care.
Thank you for cross issue references :)

Check list

  • I have read through the README
  • I have searched through the existing issues

Environment info

  • OS
    • Linux
    • Mac OS X
    • Windows
    • Others:

Version

Problem / Steps to reproduce

@wfxr
Copy link
Owner

wfxr commented May 31, 2022

Hi @zhiburt , sniff_rows is for detecting reasonable widths for each column. It is provided by the cli options --sniff (default 1000). And you are right, we still need to pre-allocate some memory to store the rows during sniffing because we cannot read them twice. But the memory usage will no longer grow once sniff finished. It's not a big deal since we only need to store constant number of rows.

The benchmarks section shows the memory usage in some common scenarios:
https://github.com/wfxr/csview#benchmark

@wfxr
Copy link
Owner

wfxr commented May 31, 2022

let (widths, buf) = sniff_widths(&mut rdr, header.as_ref(), sniff_rows)?;

let mut buf = Vec::new();

Am I wrong that essentially we will allocate n * m memory anyway here.

buf here is to store the sniffed rows, not all rows from input.

@zhiburt
Copy link
Contributor Author

zhiburt commented May 31, 2022

Hi @zhiburt , sniff_rows is for detecting reasonable widths for each column. It is provided by the cli options --sniff (default 1000). And you are right, we still need to pre-allocate some memory to store the rows during sniffing because we cannot read them twice. But the memory usage will no longer grow once sniff finished. It's not a big deal since we only need to store constant number of rows.

buf here is to store the sniffed rows, not all rows from input.

Understood

But when the --sniff argument is not provided we essentially will allocate memory for all rows right?

PS: Correct me if I am wrong but in case --sniff was 1000 but 1001 row "very big" compared to the ones in first 1000 chunk the displayed table will be broken?

@zhiburt
Copy link
Contributor Author

zhiburt commented May 31, 2022

As I understand it is assumed that --sniff always be set to some value right?

let widths = &self.widths;

csview/src/cli.rs

Lines 41 to 42 in 5a1e6d9

#[clap(long, default_value_t = 1000, name = "LIMIT")]
pub sniff: usize,

@wfxr
Copy link
Owner

wfxr commented May 31, 2022

As I understand it is assumed that --sniff always be set to some value right?

@zhiburt Right. If sniff not set by user it will take the default value 1000. You can also assign it to 0 which will disable sniff mechanism and in this case csview will load all rows in memory to calculate the exact width of each column before printing.

@wfxr
Copy link
Owner

wfxr commented May 31, 2022

PS: Correct me if I am wrong but in case --sniff was 1000 but 1001 row "very big" compared to the ones in first 1000 chunk the displayed table will be broken?

In this case the "very big" field will be truncated:

.map(|(cell, &width)| cell.unicode_pad(width, Alignment::Left, true))

@zhiburt
Copy link
Contributor Author

zhiburt commented May 31, 2022

I like your approach 👍

Thank you for the clarification.
Sorry for bothering

And once again take care.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants