Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Preview of first row fails with "UndetectableDelimiter" error if file is too large #1014

Open
briavicenti opened this issue Jul 24, 2023 · 1 comment

Comments

@briavicenti
Copy link

I'm using the preview option when parsing to validate that the header row of a file matches a set of expected headers before attempting to upload:

    Papa.parse(file, {
      preview: 1,
      complete: (results) => {
         const headers = results.data[0];

         const doHeadersMatch = headers?.every(
          (header, idx) => header === EXPECTED_HEADERS[idx]
        );
      },
    });

I have 3 files that I am using to test this feature that are 100MB, 300MB, and 700MB in size. The 3 files have identical headers -- I created the 100MB and 300MB versions by deleting rows from the original 700MB test file. Despite that, though, I get an empty data field and the following error when trying to parse the 700MB version but not the other two:

{
    "type": "Delimiter",
    "code": "UndetectableDelimiter",
    "message": "Unable to auto-detect delimiting character; defaulted to ','"
}

Specifying the delimiter and newline options when parsing does not get rid of this error. The error appears after a few seconds' delay, which surprised me anyway as I'm just trying to preview a single line. Would really appreciate any help or suggested workarounds here!

@dasveloper
Copy link

dasveloper commented Oct 10, 2023

If you use the "streaming" setup it lets you handle much larger files.

Here is how it's normally used:

let results: string[][] = [];
Papa.parse(file, {
  preview: 100000,
  worker: true,
  step: (result) => {
    // Called on every chunk read from the file.
    if (result.data) {
      results.push(result.data);
    }
  },
  complete: (test) => {
    // Final results
    console.log(results);
  },
});

But since you're only reading one row you can simplify to this:

Papa.parse(file, {
  preview: 1,
  worker: true,
  step: (result) => {
    // First and only chunk
    console.log(result.data);
  },
});

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants