Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Default record_delimiter handled differently for utf8 and utf16le #365

Closed
bergos opened this issue Oct 11, 2022 · 1 comment
Closed

Default record_delimiter handled differently for utf8 and utf16le #365

bergos opened this issue Oct 11, 2022 · 1 comment

Comments

@bergos
Copy link

bergos commented Oct 11, 2022

It looks like there is a problem regarding record_delimiter with utf16le encoding. The example below generates different outputs for utf8 and utf16le. If the record_delimiter is explicitly given, the result looks good.

const { Readable } = require('stream')
const { Parser } = require('csv-parse')

async function parse (bom, content, record_delimiter) {
  const input = Readable.from(Buffer.concat([bom, content]))
  const parser = new Parser({ bom: true, record_delimiter })

  input.pipe(parser)

  for await (const line of parser) {
    console.log(line)
  }
}

async function main () {
  const lines = ['a,b,c', '1,2,3'].join('\r\n')

  console.log('utf8')
  await parse(Buffer.from([0xef,0xbb,0xbf]), Buffer.from(lines, 'utf8'))

  console.log('utf16le')
  await parse(Buffer.from([0xff, 0xfe]), Buffer.from(lines, 'utf16le'))

  console.log('utf16le \\r\\n')
  await parse(Buffer.from([0xff, 0xfe]), Buffer.from(lines, 'utf16le'), ['\r\n'])
}

main()

/* output:

utf8
[ 'a', 'b', 'c' ]
[ '1', '2', '3' ]
utf16le
[ 'a', 'b', 'c' ]
[ '\n1', '2', '3' ]
utf16le \r\n
[ 'a', 'b', 'c' ]
[ '1', '2', '3' ]

*/
@wdavidw
Copy link
Member

wdavidw commented Oct 12, 2022

Thank you for reporting, version 5.3.1 of csv-parse fixes the issue.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants