Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

excessive cpu load(Infinite) #324

Closed
tsabbay opened this issue Sep 15, 2021 · 3 comments · Fixed by #314
Closed

excessive cpu load(Infinite) #324

tsabbay opened this issue Sep 15, 2021 · 3 comments · Fixed by #314
Labels
bug Something isn't working performance
Milestone

Comments

@tsabbay
Copy link

tsabbay commented Sep 15, 2021

We noticed 100% cpu load (infinite) when the data is containing 
 (
) Or similar codes for new line.
Example of code

const { DOMParser } = require('xmldom')

let xmlDocument = new DOMParser().parseFromString(documentFile)

Attached is a document, beware of parsing this as it's going to choke your instance.
hell end line.docx

Happy to provide more info.

@karfau karfau added bug Something isn't working needs investigation Information is missing and needs to be researched help-wanted External contributions welcome performance labels Sep 15, 2021
@karfau
Copy link
Member

karfau commented Sep 15, 2021

Thank you for providing the test data. There are earlier similar reports but nobody was able to provide example data yet.

Do i understand correctly that the docx file should be read into a string and its content parsed? (I thought it's a zip file.)

@tsabbay
Copy link
Author

tsabbay commented Sep 15, 2021 via email

@karfau karfau linked a pull request Sep 16, 2021 that will close this issue
@karfau karfau removed help-wanted External contributions welcome needs investigation Information is missing and needs to be researched labels Sep 16, 2021
@karfau karfau added this to the 0.8.0 milestone Sep 16, 2021
@karfau
Copy link
Member

karfau commented Sep 16, 2021

Hey there. I looked into it.
First of all as I thought, the docx file is basically a zip file, so xmldom can not "parse" it directly as your initial post suggested.
It contains the following structure:

├── [Content_Types].xml
├── docProps
│   ├── app.xml
│   └── core.xml
├── _rels
└── word
    ├── document.xml
    ├── fontTable.xml
    ├── _rels
    │   └── document.xml.rels
    ├── settings.xml
    ├── styles.xml
    ├── theme
    │   └── theme1.xml
    └── webSettings.xml

5 directories, 10 files

I searched the files for the character you mentioned, and only found it in word/document.xml.

Good news is: this is already fixed on master since we landed #314
We are currently working on the last two issues that will be part of the 0.8.0 release that will contain this bugfix. You can check on our progress in the related milestone

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working performance
Projects
Status: Done
Development

Successfully merging a pull request may close this issue.

2 participants