How to use Chardet for this Python code, as to read files that have ANSI encoder? #277

me-suzy · 2023-03-26T10:35:47Z

Traceback (most recent call last):
  File "D:\Convert docx to pdf.py", line 32, in <module>
    file_content = file_path.read_text(encoding='UTF-8')
  File "C:\Program Files\Python39\lib\pathlib.py", line 1133, in read_text
    return f.read()
  File "C:\Program Files\Python39\lib\codecs.py", line 322, in decode
    (result, consumed) = self._buffer_decode(data, self.errors, final)
UnicodeDecodeError: 'utf-8' codec can't decode byte 0xd2 in position 16: invalid continuation byte

and this code I find on web. It converts all .docx files into PDF files.

import re
import os
from pathlib import Path
from docx import Document
from docx.shared import Inches
import sys
import chardet
from docx2pdf import convert

# The location where the files are located
input_path = r'c:\Folder7\input'
# The location where we will write the PDF files
output_path = r'c:\Folder7\output'
# Creeaza structura de foldere daca nu exista
os.makedirs(output_path, exist_ok=True)

# Verifica existenta folder-ului
directory_path = Path(input_path)
if directory_path.exists() and directory_path.is_dir():
    print(directory_path, "exists")
else:
    print(directory_path, "is invalid")
    sys.exit(1)

for file_path in directory_path.glob("*"):
    # file_path is a Path object

    print("Procesez fisierul:", file_path)
    document = Document()
    # file_path.name is the name of the file as str without the Path
    document.add_heading(file_path.name, 0)

    file_content = file_path.read_text(encoding='UTF-8')
    document.add_paragraph(file_content)

    # build the new path where we store the files
    output_file_path = os.path.join(output_path, file_path.name + ".pdf")

    document.save(output_file_path)
    print("Am convertit urmatorul fisier:", file_path, "in: ", output_file_path)

Can anyone update the code as to read all ANSI (ASCII) files with Charder, so as to convert them into UTF-8?

The text was updated successfully, but these errors were encountered:

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

How to use Chardet for this Python code, as to read files that have ANSI encoder? #277

How to use Chardet for this Python code, as to read files that have ANSI encoder? #277

me-suzy commented Mar 26, 2023 •

edited

How to use Chardet for this Python code, as to read files that have ANSI encoder? #277

How to use Chardet for this Python code, as to read files that have ANSI encoder? #277

Comments

me-suzy commented Mar 26, 2023 • edited

me-suzy commented Mar 26, 2023 •

edited