Skip to content

Commit

Permalink
Add encoding argument for directives (#119)
Browse files Browse the repository at this point in the history
* Add `encoding` argument for directives
  • Loading branch information
mondeja committed Sep 7, 2022
1 parent 9dc4eb8 commit 31be084
Show file tree
Hide file tree
Showing 18 changed files with 276 additions and 65 deletions.
2 changes: 1 addition & 1 deletion .bumpversion.cfg
@@ -1,5 +1,5 @@
[bumpversion]
current_version = 3.6.1
current_version = 3.7.0

[bumpversion:file:mkdocs_include_markdown_plugin/__init__.py]

Expand Down
3 changes: 3 additions & 0 deletions .gitignore
Expand Up @@ -120,6 +120,9 @@ venv.bak/
# Rope project settings
.ropeproject

# VS Code
.vscode

# mkdocs documentation
/site

Expand Down
8 changes: 5 additions & 3 deletions .pre-commit-config.yaml
Expand Up @@ -22,11 +22,13 @@ repos:
args:
- --py36-plus
- repo: https://github.com/asottile/setup-cfg-fmt
rev: v1.20.2
rev: v2.0.0
hooks:
- id: setup-cfg-fmt
args:
- --include-version-classifiers
- repo: https://github.com/PyCQA/flake8
rev: 4.0.1
rev: 5.0.4
hooks:
- id: flake8
additional_dependencies:
Expand Down Expand Up @@ -71,7 +73,7 @@ repos:
- -c
- .yamllint
- repo: https://github.com/DavidAnson/markdownlint-cli2
rev: v0.5.0
rev: v0.5.1
hooks:
- id: markdownlint-cli2
name: markdownlint-readme
Expand Down
18 changes: 12 additions & 6 deletions README.md
Expand Up @@ -83,6 +83,15 @@ content to include.
`true` and `false`.
- <a name="include-markdown_dedent" href="#include-markdown_dedent">#</a>
**dedent** (*false*): If enabled, the included content will be dedented.
- <a name="include-markdown_exclude" href="#include-markdown_exclude">#</a>
**exclude**: Specify with a glob which files should be ignored. Only useful
when passing globs to include multiple files.
- <a name="include-markdown_trailing-newlines" href="#include-markdown_trailing-newlines">#</a>
**trailing-newlines** (*true*): When this option is disabled, the trailing newlines
found in the content to include are stripped. Possible values are `true` and `false`.
- <a name="include-markdown_encoding" href="#include-markdown_encoding">#</a>
**encoding** (*utf-8*): Specify the encoding of the included file.
If not defined `utf-8` will be used.
- <a name="include-markdown_rewrite-relative-urls" href="#include-markdown_rewrite-relative-urls">#</a>
**rewrite-relative-urls** (*true*): When this option is enabled (default),
Markdown links and images in the content that are specified by a relative URL
Expand All @@ -97,12 +106,6 @@ content to include.
**heading-offset** (0): Increases or decreases the Markdown headings depth
by this number. Only supports number sign (`#`) heading syntax. Accepts
negative values to drop leading `#` characters.
- <a name="include-markdown_exclude" href="#include-markdown_exclude">#</a>
**exclude**: Specify with a glob which files should be ignored. Only useful
when passing globs to include multiple files.
- <a name="include-markdown_trailing-newlines" href="#include-markdown_trailing-newlines">#</a>
**trailing-newlines** (*true*): When this option is disabled, the trailing newlines
found in the content to include are stripped. Possible values are `true` and `false`.

##### Examples

Expand Down Expand Up @@ -166,6 +169,9 @@ Includes the content of a file or a group of files.
- <a name="include_trailing-newlines" href="#include_trailing-newlines">#</a>
**trailing-newlines** (*true*): When this option is disabled, the trailing newlines
found in the content to include are stripped. Possible values are `true` and `false`.
- <a name="include_encoding" href="#include_encoding">#</a>
**encoding** (*utf-8*): Specify the encoding of the included file.
If not defined `utf-8` will be used.

##### Examples

Expand Down
22 changes: 14 additions & 8 deletions locale/es/README.md
Expand Up @@ -78,6 +78,17 @@ indentar la plantilla `{% %}` incluidora. Los valores posibles son `true` y
`false`.
- <a name="include-markdown_dedent" href="#include-markdown_dedent">#</a> **dedent**
(*false*): Si se habilita, el contenido incluido será dedentado.
- <a name="include-markdown_exclude" href="#include-markdown_exclude">#</a> **exclude**:
Expecifica mediante un glob los archivos que deben ser ignorados. Sólo es útil
pasando globs para incluir múltiples archivos.
- <a name="include-markdown_trailing-newlines"
href="#include-markdown_trailing-newlines">#</a> **trailing-newlines** (*true*):
Cuando esta opción está deshabilitada, los saltos de línea finales que se
encuentran en el contenido a incluir se eliminan. Los valores posibles son
`true` y `false`.
- <a name="include-markdown_encoding" href="#include-markdown_encoding">#</a> **encoding**
(*utf-8*): Especifica la codificación del archivo incluído. Si no se define, se
usará `utf-8`.
- <a name="include-markdown_rewrite-relative-urls"
href="#include-markdown_rewrite-relative-urls">#</a> **rewrite-relative-urls** (*true*):
Cuando esta opción está habilitada (por defecto), los enlaces e imágenes
Expand All @@ -94,14 +105,6 @@ href="#include-markdown_heading-offset">#</a> **heading-offset** (0): Incrementa
o disminuye la profundidad de encabezados Markdown por el número especificado.
Sólo soporta la sintaxis de encabezado de caracteres de hash (`#`). Acepta
valores negativos para eliminar caracteres `#` a la izquierda.
- <a name="include-markdown_exclude" href="#include-markdown_exclude">#</a> **exclude**:
Expecifica mediante un glob los archivos que deben ser ignorados. Sólo es útil
pasando globs para incluir múltiples archivos.
- <a name="include-markdown_trailing-newlines"
href="#include-markdown_trailing-newlines">#</a> **trailing-newlines** (*true*):
Cuando esta opción está deshabilitada, los saltos de línea finales que se
encuentran en el contenido a incluir se eliminan. Los valores posibles son
`true` y `false`.

##### Ejemplos

Expand Down Expand Up @@ -165,6 +168,9 @@ pasando globs para incluir múltiples archivos.
(*true*): Cuando esta opción está deshabilitada, los saltos de línea finales que
se encuentran en el contenido a incluir se eliminan. Los valores posibles son
`true` y `false`.
- <a name="include_encoding" href="#include_encoding">#</a> **encoding** (*utf-8*):
Especifica la codificación del archivo incluído. Si no se define, se usará
`utf-8`.

##### Ejemplos

Expand Down
18 changes: 18 additions & 0 deletions locale/es/README.md.po
Expand Up @@ -315,3 +315,21 @@ msgstr ""
"Las etiquetas de apertura y cierre por defecto son `{%` y `%}`. Se puede "
"cambiar este valor por defecto con los campos de configuración `opening_tag`"
" y `closing_tag`."

msgid ""
"<a name=\"include-markdown_encoding\" href=\"#include-"
"markdown_encoding\">#</a> **encoding** (*utf-8*): Specify the encoding of "
"the included file. If not defined `utf-8` will be used."
msgstr ""
"<a name=\"include-markdown_encoding\" href=\"#include-"
"markdown_encoding\">#</a> **encoding** (*utf-8*): Especifica la codificación"
" del archivo incluído. Si no se define, se usará `utf-8`."

msgid ""
"<a name=\"include_encoding\" href=\"#include_encoding\">#</a> **encoding** "
"(*utf-8*): Specify the encoding of the included file. If not defined `utf-8`"
" will be used."
msgstr ""
"<a name=\"include_encoding\" href=\"#include_encoding\">#</a> **encoding** "
"(*utf-8*): Especifica la codificación del archivo incluído. Si no se define,"
" se usará `utf-8`."
22 changes: 14 additions & 8 deletions locale/fr/README.md
Expand Up @@ -77,6 +77,17 @@ href="#include-markdown_preserve-includer-indent">#</a> **preserve-includer-inde
l'incluseur modèle `{% %}`. Les valeurs possibles sont `true` et `false`.
- <a name="include-markdown_dedent" href="#include-markdown_dedent">#</a> **dedent**
(*false*): Lorsque est activée, le contenu inclus sera déchiqueté.
- <a name="include-markdown_exclude" href="#include-markdown_exclude">#</a> **exclude**:
Spécifiez avec un glob quels fichiers doivent être ignorés. Uniquement utile
lors du passage de globs pour inclure plusieurs fichiers.
- <a name="include-markdown_trailing-newlines"
href="#include-markdown_trailing-newlines">#</a> **trailing-newlines** (*true*):
Lorsque cette option est désactivée, les nouvelles lignes de fin trouvées dans
le contenu à inclure sont supprimées. Les valeurs possibles sont `true` et
`false`.
- <a name="include-markdown_encoding" href="#include-markdown_encoding">#</a> **encoding**
(*utf-8*): Spécifiez l'encodage du fichier inclus. S'il n'est pas défini,
`utf-8` sera utilisé.
- <a name="include-markdown_rewrite-relative-urls"
href="#include-markdown_rewrite-relative-urls">#</a> **rewrite-relative-urls** (*true*):
Lorsque cette option est activée (par défaut), liens et images Markdown dans le
Expand All @@ -93,14 +104,6 @@ href="#include-markdown_heading-offset">#</a> **heading-offset** (0): Augmente
ou diminue la profondeur des en-têtes Markdown de ce nombre. Ne prend en charge
que la syntaxe d'en-tête du signe dièse (`#`). Cet argument accepte les valeurs
négatives pour supprimer les caractères `#` de tête.
- <a name="include-markdown_exclude" href="#include-markdown_exclude">#</a> **exclude**:
Spécifiez avec un glob quels fichiers doivent être ignorés. Uniquement utile
lors du passage de globs pour inclure plusieurs fichiers.
- <a name="include-markdown_trailing-newlines"
href="#include-markdown_trailing-newlines">#</a> **trailing-newlines** (*true*):
Lorsque cette option est désactivée, les nouvelles lignes de fin trouvées dans
le contenu à inclure sont supprimées. Les valeurs possibles sont `true` et
`false`.

##### Exemples

Expand Down Expand Up @@ -164,6 +167,9 @@ passage de globs pour inclure plusieurs fichiers.
(*true*): Lorsque cette option est désactivée, les nouvelles lignes de fin
trouvées dans le contenu à inclure sont supprimées. Les valeurs possibles sont
`true` et `false`.
- <a name="include_encoding" href="#include_encoding">#</a> **encoding** (*utf-8*):
Spécifiez l'encodage du fichier inclus. S'il n'est pas défini, `utf-8` sera
utilisé.

##### Exemples

Expand Down
18 changes: 18 additions & 0 deletions locale/fr/README.md.po
Expand Up @@ -315,3 +315,21 @@ msgstr ""
"Les balises d'ouverture et de fermeture par défaut sont `{%` et `%}`. Vous "
"pouvez changer ces balises avec les paramètres de configuration "
"`opening_tag` et `closing_tag`:"

msgid ""
"<a name=\"include-markdown_encoding\" href=\"#include-"
"markdown_encoding\">#</a> **encoding** (*utf-8*): Specify the encoding of "
"the included file. If not defined `utf-8` will be used."
msgstr ""
"<a name=\"include-markdown_encoding\" href=\"#include-"
"markdown_encoding\">#</a> **encoding** (*utf-8*): Spécifiez l'encodage du "
"fichier inclus. S'il n'est pas défini, `utf-8` sera utilisé."

msgid ""
"<a name=\"include_encoding\" href=\"#include_encoding\">#</a> **encoding** "
"(*utf-8*): Specify the encoding of the included file. If not defined `utf-8`"
" will be used."
msgstr ""
"<a name=\"include_encoding\" href=\"#include_encoding\">#</a> **encoding** "
"(*utf-8*): Spécifiez l'encodage du fichier inclus. S'il n'est pas défini, "
"`utf-8` sera utilisé."
2 changes: 1 addition & 1 deletion mkdocs_include_markdown_plugin/__init__.py
@@ -1,2 +1,2 @@
__title__ = 'mkdocs_include_markdown_plugin'
__version__ = '3.6.1'
__version__ = '3.7.0'
91 changes: 63 additions & 28 deletions mkdocs_include_markdown_plugin/event.py
Expand Up @@ -48,33 +48,27 @@
flags=INCLUDE_TAG_REGEX.flags,
)

str_arg = lambda arg: re.compile(
rf'{arg}=(?:"({DOUBLE_QUOTED_STR_ARGUMENT_PATTERN})")?'
rf"(?:'({SINGLE_QUOTED_STR_ARGUMENT_PATTERN})')?",
)

bool_arg = lambda arg: re.compile(
rf'{arg}=({BOOL_ARGUMENT_PATTERN})',
)

ARGUMENT_REGEXES = {
# str
'start': re.compile(
rf'start=(?:"({DOUBLE_QUOTED_STR_ARGUMENT_PATTERN})")?'
rf"(?:'({SINGLE_QUOTED_STR_ARGUMENT_PATTERN})')?",
),
'end': re.compile(
rf'end=(?:"({DOUBLE_QUOTED_STR_ARGUMENT_PATTERN})")?'
rf"(?:'({SINGLE_QUOTED_STR_ARGUMENT_PATTERN})')?",
),
'exclude': re.compile(
rf'exclude=(?:"({DOUBLE_QUOTED_STR_ARGUMENT_PATTERN})")?'
rf"(?:'({SINGLE_QUOTED_STR_ARGUMENT_PATTERN})')?",
),
'start': str_arg('start'),
'end': str_arg('end'),
'exclude': str_arg('exclude'),
'encoding': str_arg('encoding'),

# bool
'rewrite-relative-urls': re.compile(
rf'rewrite-relative-urls=({BOOL_ARGUMENT_PATTERN})',
),
'comments': re.compile(rf'comments=({BOOL_ARGUMENT_PATTERN})'),
'preserve-includer-indent': re.compile(
rf'preserve-includer-indent=({BOOL_ARGUMENT_PATTERN})',
),
'dedent': re.compile(rf'dedent=({BOOL_ARGUMENT_PATTERN})'),
'trailing-newlines': re.compile(
rf'trailing-newlines=({BOOL_ARGUMENT_PATTERN})',
),
'rewrite-relative-urls': bool_arg('rewrite-relative-urls'),
'comments': bool_arg('comments'),
'preserve-includer-indent': bool_arg('preserve-includer-indent'),
'dedent': bool_arg('dedent'),
'trailing-newlines': bool_arg('trailing-newlines'),

# int
'heading-offset': re.compile(r'heading-offset=(-?\d+)'),
Expand Down Expand Up @@ -109,6 +103,11 @@ def lineno_from_content_start(content, start):
return content[:start].count('\n') + 1


def read_file(file_path, encoding):
with open(file_path, encoding=encoding) as f:
return f.read()


def get_file_content(
markdown,
page_src_path,
Expand Down Expand Up @@ -259,11 +258,29 @@ def found_include_tag(match):
else:
end = None

encoding_match = re.search(
ARGUMENT_REGEXES['encoding'],
arguments_string,
)
if encoding_match:
encoding = parse_string_argument(encoding_match)
if encoding is None:
lineno = lineno_from_content_start(
markdown,
directive_match_start,
)
logger.error(
"Invalid empty 'encoding' argument in 'include'"
' directive at '
f'{os.path.relpath(page_src_path, docs_dir)}:{lineno}',
)
else:
encoding = 'utf-8'

text_to_include = ''
expected_but_any_found = [start is not None, end is not None]
for file_path in file_paths_to_include:
with open(file_path, encoding='utf-8') as f:
new_text_to_include = f.read()
new_text_to_include = read_file(file_path, encoding)

if start is not None or end is not None:
new_text_to_include, *expected_not_found = (
Expand Down Expand Up @@ -481,6 +498,25 @@ def found_include_markdown_tag(match):
else:
end = None

encoding_match = re.search(
ARGUMENT_REGEXES['encoding'],
arguments_string,
)
if encoding_match:
encoding = parse_string_argument(encoding_match)
if encoding is None:
lineno = lineno_from_content_start(
markdown,
directive_match_start,
)
logger.error(
"Invalid empty 'encoding' argument in 'include-markdown'"
' directive at '
f'{os.path.relpath(page_src_path, docs_dir)}:{lineno}',
)
else:
encoding = 'utf-8'

# heading offset
offset = 0
offset_match = re.search(
Expand All @@ -499,8 +535,7 @@ def found_include_markdown_tag(match):
# but they have been specified, so the warning(s) must be raised
expected_but_any_found = [start is not None, end is not None]
for file_path in file_paths_to_include:
with open(file_path, encoding='utf-8') as f:
new_text_to_include = f.read()
new_text_to_include = read_file(file_path, encoding)

if start is not None or end is not None:
new_text_to_include, *expected_not_found = (
Expand Down
8 changes: 4 additions & 4 deletions mkdocs_include_markdown_plugin/process.py
Expand Up @@ -107,8 +107,8 @@ def process_current_paragraph():
if not _current_fcodeblock_delimiter and not _inside_icodeblock:
lstripped_line = line.lstrip()
if (
lstripped_line.startswith('```') or
lstripped_line.startswith('~~~')
lstripped_line.startswith('```')
or lstripped_line.startswith('~~~')
):
_current_fcodeblock_delimiter = lstripped_line[:3]
if current_paragraph:
Expand Down Expand Up @@ -157,8 +157,8 @@ def transform_line_by_line_skipping_codeblocks(markdown, func):
if not _current_fcodeblock_delimiter:
lstripped_line = line.lstrip()
if (
lstripped_line.startswith('```') or
lstripped_line.startswith('~~~')
lstripped_line.startswith('```')
or lstripped_line.startswith('~~~')
):
_current_fcodeblock_delimiter = lstripped_line[:3]
else:
Expand Down

0 comments on commit 31be084

Please sign in to comment.