Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Tokenizer #1637

Merged
merged 15 commits into from Apr 16, 2020
79 changes: 76 additions & 3 deletions docs/USING_PRO.md
Expand Up @@ -16,7 +16,7 @@ const marked = require('marked');
const renderer = new marked.Renderer();

// Override function
renderer.heading = function (text, level) {
renderer.heading = function(text, level) {
const escapedText = text.toLowerCase().replace(/[^\w]+/g, '-');

return `
Expand Down Expand Up @@ -58,7 +58,7 @@ console.log(marked('# heading+', { renderer }));
- tablerow(*string* content)
- tablecell(*string* content, *object* flags)

`slugger` has the `slug` method to create an unique id from value:
`slugger` has the `slug` method to create a unique id from value:

```js
slugger.slug('foo') // foo
Expand Down Expand Up @@ -89,9 +89,82 @@ slugger.slug('foo-1') // foo-1-2
- image(*string* href, *string* title, *string* text)
- text(*string* text)

<h2 id="tokenizer">The tokenizer</h2>

The tokenizer defines how to turn markdown text into tokens.

**Example:** Overriding default `codespan` tokenizer to include latex.
UziTech marked this conversation as resolved.
Show resolved Hide resolved

```js
// Create reference instance
const marked = require('marked');

// Get reference
const tokenizer = new marked.Tokenizer();
const originalCodespan = tokenizer.codespan;
// Override function
tokenizer.codespan = function(lexer, src) {
const match = src.match(/\$+([^\$\n]+?)\$+/);
if (match) {
return {
type: 'codespan',
raw: match[0],
text: match[1].trim()
};
}
return originalCodespan.apply(this, arguments);
};

// Run marked
console.log(marked('$ latext code $', { tokenizer }));
```

**Output:**

```html
<p><code>latext code</code></p>
```

### Block level tokenizer methods

- space(*string* src)
- code(*string* src, *array* tokens)
- fences(*string* src)
- heading(*string* src)
- nptable(*string* src)
- hr(*string* src)
- blockquote(*string* src)
- list(*string* src)
- html(*string* src)
- def(*string* src)
- table(*string* src)
- lheading(*string* src)
- paragraph(*string* src)
- text(*string* src)

### Inline level tokenizer methods

- escape(*string* src)
- tag(*string* src, *bool* inLink, *bool* inRawBlock)
- link(*string* src)
- reflink(*string* src, *object* links)
- strong(*string* src)
- em(*string* src)
- codespan(*string* src)
- br(*string* src)
- del(*string* src)
- autolink(*string* src)
- url(*string* src)
- inlineText(*string* src, *bool* inRawBlock)

### Other tokenizer methods

- smartypants(*string* text)
- mangle(*string* text)

<h2 id="lexer">The lexer</h2>

The lexer turns a markdown string into tokens.
The lexer takes a markdown string and calls the tokenizer functions.

<h2 id="parser">The parser</h2>

Expand Down
1 change: 1 addition & 0 deletions docs/index.html
Expand Up @@ -155,6 +155,7 @@ <h1>Marked.js Documentation</h1>
<a href="#/USING_PRO.md">Extensibility</a>
<ul>
<li><a href="#/USING_PRO.md#renderer">Renderer</a></li>
<li><a href="#/USING_PRO.md#tokenizer">Tokenizer</a></li>
<li><a href="#/USING_PRO.md#lexer">Lexer</a></li>
<li><a href="#/USING_PRO.md#parser">Parser</a></li>
</ul>
Expand Down