Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Duplicate Heading IDs #1280

Closed
alystair opened this issue May 28, 2018 · 14 comments · Fixed by #1401
Closed

Duplicate Heading IDs #1280

alystair opened this issue May 28, 2018 · 14 comments · Fixed by #1401

Comments

@alystair
Copy link

alystair commented May 28, 2018

Describe the bug
In situations where a document has repeated headers, the ID is duplicated which is a big no-no for HTML validations (only one instance of an ID is allowed).

This is extremely common in changelogs where H3 headers are Added/Changed/Removed/Fixed etc. It flat out screws up accessibility as screen readers would likely ignore duplicate IDs and will cause formatting issues when an ID clashes with another outside the scope of the markdown render.

You cannot use IDs unless they are unique, so consider using "class" instead as a quick fix, and consider applying a configurable prefix to avoid clashing ('md-' perhaps).

To Reproduce
Steps to reproduce the behavior:
Create duplicate headers such as in this demo. Then copy rendered HTML into W3's HTML5 validator to see it fail validation

Expected behavior
No conflicts/render or screen reader issues.

@alystair alystair changed the title Duplicate IDs in HTML output (should be using class) HTML output fails validation due to duplicate IDs (should be using 'class') May 28, 2018
@styfle
Copy link
Member

styfle commented May 28, 2018

Hi @alystair thanks for the bug report!

You can disable emitting id for headings with an option like so:

var options = { headerIds: false };
var str = `# test heading
Some words
## Subheading
Blahblah
## Subheading`;
var html = marked(str, options);

Was this behavior working in a previous version of marked and broke in the latest release?

@alystair
Copy link
Author

I've only started using markedjs in the last month, but and I've always loaded it via your CDN link https://cdn.jsdelivr.net/npm/marked/marked.min.js

Definitely consider changing the default behaviour.

@UziTech
Copy link
Member

UziTech commented May 28, 2018

We are trying to make marked spec compliant so we are most likely going to remove the headers eventually since they are not part of any markdown spec.

@styfle styfle changed the title HTML output fails validation due to duplicate IDs (should be using 'class') Change default options so that headerIds: false May 29, 2018
@styfle styfle added this to the 0.5.0 - Architecture and extensibility milestone May 29, 2018
@styfle
Copy link
Member

styfle commented May 29, 2018

I changed the title to reflect that this is not a bug we introduced but rather a proposal to change the default options so that marked is complaint with the spec.

This would be a breaking change so the soonest release this could land in would be 0.5.0

@UziTech I can implement this. Any reason to delay or should I change this soon?

@UziTech
Copy link
Member

UziTech commented May 29, 2018

I think the idea behind adding header IDs is so you can link to each header in a markdown document with a hash (i.e. # heading 2 can be linked to by using "https://site.com/page#heading-2"). Using classes instead of ids would remove this linking functionality.

Although it isn't in the spec it looks like GitHub adds a "-[number]" to the id if there are duplicates (demo)

@styfle
Copy link
Member

styfle commented May 29, 2018

Good point! Now the question is, do we implement this to match GitHub even though it isn't in the GFM spec?

My guess is they left it out of the spec because they didn't want to commit to it yet 🤷‍♂️

@joshbruce
Copy link
Member

The reason it's true by default was to maintain consistency for users. Originally the default for Marked was to have the header ids with no way to turn them off. Making false the default could cause users to need to update their code.

@alystair is correct re id usage in HTML, which was a big reason for wanting to get out of the header id business; thereby, leaving it to developers to implement via extension and work with. I believe @UziTech is also correct in the history and why id is being used and why a post-processor like GitHub uses is probably a more viable approach.

@ChrisRobston
Copy link

ChrisRobston commented Jul 4, 2018

@styfle why headerIds: false doesn't work? I have "marked": "^0.3.19"

Why this option is enabled in first place? It make no sense for me.

Who need this ID's?

@ChrisRobston
Copy link

Got it, it works with 4!

@styfle
Copy link
Member

styfle commented Jul 4, 2018

Why this option is enabled in first place?

@canyoudev See this comment.

why headerIds: false doesn't work?

The feature to disable heading ids wasn't implement until v4.0.0 as you have discovered.

This is also indicated in the options documentation.


Also note, I have found an implementation of heading IDs that we might want to adopt called github-slugger (probably want to use this as a reference instead of a new dependency).

@Grawl
Copy link

Grawl commented Oct 16, 2018

There is another problem with header ids. The non-ASCII characters becomes - instead of understandable strings. I see the two options with this:

  1. Try to make any possible string a URL-compatible
  2. Transliterate strings (like privet for привет)

Also, I see another problem. What if I put an image into heading?

# Title ![](https://github.com/fluidicon.png)

Or if I put custom HTML into heading? What will be header id?

There's a workaround in
#919 (comment)

const renderer = new marked.Renderer();
renderer.heading = function(text, level) {
    return `<h${level} id="${text}">${text}</h${level}>`;
};
marked(mdString, { renderer: renderer });

This is easy to replace id to something like nanoid

But I think will be better to use github-slugger if we find a solution for problem with non-text into heading.

@styfle
Copy link
Member

styfle commented Oct 16, 2018

Good point, I created #1354

@Martii
Copy link
Contributor

Martii commented Oct 17, 2018

Outsourcing can be good for this as long as we can do a slower migration path e.g. keep the old style lexer override until we modify each and every comment that has references to existing id's in the DB. This definitely can be user content breaking across all sites that reference us but I realize that there are quite a few issues with auto "bookmark"ing and it could use some improvement.

@styfle

I wanted to get some early feedback on this

I'll see if I have a spare few to test the PR on our dev and see how much content really needs to be modified. Hopefully the outsourcing doesn't change their style output either because one DB traversal for this is enough imho. :)

@styfle styfle mentioned this issue Dec 20, 2018
4 tasks
@styfle styfle changed the title Change default options so that headerIds: false Duplicate Heading IDs Dec 20, 2018
@styfle
Copy link
Member

styfle commented Dec 20, 2018

After much discussion in the previous PR, I created a new one to fix this issue: #1401

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
7 participants