Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Check and enforce quotes with punctuations #101

Merged
merged 7 commits into from Jul 27, 2020
Merged

Check and enforce quotes with punctuations #101

merged 7 commits into from Jul 27, 2020

Conversation

kdeldycke
Copy link
Contributor

@kdeldycke kdeldycke commented Apr 28, 2020

This is an attempt to specify punctuation rules around quotes, as discussed in #96.

Progress:

  • Validate quoting specifications with @sindresorhus
  • Complete inventory of valid and invalid cases
  • Implement test fixtures
  • Write validation and invalidation logic in Python
  • Translate Python logic to JS
  • Fix Travis
  • Profit

@sindresorhus
Copy link
Owner

I agree with what's shown in the fixtures.

@kdeldycke
Copy link
Contributor Author

Thanks @sindresorhus for validating my fixtures!

As I told you in another comment, I'm no JS dev so I'll need some extra time to figure out the test ecosystem. But I'll be able to produce a nice logical flow and regexps to have the checks properly in place.

@sindresorhus
Copy link
Owner

@kdeldycke Friendly bump :)

@kdeldycke
Copy link
Contributor Author

@sindresorhus fair enough! :) I still want to tackle this one for good.

But I'm currently down the rabbit hole trying to reconcile some Markdown edge-cases in other formatters and linters: https://twitter.com/kdeldycke/status/1278735173903945734 . awesome-lint is at the end of the Markdown food chain, so I'm extra-slow.

@sindresorhus
Copy link
Owner

@kdeldycke
Copy link
Contributor Author

kdeldycke commented Jul 24, 2020

I've made progress on that one. Again, I'm no JS dev. So I first tried to attack the issue with my language of choice. Here is a working logical implementation in Python:

# -*- coding: utf-8 -*-
import re

valid = """
- [foo](https://foo.com) - Valid description.
- [foo](https://foo.com) - A valid description.
- [foo](https://foo.com) - A valid description...
- [foo](https://foo.com) - A valid description.......
- [foo](https://foo.com) - A valid description…
- [foo](https://foo.com) - A valid description..….....
- [foo](https://foo.com) - A valid description!
- [foo](https://foo.com) - A valid description! ⭐
- [foo](https://foo.com) - A valid description!!!
- [foo](https://foo.com) - A valid description?
- [foo](https://foo.com) - A valid description???
- [foo](https://foo.com) - A valid description????!??
- [foo](https://foo.com) - A valid description..…!?…....
- [foo](https://foo.com) - A valid description with [link](https://bar.org).
- [foo](https://foo.com) - A valid description. ![image](image.png)
- [foo](https://foo.com) - A valid description. <img src="image.png">
- [foo](https://foo.com) - `valid description` here.
- [foo](https://foo.com) - `valid` is a word that is `great`.
- [foo](https://foo.com) - `valid` is a word that is `great`!?!.
- [foo](https://foo.com) - VaLid description.
- [foo](https://foo.com) - anoTher valid description with pascalCase.

- [foo](https://foo.com) - Ending with a period.
- [foo](https://foo.com) - Ending with an exclamation mark!
- [foo](https://foo.com) - Ending with a question mark?
- [foo](https://foo.com) - Ending with an ellipsis…

- [foo](https://foo.com) - Ending with a "non-quoted period".
- [foo](https://foo.com) - Ending with a "non-quoted exclamation point"!
- [foo](https://foo.com) - Ending with a "non-quoted question mark"?
- [foo](https://foo.com) - Ending with a "non-quoted ellipsis"…
- [foo](https://foo.com) - Ending with another kind of “non-quoted period”.
- [foo](https://foo.com) - Ending with another kind of “non-quoted exclamation point”!
- [foo](https://foo.com) - Ending with another kind of “non-quoted question mark”?
- [foo](https://foo.com) - Ending with another kind of “non-quoted ellipsis”…

- [foo](https://foo.com) - "Description is a full quote ending with a period."
- [foo](https://foo.com) - "Description is a full quote ending with an exclamation point!"
- [foo](https://foo.com) - "Description is a full quote ending with a question mark?"
- [foo](https://foo.com) - "Description is a full quote ending with an ellipsis…"
- [foo](https://foo.com) - “Description is an other king of full quote ending with a period.”
- [foo](https://foo.com) - “Description is an other king of full quote ending with an exclamation point!”
- [foo](https://foo.com) - “Description is an other king of full quote ending with a question mark?”
- [foo](https://foo.com) - “Description is an other king of full quote ending with an ellipsis…”

- [foo](https://foo.com) - Ending with a parenthetical. (Japanese)
- [foo](https://foo.com) - Ending with an emphasis parenthetical. *(Japanese)*
- [foo](https://foo.com) - Ending with a strong parenthetical. **(Japanese)**

- [foo](https://foo.com) - Ending with an emoji case 1. 📷
- [foo](https://foo.com) - Ending with an emoji case 1. 📷 📷 📷
- [foo](https://foo.com) - Ending with an emoji case 2. 👩🏿
- [foo](https://foo.com) - Ending with an emoji case 3. ⌚

- [foo](https://foo.com) - [Preview](https://read.amazon.com/kp/embed?asin=B01G7TTKSK&asin=B01G7TTKSK&preview=newtab&linkCode=kpe&ref_=cm_sw_r_kb_dp_DLhOxb0XZ3MEC) 💲
"""


invalid = """
- [foo](https://foo.com) - Missing ending punctuation

- [foo](https://foo.com) - ``
- [foo](https://foo.com) - `invalid quote: too noisy`
- [foo](https://foo.com) - `still invalid quote, even with a period.`
- [foo](https://foo.com) - `still invalid quote, even with an exclamation mark!`
- [foo](https://foo.com) - `still invalid quote, even with a question mark?`
- [foo](https://foo.com) - `still invalid quote, even with an ellipsis…`
- [foo](https://foo.com) - `still invalid quote, even ending with a period`.
- [foo](https://foo.com) - `still invalid quote, even ending with an exclamation mark`!
- [foo](https://foo.com) - `still invalid quote, even ending with a question mark`?
- [foo](https://foo.com) - `still invalid quote, even ending with an ellipsis`…

- [foo](https://foo.com) - `still invalid quote, ending with too much punctuations`…...?
- [foo](https://foo.com) - `still invalid quote, ending with too much punctuations...!?`…...?

- [foo](https://foo.com) - Quote-inducing double punctuation "end.".
- [foo](https://foo.com) - Quote-inducing double punctuation "end."!
- [foo](https://foo.com) - Quote-inducing double punctuation "end."?
- [foo](https://foo.com) - Quote-inducing double punctuation "end."…
- [foo](https://foo.com) - Quote-inducing double punctuation "end!".
- [foo](https://foo.com) - Quote-inducing double punctuation "end!"!
- [foo](https://foo.com) - Quote-inducing double punctuation "end!"?
- [foo](https://foo.com) - Quote-inducing double punctuation "end!"…
- [foo](https://foo.com) - Quote-inducing double punctuation "end?".
- [foo](https://foo.com) - Quote-inducing double punctuation "end?"!
- [foo](https://foo.com) - Quote-inducing double punctuation "end?"?
- [foo](https://foo.com) - Quote-inducing double punctuation "end?"…
- [foo](https://foo.com) - Quote-inducing double punctuation "end…".
- [foo](https://foo.com) - Quote-inducing double punctuation "end…"!
- [foo](https://foo.com) - Quote-inducing double punctuation "end…"?
- [foo](https://foo.com) - Quote-inducing double punctuation "end…"…

- [foo](https://foo.com) - Quote-inducing double punctuation “end.”.
- [foo](https://foo.com) - Quote-inducing double punctuation “end.”!
- [foo](https://foo.com) - Quote-inducing double punctuation “end.”?
- [foo](https://foo.com) - Quote-inducing double punctuation “end.”…
- [foo](https://foo.com) - Quote-inducing double punctuation “end!”.
- [foo](https://foo.com) - Quote-inducing double punctuation “end!”!
- [foo](https://foo.com) - Quote-inducing double punctuation “end!”?
- [foo](https://foo.com) - Quote-inducing double punctuation “end!”…
- [foo](https://foo.com) - Quote-inducing double punctuation “end?”.
- [foo](https://foo.com) - Quote-inducing double punctuation “end?”!
- [foo](https://foo.com) - Quote-inducing double punctuation “end?”?
- [foo](https://foo.com) - Quote-inducing double punctuation “end?”…
- [foo](https://foo.com) - Quote-inducing double punctuation “end…”.
- [foo](https://foo.com) - Quote-inducing double punctuation “end…”!
- [foo](https://foo.com) - Quote-inducing double punctuation “end…”?
- [foo](https://foo.com) - Quote-inducing double punctuation “end…”…

- [foo](https://foo.com) - Quote-inducing double punctuation “end…”??!
- [foo](https://foo.com) - Quote-inducing double punctuation “end?…...”…...?
"""


def ends_with_emoji(string):
    # Quick and dirty check for emoji. Good enough for our tests here.
    return ord(string[-1]) > 2**16


def validate(line):
    """ Python translation of ./rules/list-item.js:validateListItemSuffix() """

    desc = line.split(' - ', 1)[1]
    # print(desc)

    # XXX NEW RULE
    print('Rule #1')
    # Descriptions are not allowed to be fully backticked quotes, whatever the
    # ending punctuation and its position.
    if re.match(r"^`.*[.!?…]*`[.!?…]*$", desc):
        # Still allow multiple backticks if the whole description is not fully
        # quoted.
        if re.match(r"^`.+`.+`.+$", desc):
            return True
        return False

    # XXX NEW RULE
    # Any kind of quote followed by one of our punctuaction marker is perfect,
    # but only if not following a punctuation itself.
    print('Rule #2')
    # Use positive lookbehind to search for punctuation following a quote.
    if re.match(r".*(?<=[\"”])[.!?…]+$", desc):
        # If the quote follows a regular punctuation, this is wrong.
        if re.match(r".*[.!?…][\"”][.!?…]+$", desc):
            return False
        return True

    # XXX NEW RULE
    # Any of our punctuation marker eventually closed by any kind of quote is
    # good.
    print('Rule #3')
    if re.match(r".*[.!?…][\"”]?$", desc):
        return True

    # XXX ALREADY IMPLEMENTED
    print('Rule #4')
    if not re.findall(r'[\.!?…]', desc):
        tokens = [t for t in re.split(r"[- ;./]", desc) if t]
        if len(tokens) > 2 or not ends_with_emoji(tokens[-1]):
            return False

    # XXX ALREADY IMPLEMENTED
    print('Rule #5')
    if re.match(r".*\)?$", desc):
        return True

    # XXX ALREADY IMPLEMENTED
    print('Rule #6')
    if ends_with_emoji(desc):
        return True

    return False


for line in [l for l in valid.splitlines() if l]:
    print("Must be valid: {}".format(line))
    assert validate(line)


for line in [l for l in invalid.splitlines() if l]:
    print("Must be invalid: {}".format(line))
    assert not validate(line)

This covers all the cases.

@kdeldycke kdeldycke changed the title Check and enforce quotes with punctuations. [WIP] Check and enforce quotes with punctuations. Jul 24, 2020
@kdeldycke kdeldycke marked this pull request as draft July 24, 2020 21:20
@kdeldycke kdeldycke changed the title [WIP] Check and enforce quotes with punctuations. Check and enforce quotes with punctuations. Jul 24, 2020
@kdeldycke kdeldycke marked this pull request as ready for review July 24, 2020 22:04
@kdeldycke
Copy link
Contributor Author

This PR is ready for review. At last! :)

@sindresorhus sindresorhus changed the title Check and enforce quotes with punctuations. Check and enforce quotes with punctuations Jul 27, 2020
@sindresorhus sindresorhus merged commit 5afa2a0 into sindresorhus:master Jul 27, 2020
@sindresorhus
Copy link
Owner

This looks good to me. Nice work 👍🏻

@kdeldycke
Copy link
Contributor Author

Thanks a lot @sindresorhus for your patience. It took 3 months but I'm glad I finally tackled that tricky feature! 🤟

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

2 participants