Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

🚧[Experimenting]: Refactoring marked js🚧 #1102

Closed
wants to merge 17 commits into from
Closed

🚧[Experimenting]: Refactoring marked js🚧 #1102

wants to merge 17 commits into from

Conversation

joshbruce
Copy link
Member

Marked version: 0.3.17

Description

Experimenting with the codebase to learn more and try to make marked easier to read without fundamentally breaking things.

Again, this is experimental, please do not merge!

Having said that, this should demonstrate how we can effectively alter how easy it is to read marked without actually changing the functionality. Please see code comments for questions I have regarding why certain decisions were made and whatnot.

I've also noted areas where I do not know if it is possible to test a unit of functionality with our current testing harness; therefore, tagging @UziTech.

@joshbruce joshbruce added RR - refactor & re-engineer Results in an improvement to developers using Marked, or end-users, or both. question labels Feb 28, 2018
@joshbruce
Copy link
Member Author

I really don't trust our test bed to accurately reflect the desired behavior and use cases, to be honest.

@joshbruce
Copy link
Member Author

@intcreator: Interesting findings and notes in here, might help. Maybe we should slow down on trying to fix issues and work on making this thing more readable...?? Or, I'll do that, you do you, and we'll meet in the middle??

@joshbruce joshbruce mentioned this pull request Feb 28, 2018
8 tasks
@intcreator
Copy link

intcreator commented Mar 1, 2018

It looks like the diff got messed up so it's really hard for me to tell what's going on still. As far as fixing vs. making the code legible, I vote we get the builds passing (just fixing #1092) then we can focus on understanding the code more. It's just way easier to check if things are still okay if builds are passing.

Edit: I see the builds are passing now so let's work on figuring out what the code does. This may be a good opportunity for creating some unit tests as well.

Edit 2: Okay, the diffs are messed up because you're reorganizing the layout of the code. Cool. I'm kind of liking the idea someone had earlier to have different things in different files. I think unit tests will make it easy to see what's going on more clearly too.

I'll look at it a bit when I have time and make comments where appropriate.

* @return string
*/
function marked(src, opt, callback) {
// ^^ Can't we do something like marked(src, opt = {}, callback = function())

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Default parameters are ES6 according to MDN. Maybe in 2.0.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ES5 is gonna probably start to annoy me soon. :)

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Perhaps we can find a transpiler for 1.0.

var highlight = opt.highlight,
tokens,
pending,
i = 0;
Copy link

@intcreator intcreator Mar 1, 2018

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

According to MDN, the comma operator evaluates each operand and returns the value of the last operand. Here, highlight = opt.highlight, tokens, pending, and i = 0 are all being evaluated and 0 is being returned (and not being used).

This is a somewhat obscure way of declaring tokens, pending, and i as variables in addition to highlight. Here is a reference on Stack Overflow.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Concur with the assessment of the commenter. Wonder if we can add that to the lint (@UziTech)? Don't do this. :)

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I agree, I hate this style, but that is how chjj did it. I prefer initializing the variables as close to when you use them as possible.

In this case since the = operator has a higher presedence than the , operator the expression is essentially:

var (highlight = opt.highlight),
    (tokens),
    (pending),
    (i = 0);

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

To be fair, I don't know who necessarily did it that way, just that I think it should be changed. Totally agree that putting things as close to their contextual relevance is kind of a big deal.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There actually seems to be a lot of "clever" decisions that were made, which may not actually bring the desired ROI compared to writing human readable code. We spend way more time reading than we do writing, you know?

@joshbruce
Copy link
Member Author

@intcreator: On the diff, don't know if this is what you're referring to exactly, but the moving marked() to the top causes the entire file to considered new according to the diff...basically a change to the whole file. I'm trying to make each commit a refactor; so, it might be easier to read one commit at a time.

//
// TODO: Don't some highlighter plugins allow for `js`, effectively being less than three
// despite wanting to have the highlighter run??
if (!highlight || highlight.length < 3) {

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think there are better tests than the length of the highlight. Agree with the possibility of js. Will we be supporting syntax highlighting of languages such as c or r? Highlighting doesn't appear to be in the CommonMark spec.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Agreed. See #1043

}

return;
}

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This function is almost 100 lines long. I highly recommend we break it down into pieces.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Tagging #1043 so UziTech can get a better feel on internal pull for the updates to the test harness request. No pressure. Just informational purposes.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Meant #1103

* @return boolean
*/
marked.receivedOnlySource = function(src, opt, callback) {
return (marked.receivedCallback(opt, callback));
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should be return !(marked.receivedCallback(opt, callback)); only source means we did not receive those other things.

@joshbruce
Copy link
Member Author

different things in different files

Strictly speaking, is there anything about ES5 (or some other gotcha) that stops us from separating marked.js before 1.0 or later?

.getRegex();
function Parser(options) {
this.tokens = [];
this.token = null;
Copy link
Member Author

@joshbruce joshbruce Mar 2, 2018

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is having it be null necessary? Given this becomes an object with members might be easier for a new person to understand if they saw something like this (or make a Token object):

{
  type: '',
  ...
}

Was going to make a more complete example; however, it appears the Token is just a random set of members...like an undefined type. You can see it in the switch inside Parser.prototype.tok() ~ line 930

Copy link
Member Author

@joshbruce joshbruce Mar 2, 2018

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe something more akin to this from traditional OO languages...using Swift highlighting, but not the syntax, strictly speaking.

abstract class Token {}

class Heading extends Token {}

// then the switch could be something like
switch (typeof this.token.type) {
  case 'Heading':
...

But, if the type member is the only thing in common between all the different variations for Token...then there's really no reason for the type member or Token. ??

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I wonder if there are more "hidden" objects inside of here.

} else if (this.options.pedantic) {
this.rules = inline.pedantic;
}
}
Copy link
Member Author

@joshbruce joshbruce Mar 3, 2018

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The Renderer is just below here, it's essentially a string concatenation object...there's a separate one for both inline and block elements...think it could actually be much smaller.

When we get there, I'll probably want to work on this one...leverage the work from here and here. More the former than the latter since we need to accommodate client-side and most likely we are the extension of someone else's package, being used in, yet again, someone else's project.

We really should try to make Marked compile to something tiny - that's probably why some of the "clever" decisions were made, in fact. Not sure if the project started prior to the prevalence of minify for JS.

[edit] There's also the inability to minify the "boilerplate" mentioned by @UziTech in #746 - "prototype" can't get minified...and neither can any of the JS standard library calls and JS base language stuff.

@joshbruce
Copy link
Member Author

joshbruce commented Mar 3, 2018

It seems like what Marked is trying to do is sound, but it never got the chance to iterate that far...could be totally wrong here, but looking at the code, it's starting to feel familiar based on other projects:

M⬇ --- parse ---> {} --- render ---> HTML

As we consider the future refactoring and unit tests, might be worth keeping the following HTML element type definition in mind:

{
  // h1, h2, etc.
  elem: string,

  // the attribute list, which is usually a set of key-value pairs, JSON, in JS,
  // where you can easily convert keys to strings to generate the desired `key=value` 
  // string output. 
  attr: array, 

  // this gets tricky because some HTML elements can only take a string; or, there's the 
  // child element problem (what marked seems to use "inline" for). If we create a text 
  // object that only holds a string we can always pass in an array of element objects 
  // (what we seem to call tokens at the moment).
  inner: string|array,

  // this phrasing comes from the HTML5 spec when looking at the definition of an 
  // individual element, the first time I created this type definition I called 
  // `isSelfClosing` but that proved confusing.
  omitEndTag: boolean
}

This is almost the model used by React, I think, but not explicitly just through their createElement method: https://reactjs.org/docs/react-api.html#createelement

Maybe we need an Element object??

Then the parser can just keep creating those. Various Markdown flavors are also only able to use so many of the global attributes afforded by HTML5 that we might be able to do some optimization there as well.

@joshbruce
Copy link
Member Author

Closing as edification was achieved, conflicts, and will most likely happen over a series of PRs not just one.

@joshbruce joshbruce closed this Apr 5, 2018
@joshbruce joshbruce deleted the refactoring-marked-js branch April 5, 2018 18:50
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
question RR - refactor & re-engineer Results in an improvement to developers using Marked, or end-users, or both.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

3 participants