New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
improve client message parsing #699
Conversation
Nice! This needs heavy testing. Did you take a look at the previous PR trying to fix this to see why it wasn't merged? |
@xPaw, what do you say we come up with a way to write tests for these? Every time we improve the message parsing, we fix 5 things and break 1. I remember @maxpoulin64 had a list of strings to test against once, we should be able to make this part of our mild test suite. |
We do need a test suite for it. Since it's exposed as a separate module, trivial to test. |
just realized, Object.assign is a ES2015 Feature - are you supporting IE11-? |
client/js/libs/handlebars/parse.js
Outdated
}); | ||
} | ||
} else if ((text[position] === "#" || text[position === "&"]) && | ||
// ^-- this is basically wrong, because we need the 005 CHANTYPES response from the connect |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The front-end is actually aware of this, as I passed it into client when we moved to irc-fw. The network dom object has it in data.
i would call it done. |
client/js/libs/handlebars/parse.js
Outdated
Handlebars.registerHelper( | ||
"parse", function(text) { | ||
function createFragment(fragment) { | ||
var className = ""; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You could make this an array and then call .join
on it, this would fix the leading space.
Testing this, I see a bug in how it parses channels.
EDIT: Another bug with the parser. |
|
@Bonuspunkt I understand it's valid, but it's ignoring the |
the behavior is the same as mirc i think i see the problem here, should
be |
Pretty sure that should not be parsed as a channel at all. I wouldn't really say mIRC is the ideal client to base our parser on. |
same behavior as mIRC in HexChat what is your reference? |
Okay for now it might be fine to parse channels with colours, what I'm worried about are the parser bugs producing invalid markup like I showed above. |
oh sorry missed that. also fixed |
Mind looking at Perhaps it's also possible to optimize the generated code if there are same colours? E.g: If not possible or easy to optimize, it should not at least generate empty tags like |
@@ -74,6 +74,14 @@ describe("parse Handlebars helper", () => { | |||
"https://theos.kyriasis.com/~kyrias/stats/archlinux.html" + | |||
"</a>" + | |||
">" | |||
}, { | |||
input: "(http://example.com)", |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Then add a test for links that do have balanced ()
in them. Like http://example.com/Test_(Page)
.
client/js/libs/handlebars/parse.js
Outdated
@@ -102,6 +102,13 @@ function analyseText(text) { | |||
// NOTE: channel prefixes should be RPL_ISUPPORT.CHANTYPES | |||
var channelPrefixes = ["#", "&"]; | |||
var punctuations = ["'", "\"", ".", ",", "!", "?", "¿", "<", ">", "(", ")", "{", "}"]; | |||
var commonProtocols = [ |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Couple more: mailto
, steam
, mumble
, ts3server
, ssh
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
mailto
won't match - url detection is based on ://
never have seen mumble
, ts3server
urls, are they using ://
?
i'm thinking about replacing the url detection with https://www.npmjs.com/package/autolinker |
bump? would love to see this merged |
client/js/libs/handlebars/parse.js
Outdated
); | ||
|
||
function uri(text) { | ||
return window.URI.withinString(text, function(url) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I see you are getting rid of URI.withinString
which is, I think, our only use of URI.js
(in our repo, official repo). Any specific reason for this or just killing a dependency?
This seems to be a very active project, and they've just released a version that fixes one of the bugs I noticed a long time ago.
The latest release packs quite a lot of changes since the version we use, see medialize/URI.js@v1.14.1...v1.18.4.
Do you want to try running your tests against our version of URI.js
and the latest one released?
I know we like to kill dependencies when they are tiny, but I think at that point I'd rather not roll our own URL parser.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
i need start & end position of the url to correctly merge it with the styling information.
i'll look at it.
tests are against latest version of |
Really solid job so far! However I am getting an error at the current version when sending a link:
|
would have preferred replacing |
I think we will focus on this PR for 2.3.0 after 2.2.0 is shipped. cc @astorije |
I'm not a particular fan of moving parsing stuff into a separate repository we have no access to. |
i have no problem with transfer of repo & npm |
@astorije What do you think? Should the code kept in the repo, or have this as a separate project under |
I dont think it makes much sense to move it out of Lounge unless we remove Lounge specific code (like in the merge function) |
This parser mis-parses the following URL:
|
@Bonuspunkt, |
rebased to master, it is reviewable. |
I see @Bonuspunkt did a good job with setting this up: nice README, semver-releases, good test coverage. For these reasons, I'm happy to keep it in a separate repo at least at the moment. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think it's already better than what we have on master
(but still a few comments inline).
Also, 👏 👏 on the rigorous test suite!
Any idea how to fix CI?
client/js/libs/handlebars/parse.js
Outdated
"irc", "ircs", | ||
"svn", "git", | ||
"steam", "mumble", "ts3server", "ssh", | ||
]; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Wouldn't it quickly become an issue to whitelist schemes here? I feel like this is a well-known problem that should be handled in one of the libraries we use instead of in our code. Thoughts?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
client/js/libs/handlebars/parse.js
Outdated
"svn", "git", | ||
"steam", "mumble", "ts3server", "ssh", | ||
]; | ||
const incorrectDetections = ["www"]; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
So, we don't want to parse www.google.com
as a URL? Why is that?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
that's the fix for matching www.
(also covered in the test line126+)
edit: medialize/URI.js#327
yes
currently it's ISC, MIT seems pretty much the same, so i'm fine with it.
node v4 has not enough ES6 support ( http://node.green/ ) |
Cool!
Could you get rid of the default params instead? It's a small library and transpiling only for this is a bit of a shame IMO :) |
wait, the code is run by node out of convenience. when the lounge is running, this code is only executed by the webclient. also it's not worth removing syntactic sugar as i have to readd it. |
So yeah, I think keeping all the code in the client would be better for us. As for the polyfill, you've introduced a lot of unnecessary code that we will have to maintain. Overall, the cleaner solution is just to keep the code here IMO. If you can't be bothered to revert to how it was, we could handle it ourselves (create a new branch in lounge), if you allow it. I really do want to get this into 2.3 release. |
not sure where in time the "how it was" was, but sure, go ahead. if you need help ping me edit: very old might be in this branch https://github.com/Bonuspunkt/lounge/tree/messageParserOld |
By "how it was" I mean when the parser was contained within the lounge code, and not a separate project. |
Closing in favour of #972. I hope to get that PR merged ASAP. @Bonuspunkt, I've used your latest code in the separate package, not |
replaced regexps with parser and processing
fixes #654