Replies: 1 comment
-
Hm, it sounds like you want an actual parser. While Pygments' stateful lexers can fulfill many jobs of a recursive parser (in order to correctly highlight constructs of more complex languages), it is not meant for generic parsing. You can certainly model your own code after it - the core |
Beta Was this translation helpful? Give feedback.
0 replies
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
-
I was looking into your Lexer code and find it quit attractive.
I think about if it might be possible to parse org-files (from Emacs org- or orgroam mode) with that. The intention is not syntax highlighting but just parsing that file into its logical components.
For example the head of such a file might look like this which might not be hard to parse for a regex based lexer.
But sometimes context and indention is very important. Here you see a nested list.
Can this be handeled? The expected output should be something like
ListItem(text='item 1', level=1)
. So if one item is one token I need to know on which level that item stands.Here is another example where a starting line modifies the context. Here between the first and last line it is verbatim text. Can I create a Lexer based on yours that is able to know that situation?
Beta Was this translation helpful? Give feedback.
All reactions