Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Clean up LLVM lexer #1376

Merged
merged 3 commits into from Dec 15, 2019
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Jump to
Jump to file
Failed to load files.
Diff view
Diff view
101 changes: 55 additions & 46 deletions lib/rouge/lexers/llvm.rb
Expand Up @@ -14,6 +14,48 @@ class LLVM < RegexLexer
string = /"[^"]*?"/
identifier = /([-a-zA-Z$._][-a-zA-Z$._0-9]*|#{string})/

def self.keywords
@keywords ||= Set.new %w(
addrspace addrspacecast alias align alignstack allocsize alwaysinline
appending arcp argmemonly arm_aapcs_vfpcc arm_aapcscc arm_apcscc asm
attributes available_externally begin builtin byval c cc ccc cold
coldcc common constant convergent datalayout dbg declare default
define dllexport dllimport end eq exact extern_weak external false
fast fastcc gc global hidden inaccessiblemem_or_argmemonly
inaccessiblememonly inbounds inlinehint inreg internal jumptable
landingpad linker_private linkonce linkonce_odr minsize module naked
ne nest ninf nnan no-jump-tables noalias nobuiltin nocapture
nocf_check noduplicate noimplicitfloat noinline nonlazybind norecurse
noredzone noredzone noreturn nounwind nsw nsz null nuw oeq oge ogt
ole olt one opaque optforfuzzing optnone optsize ord personality
private protected ptx_device ptx_kernel readnone readonly
returns_twice safestack sanitize_address sanitize_hwaddress
sanitize_memory sanitize_thread section sge sgt shadowcallstack
sideeffect signext sle slt speculatable speculative_load_hardening
sret ssp sspreq sspstrong strictfp tail target thread_local to triple
true type ueq uge ugt ule ult undef une unnamed_addr uno uwtable
volatile weak weak_odr writeonly x x86_fastcallcc x86_stdcallcc
zeroext zeroinitializer
)
end

def self.instructions
@instructions ||= Set.new %w(
add alloca and ashr bitcast br call catch cleanup extractelement
extractvalue fadd fcmp fdiv fmul fpext fptosi fptoui fptrunc free
frem fsub getelementptr getresult icmp insertelement insertvalue
inttoptr invoke load lshr malloc mul or phi ptrtoint resume ret sdiv
select sext shl shufflevector sitofp srem store sub switch trunc udiv
uitofp unreachable unwind urem va_arg xor zext
)
end

def self.types
@types ||= Set.new %w(
double float fp128 half label metadata ppc_fp128 void x86_fp80 x86mmx
)
end

state :basic do
rule %r/;.*?$/, Comment::Single
rule %r/\s+/, Text
Expand All @@ -33,55 +75,22 @@ class LLVM < RegexLexer
rule %r/[=<>{}\[\]()*.,!]|x/, Punctuation
end

builtin_types = %w(
void float double half x86_fp80 x86mmx fp128 ppc_fp128 label metadata
)
state :root do
mixin :basic

state :types do
rule %r/i[1-9]\d*/, Keyword::Type
rule %r/#{builtin_types.join('|')}/, Keyword::Type
end

builtin_keywords = %w(
begin end true false declare define global constant alignstack private
landingpad linker_private internal available_externally linkonce_odr
linkonce weak weak_odr appending dllimport dllexport common default
hidden protected extern_weak external thread_local zeroinitializer
undef null to tail target triple datalayout volatile nuw nsw nnan ninf
nsz arcp fast exact inbounds align addrspace section alias module asm
sideeffect gc dbg ccc fastcc coldcc x86_stdcallcc x86_fastcallcc
arm_apcscc arm_aapcscc arm_aapcs_vfpcc ptx_device ptx_kernel cc
c signext zeroext inreg sret nounwind noreturn noalias nocapture byval
nest readnone readonly inlinehint noinline alwaysinline optsize ssp
sspreq noredzone noimplicitfloat naked type opaque eq ne slt sgt sle
sge ult ugt ule uge oeq one olt ogt ole oge ord uno unnamed_addr ueq
une uwtable x personality allocsize builtin cold convergent
inaccessiblememonly inaccessiblemem_or_argmemonly jumptable minsize
no-jump-tables nobuiltin noduplicate nonlazybind noredzone norecurse
optforfuzzing optnone writeonly argmemonly returns_twice safestack
sanitize_address sanitize_memory sanitize_thread sanitize_hwaddress
speculative_load_hardening speculatable sspstrong strictfp nocf_check
shadowcallstack attributes
)

builtin_instructions = %w(
add fadd sub fsub mul fmul udiv sdiv fdiv urem srem frem shl lshr ashr
and or xor icmp fcmp phi call catch trunc zext sext fptrunc fpext
uitofp sitofp fptoui fptosi inttoptr ptrtoint bitcast select va_arg ret
br switch invoke unwind unreachable malloc alloca free load store
getelementptr extractelement insertelement shufflevector getresult
extractvalue insertvalue cleanup resume
)

state :keywords do
rule %r/#{builtin_instructions.join('|')}/, Keyword
rule %r/#{builtin_keywords.join('|')}/, Keyword
end

state :root do
mixin :basic
mixin :keywords
mixin :types
rule %r/\w+/ do |m|
if self.class.types.include? m[0]
token Keyword::Type
elsif self.class.instructions.include? m[0]
token Keyword
elsif self.class.keywords.include? m[0]
token Keyword
else
token Error
end
end
end
end
end
Expand Down
2 changes: 2 additions & 0 deletions spec/visual/samples/llvm
Expand Up @@ -82,3 +82,5 @@ attributes #1 = { "no-sse" }

; Function @f has attributes: alwaysinline, alignstack=4, and "no-sse".
define void @f() #0 #1 { ... }

%1 = addrspacecast i32* %a to i32 addrspace(1)*