Skip to content

📦 Extensible English language lexicon for POS tagging with Emojis and around 110K words

License

Notifications You must be signed in to change notification settings

FinNLP/en-lexicon

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

34 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

English Lexicon

Extensible English language lexicon for POS tagging with Emojis and around 110K words

Installation

npm install en-lexicon --save

Usage

const lexicon = require("en-lexicon");

console.log(lexicon.lexicon.faraway);
// "JJ"

// multiple POS tags are separated by "|"
console.log(lexicon.lexicon.acquired);
// "VBN|JJ|VBD"

Extending

One of the main reason that I had to write my own lexicon module is that I needed it to be extensible.

To extend the lexion with medical terms for example:

const lexicon = require("en-lexicon");
lexicon.extend({
	lactate:"VB",
	serum:"NN"
});

// Now that you've extended the lexicon with your own terms
// you won't only get the terms you entered
// The lexicon will (try) to be smart and
// apply some inflections on those terms

// the term you entered
console.log(lexicon.lexicon.lactate);
// "VB"
console.log(lexicon.lexicon.lactated);
// "VBD|VBN"
console.log(lexicon.lexicon.lactating);
// "VBG"

Credits

I've used Eric Brill's lexicon as starting point for this project, manually corrected some cases, and expanded it using various corpora, this one and this one for example.

License

License: The MIT License (MIT) - Copyright (c) 2017 Alex Corvi

About

📦 Extensible English language lexicon for POS tagging with Emojis and around 110K words

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published