Skip to content

A ruby gem that classifies search query strings as either known-item searches or unknown-item searches

License

Notifications You must be signed in to change notification settings

sandbergja/known_item_search_classifier

Repository files navigation

known_item_search_classifier

A ruby gem that classifies search query strings as either known-item searches or unknown-item searches. It uses features identified by Min-Yen Kan and Danny C.C. Poo in their 2005 paper called Detecting and Supporting Known Item Queries
in Online Public Access Catalogs](http://www.comp.nus.edu.sg/~kanmy/papers/f57-kan.pdf). It uses a Guassian Naive Bayes algorithm and winds up being about 80% accurate.

Usage

# Using the default training set
require('known_item_search_classifier')
c = KnownItemSearchClassifier::Classifier.new
c.is_known_item_search? "Horton hears a Who" # => :known

# Using your own training set
require('known_item_search_classifier')
c = KnownItemSearchClassifier::Classifier.new
c.train([
    ["The Illiad by Homer", :known],
    ["bugs", :unknown],
    ["Teaching Community: A Pedagogy of Hope", :known],
    ["pre-exam stress", :unknown]])
c.train_from_csv('training_set.csv') # With format "fantastic beasts and where to find them",known
c.is_known_item_search? "Rowling's Harry Potter and the Philosopher Stone" # => :known

About

A ruby gem that classifies search query strings as either known-item searches or unknown-item searches

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages