Skip to content

WideScale is a full-text indexing and searching engine, written in golang.

Notifications You must be signed in to change notification settings

anubhavp28/WideScale

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

7 Commits
 
 
 
 
 
 
 
 

Repository files navigation

WideScale

WideScale is a full-text indexing and searching engine, written in golang. WideScale is solely for educational purposes. It provides a simple API to search for words or group of words, inside large quantity of text spread across multiple documents. Internally, WideScale uses a Inverted Index, similar to ElasticSearch. For more information, see this article.

Let me know if you guys have any suggestions.

Why use a special data structure (Inverted Index) ?

I found Inverted Index while I was reading about ElasticSearch. To understand why use it, here is excerpt from wikipedia article about it -

When dealing with a small number of documents, it is possible for the full-text-search engine to directly scan the contents of the documents with each query, a strategy called "serial scanning". This is what some tools, such as grep, do when searching.

However, when the number of documents to search is potentially large, or the quantity of search queries to perform is substantial, the problem of full-text search is often divided into two tasks: indexing and searching. The indexing stage will scan the text of all the documents and build a list of search terms (often called an index). In the search stage, when performing a specific query, only the index is referenced, rather than the text of the original documents.

I really didn't think I could do a better explanation than that.

Installation

  • Install golang (Instructions). Add go installation path to your PATH environment variable.
  • Download mux.
    > go get github.com/gorilla/mux
    
  • Downlaod widescale
    > go get github.com/anubhavp28/WideScale/
    
  • Install widescale
    > go install github.com/anubhavp28/WideScale/
    

Usage

  • To start the server, simply run:

    > cd $(go env GOPATH)/bin
    > widescale <path-to-dir-containing-txt-files-to-index>
    

License

This project is licensed under the MIT License - see the LICENSE.md file for details.

About

WideScale is a full-text indexing and searching engine, written in golang.

Topics

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages