Skip to content
This repository has been archived by the owner on Apr 3, 2024. It is now read-only.
/ fashion-mining Public archive

An application for analysing text documents with natural language processing.

Notifications You must be signed in to change notification settings

sebkasanzew/fashion-mining

Repository files navigation

Zalando Textmining

What is this about?

Slides about the purpose and the functionality of this project (german)

Installation instructions for Windows

NLTK Installation

  1. make sure python is installed
  2. execute pip install nltk in the terminal
  3. download nltk data by opening python and run:
>>> import nltk
>>> nltk.download()
  1. select C:/nltk_data as the destination folder and select download_all in the list and then click on download
  2. have fun

GENSIM installation (no working guide yet)


Installation instructions for Debian Linux (Ubuntu)

NLTK Installation

  1. open terminal and execute pip install nltk
  2. make sure in downloader.py line 380 in /usr/local/lib/python2.7/dist-packages/nltk/ the DEFAULT_URL is set to "http://nltk.github.com/nltk_data/"
  3. download nltk data by executing this command:
~$ sudo python -m nltk.downloader -d /usr/local/share/nltk_data all
  1. have fun

GENSIM installation

Dependencies

  1. Python >= 2.7
  2. NumPy >= 1.10
  3. Scipy >= 0.16

Installation

open the terminal and execute the following code:

~$ sudo apt-get install libamd2.* libblas3gf libc6 libgcc1 \
libgfortran3 liblapack3gf libumfpack5.* libstdc++6 \
build-essential gfortran python-all-dev \
libatlas-base-dev python-tk

~$ sudo apt-get install python-setuptools

~$ sudo easy_install pip

~$ sudo pip install --upgrade gensim
(NumPy and SciPy would be downloaded automatically)

All dependencies should be resolved and installation is complete to use GENSIM

Additional Packages

open the terminal and execute the following code:

~$ sudo pip install texttable

Glove setup

To use the Glove Word2Vec in the GUI you need to download the pre-trained word vectors with 840B tokens from http://nlp.stanford.edu/projects/glove/ and save the text file as "common.840B.300d.txt" in a self created folder named "fashion-mining\data\tmp".

About

An application for analysing text documents with natural language processing.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published