Skip to content

soliverc/Enron-Ham-Spam-Email-FIlter

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

4 Commits
 
 
 
 

Repository files navigation

Enron-Ham-Spam-Email-FIlter

Before runnning this script:

Download the Enron Email Ham and Spam Dataset: https://drive.google.com/file/d/1c0Zbp4tSRsRPcxbswYDCYzmE66GRGv_L/view?usp=sharing

Extract the files. You will see two folders: Ham Unprocessed and Spam Unprocessed

In the python script, set these directories as:

hamdir = r'....\Ham Unprocessed' (remove 'r' for MacOS) spamdir = r'.....\Spam Unprocessed (remove 'r' for MacOS)

If the google drive link doesn't work you can download the data from the original source:

Go to: http://www2.aueb.gr/users/ion/data/enron-spam/

Download all files under 'Enron-Spam in raw form'

Extract the ham folders to a folder of your choice eg. ..../ham Extract the spam folders to a folder of your choice eg. ..../spam

In the python script, set these directories as:

hamdir = r'....\ham (remove 'r' for MacOS) spamdir = r'.....\spam (remove 'r' for MacOS)

You can now run the script.

note: the n_jobs argument in gridsearch allows it to utilize all cores and 100% of your CPU. If you are experiencing problems with this section remove the n_jobs argument entirely.

About

No description, website, or topics provided.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published