This project was for my Honor Thesis at Fisk University. The goal of this project was to build a model for analyzing bias within news outlets. The results from my research can be seen in the AnalyisisReport_date.json files. I have created working spiders New York Times, Fox News, and NPR. The Drudge Report spider does not work because the site is being rendered in javascript. Libraries such as splash can overcame this boundery.
Using the Scrapy, Textblob, Cronjobs, and Newspaper3k libraries, I successfully created a model for analyzing the biases of news outlets. NPR was the most bias of the four news outlets, and New York Times was the least. However, the data collected was over a few weeks. The goal of this research was to programmatic build a model for analyzing bias in news outlets.
A copy of my research paper can be found here https://docs.google.com/document/d/1L5IgNjKwVHu-0N6DU04HQon-F2py-ibDitPpNXevfX8/edit.