Skip to content

Caphace-Ethan/Twitter-Data-Analysis

 
 

Repository files navigation

Africa Covid19 Twitter-Data Analysis

This Project is divided into two parts

  1. Business understanding & Data Processing
  2. Data Analysis (Topic Modeling and Sentiment Analysis)

1. Business understanding & Data Processing

This Involves; Uderstanding requirements for the project/Task, Data Collection and Understanding, and data preparation.

Data source is Twitter Social Media, Data preparation involves the following; i. Handling NA, ii. Handling missing value, iii. Data standardization/ scaling of data, iv. Feature engineering, v. Dimensionality reduction

The following are the tasks for completing part I of this Project

  1. The repository was Forked from 10 Academy (https://github.com/10xac/Twitter-Data-Analysis)
  2. “fix_bug” branch was Created to fix the bugs in the fix_clean_tweets_dataframe.py and fix_extract_dataframe.py
  3. In branch fix_bug the file fix_clean_tweets_dataframe.py and fix_extract_dataframe.py were renamed to clean_tweets_dataframe.py and extract_dataframe.py respectivelly.
  4. The bugs on clean_tweets_dataframe.py and extract_dataframe.py were fixed,
  5. Multiple pushes to git was made during fixing bugs and testing, and when the fix was completed, the fix_bug branch was merged to main and master branch
  6. A new branch make_unittest was Create for creating a new unit test for extract_dataframe.py code.
  7. After completing the unit test writing, the make_unittest branch was merged to main branch
  8. Travis CI was set to the repository such that when code was pushed to git or branch(s) merged to the main branch, the unit test in tests/*.py ran automatically.
  9. All tests passed.

Some outputs screenshots of the these tasks are included in results folder

Releases

No releases published

Packages

No packages published

Languages

  • Python 100.0%