Skip to content

YukiDayDreamer/Twitter-Analysis

Repository files navigation

Twitter-Analysis

Update 2018.06.09: Demo website is not avaliable because Microsoft Azure does not provide free MySQL Database anymore 😞(http://purduetweets.azurewebsites.net)


I try to do some cool stuffs 😈 with Twitter data, and this is a dashboard like Twitter Analysis platform built with JavaScript and PHP, and it is sourced by Twitter data in West Lafayette, home of Purdue University, between 2014 and 2015.

There are two projects in the repository:

  1. Individual User Pattern Analysis
  2. Event Detection

Individual User Pattern Analysis

Demo of Individual Pattern

This project tries to find out the pattern of the most active users in the campus, spatial, temporal and textual patterns.

Dashboard

Dashboard of Individual

Workflow

  1. Group tweets of individual user by hour, then apply DBSCAN to detect cluters.

  2. Analyze the probability of the apperance of user in different clusters with similar method like Huang's Work, as well as calculate center of cluster, radius, keywords and other metadata of the clusters.

  3. Construct the tweeting frency bar chart with gathering clusters in the same type, then we could know the structure. If "Frequency" dominates the bar chart, the user is likely to be a nerd who would like to stay at specific places like office or apartments (Like me 😂), while if "Rarely" dominates the bar chart, the user would like to appear at different places as a social butterfly~ (What I want to be !)

Sample Cluster

Sample Cluster

Frequency Bar Chart

Time Bar

Event Detection

Demo of Event Detection

This one tries to detect events in the campus. The idea for event detection is based on this definition of event:

Some people around a place in specific period talking about something realted to a topic (or topics)

Dashboard

Dashboard of Individual

Workflow

  1. Group tweets by day, generate line chart about number of tweets and users monthly

  2. Different from DBSCAN for individual pattern, I apply ST-DBSCAN to do cluster the tweets every day. Then we could know its spatial and temporal pattern.

  3. Count word frequency. Apply LDA to do find out potential topics in the cluster and analyze the structure of every tweets. Although some clusters only contain rambling words (even after using a list of stop-words as a filter), some important events, like Gunshot at Campus (1.21.2104), Super Bowl (2.2.2014) and Graduation Ceremony (5.16.2014 ~ 5.18.2014), are really significant in the textual information. And it is also able to detect unknown events.

Monthly Pattern

Monthly Pattern

Pick a cluster

Main Map

Its spatial pattern showed in heatmap

Heat Map

Its temporal pattern

Temporal Pattern

Dynamic Map of sptial pattern in different periods (It is not playable in GitHub, highly recommend you to have a look at the DEMO !!)

Dynamic Map

Word frequncy in descending order

Word Freq

Sample original texts

Sample Texts

LDA topics and structure of Tweets

LDA Topics

LDA Sentences

Acknowledgement: Thanks CanvasJS to provide chart API.

Enjoy! 💥

PS: Fork and Star are reallllllllllly welcome ~~~

About

Dashboard of Twitter Analysis

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published