Skip to content
Ffisegydd edited this page Nov 26, 2014 · 1 revision

nidaba

Nidaba is a data analytics project devoted to analysing and making use of the freely available information on Stack Overflow, in particular when relating to the Python programming language.

Stack Overflow is licensed under CC BY-SA 3.0 meaning that we have access to a fantastic wealth of programming knowledge, with currently 300,000+ Python questions asked. Project Nidaba aims to use this data to help members of the Stack Overflow Python community.

A big part of Nidaba will be analysis of data. There will be broadly too different sets of analysis: analysis that is directly actionable (i.e. finding duplicates) and analysis that is instead informative (trends in the sopython data etc).

Some ideas we have going forward are:

  • Trends in the Python questions/answers with respect to time;
  • Highlighting famous questions and answers;
  • Finding interesting, hidden gems and shining a spotlight on them;
  • Suggesting possible duplicate questions automatically based on similar content;
  • Predicting the likelihood of closure of questions based on their quality;
  • Identifying spam questions so they can be quickly closed and deleted;
  • FGITW analysis by keeping track of edits inside grace timeframe;
  • Looking at "relationships" of people that interact via questions/answers/comments.

Project Nidaba is a young project so far and we can constantly looking for ideas, advice, and help. Please contact us by email or pop into the sopython chatroom if you want to chat.

Links

Clone this wiki locally