Skip to content

A list of topics for a Google summer of code (GSOC) 2011

GaelVaroquaux edited this page Mar 26, 2011 · 19 revisions

A list of topics for a Google summer of code (GSOC) 2011

Online learning

Mentor : O. Grisel

Goal : Devise an intuitive yet efficient API dedicated to the incremental fitting of some scikit-learn estimators (on an infinite stream of samples for instance).

See this thread on the mailing list for a discussion of such an API. Design decision will be taken by implementing / adapting three concrete models:

  • text feature extraction
  • online clustering with sequential k-means
  • generalized linear model fitting with Stochastic Gradient Descent (both for regression and classification)

Dictionary Learning a.k.a. Sparse Coding

Mentor : Gael Varoquaux, Alex Gramfort, Alex Passos

Some useful ressources with compatible License:

Vlad candidate?

Boosting

Mentor : Satra

Manifold learning

Mentor : Fabian Pedregosa

Random forest

Mentor : Satra

(there is already a preliminary implementation in my fork) i would combine this with boosting/bagging

Locality Sensitive Hashing

Mentor : Mathieu Blondel?

There is an LSH implementation in pybrain (pybrain/supervised/knn/lsh)

Command line interface

Mentor : ?

Interaction with mldata.org

Mentor : ?

Clone this wiki locally