From edc61a35541c8a1e2a2bbc16a4bf5761c050ddfa Mon Sep 17 00:00:00 2001 From: Olivier Grisel Date: Tue, 31 Aug 2021 15:13:16 +0200 Subject: [PATCH] Create 2021-08-24.md --- 2021/2021-08-24.md | 51 ++++++++++++++++++++++++++++++++++++++++++++++ 1 file changed, 51 insertions(+) create mode 100644 2021/2021-08-24.md diff --git a/2021/2021-08-24.md b/2021/2021-08-24.md new file mode 100644 index 0000000..0646893 --- /dev/null +++ b/2021/2021-08-24.md @@ -0,0 +1,51 @@ + +### Gael + +_Off this week_ + +### Olivier + +- Follow-up PRs for feature names consistency checks in meta-estimators: linked at the bottom of [#1810](https://github.com/scikit-learn/scikit-learn/pull/18010) +- Profiled [#20254](https://github.com/scikit-learn/scikit-learn/pull/20254) to understand the discrepancy with scikit-learn-intelex and found bugs in profilers (reported upstream) +- Make viztracer work with joblib/loky: [#299](https://github.com/joblib/loky/pull/299) +- Reading [Uncertainty in Gradient Boosting via Ensembles](https://arxiv.org/abs/2006.10562) pointed to me by Leo from Dataiku: need stochastic gradient boosting + `predict_dist` for regression? +- Next: continue consistency checks +- Next: resume work on MOOC +- Next: resume work on proof of concept numba-dppy + +### Mathis +- sklearn_benchmarks: + - Work on fixing reporting bugs and improve overall reporting + - Ensure deterministic results + - Issue with joblib Memory cache not hit + - Adapt benchmark script to run benchmarks with SLURM + - Run benchmarks on reference datasets using declarative pipeline specification + +### Julien + +_Off this week_ + +### Jérémie + +_Off this month_ + +### Guillaume + +_Off this week_ + +### Loïc + +_Off this week_ + +### Arturo + +- Continued working on the MOOC (finished Module 6). +- Created issues [#439](https://github.com/INRIA/scikit-learn-mooc/issues/439), [#441](https://github.com/INRIA/scikit-learn-mooc/issues/441), [#442](https://github.com/INRIA/scikit-learn-mooc/issues/442) and [#443](https://github.com/INRIA/scikit-learn-mooc/issues/443), as well as PR [#440](https://github.com/INRIA/scikit-learn-mooc/pull/440) +- Made notes (for myself) classifying notebooks depending on the data-set split and whether they use any scoring or not (will result in new issues coming soon) +- Continued cross-linking forum posts with issues in GitHub and [Loïc's notes](https://notes.inria.fr/rgSzYtubR6uSOQIfY9Fpvw#) to find problematic questions and how to improve them (Issue [#432](https://github.com/INRIA/scikit-learn-mooc/issues/432)) + +### TODO Next + +- Work on Loïc's notes (Olivier) +- Resume PoC with numba-dppy (Olivier) +- Create github issues out of forum posts and problematic quizzes (Arturo, Olivier?)