Skip to content

Google Season of Docs 2020 Project Ideas

Maja Gwóźdź edited this page Apr 27, 2020 · 8 revisions

Welcome, and thank you for taking an interest in NumPy! On this page we will first provide some context about NumPy and the current state of its documentation, and then describe one project idea in detail. We want to point out that we are not only interested in just that idea; we'd love to talk to you if you have your own ideas about a project that you're excited about and that you think would help improve NumPy's documentation or online presence.

Please note that Season of Docs is a program for writers with previous experience to show for the application. If you are a student, please consider Google Summer of Code instead.

About NumPy: NumPy is very widely used in pretty much every field of science and engineering. Its user base spans from beginning coders to experienced researchers doing state-of-the-art scientific and industrial R&D. NumPy is the universal standard for working with numerical data in Python, and at the core of the scientific Python and PyData ecosystems. It provides ndarray, a homogeneous n-dimensional array object with methods to efficiently operate on it. The NumPy API is used extensively in Pandas, SciPy, Matplotlib, scikit-learn, scikit-image and most other data science and scientific Python package. The API and concepts are also replicated in deep learning frameworks (e.g. Tensorflow, PyTorch) and in array computing libraries for other programming languages.

The state of NumPy's documentation: NumPy has been a 100% volunteer project until the first half of 2018 (we now have 2 full-time and a few part-time paid developers). Recently, the project has been awarded a grant from the Chan Zuckerberg Initiative (through its Essential Open Source Software for Science (EOSS) program), with the goal of improving the current governance structure and allowing some people involved in the project to focus on documentation and community building. A first step in this direction is the NumPy Enhancement Proposal (NEP) 44, which describes our plans for the future of NumPy's documentation and establishes the creation of a dedicated Documentation Team, which is currently underway. However, the documentation team is not a team of technical writers; rather, they are developers splitting their time between writing code and high-quality documentation.

NumPy has participated in Google Season of Docs in 2019. This resulted in the excellent NumPy: the absolute basics for beginners tutorial by technical writer Anne Bonner. However, NumPy is a big project and there is still much work to do. We have mostly complete reference documentation for each function and class exposed to users, although some functions are missing a usage example. Also, there are many explanations mixed in with the reference documentation, and users would benefit greatly from a dedicated "Explanations" section. We also plan on building a gallery of high-level tutorials and how-tos that can give users a larger pool of resources and address different kinds of users and use cases. Improving the quality of NumPy documentation will be very valuable to our millions of users!

How NumPy's documentation is built: All our documentation and websites are built with Sphinx. Sphinx generates static websites (making them easy to deploy) and provides extensive functionality to transform plain-text reStructuredText documents to html, as well as extract and cross-link documentation automatically from docstrings in Python source code. Reference documentation follows the NumPy docstring standard. A detailed guide on how to document functions, classes and other objects can be found here, and how to build them here.

NumPy's approach to documentation work: The documentation and core development teams drive and decide on doc changes as they are proposed. Documentation tasks and issues are maintained on our GitHub issue tracker. Changes to the documentation are made via pull requests on GitHub, and reviewed with our standard review process which is the same for documentation and code (see our contributing guide). For any new features added to NumPy, comprehensive reference documentation must be added at the same time as code, including usage examples. New educational content, such as Tutorials and How-Tos, can be proposed directly as issues or discussed on the mailing list first.

Current documentation:

Contact

As a community driven project we try to have all conversations about NumPy in public. The main venue for discussions related to the development of NumPy (which includes GSoD) is the numpy-discussion mailing list: https://mail.python.org/mailman/listinfo/numpy-discussion. Please register and post to that list for discussing a GSoD proposal or idea. In case you want to pre-discuss something in private first, please contact the NumPy GSoD coordinators at numpy-scipy-gsod@googlegroups.com.

Project idea: High level restructuring and end user focus

NumPy serves many kinds of users: students new to programming or Python, educators, researchers, domain experts in one of the areas that NumPy covers, data scientists, library developers, packagers, and more. And NumPy's documentation is huge (the pdf version of the last release is over 1500 pages). The challenge: provide ways to guide those users to the parts of the documentation most relevant to them. We would love to work with a technical writer that is able to help us address this challenge.

Possible topics include:

  • Creating high-level documentation, such as Tutorials and How-Tos, covering topics that are missing from the official documentation.
  • Creating a new "Explanations" section, organizing the content currently scattered through the reference documentation.
  • Rewriting a section of the User Guide (that can then serve as a template for other/new sections).
  • Adding non-textual images or graphics to enhance the textual explanations.
  • Updating out-of-date references and refactoring content to latest best practices.
  • Integrating the documentation more cleanly with the growing body of on-line literature on scientific computing, data science, resources for learning Python, NumPy, and performance considerations when writing code.
  • Consult the SciPy user survey (conducted in 2019) to get an overview of the most common features / improvements the community would like to see implemented in the future.

We assume that the technical writer who will take on this project is not yet familiar with NumPy and its community. This means time will need to be built into the project plan at the start to get familiar with it. Mentors will be able to provide walkthroughs, set up time for discussing with or interviewing different kinds of end users and content experts. We are open to listening to writers' suggestions and contributions.

Mentors: Melissa Mendonca Weber, Ralf Gommers, Matti Picus

Project idea: external tutorial content curation and adaptation

A lot of good educational material about NumPy can be found in many places: blogs, websites, courses, and complete freely available online books. Many of the authors may be interested in finding a wider audience for their content, and/or contributing their content to the official NumPy documentation. Therefore finding and obtaining permission to adapt or rewrite this content could be very impactful. This project will require editorial skills as much as technical writing skills. An interest in networking and building connections with other writers and educators will also be very helpful for this project.

Mentors: Melissa Mendonca Weber, Ralf Gommers

Relevant material that is not yet linked above:

(Org application deadline: May 4 2020, Technical writer exploration phase May 11 - June 8, 2020)