Skip to content

Google Season of Docs 2021: Submitted Project

Melissa Weber Mendonça edited this page Mar 26, 2021 · 2 revisions

HIGH-LEVEL RESTRUCTURING AND END-USER FOCUS - NumPy

About your organization

NumPy is very widely used in pretty much every field of science and engineering. Over 32,000 packages on GitHub depend on NumPy, and 6 million users visit our website every month. Its user base spans from beginner coders to experienced researchers doing state-of-the-art scientific and industrial R&D. NumPy is the universal standard for working with numerical data in Python and is at the core of the scientific Python and PyData ecosystems. It provides ndarray, a homogeneous n-dimensional array object with methods to efficiently operate on it. The NumPy API is used extensively in Pandas, SciPy, Matplotlib, scikit-learn, scikit-image, and most other data science and scientific Python packages. The API and concepts are also replicated in deep learning frameworks (e.g., Tensorflow, PyTorch) and in array computing libraries for other programming languages.

About your project

Your project’s problem

NumPy has been a 100% volunteer project until the first half of 2018 (we now have a few part-time paid developers). Recently, the project has been awarded two grants from the Chan Zuckerberg Initiative (through its Essential Open Source Software for Science (EOSS) program). In 2020, the focus was to improve the project's governance structure and allow some people involved in the project to focus on documentation and community building. In 2021, our focus on documentation continues, and we want to solidify the Documentation Team.

Following the ideas outlined in the NumPy Enhancement Proposal (NEP) 44, which describes our plans for the future of NumPy's documentation, we consider the documentation to be divided into four parts: Reference, Tutorials, How-tos and Explanations (which we call NumPy Fundamentals). Although we have mostly complete reference documentation for each function and class exposed to users, there is a lack of usage examples for some of them. Also, many explanations are mixed in with the reference documentation, and users would benefit greatly from an expansion of the NumPy Fundamentals section. There is also some duplication of content in our User Guide, which hinders searchability and usability. Finally, preliminary results from our 2020 User Survey point to documentation as one of the main points of improvement for NumPy, and there is a clear demand for tutorials and how-tos for users with different experience levels and domain-specific knowledge. Improving the quality of NumPy documentation will be very valuable to our millions of users!

Your project’s scope

NumPy serves many kinds of users: students new to programming or Python, educators, researchers, domain experts in one of the areas that NumPy covers, data scientists, library developers, packagers, and more. And NumPy's documentation is huge (the pdf version of the last release is over 1500 pages). The challenge: provide ways to guide those users to the parts of the documentation most relevant to them. We would love to work with a technical writer that can help us address this challenge.

Our priorities for this project are:

  • Creating high-level documentation, in the Tutorial or How-to format, covering topics that are missing from the official documentation.
  • Populating the "NumPy Fundamentals" section, organizing the content currently scattered through the reference documentation.
  • Removing duplication from the User Guide to improve searchability and discoverability for users.

Other possible topics include:

  • Identifying the scope of the NumPy documentation shipped with the code.
  • Adding non-textual images or graphics to enhance the textual explanations.
  • Consult the SciPy user survey (conducted in 2019) and the NumPy user survey (conducted in 2020) to get an overview of the most common features / improvements the community would like to see implemented in the future.

Familiarity with NumPy and its community is not required for the technical writer who will take on this project. This means we anticipate some time will be spent on onboarding and learning the technical processes involved in working on our documentation. Two experienced mentors will be available to provide walkthroughs, set up time for discussing with or interviewing different kinds of end-users and content experts as needed. We are open to listening to writers' suggestions and contributions.

Measuring your project’s success

Currently, NumPy's GitHub issue tracker counts over 100 issues related to documentation. Some of them involve re-writing or making documentation clearer, but many involve duplicate, missing or misplaced documentation. Similarly, many of the questions about NumPy on StackOverflow involve content that is maybe documented, but not easily findable.

A successful project would involve the creation of new content based on suggestions from the community (for example, documentation on Internal representation of NumPy arrays or How to navigate the NumPy codebase) or on the identification of existing content from the Reference documentation that can be moved/reorganized into high-level Explanations/How-tos.

We would consider the project successful if:

  • At least two pages are added to NumPy Fundamentals.
  • Either two How-tos or one How-to and one Tutorial are created.

Project budget

Budget item Amount Running Total Notes/justifications
Technical writer 5000.00 5000.00
TOTAL 5000.00

Additional information

NumPy has participated in Google Season of Docs in 2019 and 2020. Last year, we had two technical writers working with NumPy. You can see their reports here and here. Both writers are now active participants in our Documentation Team.

The mentors for this project, Melissa Mendonça and Ross Barnowski, are both members of the NumPy Documentation Team, with experience mentoring and creating documentation, as well as around the tooling and infrastructure involved in creating the documentation for NumPy. Melissa was also in this role for last year's Google Season of Docs.

The NumPy documentation and development teams drive and decide on doc changes as they are proposed. We also welcome ideas around accessibility, usability and inclusion when creating documentation for NumPy. The Documentation Team holds bi-weekly meetings to openly discuss goals and projects with the NumPy community.

Current documentation: