BK: Add tox.ini with typing,lint (codespell); add workflows for codespell; update pyproject.toml #45

yarikoptic · 2022-09-30T13:54:58Z

To make template better tested and IMHO we should start aiming for type annotation of code in (new) extensions, although datalad itself is not yet annotated.

@jwodder - is it feasible to aim for downstream libraries (like datalad extensions) to be type annotated although only small portion of datalad itself is type annotated ATM?
Could you help finishing up this PR (type annotations for that small datalad_helloworld)?

@mih I also quite like pre-commit with black now as used in dandi-cli and other projects. May be we could/should introduce that to extensions to encourage code/imports consistency etc?

jwodder · 2022-09-30T13:59:01Z

@yarikoptic

is it feasible to aim for downstream libraries (like datalad extensions) to be type annotated although only small portion of datalad itself is type annotated ATM?

The main problem I see is that, as long as datalad lacks a py.typed file, when mypy type-checks a library that uses datalad, any types & values from datalad will be treated as Any, limiting the amount of checking mypy can do. I'm not sure what would happen if datalad gained a py.typed file before it was completely type-annotated, but I doubt it'd be pretty.

pyproject.toml

codecov · 2022-09-30T14:03:56Z

Codecov Report

Patch coverage: 100.00% and project coverage change: +0.34% 🎉

Comparison is base (fac0fda) 92.68% compared to head (cd2a860) 93.02%.
Report is 19 commits behind head on main.

❗ Current head cd2a860 differs from pull request most recent head 60bfa81. Consider uploading reports for the commit 60bfa81 to get more accurate results

Additional details and impacted files

@@            Coverage Diff             @@
##             main      #45      +/-   ##
==========================================
+ Coverage   92.68%   93.02%   +0.34%     
==========================================
  Files           5        5              
  Lines          41       43       +2     
==========================================
+ Hits           38       40       +2     
  Misses          3        3

Files Changed	Coverage Δ
datalad_helloworld/hello_cmd.py	`88.88% <100.00%> (+0.88%)`	⬆️
datalad_helloworld/tests/test_register.py	`100.00% <100.00%> (ø)`

☔ View full report in Codecov by Sentry.

📢 Have feedback on the report? Share it here.

yarikoptic · 2023-01-30T05:06:37Z

remove typing workflow -- could not figure out how to ignore _version.py . Left possibility to run it though via typing since IMHO we should strive to do typing.

yarikoptic · 2023-01-30T05:08:54Z

smth likely in the _datalad_buildsupport changes I added which now makes

subcommand --help doc output not building

make: Entering directory '/home/runner/work/datalad-extension-template/datalad-extension-template/docs'
python -m sphinx -b html -d build/doctrees  -W source build/html
Running Sphinx v6.1.3
usage: setup.py [global_opts] cmd1 [cmd1_opts] [cmd2 [cmd2_opts] ...]
   or: setup.py --help [cmd1 cmd2 ...]
   or: setup.py --help-commands
   or: setup.py cmd --help

error: option --cmdsuite not recognized
making output directory... done
[autosummary] generating autosummary for: index.rst
[autosummary] generating autosummary for: /home/runner/work/datalad-extension-template/datalad-extension-template/docs/source/generated/datalad.api.hello_cmd.rst
loading intersphinx inventory from https://docs.python.org/objects.inv...
intersphinx inventory has moved: https://docs.python.org/objects.inv -> https://docs.python.org/3/objects.inv
building [mo]: targets for 0 po files that are out of date
writing output... 
building [html]: targets for 1 source files that are out of date
updating environment: [new config] 2 added, 0 changed, 0 removed
reading sources... [ 50%] generated/datalad.api.hello_cmd
reading sources... [100%] index


Warning, treated as error:
/home/runner/work/datalad-extension-template/datalad-extension-template/docs/source/index.rst:[24](https://github.com/datalad/datalad-extension-template/actions/runs/4040767962/jobs/6946719935#step:6:25):toctree contains reference to nonexisting document 'generated/man/datalad-hello-cmd'
make: *** [Makefile:55: html] Error 2
make: Leaving directory '/home/runner/work/datalad-extension-template/datalad-extension-template/docs'
Error: Process completed with exit code 2.

yarikoptic · 2023-01-31T16:27:01Z

ah -- filed #51, will filter this branch to not bother about build support for now either.

matrss · 2023-02-03T15:00:09Z

remove typing workflow -- could not figure out how to ignore _version.py .

I ran into this issue as well. Just excluding _version.py is not enough, since the module is imported from other files which are type checked, which in turn means that the _version module will be checked. In addition to the exclude I needed to add an override which allows untyped calls and definitions in this module (and others, like _datalad_buildsupport.*), see:
https://github.com/matrss/datalad-getexec/blob/4847a6937f81f461ec4972aff963a7849f667f8b/pyproject.toml#L43-L50

jwodder · 2023-02-03T15:03:46Z

The way I handled ignoring _version in tinuous was to add the following to the mypy config:

[mypy]
exclude = _version\.py

[mypy-tinuous._version]
follow_imports = skip

matrss · 2023-02-03T15:09:40Z

Yes, that should also work. I wasn't aware that you can specify follow_imports per module instead of just globally.

yarikoptic · 2023-02-03T15:51:42Z

ha -- that last setting did the trick seems to me! thank you @jwodder. Pushing now back with typing CI enabled.

yarikoptic · 2023-02-17T02:34:08Z

if no objections are voiced, I will merge it in 2-3 days. IMHO it would improve extensions

mih

Thanks for the PR. I can say that I like the intent.

What I have issues with is the increasing complexity of the template that is furthered here. For those without expert knowledge it is unclear what is important or critical, what is a not-a-datalad-extension-without-it and what is more of a here-is-something-running-that-you-could-keep-if-liked.

All these little things become annoying quite quickly. If I need to maintain a spell checker configuration now, because it simply does not know something that I nevertheless need to write, the convenience is soured. I will leave brief comments on the diff for things that came to my mind in this regard.

mih · 2023-02-17T07:54:34Z

datalad_helloworld/tests/test_register.py

@@ -1,7 +1,7 @@
 from datalad.tests.utils_pytest import assert_result_count


-def test_register():
+def test_register() -> None:


I don't know why this was done, but if it turns out to be required now to type-annotate any test function with -> None, I'd say this is counterproductive. There is very little (if any) value in this, and there are enough reasons already to not write tests. We should not make it even more complicated.

It should be noted that these two are not equivalent in regards to typing: no annotation means a return type of "Any". Since test_* functions in pytest should always return None this might seem superfluous, but if the test code is to be type-checked then these annotations are necessary. I also do not see how this would be another barrier to writing tests; if someone forgot to add the return type they will still write a test first and be nagged by the typing workflow to add a return type later.

Yes. However, my comment is focused on whether extension developers must write type annotated tests. As you point out, all of them would need this. But which practical problem is solved with this additional investment. I am specifically talking about test functions, not type-annotation in general.

IMHO type annotation in the tests would help with what type annotation helps in general - ensuring that we have expected types/know what types any particular code is working with. To some degree tests type annotation increases the benefit from those tests. So the 'practical problem' being solved is the same as solved by tests themselves -- ensuring that code is working as expected (returns expected types etc).

Somewhat recent versions of pytest emit a warning on return of something other than None from test functions. In the future this will become an error: pytest-dev/pytest#7337. In that regard, type annotations seem a bit redundant, other than making this implicit assumption explicit in the code (which, honestly, is one of the main reasons to type annotate anyway).

There are some test functions in the datalad test suite on maint which still yield values, e.g.: https://github.com/datalad/datalad/blob/6c2a573e9cf75f62b4aa8b03fdd346dca8d29820/datalad/tests/test_tests_utils.py#L134.
These instances could have been found using type annotations and mypy (I think they are fixed on master anyway).

I don't really see a reason not to be consistent and just type annotate everything though.

ha -- cool find. Just FTR: we are ok/safe -- that file is for testing old nose based utils etc, that test is "nose style parametric test", and indeed the file was altogether already removed in master. pytest counterpart is https://github.com/datalad/datalad/blob/HEAD/datalad/tests/test_tests_utils_pytest.py .

These instances could have been found using type annotations and mypy (I think they are fixed on master anyway).

how to do that? i.e. that there is some test_ function which returns something not None

how to do that? i.e. that there is some test_ function which returns something not None

If the yielding function would have been annotated with a return type of None, then mypy would emit this error:

datalad/tests/test_tests_utils.py:133: error: The return type of a generator function should be "Generator" or one of its supertypes [misc]

that is, the annotated return type and the real return type do not match.

mih · 2023-02-17T07:55:14Z

.github/workflows/lint.yml

@@ -0,0 +1,26 @@
+name: Linters


It would be good to communicate what this workflow does, and where to look for answers when it fails.

Suggested change

name: Linters

# This workflow runs tox -e lint to trigger checks such as codespell to help ensuring lint free code and docs

name: Linters

?

mih · 2023-02-17T07:56:10Z

.github/workflows/typing.yml

+# As type hinting is not yet fully and consistently done through out the project only
+# some (randomly) selected files (listed in tox.ini) will be checked.  Feel welcome
+# to add extra files to the list there. See https://github.com/datalad/datalad/issues/6884
+# for the overall goal/progress towards type hints coverage in DataLad.


It is unclear to me how type-annotation progress in datalad is a concern for extension authors.

@jwodder can correct me, but my limited understanding is that without type annotation of datalad interfaces, we simply cannot achieve complete type annotation/checking within the extension which would use those interfaces.

pyproject.toml

mih · 2023-02-17T08:05:02Z

setup.cfg

+# https://mypy.readthedocs.io/en/stable/running_mypy.html#missing-imports
+# datalad-specific setting didn't work so we set for all imports.
+# TODO: remove whenever datalad is typed.
+# [mypy-datalad.*]


I am unfamiliar with this setup, but it looks as if the "need to ignore all datalad imports" might be commented out below? It is not obvious to me how this is done.

mih · 2023-02-17T08:07:21Z

tox.ini

@@ -0,0 +1,75 @@
+[tox]


I suspect that this file would need substantial tailoring when adopted for a specific extension. I would appreciate some comments on what is relevant and what not. Right now this communicates to extension developers that they will have to learn about tox first.

added some in now a4d26c4 . Check out if that clarifies it a bit.

_datalad_buildsupport/formatters.py

jsheunis · 2023-02-17T10:31:58Z

I have zero experience with typing, and minor with linting. But FWIW I agree with the sentiment expressed by @mih, especially the ambiguity of

what is a not-a-datalad-extension-without-it and what is more of a here-is-something-running-the-you-could-keep-if-liked.

I am already quite bad at keeping everything green in the extension that I maintain, and having another red cross will be annoying. I don't mean to say that this PR is not useful, rather that the version that gets merged would ideally be the least complex and least restrictive variation from the perspective of extension developers.

mslw · 2023-02-17T10:39:26Z

On one hand, I like how the extension template encourages extension maintainers to use continuous integration by providing appveyor and github actions configuration. And I like how manpage generation "just works" thanks to the boilerplate code for building docs. On the other hand I also feel that it's helpful to keep the extension template relatively bare-bones, as there is a bit of complexity to be grasped already. Perhaps linting tools are something that can be left up to extension maintainers rather than included in the template (that's not to say they are not useful - I began using black and flake8 in recently started redcap extension, although just through manual triggers python-black-buffer, not precommits or anything).

In summary, I'm on the fence about this one, so just sharing my thoughts.

P.s. I guess left between the lines of my comment and discussion above is who the "extension maintainers" or "authors" really are - I think that the extension template targets both maintainers of extensions under DataLad organization, as well as external DataLad users who want to fine-tune their workflows. In the first case, the template is more of a normative document where it makes sense to establish conventions; in the latter, it's more important to reduce complexity and make entry more accessible.

bpoldrack · 2023-02-17T11:46:39Z

In the first case, the template is more of a normative document where it makes sense to establish conventions; in the latter, it's more important to reduce complexity and make entry more accessible.

Agree. I think the problem here is to mix those two roles of this repo. And I think the template was never meant to become that normative source for the datalad-organization universe. Hence, I lean towards @mih's point of view and to not continue to make this the most expensive hello world imaginable.

Does raise the question, though: Do we have or want a place for such things, where our other repos (in fact not only extensions) could pull such workflows and their updates from?

matrss · 2023-02-17T11:58:48Z

I think that the extension template targets both maintainers of extensions under DataLad organization, as well as external DataLad users who want to fine-tune their workflows. In the first case, the template is more of a normative document where it makes sense to establish conventions; in the latter, it's more important to reduce complexity and make entry more accessible.

This might get somewhat off-topic, but since I have recently build a small extension and am not part of the DataLad Team I want to give some input from my point of view re: complexity.

There are some hurdles when setting up a new extension which go beyond just "instantiating" the template:

obviously, updating metadata
setting up readthedocs
setting up appveyor
codeclimate?

I have setup readthedocs, but skipped the last two since I have never used appveyor before and didn't yet find a good reason to investigate further. Instead I have just setup a tox test suite in a GitHub action. Regarding codeclimate, I took a look at their website and couldn't see what it would provide to me, so I didn't bother either.

With that in mind, I think it would be a good idea to make the template work out-of-the-box as much as possible, which in practice would mean to reduce external services as much as possible. For example, the docbuild workflow could publish to a GitHub Page and make readthedocs redundant. The appveyor pipeline could be converted to a GitHub action if feasible and then require no further setup. Some more automated way of updating the metadata could be nice, but there is #42 for that.

Considering this, I do not see much complexity added here. I think it is useful to have some CI workflows that check best-practices via linting and type-checking, as long as they do not introduce even more setup steps. Maybe these could be self-contained GitHub actions instead of introducing another tool (tox) though.

mih · 2023-02-17T14:22:44Z

@matrss Thank you for expressing your view. I think your comments are also point out the discrepancy between the normative and minimal approach that @mslw

… update pyproject.toml To make template better tested and IMHO we should start aiming for type annotation of code in (new) extensions, although datalad itself is not yet annotated.

…ldsupport

…/fixed copy of buildsupport here === Do not change lines below === { "chain": [], "cmd": "codespell -w", "exit": 0, "extra_inputs": [], "inputs": [], "outputs": [], "pwd": "." } ^^^ Do not change lines above ^^^

@mih

As @mih mentioned -- downgrading is rarely a good thing. I guess it was just a blind copy-paste and then packaging was removed anyways... so now reverting back so diff must be minimized

yarikoptic · 2023-02-17T17:39:25Z

Thank you @matrss @mih @mslw @jsheunis ! @matrss: Indeed this PR does not introduce any need for extra steps on external services to enable any new added here QA feature, so I expect it only to robustify extensions developed with this set of "better practices", even though indeed it might be an additional red cross to address -- usually much easier if not put into the closet of "will do later".
Indeed there is a balance to strike between being minimal normative "howto/what an extension is" and being a pragmatic template to achieve the best result with minimal effort. But I think for the minimal normative view - we have stepped away from it long ago, and this template became more of the latter. But as a result of current shape of it I see lots of duplication (which I hate) of effort (as e.g. @mslw mentioned - enabling flake8 etc) since good practices are desired by everyone really, but then done different ways since done separately, and that makes it inefficient, and harder to fix/contribute across extensions. So it then boils down to us deciding on some common best practices and how to code for them.
IMHO type setting, guaranteed absence of typos, and consistent testing is something to strive for, and that is what this PR proposes. If we do not want any of those features -- let me know which and I remove its support in this PR.

RE @matrss

Maybe these could be self-contained GitHub actions instead of introducing another tool (tox) though.

Well, they are (lint and typing), they just use tox to streamline dependencies installation and possible invocation centralization, so could be done exactly the same way in CI and locally. Having said that, I do feel odd in me hiding codespell within tox's lint, and often I just add codespell workflow which uses codespell action. I can do it that other way too if we decide to proceed here at all.

yarikoptic · 2023-02-17T17:52:33Z

re @mih

If I need to maintain a spell checker configuration now, because it simply does not know something that I nevertheless need to write, the convenience is soured.

In my experience with codespell through the past weeks where I submitted PRs to a number of projects, in some touching over a hundred of files (see e.g CDLUC3/dmptool#428 and cvmfs/cvmfs#3183) I am genuinely surprised on how well codespell has performed (kudos to many contributors who developed it)! With added config file it is trivial to ignore that rare to hit "false positive" and hopefully soon inline annotation would make it more specific. To that extent -- we had codespell CI enabled in datalad core since Sun Apr 25 18:12:03 2021 , and I don't remember much complaints about it severing the convenience. But I think we did not hit much of typos recently, did we?

yarikoptic · 2023-03-08T14:06:45Z

ok, let me ask

do we want type annotations and checking for their correctness in the extensions?
do we want to tolerate typos in our code/docs?

I think knowing answers would help me since I kinda lost track on what you are fighting against here dear @datalad/developers . If not in the template -- everyone would need eventually to introduce proposed checks one way or another into their code.

.github/workflows/typing.yml

jwodder reviewed Sep 30, 2022

View reviewed changes

pyproject.toml Outdated Show resolved Hide resolved

yarikoptic changed the title ~~BK: Add tox.ini with typing,lint (codespell); add workflows for them; update pyproject.toml~~ BK: Add tox.ini with typing,lint (codespell); add workflows for codespell; update pyproject.toml Jan 30, 2023

yarikoptic mentioned this pull request Jan 31, 2023

Update _datalad_buildsupport #51

Open

yarikoptic force-pushed the enh-tox-etc branch from 743fbf4 to 97258df Compare January 31, 2023 16:29

yarikoptic marked this pull request as ready for review January 31, 2023 16:29

yarikoptic force-pushed the enh-tox-etc branch from 43ceb82 to 83f52fc Compare February 3, 2023 15:51

mih requested changes Feb 17, 2023

View reviewed changes

adswa reviewed Feb 17, 2023

View reviewed changes

_datalad_buildsupport/formatters.py Outdated Show resolved Hide resolved

yarikoptic and others added 7 commits February 17, 2023 12:21

BK: Add tox.ini with typing,lint (codespell); add workflows for them;…

027e188

… update pyproject.toml To make template better tested and IMHO we should start aiming for type annotation of code in (new) extensions, although datalad itself is not yet annotated.

Annotate the two functions

ff0f93d

Fix the only typing error we can do anything about

ca67a1a

Add .codespellrc with our skips etc

131d0cd

git ignore .idea/ and venvs/

4575281

Remove empty and no longer needed/used .codespell-ignore*

6cbde9d

Tried to make mypy exclude _version.py -- did not work

537b98b

yarikoptic added 5 commits February 17, 2023 12:21

packaging is not needed yet since I postponed upgrade of _datalad_bui…

6853d5f

…ldsupport

Add follow_imports skip for _version to resolve its issue

afe874f

Revert back to require setuptools >= 43.0.0 with original formatting etc

a1d9e60

As @mih mentioned -- downgrading is rarely a good thing. I guess it was just a blind copy-paste and then packaging was removed anyways... so now reverting back so diff must be minimized

Document tox.ini and its use/tuneup

cd2a860

yarikoptic force-pushed the enh-tox-etc branch from a4d26c4 to cd2a860 Compare February 17, 2023 17:39

This was referenced Feb 18, 2023

Add black and isort with pre-commit? datalad/datalad-container#196

Open

Codespell: adding config, workflow and fixing some typos datalad/datalad-metalad#336

Merged

yarikoptic mentioned this pull request Mar 22, 2023

codespell: config, dedicated workflow + 1 typo fix #58

Merged

it must have been inconsistent

60bfa81

yarikoptic commented Sep 8, 2023

View reviewed changes

.github/workflows/typing.yml Outdated Show resolved Hide resolved

python 3.8 now

fb628c9

adswa mentioned this pull request Oct 9, 2023

Include sphinx_copybutton extension for docs #85

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

BK: Add tox.ini with typing,lint (codespell); add workflows for codespell; update pyproject.toml #45

BK: Add tox.ini with typing,lint (codespell); add workflows for codespell; update pyproject.toml #45

yarikoptic commented Sep 30, 2022

jwodder commented Sep 30, 2022

codecov bot commented Sep 30, 2022 •

edited

yarikoptic commented Jan 30, 2023

yarikoptic commented Jan 30, 2023

yarikoptic commented Jan 31, 2023

matrss commented Feb 3, 2023 •

edited

jwodder commented Feb 3, 2023

matrss commented Feb 3, 2023

yarikoptic commented Feb 3, 2023

yarikoptic commented Feb 17, 2023

mih left a comment •

edited

mih Feb 17, 2023

matrss Feb 17, 2023

mih Feb 17, 2023

yarikoptic Feb 17, 2023

matrss Feb 22, 2023

yarikoptic Feb 22, 2023

matrss Feb 23, 2023

mih Feb 17, 2023

yarikoptic Sep 8, 2023

mih Feb 17, 2023

yarikoptic Feb 17, 2023

mih Feb 17, 2023

mih Feb 17, 2023

yarikoptic Feb 17, 2023

jsheunis commented Feb 17, 2023

mslw commented Feb 17, 2023

bpoldrack commented Feb 17, 2023

matrss commented Feb 17, 2023

mih commented Feb 17, 2023 •

edited

yarikoptic commented Feb 17, 2023

yarikoptic commented Feb 17, 2023

yarikoptic commented Mar 8, 2023

	name: Linters
	# This workflow runs tox -e lint to trigger checks such as codespell to help ensuring lint free code and docs
	name: Linters

BK: Add tox.ini with typing,lint (codespell); add workflows for codespell; update pyproject.toml #45

Are you sure you want to change the base?

BK: Add tox.ini with typing,lint (codespell); add workflows for codespell; update pyproject.toml #45

Conversation

yarikoptic commented Sep 30, 2022

jwodder commented Sep 30, 2022

codecov bot commented Sep 30, 2022 • edited

Codecov Report

yarikoptic commented Jan 30, 2023

yarikoptic commented Jan 30, 2023

yarikoptic commented Jan 31, 2023

matrss commented Feb 3, 2023 • edited

jwodder commented Feb 3, 2023

matrss commented Feb 3, 2023

yarikoptic commented Feb 3, 2023

yarikoptic commented Feb 17, 2023

mih left a comment • edited

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

jsheunis commented Feb 17, 2023

mslw commented Feb 17, 2023

bpoldrack commented Feb 17, 2023

matrss commented Feb 17, 2023

mih commented Feb 17, 2023 • edited

yarikoptic commented Feb 17, 2023

yarikoptic commented Feb 17, 2023

yarikoptic commented Mar 8, 2023

codecov bot commented Sep 30, 2022 •

edited

matrss commented Feb 3, 2023 •

edited

mih left a comment •

edited

mih commented Feb 17, 2023 •

edited