[Question] Hints on best practices / what to avoid #369

Penaz91 · 2022-08-24T15:11:08Z

Greetings,

I have been a user of Pylint-Django for about a year and I really appreciate how this tool gives me a hand when it comes to code quality.

I wanted to ask if it would be possible to implement some suggestions on best practices and how to avoid some common pitfalls that may come from being in a rush or just plain carelessness (I am guilty of both).

For instance:

Creating/updating models inside a for loop could have bad performance, instead bulk_create/bulk_update could work better;
Querying models inside a for loop should be avoided, a single query using the __in= operator could be better performing.

This are a couple examples on top of my head that gave me a bit of a headache when debugging performance issues on a software I was coding. Having a tool remind me about these possible performance hogs could have saved me a lot of effort.

Thank you for your attention.

The text was updated successfully, but these errors were encountered:

atodorov · 2022-08-24T19:36:22Z

At the top of my head I think these are definitely doable from a technical POV. However I think most if the maintainers and contributors are quite busy ATM to be able to devote significant time to new features. Although the patterns seem relatively simple implementation and managing tests & corner cases (in order to actually have confidence that it works) become non-trivial.

However you may find the following resources helpful and give it a try yourself:

From what I can see your pattern is:

For 1) for loop statement, then calling .save() inside of it - both easy to match. Bonus points for checking if the variable we call .save() on inherits from Model.

For 2) For loop + QuerySet - that's a bit tricky b/c there are many methods which return a QuerySet and/or it could be coming in outside the loop as a variable, result of a function, etc. A hack could be to search for .filter indiscriminately inside the loop body.

Penaz91 · 2022-08-25T09:28:32Z

Thanks for your response, no worries about being busy: it's understandable.

I will keep these suggestions in mind for a rainy day, It could be fun to delve into ASTs and write something others may be able to use.

carlio · 2022-08-25T09:36:11Z

For future reference, I'll add that the "is it a queryset" detection is a bit fuzzy - https://github.com/PyCQA/pylint-django/blob/master/pylint_django/augmentations/__init__.py#L550 - because as @atodorov says, it's not super easy to know for sure that it is a Queryset.

The AST parsing is all done by astroid so the docs there would help too understanding the Node class and similar.

Conceptually, "am I in a loop, am I calling .get or .create a lot on something that looks like a queryset" is very doable.

Penaz91 · 2022-09-01T13:07:32Z

I'm slowly studying how PyLint checkers are written and in the meantime I got another couple of ideas while working at my day job:

Using queryset.all() in a for loop may be slower than using queryset.iterator()
Using queryset.all() in a list comprehension may be slower than using queryset.iterator()

Hopefully I'll be able to integrate Pylint with a "git version" of Pylint-Django without too many headaches.

Thank you all for the resources and helpful tips!

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[Question] Hints on best practices / what to avoid #369

[Question] Hints on best practices / what to avoid #369

Penaz91 commented Aug 24, 2022

atodorov commented Aug 24, 2022

Penaz91 commented Aug 25, 2022

carlio commented Aug 25, 2022

Penaz91 commented Sep 1, 2022

[Question] Hints on best practices / what to avoid #369

[Question] Hints on best practices / what to avoid #369

Comments

Penaz91 commented Aug 24, 2022

atodorov commented Aug 24, 2022

Penaz91 commented Aug 25, 2022

carlio commented Aug 25, 2022

Penaz91 commented Sep 1, 2022