Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Update the scraper-generator template. Inherit from abc.ABC in AbstractScraper #664

Merged
merged 7 commits into from Dec 16, 2022

Conversation

jayaddison
Copy link
Collaborator

@jayaddison jayaddison commented Oct 21, 2022

Perhaps we should only include 'core' recipe fields within the auto-generated scrapers. This pull request omits originally omitted a few fields that I'd consider more on the optional side, related to recipes. Inspired by discussion in #663 (cc @bickerdyke).

It's always possible for people to implement the additional fields - and that should happen naturally over time when people decide that they want that information.

This would reduce the work to implement scrapers from scratch, particularly time spent putting together test case data and implementing BeautifulSoup queries.

@jayaddison
Copy link
Collaborator Author

Hmm. Or perhaps we should match the fields to the default-exception-handling field list.

@jayaddison
Copy link
Collaborator Author

Hmm. Or perhaps we should match the fields to the default-exception-handling field list.

Nope, I don't think so after all. A few of those fields are definitely optional (and in fact, ratings is in there too).

@jayaddison jayaddison changed the base branch from main to issue-617/migrate-opengraphimage-plugin October 21, 2022 17:19
@jayaddison jayaddison force-pushed the issue-617/reduce-template-methods branch from 2f5e78b to 10da8ad Compare October 21, 2022 17:21
@jayaddison
Copy link
Collaborator Author

A related idea: we could make AbstractScraper inherit from ABC (Python's abstract base class), and then require that some of the methods are implemented (by tagging them as @abstractmethod).

As usual, I'm not sure whether it's worthwhile. It'd be stricter - but is that good for developers and the library?

And what are the required elements of a recipe? (ingredients and instructions, certainly - anything else?)

@hhursev
Copy link
Owner

hhursev commented Oct 23, 2022

I'm on board with the idea of inheriting abc.ABC + abstractmethod. It'll be good for developers and the library imo.

The 'core' recipe fields are: host, title, total_time, image, ingredients, instructions.

@jayaddison jayaddison force-pushed the issue-617/reduce-template-methods branch from 3886a67 to f502afa Compare October 27, 2022 09:42
@jayaddison jayaddison changed the title Reduce the set of methods included in the scraper-generator template Update the scraper-generator template Oct 27, 2022
@jayaddison jayaddison changed the base branch from issue-617/migrate-opengraphimage-plugin to main October 27, 2022 09:44
@@ -12,6 +12,10 @@ def host(cls):
def title(self):
return self.soup.find("meta", {"property": "og:title"})["content"]

def total_time(self):
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In case where site does not have time information for their recipes:

raise RecipeScrapersExceptions(f"{self.host} does not provide time information.")

or sth along these lines to set an example.

@hhursev hhursev changed the title Update the scraper-generator template Update the scraper-generator template. Inherit from abc.ABC in AbstractScraper Oct 28, 2022
@hhursev
Copy link
Owner

hhursev commented Oct 28, 2022

I really like what I see here! Why the CI didn't run 🤔 Is it a glitch or I'm missing something

@jayaddison
Copy link
Collaborator Author

Thanks for the review! Ah, CI didn't run here because the branch changed (originally I had this stacked onto some of the decorator-related field retrieval). Should be fixed when I push a commit for the exception change.

@jayaddison
Copy link
Collaborator Author

Should we consider this a breaking/major version change? (I hadn't thought of this until now, but it could cause breakage for anyone developing private scrapers inherited from AbstractScraper)

@hhursev
Copy link
Owner

hhursev commented Oct 28, 2022

Ah nice thinking .. do not merge into main for now please. Let's create a v15 branch and merge it there instead for the time being.

This would be a major version bump indeed and I'd like us to unite it with other things (removal of the settings module, some scrapers updates due to the health check thing and I probably forget other things we'll introduce too).

Not sure what's the best way to notify current package users. Ideally minor version bump with message during install 🤔 (not sure how this can be done). Simply a message when recipe_scrapers is imported might work too.

@jayaddison jayaddison changed the base branch from main to v15 October 28, 2022 14:12
@jayaddison
Copy link
Collaborator Author

Ok, the v15 development branch now exists (with unit testing and linting enabled.. as should be visible in this pull request for any subsequent commits).

@jayaddison jayaddison added the v15 label Oct 28, 2022
@jayaddison
Copy link
Collaborator Author

It could be worth figuring out what (if anything) to do about the disallow_untyped_defs=False directives across the codebase as we start on two parallel branches.

I feel like those could cause a bunch of merge conflicts between main and v15 if we're not careful.

(allow-untyped-defs seems much clearer to me than disallow_untyped_defs=False)

@jayaddison jayaddison force-pushed the issue-617/reduce-template-methods branch from f502afa to 3ea825e Compare October 28, 2022 15:45
@jayaddison jayaddison merged commit cb64e51 into v15 Dec 16, 2022
@jayaddison jayaddison deleted the issue-617/reduce-template-methods branch December 16, 2022 22:45
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

2 participants