Fixup: repair sallys-blog to match updated website design #744

jayaddison · 2023-03-14T14:05:25Z

Resolves #739.

* Added support for sallys-blog.de * Added missing test for sallys-blog.de total time * Fixed formatting and string handling of sallys-blog.de scraper * Removed unused dependency from sallys-blog.de scraper (cherry picked from commit 516e7f4)

…ngraph image property is found on the page

hhursev · 2023-03-14T16:11:24Z

recipe_scrapers/plugins/opengraph_image_fetch.py

@@ -47,6 +47,6 @@ def decorated_method_wrapper(self, *args, **kwargs):
                image = self.soup.find(
                    "meta", {"property": "og:image", "content": True}
                )
-                return image.get("content")
+                return image.get("content") if image else None


hhursev · 2023-03-14T16:41:13Z

recipe_scrapers/sallysblog.py

+        return normalize_string(self.soup.head.find("title").get_text())
+
+    def image(self):
+        raise NotImplementedError()  # todo: probably better to return URLs than base64 content


fwiw I'm ok with something (kinda-ugly) like:

from urllib.parse import unquote ..... def image(self): image_element = self.soup.find("div", {"class": "images-wrap"}).findAll("img", {"sizes": "100vw"})[0] image_src = image_element["src"] image_url = unquote(image_src).split("url=")[1].split("&")[0] return image_url

Nice! I hadn't noticed those image links. That sounds like a good approach.

janecker and others added 3 commits March 14, 2023 12:50

Added support for sallys-blog.de (#442)

e914e9c

* Added support for sallys-blog.de * Added missing test for sallys-blog.de total time * Fixed formatting and string handling of sallys-blog.de scraper * Removed unused dependency from sallys-blog.de scraper (cherry picked from commit 516e7f4)

opengraph plugin: don't attempt to retrieve image content when no ope…

81d2f0e

…ngraph image property is found on the page

Fixup: repair sallys-blog to match updated website design

fb8085f

hhursev reviewed Mar 14, 2023

View reviewed changes

jayaddison merged commit ec73224 into main Mar 14, 2023

jayaddison deleted the issue-739/sallys-blog-update branch March 14, 2023 16:22

hhursev reviewed Mar 14, 2023

View reviewed changes

jayaddison mentioned this pull request Mar 14, 2023

sallys-blog: add support for image link retrieval #745

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Fixup: repair sallys-blog to match updated website design #744

Fixup: repair sallys-blog to match updated website design #744

jayaddison commented Mar 14, 2023

hhursev Mar 14, 2023

hhursev Mar 14, 2023

jayaddison Mar 14, 2023

Fixup: repair sallys-blog to match updated website design #744

Fixup: repair sallys-blog to match updated website design #744

Conversation

jayaddison commented Mar 14, 2023

hhursev Mar 14, 2023

Choose a reason for hiding this comment

hhursev Mar 14, 2023

Choose a reason for hiding this comment

jayaddison Mar 14, 2023

Choose a reason for hiding this comment