Skip to content

Commit

Permalink
DEV: Improved error handling. New Instagram oEmbed endpoint. New blan…
Browse files Browse the repository at this point in the history
…k value for Twitter metadata. (#437)

* DEV: Expost hash of missing data elements

* FEATURE: Use new Instagram oEmbed endpoint (requires access token)

* DEV: Correctly reference facebook_app_access_token

* DEV: title should be truncated

* FIX: “0 minutes” is the equivalent of a blank value for Twitter metadata

* DEV: error message should be assigned as array element
  • Loading branch information
jbrw committed Nov 16, 2020
1 parent dc76458 commit 1cfbb2b
Show file tree
Hide file tree
Showing 11 changed files with 103 additions and 749 deletions.
2 changes: 2 additions & 0 deletions lib/onebox/engine.rb
Original file line number Diff line number Diff line change
Expand Up @@ -30,6 +30,7 @@ def self.origins_to_regexes(origins)

attr_reader :url, :uri
attr_reader :timeout
attr :errors

DEFAULT = {}
def options
Expand All @@ -44,6 +45,7 @@ def options=(opt)
end

def initialize(link, timeout = nil)
@errors = {}
@options = DEFAULT
class_name = self.class.name.split("::").last.to_s

Expand Down
9 changes: 9 additions & 0 deletions lib/onebox/engine/allowlisted_generic_onebox.rb
Original file line number Diff line number Diff line change
Expand Up @@ -256,6 +256,15 @@ def data
d[:data_1] = Onebox::Helpers.truncate("#{d[:price_currency].strip} #{d[:price_amount].strip}")
end

skip_missing_tags = [:video]
d.each do |k, v|
next if skip_missing_tags.include?(k)
if v == nil || v == ''
errors[k] ||= []
errors[k] << 'is blank'
end
end

d
end
end
Expand Down
37 changes: 16 additions & 21 deletions lib/onebox/engine/instagram_onebox.rb
Original file line number Diff line number Diff line change
Expand Up @@ -15,35 +15,30 @@ def clean_url
end

def data
og = get_opengraph

# There are at least two different versions of the description. e.g.
# - "3,227 Likes, 88 Comments - An Account (@user.name) on Instagram: “Look at my picture!”"
# - "@user.name posted on their Instagram profile: “Look at my picture!”"

m = og.description.match(/\(@([\w\.]+)\) on Instagram/)
author_name = m[1] if m

author_name ||= begin
m = og.description.match(/^\@([\w\.]+)\ posted/)
m[1] if m
end

raise "Author username not found for post #{clean_url}" unless author_name

permalink = clean_url.gsub("/#{author_name}/", "/")
oembed = get_oembed
permalink = clean_url.gsub("/#{oembed.author_name}/", "/")

{ link: permalink,
title: "@#{author_name}",
image: og.image,
description: Onebox::Helpers.truncate(og.title, 250)
title: "@#{oembed.author_name}",
image: oembed.thumbnail_url,
description: Onebox::Helpers.truncate(oembed.title, 250),
}

end

protected

def access_token
(options[:facebook_app_access_token] || Onebox.options.facebook_app_access_token).to_s
end

def get_oembed_url
oembed_url = "https://api.instagram.com/oembed/?url=#{clean_url}"
if access_token != ''
oembed_url = "https://graph.facebook.com/v9.0/instagram_oembed?url=#{clean_url}&access_token=#{access_token}"
else
# The following is officially deprecated by Instagram, but works in some limited circumstances.
oembed_url = "https://api.instagram.com/oembed/?url=#{clean_url}"
end
end
end
end
Expand Down
2 changes: 1 addition & 1 deletion lib/onebox/engine/standard_embed.rb
Original file line number Diff line number Diff line change
Expand Up @@ -91,7 +91,7 @@ def get_twitter
html_doc.css('meta').each do |m|
if (m["property"] && m["property"][/^twitter:(.+)$/i]) || (m["name"] && m["name"][/^twitter:(.+)$/i])
value = (m["content"] || m["value"]).to_s
twitter[$1.tr('-:' , '_').to_sym] ||= value unless Onebox::Helpers::blank?(value)
twitter[$1.tr('-:' , '_').to_sym] ||= value unless (Onebox::Helpers::blank?(value) || value == "0 minutes")
end
end

Expand Down
1 change: 1 addition & 0 deletions lib/onebox/helpers.rb
Original file line number Diff line number Diff line change
Expand Up @@ -156,6 +156,7 @@ def self.blank?(value)
end

def self.truncate(string, length = 50)
return string if string.nil?
string.size > length ? string[0...(string.rindex(" ", length) || length)] + "..." : string
end

Expand Down
10 changes: 10 additions & 0 deletions lib/onebox/preview.rb
Original file line number Diff line number Diff line change
Expand Up @@ -31,6 +31,16 @@ def placeholder_html
""
end

def errors
return {} unless engine
engine.errors
end

def data
return {} unless engine
engine.data
end

def options
OpenStruct.new(@options)
end
Expand Down
363 changes: 12 additions & 351 deletions spec/fixtures/instagram.response

Large diffs are not rendered by default.

347 changes: 0 additions & 347 deletions spec/fixtures/instagram_alternative.response

This file was deleted.

17 changes: 17 additions & 0 deletions spec/fixtures/instagram_old.response
Original file line number Diff line number Diff line change
@@ -0,0 +1,17 @@

{
"version": "1.0",
"title": "Photo by Pete McBride @pedromcbride | For the first time in three decades, inhabitants of northern India are able to see the Himalaya\u2014thanks to reduced air pollution over the last few weeks. Considering that India experiences some of the worst pollution in the world, this is a literal breath of fresh air. When I was there, the air was so thick you could taste the smoke and fumes.\n\nThe coronavirus pandemic that has led to India's temporary reduction in pollutants has also put the country on the world's largest lockdown, and it's too soon to tell what impact that has had on curbing the disease\u2014as well as what the long-term effects will be on attitudes toward fresh air once the population returns to business as usual. For more on India and the environment, follow @pedromcbride. #india #himalaya #covid19 #pollution",
"author_name": "natgeo",
"author_url": "https://www.instagram.com/natgeo",
"author_id": 787132, "media_id": "2310750110684704208_787132",
"provider_name": "Instagram",
"provider_url": "https://www.instagram.com",
"type": "rich",
"width": 658,
"height": null,
"html": "\u003cblockquote class=\"instagram-media\" data-instgrm-captioned data-instgrm-permalink=\"https://www.instagram.com/p/CARbvuYDm3Q/?utm_source=ig_embed\u0026amp;utm_campaign=loading\" data-instgrm-version=\"13\" style=\" background:#FFF; border:0; border-radius:3px; box-shadow:0 0 1px 0 rgba(0,0,0,0.5),0 1px 10px 0 rgba(0,0,0,0.15); margin: 1px; max-width:658px; min-width:326px; padding:0; width:99.375%; width:-webkit-calc(100% - 2px); width:calc(100% - 2px);\"\u003e\u003cdiv style=\"padding:16px;\"\u003e \u003ca href=\"https://www.instagram.com/p/CARbvuYDm3Q/?utm_source=ig_embed\u0026amp;utm_campaign=loading\" style=\" background:#FFFFFF; line-height:0; padding:0 0; text-align:center; text-decoration:none; width:100%;\" target=\"_blank\"\u003e \u003cdiv style=\" display: flex; flex-direction: row; align-items: center;\"\u003e \u003cdiv style=\"background-color: #F4F4F4; border-radius: 50%; flex-grow: 0; height: 40px; margin-right: 14px; width: 40px;\"\u003e\u003c/div\u003e \u003cdiv style=\"display: flex; flex-direction: column; flex-grow: 1; justify-content: center;\"\u003e \u003cdiv style=\" background-color: #F4F4F4; border-radius: 4px; flex-grow: 0; height: 14px; margin-bottom: 6px; width: 100px;\"\u003e\u003c/div\u003e \u003cdiv style=\" background-color: #F4F4F4; border-radius: 4px; flex-grow: 0; height: 14px; width: 60px;\"\u003e\u003c/div\u003e\u003c/div\u003e\u003c/div\u003e\u003cdiv style=\"padding: 19% 0;\"\u003e\u003c/div\u003e \u003cdiv style=\"display:block; height:50px; margin:0 auto 12px; width:50px;\"\u003e\u003csvg width=\"50px\" height=\"50px\" viewBox=\"0 0 60 60\" version=\"1.1\" xmlns=\"https://www.w3.org/2000/svg\" xmlns:xlink=\"https://www.w3.org/1999/xlink\"\u003e\u003cg stroke=\"none\" stroke-width=\"1\" fill=\"none\" fill-rule=\"evenodd\"\u003e\u003cg transform=\"translate(-511.000000, -20.000000)\" fill=\"#000000\"\u003e\u003cg\u003e\u003cpath d=\"M556.869,30.41 C554.814,30.41 553.148,32.076 553.148,34.131 C553.148,36.186 554.814,37.852 556.869,37.852 C558.924,37.852 560.59,36.186 560.59,34.131 C560.59,32.076 558.924,30.41 556.869,30.41 M541,60.657 C535.114,60.657 530.342,55.887 530.342,50 C530.342,44.114 535.114,39.342 541,39.342 C546.887,39.342 551.658,44.114 551.658,50 C551.658,55.887 546.887,60.657 541,60.657 M541,33.886 C532.1,33.886 524.886,41.1 524.886,50 C524.886,58.899 532.1,66.113 541,66.113 C549.9,66.113 557.115,58.899 557.115,50 C557.115,41.1 549.9,33.886 541,33.886 M565.378,62.101 C565.244,65.022 564.756,66.606 564.346,67.663 C563.803,69.06 563.154,70.057 562.106,71.106 C561.058,72.155 560.06,72.803 558.662,73.347 C557.607,73.757 556.021,74.244 553.102,74.378 C549.944,74.521 548.997,74.552 541,74.552 C533.003,74.552 532.056,74.521 528.898,74.378 C525.979,74.244 524.393,73.757 523.338,73.347 C521.94,72.803 520.942,72.155 519.894,71.106 C518.846,70.057 518.197,69.06 517.654,67.663 C517.244,66.606 516.755,65.022 516.623,62.101 C516.479,58.943 516.448,57.996 516.448,50 C516.448,42.003 516.479,41.056 516.623,37.899 C516.755,34.978 517.244,33.391 517.654,32.338 C518.197,30.938 518.846,29.942 519.894,28.894 C520.942,27.846 521.94,27.196 523.338,26.654 C524.393,26.244 525.979,25.756 528.898,25.623 C532.057,25.479 533.004,25.448 541,25.448 C548.997,25.448 549.943,25.479 553.102,25.623 C556.021,25.756 557.607,26.244 558.662,26.654 C560.06,27.196 561.058,27.846 562.106,28.894 C563.154,29.942 563.803,30.938 564.346,32.338 C564.756,33.391 565.244,34.978 565.378,37.899 C565.522,41.056 565.552,42.003 565.552,50 C565.552,57.996 565.522,58.943 565.378,62.101 M570.82,37.631 C570.674,34.438 570.167,32.258 569.425,30.349 C568.659,28.377 567.633,26.702 565.965,25.035 C564.297,23.368 562.623,22.342 560.652,21.575 C558.743,20.834 556.562,20.326 553.369,20.18 C550.169,20.033 549.148,20 541,20 C532.853,20 531.831,20.033 528.631,20.18 C525.438,20.326 523.257,20.834 521.349,21.575 C519.376,22.342 517.703,23.368 516.035,25.035 C514.368,26.702 513.342,28.377 512.574,30.349 C511.834,32.258 511.326,34.438 511.181,37.631 C511.035,40.831 511,41.851 511,50 C511,58.147 511.035,59.17 511.181,62.369 C511.326,65.562 511.834,67.743 512.574,69.651 C513.342,71.625 514.368,73.296 516.035,74.965 C517.703,76.634 519.376,77.658 521.349,78.425 C523.257,79.167 525.438,79.673 528.631,79.82 C531.831,79.965 532.853,80.001 541,80.001 C549.148,80.001 550.169,79.965 553.369,79.82 C556.562,79.673 558.743,79.167 560.652,78.425 C562.623,77.658 564.297,76.634 565.965,74.965 C567.633,73.296 568.659,71.625 569.425,69.651 C570.167,67.743 570.674,65.562 570.82,62.369 C570.966,59.17 571,58.147 571,50 C571,41.851 570.966,40.831 570.82,37.631\"\u003e\u003c/path\u003e\u003c/g\u003e\u003c/g\u003e\u003c/g\u003e\u003c/svg\u003e\u003c/div\u003e\u003cdiv style=\"padding-top: 8px;\"\u003e \u003cdiv style=\" color:#3897f0; font-family:Arial,sans-serif; font-size:14px; font-style:normal; font-weight:550; line-height:18px;\"\u003e View this post on Instagram\u003c/div\u003e\u003c/div\u003e\u003cdiv style=\"padding: 12.5% 0;\"\u003e\u003c/div\u003e \u003cdiv style=\"display: flex; flex-direction: row; margin-bottom: 14px; align-items: center;\"\u003e\u003cdiv\u003e \u003cdiv style=\"background-color: #F4F4F4; border-radius: 50%; height: 12.5px; width: 12.5px; transform: translateX(0px) translateY(7px);\"\u003e\u003c/div\u003e \u003cdiv style=\"background-color: #F4F4F4; height: 12.5px; transform: rotate(-45deg) translateX(3px) translateY(1px); width: 12.5px; flex-grow: 0; margin-right: 14px; margin-left: 2px;\"\u003e\u003c/div\u003e \u003cdiv style=\"background-color: #F4F4F4; border-radius: 50%; height: 12.5px; width: 12.5px; transform: translateX(9px) translateY(-18px);\"\u003e\u003c/div\u003e\u003c/div\u003e\u003cdiv style=\"margin-left: 8px;\"\u003e \u003cdiv style=\" background-color: #F4F4F4; border-radius: 50%; flex-grow: 0; height: 20px; width: 20px;\"\u003e\u003c/div\u003e \u003cdiv style=\" width: 0; height: 0; border-top: 2px solid transparent; border-left: 6px solid #f4f4f4; border-bottom: 2px solid transparent; transform: translateX(16px) translateY(-4px) rotate(30deg)\"\u003e\u003c/div\u003e\u003c/div\u003e\u003cdiv style=\"margin-left: auto;\"\u003e \u003cdiv style=\" width: 0px; border-top: 8px solid #F4F4F4; border-right: 8px solid transparent; transform: translateY(16px);\"\u003e\u003c/div\u003e \u003cdiv style=\" background-color: #F4F4F4; flex-grow: 0; height: 12px; width: 16px; transform: translateY(-4px);\"\u003e\u003c/div\u003e \u003cdiv style=\" width: 0; height: 0; border-top: 8px solid #F4F4F4; border-left: 8px solid transparent; transform: translateY(-4px) translateX(8px);\"\u003e\u003c/div\u003e\u003c/div\u003e\u003c/div\u003e \u003cdiv style=\"display: flex; flex-direction: column; flex-grow: 1; justify-content: center; margin-bottom: 24px;\"\u003e \u003cdiv style=\" background-color: #F4F4F4; border-radius: 4px; flex-grow: 0; height: 14px; margin-bottom: 6px; width: 224px;\"\u003e\u003c/div\u003e \u003cdiv style=\" background-color: #F4F4F4; border-radius: 4px; flex-grow: 0; height: 14px; width: 144px;\"\u003e\u003c/div\u003e\u003c/div\u003e\u003c/a\u003e\u003cp style=\" color:#c9c8cd; font-family:Arial,sans-serif; font-size:14px; line-height:17px; margin-bottom:0; margin-top:8px; overflow:hidden; padding:8px 0 7px; text-align:center; text-overflow:ellipsis; white-space:nowrap;\"\u003e\u003ca href=\"https://www.instagram.com/p/CARbvuYDm3Q/?utm_source=ig_embed\u0026amp;utm_campaign=loading\" style=\" color:#c9c8cd; font-family:Arial,sans-serif; font-size:14px; font-style:normal; font-weight:normal; line-height:17px; text-decoration:none;\" target=\"_blank\"\u003eA post shared by National Geographic (@natgeo)\u003c/a\u003e\u003c/p\u003e\u003c/div\u003e\u003c/blockquote\u003e\n\u003cscript async src=\"//www.instagram.com/embed.js\"\u003e\u003c/script\u003e",
"thumbnail_url": "https://scontent-yyz1-1.cdninstagram.com/v/t51.2885-15/sh0.08/e35/s640x640/97565241_163250548553285_9172168193050746487_n.jpg?_nc_ht=scontent-yyz1-1.cdninstagram.com\u0026_nc_cat=105\u0026_nc_ohc=dnXCQ6urT_gAX99AO01\u0026_nc_tp=24\u0026oh=32b676a618164ab0248e2726767dae14\u0026oe=5FDD8836",
"thumbnail_width": 640,
"thumbnail_height": 427
}
60 changes: 32 additions & 28 deletions spec/lib/onebox/engine/instagram_onebox_spec.rb
Original file line number Diff line number Diff line change
Expand Up @@ -3,24 +3,8 @@
require "spec_helper"

describe Onebox::Engine::InstagramOnebox do
let(:link) { "https://www.instagram.com/p/CARbvuYDm3Q/" }
let(:html) { described_class.new(link).to_html }

before do
fake(link, response("instagram"))
end

it "includes title" do
expect(html).to include('<a href="https://www.instagram.com/p/CARbvuYDm3Q" target="_blank" rel="noopener">@natgeo</a>')
end

it "includes image" do
expect(html).to include("https://scontent.cdninstagram.com/v/t51.2885-15/fr/e15/s1080x1080/97565241_163250548553285_9172168193050746487_n.jpg?_nc_ht=scontent.cdninstagram.com&amp;_nc_cat=105&amp;_nc_ohc=dN9OLDXIp88AX8OhjJy&amp;oh=fe23f001b0997b3a73f72fae3e0ef91f&amp;oe=5FBA2690")
end

it "includes description" do
expect(html).to include("National Geographic on Instagram: “Photo by Pete McBride @pedromcbride")
end
let(:access_token) { 'abc123' }
let(:link) { "https://www.instagram.com/p/CARbvuYDm3Q" }

it 'oneboxes links that include the username' do
link_with_profile = 'https://www.instagram.com/bennyblood24/p/CARbvuYDm3Q/'
Expand All @@ -34,26 +18,46 @@
expect(onebox_klass.name).to eq(described_class.name)
end

# Sometimes Instagram sends back responses with the `description` in a different format.
# Perhaps some form of A/B testing? Make sure we handle those cases.
context 'alternate response' do
let(:link) { "https://www.instagram.com/p/CByPkaHAhaA/" }
let(:html) { described_class.new(link).to_html }
context 'with access token' do
let(:api_link) { "https://graph.facebook.com/v9.0/instagram_oembed?url=#{link}&access_token=#{access_token}" }

before do
fake(link, response("instagram_alternative"))
fake(api_link, response("instagram"))
end

after(:each) do
Onebox.options = { facebook_app_access_token: nil }
end

it "includes title" do
expect(html).to include('<a href="https://www.instagram.com/p/CByPkaHAhaA" target="_blank" rel="noopener">@picturesontv</a>')
Onebox.options = { facebook_app_access_token: access_token }
html = described_class.new(link).to_html

expect(html).to include('<a href="https://www.instagram.com/p/CARbvuYDm3Q" target="_blank" rel="noopener">@natgeo</a>')
end

it "includes image" do
expect(html).to include("https://instagram.fykz2-1.fna.fbcdn.net/v/t51.2885-15/e35/s1080x1080/104690885_607568746536223_3426942535883552192_n.jpg?_nc_ht=instagram.fykz2-1.fna.fbcdn.net&amp;_nc_cat=107&amp;_nc_ohc=2fS_olBgk34AX_eyFqt&amp;_nc_tp=15&amp;oh=d4364e8f3476a3d6065f67f374aa26b1&amp;oe=5FCA29BA")
Onebox.options = { facebook_app_access_token: access_token }
html = described_class.new(link).to_html

expect(html).to include("https://scontent.cdninstagram.com/v/t51.2885-15/sh0.08/e35/s640x640/97565241_163250548553285_9172168193050746487_n.jpg")
end
end

context 'without access token' do
let(:api_link) { "https://api.instagram.com/oembed/?url=#{link}" }
let(:html) { described_class.new(link).to_html }

it "includes description" do
expect(html).to include("@picturesontv on Instagram: “Every day is a day of new opportunities....")
before do
fake(api_link, response("instagram_old"))
end

it "includes title" do
expect(html).to include('<a href="https://www.instagram.com/p/CARbvuYDm3Q" target="_blank" rel="noopener">@natgeo</a>')
end

it "includes image" do
expect(html).to include("https://scontent-yyz1-1.cdninstagram.com/v/t51.2885-15/sh0.08/e35/s640x640/97565241_163250548553285_9172168193050746487_n.jpg")
end
end
end
4 changes: 3 additions & 1 deletion templates/instagram.mustache
Original file line number Diff line number Diff line change
Expand Up @@ -8,4 +8,6 @@
</div>
{{/image}}

<div class="instagram-description">{{description}}</div>
{{#description}}
<div class="instagram-description">{{description}}</div>
{{/description}}

0 comments on commit 1cfbb2b

Please sign in to comment.