[MRG+1] Import scurl if found installed #110

malloxpb · 2018-07-24T17:59:33Z

Hey @lopuhin, I made this PR to run scurl test on w3lib itself. Let me know if there's something that can be improved here 😄

malloxpb · 2018-07-24T18:47:11Z

I noticed that there's an error on py33 in travis build. I think it's because tox does not support py33 anymore (https://tox.readthedocs.io/en/latest/install.html) Can you take a look at this for me when you have a chance @kmike ? 😄

codecov · 2018-07-24T19:21:57Z

Codecov Report

Merging #110 into master will decrease coverage by 0.07%.
The diff coverage is 100%.

@@            Coverage Diff             @@
##           master     #110      +/-   ##
==========================================
- Coverage   95.43%   95.35%   -0.08%     
==========================================
  Files           7        7              
  Lines         482      474       -8     
  Branches       98       95       -3     
==========================================
- Hits          460      452       -8     
  Misses         15       15              
  Partials        7        7

Impacted Files	Coverage Δ
w3lib/url.py	`97.96% <100%> (-0.08%)`	⬇️
w3lib/encoding.py	`100% <0%> (ø)`	⬆️

malloxpb · 2018-07-24T20:20:14Z

travis also gives the same error on master branch: https://travis-ci.org/nctl144/w3lib

lopuhin

Hey @nctl144 great progress on this, please have a look at comments. I'll also kindly ask @kmike to have a look at the PR :)

lopuhin · 2018-07-25T07:02:11Z

w3lib/url.py

-                       query,
-                       fragment))
+try:
+    from scurl import canonicalize_url


Another option would be to leave canonicalize_url definition unchanged, and have

try: from scurl import canonicalize_url except ImportError: pass

after it, or having def _canonicalize_url and canonicalize_url = _canonicalize_url instead of pass. This would allow to get rid of extra indent level and simplify git annotate, but I'd defer the judgment to @kmike here.

Hey @lopuhin , I just modified the code to go with the approach which changes canonicalize_url to _canonicalize_url. But we'll see how @kmike reviews this 😄

lopuhin · 2018-07-25T07:12:06Z

.travis.yml

@@ -1,5 +1,6 @@
 language: python
-sudo: false
+sudo: required


I think it would be better to add a separate environment where scurl is installed instead of adding it to all environments, because adding scurl makes all builds several minutes slower, and also because now we're always using canonicalize_url from scurl (since it's installed everywhere), while we want to test both implementations. It's possible to override sudo, dist and other options just for one environment. Would be also good to add a comment that this environments is added to test scurl's implementation of canonicalize_url somewhere in the travis.yml.

I just added the env for scurl only. However, do we need to run it on both py36 and py27 @lopuhin ? 🤔

lopuhin · 2018-07-25T07:25:23Z

w3lib/url.py

-                       fragment))
+try:
+    from scurl import canonicalize_url
+except ImportError as e:


I wonder if there is some way to make sure that we are in fact running tests with scurl? It's not obvious from the coverage report (https://codecov.io/gh/scrapy/w3lib/pull/110/diff?src=pr&el=tree#D1-396) - for some reason it shows python implementation of canonicalize_url as covered, while we installed scurl in all environments, so it should not be covered, right? We could at least run python -c "from scurl import canonicalize_url" before the tests, but maybe there are better ways to check it.

if we run python -c "from scurl import canonicalize_url" in tox then probably it will print out ImportError for those envs that don't have scurl installed right?

it will print out ImportError for those envs that don't have scurl installed right?

Yes, I meant to do this only when running scurl tests, so that we know that scurl at least imports, and it's supposed to be used (would be great to know that for sure though).

what if we use echo to print out the message we need in tox, @lopuhin ? 😄

@nctl144 I see what you mean by echo now, this is not what I had in mind :)
Right now we have this bit in w3lib.url:

try: from scurl import canonicalize_url except ImportError as e: canonicalize_url = _canonicalize_url

Now suppose that in scurl environment this import from scurl import canonicalize_url fails for some reason: e.g. the scurl library installed but failed to import due to some .so issues, or we moved the canonicalize_url, or some other reason. But we won't notice this in the tests as they stand now, because w3lib will fall back to the python implementation, and all tests will pass. So I would like for tests in scurl environment to fail if we fall back to pure python w3lib. One way to do it is to add python -c "from scurl import canonicalize_url" to the scurl environments. Or maybe even better would be to make sure that w3lib.url.canonicalize_url is coming from scurl, but will probably need a separate small script.

oh I see your point now @lopuhin , I will just add the command python -c "from scurl import canonicalize_url" for now since it's the shortest way to detect import failure. But we will see what people's opinions on this are 😄

lopuhin

I like how the code is structured now, the only bit I'd like to get resolved before the merge would be some way to make sure we are using scurl (or that scurl at least can be imported): #110 (comment)

malloxpb · 2018-07-26T16:31:53Z

Hey @lopuhin , I just added the echo command in the scurl envs so that we know that it's supposed to be installed and tested in SCURL testing env. :-) Let me know what you think!

lopuhin · 2018-07-31T07:35:29Z

I just added the echo command in the scurl envs so that we know that it's supposed to be installed and tested in SCURL testing env. :-) Let me know what you think!

@nctl144 sorry that I didn't respond right away, please see #110 (comment)

malloxpb · 2018-07-31T22:17:27Z

@lopuhin I just changed the echo command to python -c "from scurl import canonicalize_url" to make sure it's imported successfully 😄

lopuhin

Looks good to me, thanks @nctl144 !

Gallaecio · 2019-08-14T08:48:07Z

@nctl144 I’m sorry that it has been a year. Do you think you will have time to resolve the current conflicts?

malloxpb · 2019-09-03T14:18:30Z

Hey @Gallaecio , I will try to take a look into it and I will let you know as soon as I can :) Sorry for the delay in response

yozachar · 2022-07-20T06:45:29Z

Bumping to close outdated PR.

import scurl if it's installed

6e41c6f

add scurl to tox

0eec6dd

malloxpb added 2 commits July 24, 2018 14:28

move scurl to default env

d3584e8

update travis to support cython compile

bc2fa07

lopuhin reviewed Jul 25, 2018

View reviewed changes

lopuhin requested a review from kmike July 25, 2018 07:27

malloxpb added 4 commits July 25, 2018 12:34

keep function indented

157fe9d

add scurl as a separate env

1f855d5

add default deps to scurl env

63fee06

run travis for scurl on py27

700960d

lopuhin reviewed Jul 26, 2018

View reviewed changes

print out a message says scurl is being tested

3721c2c

import scurl in tox to see if it's imported

0d5743f

malloxpb mentioned this pull request Aug 1, 2018

Integrate scurl into w3lib and scrapy scrapy/scurl#16

Open

lopuhin approved these changes Aug 1, 2018

View reviewed changes

lopuhin changed the title ~~Import scurl if found installed~~ [MRG+1] Import scurl if found installed Aug 1, 2018

malloxpb added 2 commits August 1, 2018 14:11

change the package location

c71d83d

remove Cython from the deps test

4b8f1b9

malloxpb mentioned this pull request Oct 1, 2018

Make sure that canonicalize_url is not different from that of w3lib scrapy/scurl#30

Closed

Merge branch 'master' into scurl

346722a

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[MRG+1] Import scurl if found installed #110

[MRG+1] Import scurl if found installed #110

malloxpb commented Jul 24, 2018

malloxpb commented Jul 24, 2018 •

edited

codecov bot commented Jul 24, 2018 •

edited

malloxpb commented Jul 24, 2018

lopuhin left a comment

lopuhin Jul 25, 2018

malloxpb Jul 25, 2018 •

edited

lopuhin Jul 25, 2018

malloxpb Jul 25, 2018

lopuhin Jul 25, 2018

malloxpb Jul 25, 2018

lopuhin Jul 26, 2018

malloxpb Jul 26, 2018

lopuhin Jul 31, 2018

malloxpb Jul 31, 2018

lopuhin left a comment

malloxpb commented Jul 26, 2018

lopuhin commented Jul 31, 2018

malloxpb commented Jul 31, 2018

lopuhin left a comment

Gallaecio commented Aug 14, 2019

malloxpb commented Sep 3, 2019

yozachar commented Jul 20, 2022

[MRG+1] Import scurl if found installed #110

Are you sure you want to change the base?

[MRG+1] Import scurl if found installed #110

Conversation

malloxpb commented Jul 24, 2018

malloxpb commented Jul 24, 2018 • edited

codecov bot commented Jul 24, 2018 • edited

Codecov Report

malloxpb commented Jul 24, 2018

lopuhin left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

malloxpb Jul 25, 2018 • edited

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

lopuhin left a comment

Choose a reason for hiding this comment

malloxpb commented Jul 26, 2018

lopuhin commented Jul 31, 2018

malloxpb commented Jul 31, 2018

lopuhin left a comment

Choose a reason for hiding this comment

Gallaecio commented Aug 14, 2019

malloxpb commented Sep 3, 2019

yozachar commented Jul 20, 2022

malloxpb commented Jul 24, 2018 •

edited

codecov bot commented Jul 24, 2018 •

edited

malloxpb Jul 25, 2018 •

edited