Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We鈥檒l occasionally send you account related emails.

Already on GitHub? Sign in to your account

[MRG+1] Import scurl if found installed #110

Open
wants to merge 13 commits into
base: master
Choose a base branch
from

Conversation

malloxpb
Copy link
Member

Hey @lopuhin, I made this PR to run scurl test on w3lib itself. Let me know if there's something that can be improved here 馃槃

@malloxpb
Copy link
Member Author

malloxpb commented Jul 24, 2018

I noticed that there's an error on py33 in travis build. I think it's because tox does not support py33 anymore (https://tox.readthedocs.io/en/latest/install.html) Can you take a look at this for me when you have a chance @kmike ? 馃槃

@codecov
Copy link

codecov bot commented Jul 24, 2018

Codecov Report

Merging #110 into master will decrease coverage by 0.07%.
The diff coverage is 100%.

@@            Coverage Diff             @@
##           master     #110      +/-   ##
==========================================
- Coverage   95.43%   95.35%   -0.08%     
==========================================
  Files           7        7              
  Lines         482      474       -8     
  Branches       98       95       -3     
==========================================
- Hits          460      452       -8     
  Misses         15       15              
  Partials        7        7
Impacted Files Coverage 螖
w3lib/url.py 97.96% <100%> (-0.08%) 猬囷笍
w3lib/encoding.py 100% <0%> (酶) 猬嗭笍

@malloxpb
Copy link
Member Author

travis also gives the same error on master branch: https://travis-ci.org/nctl144/w3lib

Copy link
Member

@lopuhin lopuhin left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hey @nctl144 great progress on this, please have a look at comments. I'll also kindly ask @kmike to have a look at the PR :)

query,
fragment))
try:
from scurl import canonicalize_url
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Another option would be to leave canonicalize_url definition unchanged, and have

try:
    from scurl import canonicalize_url
except ImportError:
    pass

after it, or having def _canonicalize_url and canonicalize_url = _canonicalize_url instead of pass. This would allow to get rid of extra indent level and simplify git annotate, but I'd defer the judgment to @kmike here.

Copy link
Member Author

@malloxpb malloxpb Jul 25, 2018

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hey @lopuhin , I just modified the code to go with the approach which changes canonicalize_url to _canonicalize_url. But we'll see how @kmike reviews this 馃槃

.travis.yml Outdated
@@ -1,5 +1,6 @@
language: python
sudo: false
sudo: required
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think it would be better to add a separate environment where scurl is installed instead of adding it to all environments, because adding scurl makes all builds several minutes slower, and also because now we're always using canonicalize_url from scurl (since it's installed everywhere), while we want to test both implementations. It's possible to override sudo, dist and other options just for one environment. Would be also good to add a comment that this environments is added to test scurl's implementation of canonicalize_url somewhere in the travis.yml.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I just added the env for scurl only. However, do we need to run it on both py36 and py27 @lopuhin ? 馃

fragment))
try:
from scurl import canonicalize_url
except ImportError as e:
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I wonder if there is some way to make sure that we are in fact running tests with scurl? It's not obvious from the coverage report (https://codecov.io/gh/scrapy/w3lib/pull/110/diff?src=pr&el=tree#D1-396) - for some reason it shows python implementation of canonicalize_url as covered, while we installed scurl in all environments, so it should not be covered, right? We could at least run python -c "from scurl import canonicalize_url" before the tests, but maybe there are better ways to check it.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

if we run python -c "from scurl import canonicalize_url" in tox then probably it will print out ImportError for those envs that don't have scurl installed right?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

it will print out ImportError for those envs that don't have scurl installed right?

Yes, I meant to do this only when running scurl tests, so that we know that scurl at least imports, and it's supposed to be used (would be great to know that for sure though).

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

what if we use echo to print out the message we need in tox, @lopuhin ? 馃槃

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@nctl144 I see what you mean by echo now, this is not what I had in mind :)
Right now we have this bit in w3lib.url:

try:
    from scurl import canonicalize_url
except ImportError as e:
    canonicalize_url = _canonicalize_url

Now suppose that in scurl environment this import from scurl import canonicalize_url fails for some reason: e.g. the scurl library installed but failed to import due to some .so issues, or we moved the canonicalize_url, or some other reason. But we won't notice this in the tests as they stand now, because w3lib will fall back to the python implementation, and all tests will pass. So I would like for tests in scurl environment to fail if we fall back to pure python w3lib. One way to do it is to add python -c "from scurl import canonicalize_url" to the scurl environments. Or maybe even better would be to make sure that w3lib.url.canonicalize_url is coming from scurl, but will probably need a separate small script.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

oh I see your point now @lopuhin , I will just add the command python -c "from scurl import canonicalize_url" for now since it's the shortest way to detect import failure. But we will see what people's opinions on this are 馃槃

@lopuhin lopuhin requested a review from kmike July 25, 2018 07:27
Copy link
Member

@lopuhin lopuhin left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I like how the code is structured now, the only bit I'd like to get resolved before the merge would be some way to make sure we are using scurl (or that scurl at least can be imported): #110 (comment)

@malloxpb
Copy link
Member Author

Hey @lopuhin , I just added the echo command in the scurl envs so that we know that it's supposed to be installed and tested in SCURL testing env. :-) Let me know what you think!

@lopuhin
Copy link
Member

lopuhin commented Jul 31, 2018

I just added the echo command in the scurl envs so that we know that it's supposed to be installed and tested in SCURL testing env. :-) Let me know what you think!

@nctl144 sorry that I didn't respond right away, please see #110 (comment)

@malloxpb
Copy link
Member Author

@lopuhin I just changed the echo command to python -c "from scurl import canonicalize_url" to make sure it's imported successfully 馃槃

Copy link
Member

@lopuhin lopuhin left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good to me, thanks @nctl144 !

@lopuhin lopuhin changed the title Import scurl if found installed [MRG+1] Import scurl if found installed Aug 1, 2018
@Gallaecio
Copy link
Member

@nctl144 I鈥檓 sorry that it has been a year. Do you think you will have time to resolve the current conflicts?

@malloxpb
Copy link
Member Author

malloxpb commented Sep 3, 2019

Hey @Gallaecio , I will try to take a look into it and I will let you know as soon as I can :) Sorry for the delay in response

@yozachar
Copy link

Bumping to close outdated PR.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

4 participants