Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Why not use https://github.com/andymccurdy/redis-py? #2

Closed
foxx opened this issue Jul 1, 2016 · 4 comments
Closed

Why not use https://github.com/andymccurdy/redis-py? #2

foxx opened this issue Jul 1, 2016 · 4 comments

Comments

@foxx
Copy link

foxx commented Jul 1, 2016

Any reason why there is a need for another Python Redis library? Defacto standard is https://github.com/andymccurdy/redis-py

@schlitzered
Copy link
Owner

schlitzered commented Jul 1, 2016

Hey, the above lib has some design flaws

  1. it uses the select system call to check if there is data to read.
    using select in this case is problematic. the main problem is, that you cannot have a file descriptor > 1024 with select, any FD with a bigger number will make select throw an exception:
    see these issues:
    Update connection polling to use something other than select.select() redis/redis-py#486
    redis-py is not compatible with select.select() redis/redis-py#419
    the other problem is, that it is simply not required, setting a timeout on the socket would do the same thing.

with redis cluster & many clients, reaching this 1024 FD limit, will become a serious problem.

  1. it is implementing the pipeline mode wrong.
    what redis-py is doing:
  2. create a pipeline
  3. thow many requests into a list
  4. execute query -> push all requests in that list to the wire
  5. fetch all results

this will delay execution of redis command, until the pipelined commands get "executed"

if you check redis pipeling documentation here:
http://redis.io/topics/pipelining

you will find that pipeling should work like this:

you send a number of requests directly to redis, and you delay fetching of the results, and you should also take care of not having to much commands in the pipe, not more like 100 or something like this:

so you have no real pipeling with redis-py. at least, not how it was intended to work.

  1. redis-py is not supporting redis cluster.
    to be honest. i only started this lib, because i needed redis cluster support. i tried to add redis cluster to redis-py and quickly found the flaw with using the select system call.

*. there have also been some other things that did not look that appealing to me, as far as i remember, there have also been some strange decisions in the PubSub client, that i did not feel comfortable with.

the other thing is that i was curious, on how hard it could be to write a redis client implementation.

besides this, this client is also twice as fast doing simple set & get commands, but is guess this will most of the time never be a problem.

hope this answers your questions.

@foxx
Copy link
Author

foxx commented Jul 1, 2016

Interesting, I wasn't aware of those issues, thank you for taking time to respond in detail!

@foxx foxx closed this as completed Jul 1, 2016
@yohanboniface
Copy link

this client is also twice as fast doing simple set & get commands

Can you share the benchmark? I'm evaluating using this lib instead of redis-py, and was thinking of doing a small speed test, but given you seem to have done it yet ;)

@schlitzered
Copy link
Owner

hey, seems like this is something i have to correct. at least i am not able to get the 2x improvement (not close to):

here is the setup: redis running on the same host as the client. doing 1000000 sets, gets & deletes, using python 3.5.1.:

without hiredis:
redis plain set: time taken 29.69188094139099
redis plain get: time taken 28.177817583084106
redis plain del: time taken 28.794076681137085
pyredis plain set: time taken 26.752699851989746
pyredis plain get: time taken 25.01215624809265
pyredis plain del: time taken 23.110271692276

with hiredis:
redis plain set: time taken 27.971192598342896
redis plain get: time taken 23.039212226867676
redis plain del: time taken 22.663814067840576
pyredis plain set: time taken 20.268144607543945
pyredis plain get: time taken 19.419324159622192
pyredis plain del: time taken 18.260671138763428

but at least it is still a little bit faster.

what i also notices is that on my machine, the redis client was taking most of the time something around 90% CPU, while pyredis was something around 80% CPU usage.

but feel free to test for your own, speed will vary depending on key length, data length and what commands are used.

here is the script i used for doing the benchmark:

https://gist.github.com/schlitzered/191be6cba050d77d283db366abc47a73

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants