simplify typecasting #215

atdt · 2011-11-21T06:20:29Z

The current mechanism for specifying how redis responses should be cast, set_response_callback, makes it needlessly difficult to implement serialization.

To illustrate, consider the redis commands SDIFF and SINTER. The first thing to note is that it'd be very improbable for someone to want different casting functions. It's far likelier that anyone who wants to control how SDIFF is cast wants the same implementation for SDIFF, SINTER, SMEMBERS and SUNION. In the off-chance that someone does need that kind of granularity, it seems seems more sensible for them to subclass the client and override the method whose behavior they want to modify.

Next, consider that someone writing a casting function for set commands is probably writing something like this:

def set_callback(resp):
    return {parse_item(item) for item in resp}

In other words, they're probably not interested in changing the collection datatype: redis sets map naturally onto Python sets. They're probably simply interested in transforming the value of individual elements. In that case, they're likely to want the same behavior for redis lists, but they can't reuse the function above. They need to write another function, using list comprehension rather than set comprehension.

Next, consider that if someone is interested in transforming the value of individual elements, they probably also want that transformation applied to the response of commands that return a single item, like GET, HGET, SPOP, etc. But annoyingly these don't appear in response_callbacks, because (IIRC) redis-py has an implicit rule that the response to any command that doesn't have a callback specified is returned as-is.

So it seems to me that set_response_callback is excessively granular in allowing you to define behavior for one command, whereas redis commands fall into natural groupings.

I think it'd be better to separate configuration of redis<->python type mapping into two levels:
#1. Individual values

For example, the init method could accept to_python and to_redis params which are expected to be callables, and which redis-py will call with the value to transform it. This way you'd be able to do something like:

import json
from redis import Redis

redis = Redis(to_python=json.loads, to_redis=json.dumps)

#2. Collections

These wouldn't be kwargs to init, because I think it's important and natural to stress convention over configuration here. Hashes are dicts; sets are sets; lists are lists. But if someone is dead-set on changing this mapping, there is a way: the class has a collection_map attribute that is initialized to this:

class Redis:
    # ...
    collection_map = {
        'set': set,
        'hash': string_pairs_to_dict,
        'sortedset': ordereddict_from_sortedset,
        # ..etc...
    }

Where the keys of collection_map are the canonical names for redis's data types and the values are callables that take an iterable and produce a Python object of the right type.

The text was updated successfully, but these errors were encountered:

andymccurdy · 2011-11-28T19:54:03Z

Thanks for the thoughts.

The callbacks were required to support pipelines (casting delayed results). Since it was easy, I made that system extensible in case anyone wanted to tweak things. I personally don't tweak any of the response callbacks in any of my application code. So I agree that they're not incredibly useful for customization, but they need to be defined somewhere, and if it makes someones life easier to leave them exposed, great.

I do agree that it'd be nice to make the individual data items easier to serialize/deserialize. I see a number of subtle issues, however.

Values that aren't encoded. I might have a bunch of data that's json encoded, but I probably also likely have other pieces of unencoded data, too. Do I need a separate client instance to get the raw data?
How do you deal with data pointers? For instance, maybe I have a ZSET or SET or LIST of "pointers" to other keys. Those other keys store JSON data, or are hashes, but I only store the ID in the collections. Do I get the list of IDs using one client and feed those IDs to another client to get the real data? That seems unintuitive.

My first inclination is that this kind of logic should be defined in a layer above the client. Perhaps some kind of light-weight object mapper where you can define your objects with fields that know how to serialize/deserialize values. Something like Django's model definitions without any of the relational stuff.

Thoughts?

github-actions · 2020-09-01T00:33:29Z

This issue is marked stale. It will be closed in 30 days if it is not updated.

github-actions bot added the Stale label Sep 1, 2020

github-actions bot closed this as completed Oct 1, 2020

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

simplify typecasting #215

simplify typecasting #215

atdt commented Nov 21, 2011

andymccurdy commented Nov 28, 2011

github-actions bot commented Sep 1, 2020

simplify typecasting #215

simplify typecasting #215

Comments

atdt commented Nov 21, 2011

andymccurdy commented Nov 28, 2011

github-actions bot commented Sep 1, 2020