Should we merge the Stream and Channel interfaces? #959

njsmith · 2019-03-01T09:26:37Z

This is a weird/radical idea, but @oremanj's comment here (in response to @Badg) raised some red flags for me:

And the channel interface is nicer than the stream one for incremental processing -- async for chunk in channel: rather than
while True:
    chunk = await stream.receive_some(ARBITRARILY_CHOSEN_POWER_OF_TWO)
    if not chunk:
        break
    ...

At the conceptual level, the output from a process is exactly a Stream (or ReceiveStream or whatever), and if people are trying to jump through hoops to avoid using our Stream ABC to represent the Stream concept, then that seems like a bad sign!

So, let's at least go through the thought experiment: what if we got rid of Stream and used Channel[bytes] instead?

Basic usability

Remembering which is a "stream" and which is a "channel" is super annoying. Merging them would eliminate this problem. Also annoying: constantly going through the pointless ritual of inventing a made-up buffer size (@oremanj's ARBITRARILY_CHOSEN_POWER_OF_TWO). And writing that while True loop over and over is also annoying. Making it a plain await channel.receive() or async for chunk in channel: would eliminate these annoyances.

Conceptual level

For me, the major conceptual difference is that I've thought of Channel as inherently preserving object boundaries, i.e., whatever I pass to send is what comes out of receive. In this way of thinking, a Stream is equivalent to a Channel[single_byte], but since handling single bytes individual would be inefficient, it uses batched-send and batched-receive operations. If we do decide to merge Stream and Channel, then we'd have to change this, and start saying that some Channels don't preserve the 1-to-1 mapping between send and receive.

I'm not sure how I feel about this. It's certainly doable on a technical level. But conceptually – it feels weird to say that a websocket and a TCP socket are both Channel[bytes], given that one is framed and the other isn't – that's a fundamental difference in their usage. (Right now one is Channel[bytes] and the other is Stream.) It would mean Uint32Framing adaptor doesn't convert a Stream into a Channel[bytes], it converts a Channel[bytes] into another Channel[bytes]. And that a TCP socket and a UDP socket have the same type. Intuitively this feels weird. It seems like this is a distinction you want to expose, and emphasize, on the type level.

An interesting partial counter-example would be h11: an h11.Connection object is essentially a Channel[h11.Event]: you send a sequence of objects like h11.Request, h11.Data, h11.EndOfMessage, and then receive a sequence of similar objects. Sometimes, the objects on the sender and receiver sides match 1-to-1, like Request and EndOfMessage. But sometimes they don't, like Data, which might be arbitrarily rechunked! So if you want to treat h11.Connection as a Channel[h11.Event], it's sort of simultaneously a 1-to-1 Channel and also a re-chunking Channel.

One possibility is to distinguish them somehow at the type level, but make them more consistent, or identical, in terms of the operations they happen to implement.

In Liskov terms, a 1-to-1/framed Channel[bytes] IS-A rechunking/unframed Channel[bytes] – all it does is add stronger guarantees. So merging them would at least have that going for it. But I don't put a huge amount of weight on that – in practice they're used very differently. Actually, they are not Liskov-compatible – see below!

Technical level

Currently we have:

Stream: send_all, wait_send_all_might_not_block, send_eof, receive_some
Channel: send, send_nowait, receive, receive_nowait, clone, iteration

Problematic bits:

wait_send_all_might_not_block: we actually have an issue open about possibly getting rid of this: Should send_all automatically do wait_send_all_might_not_block after sending the data? #371. If we do that would remove this problem :-)
send_eof: we could add this to a bidirectional Channel, it wouldn't be too weird
*_nowait: we've been considering moving these to memory channels specifically, instead of the generic Channel interface
clone: we've been considering dropping this (same link as for *_nowait)

So all those might get sorted out? And send_all and send are already basically the same. So that just leaves async def receive() versus async def receive_some(max_nbytes). And max_nbytes is also the obstacle to having iteration. ...basically this is THE core distinction between the two APIs. So, what do we think about max_nbytes.

Specifying max_nbytes manually all the time is tiresome and annoying, as noted above.

Also, I note that Twisted/Asyncio/libuv always handle max_nbytes internally, and the user just deals with whatever they get.

Most Stream users basically want to read everything, and the only thing max_nbytes effects is efficiency, not correctness. In practice it's almost always set arbitrarily. I've never even seen anyone even benchmarking different values, except in extreme cases like trying to transmit multiple gigabytes/second through python. For SocketStream, there's some penalty for setting it too big – Python has to first allocate a max_nbytes-sized buffer, then realloc it down to size (see). And of course if you set it too small then you pay some overhead from doing lots of small recvs instead of one big one. So you want some kind of "not too big, not too little" setting.

For other Stream implementations, this doesn't apply – for example, SSLStream.receive_some forces you to pass max_nbytes, and that controls how many bytes it reads out of its internal decrypted data buffer at any one time, but this has no effect at all on how much data it reads at a time from the underlying socket, when it needs to refill its buffer. That's controlled by the constructor argument SSLStream(max_refill_bytes=...).

There are also cases where there is a "natural" size to return from receive_some. For example:

in most applications, SSLStream might as well return whatever data has already been decrypted and is sitting in memory, instead of spending instructions messing around with buffers.
A GunzipStream might as well return whatever data it got from decompressing the last chunk it read. (This can avoid some non-trivial complications: urllib3.response.GzipDecoder is accidentally quadratic, which allows a malicious server to DoS urllib3 clients urllib3/urllib3#1467)
When reading from the Windows console, the underlying representation is natively unicode. This means that when we want to read bytes, we have to transcode into utf8, which in turn means that it may be impossible to read less than 4 bytes at a time (at least without nasty buffering tricks)

Given that most people don't tune it at all, I bet if we did a bit of benchmarking then we could pick a default SocketStream recv size that would work better that 99% of what people currently do. And I guess we'd make this an argument to the SocketStream / SocketChannel constructor, exactly like how SSLStream currently works, so people could override it if they want. This could complicate code where the stream is constructed implicitly though, like p = Process(..., stdout=PIPE) – if you don't want p.stdout to use the default max_nbytes setting, then how do you specify something different? Some options:

We could simply set the default and tell everyone to live with it.
We could add some way to pass this through, like Process(..., stdout=NewPipe(max_nbytes=...)).
We could provide some API to mutate it, like process.stdout.max_nbytes = new_value.
We could tell people with this unusual requirement that they should create their own pipe with whatever settings they want (this functionality is somewhat needed anyway, see Add support for talking to our stdin/stdout/stderr as streams #174, support for windows named pipes #824), then pass in one end by hand.

What about cases where correctness does depend on setting max_nbytes? It can never be the case that setting max_nbytes too small affects correctness, because Stream is already free to truncate max_nbytes to some smaller value if it wants to – no guarantees. But, we do make a guarantee that we won't return more than max_nbytes.

That... actually is important for correctness in some cases. For example, from this comment:

4. We should provide a trio.input, that's like builtins.input but async and routed through our stdio-handling machinery. Probably it just calls receive_some(1) a bunch of times until it sees a newline.

This is why we can't quite think of our current Channel[bytes] as being a sub-interface of Stream – in this one very specific case, Stream genuinely has slightly more functionality.

This is probably a rare occurrence in practice. Most protocols need an internal buffer anyway, so any over-reads just go into the buffer for next time. And sometimes you want to hand-off between protocols, e.g. SSLStream.unwrap, or something like switching from HTTP/1.1 to Websocket... but in those cases we generally don't try to avoid over-reading from the underlying stream. Instead, we just accept that some over-read may have happened, and give it to the user to deal with (example 1, example 2). And in many cases, it's actually impossible to avoid this in any efficient way – e.g. if you have a newline-delimited protocol, then you have no idea where the next line boundary will be, so the only way to avoid over-read is to read one-byte-at-a-time, which is way too inefficient. In theory we could avoid it for TLS (which is length-prefixed), or for other length-prefixed protocols (like Uint32FramedChannel), but it doesn't seem worth it in most cases.

The special thing about trio.input is that it's sharing the process's stdin with who-knows-what-else, so we can't coordinate our buffer usage with other users, and are reduced to this kind of stone-age receive_some(1) technique.

Some options here:

Treat this as a special case for trio.input, and implement it using some specialized tools. E.g., make sure that open_stdin can safely be called repeatedly within a single program and returns different handles that don't interfere with each other, and then have trio.input do async with trio.open_stdin(max_nbytes=1): .... Or provide some low-level receive_some_from_stdin function, or something.
Have some Channel[bytes] implementations where receive takes an optional max_nbytes argument, as a matter of convention.
Same as previous point, but also formalize this convention as a named sub-interface – though I'm having trouble thinking of a good name! This might help with our problem up above, about wanting some more informative way to describe the type of Uint32Framing? But of course proliferating names always has its own cost, especially if the names are awkward.

Also, naming the interface creates an interesting challenge: how do you type StapledChannel? You want StapledChannel.receive to have the same signature as StapledChannel.receive_channel.receive, and at runtime this is easy – just use *args, **kwargs. But if we name this sub-interface, then the proper static type for StapledChannel depends on the static type of its ReceiveChannel. I'm not sure whether giving StapledChannel the right static type matters or not.

The text was updated successfully, but these errors were encountered:

smurfix · 2019-03-01T10:28:02Z

I quite agree about restricting clone and _nowait to MemoryChannels. We don't want that in our generic interface, it's super complex to set up. If you need that, writing a MemoryChannel that works as a clonable and _nowaitable frontispiece of any other channel is simple enough.

I'd argue that the distinguishing element of Channel vs. Stream isn't that one has message boundaries and the other doesn't, but the fact that one transports a multitude of whatever-it-is-you-transport (i.e. not arbitrary-length bytes arrays but a whole lot of single bytes, for which Py3 doesn't even have a distinct type) and the other transports exactly one message (of whatever type) at a time. The intermediate stage, i.e. the single chunk of bytes which encapsulates one MsgPack or protobuf or LF-terminated run of bytes we call a line or LF-terminated run of Unicode characters we also call a line is often hidden inside the StreamChanneler (i.e. the thing that encodes a channel's message to a byte stream).

You never want to read a single byte from a stream (even though external reality sometimes forces you to), and you never want to read multiple objects at a time from a channel (except maybe for super-high performance, like in the old UnboundedQueue, or as an atomic "give me any outstanding messages and then die" operation). The same reasoning applies to sending.

Yes, Uint32Framing converts a stream into a channel. So does LFdelimitedLine. This is by design, and far easier to ensure correct usage of than "if you use a StapledChannel-transmitting-bytes on top of a Stream-of-bytes [i.e. without interposing a Uint32Framing filter] then your code works fine while testing but you'll get screwed as soon as you use it in the real world". It's also far easier to ensure type safety for since we don't need to overload bytes to mean two different things depending on some nebulous context.

oremanj · 2019-03-01T18:36:02Z

Interesting idea! My initial thought is that people (beginners especially) seem to often have trouble understanding that, no really, TCP/TLS/etc doesn't preserve message boundaries, not even a little bit. Having that distinction be obvious in the type system and in the names of the functions you call (especially receive_some) seems like a substantial win for teachability. I'm worried about the "it works on localhost and fails once I put real data through it" possibility that @smurfix mentions.

I think it probably is true that if you want to iterate through "messages" on a stream, correctness-focus says you should have to specify how those messages are delimited. If we have an interface/mechanism/library for that, it becomes trivial to have an ArbitraryChunksFramer (assuming we refer to this concept as a Framer - I sort of like Protocol too but that might make people think of asyncio/twisted too much) that implements the receive_some loop discussed above. We could separately make the argument to receive_some() optional, and put some thought into choosing a good default.

A compromise approach might be to make Stream mostly duck-type-compatible with Channel (rename send_all to send, receive_some to receive, make the max_bytes argument optional, and add async iteration) but not have either one be a subtype of the other.

oremanj · 2019-03-02T00:03:36Z

Another factor to consider: how would all of this interact with passing credentials or FDs over a UNIX domain socket? These show up at a specific byte offset in the stream (though it gets fuzzy if they're sent alongside more than one byte of normal data -- from experimentation on Linux, if you send the control message alongside multiple bytes of data, the receiver will get it in the recvmsg() call that consumes the first byte of that data, and a single call to recvmsg() won't bridge the gap between the last byte of that data and the first byte after it). I guess there's a similar consideration with TCP urgent data, though I don't know if anyone actually uses that.

njsmith · 2019-03-04T03:08:58Z

@njsmith

Most Stream users basically want to read everything, and the only thing max_nbytes affects is efficiency, not correctness.

On further thought, there is another wrinkle to setting the max_nbytes size that I didn't mention up above: large values → more buffer → less fine-grained backpressure and more bufferbloat. For many applications, a few tens of kilobytes of buffering are negligible, but it's certainly possible to construct cases where it matters.

@smurfix

I'd argue that the distinguishing element of Channel vs. Stream isn't that one has message boundaries and the other doesn't, but the fact that one transports a multitude of whatever-it-is-you-transport (i.e. not arbitrary-length bytes arrays but a whole lot of single bytes, for which Py3 doesn't even have a distinct type) and the other transports exactly one message (of whatever type) at a time.

Right, that's how we think of it right now. If we switched to using Channel for things like TCP streams, then we would have to switch to thinking in terms of boundaries instead.

@oremanj

Another factor to consider: how would all of this interact with passing credentials or FDs over a UNIX domain socket? [...] I guess there's a similar consideration with TCP urgent data, though I don't know if anyone actually uses that.

I think there are outside the scope of the abstract stream/channel/whatever interface? Certainly the way Trio works right now, you can do those things, but not using SocketStream – you have to drop down to the full-fledged trio.socket layer instead.

I guess the case where this might be tricky is if you want to use SocketStream most of the time, and only drop down to trio.socket when you have to, AND if there's some reason why you have to use a bounded recv call in order to manage that switch-over between the high-level and low-level interfaces. I don't know enough about the SCM_* and URG APIs to even make a guess about that...

So I think we can divide the stuff in this thread into two major ideas.

First major idea

Maybe our bytestream interface would be more friendly if we made max_nbytes optional, and implemented __aiter__.

If we do this, then there's an open question about whether we should make max_nbytes optional for the consumer but mandatory for the stream implementor (the ABC's signature is receive_some(max_nbytes=None)), or make it optional for both (the ABC's signature is receive_some(), but some concrete implementations add a max_nbytes=None argument). One place where this matters:

Second major idea

Maybe we should somehow connect streams and channels more closely in terms of names/types/concepts.

Along these lines, here's another possibility to think about: rename Stream → ByteChannel, while keeping roughly the same API as it has now. Document it as "conceptually, it's a specialized channel where you send and receive individual bytes, but for efficiency and convenience the API is built around batched-send and batched-receive".

SendByteChannel/ReceiveByteChannel are kinda awkward names, but I guess most people will barely ever encounter those.

If we go this way, then should EOF be indicated by receive_some returning b"", or by raising EndOfChannel?

smurfix · 2019-03-04T08:41:07Z

On 04.03.19 04:08, Nathaniel J. Smith wrote: Maybe we should somehow connect streams and channels more closely in terms of names/types/concepts

Well … I'm still convinced that a clean separation of "one thing at a time" and "multiple things at a time, without a boundary between them" makes a lot of sense, conceptually as well as for type safety and whatnot. I can't offhand think of any situation where you'd want to use one in place of the other. Also: How would you distinguish between a `Stream[bytes]` and a `Channel[bytes]`, if `Stream` and `Channel` end up being the same class?

Frankly I consider that a Unix wart. I mean, no data available on a stream raises an error (`EAGAIN`), but a closed stream returns an empty string?? Also, on a packetized bytestream (which Unix doesn't have, historically), how do you distinguish between EOF and an empty packet? The other way around seems way more logical to me.

…

-- -- mit freundlichen Grüßen -- -- Matthias Urlichs

njsmith · 2019-03-04T17:24:02Z

Well … I'm still convinced that a clean separation of "one thing at a time" and "multiple things at a time, without a boundary between them" makes a lot of sense, conceptually as well as for type safety and whatnot.

What do you think of ByteChannel vs Channel, as a way to emphasize both the similarities and differences?

Frankly I consider that a Unix wart. I mean, no data available on a stream raises an error (EAGAIN), but a closed stream returns an empty string??

EAGAIN is a weird quirk from retrofitting non-blocking operation onto an originally blocking model... also you can't exactly use b"" to replace EAGAIN in like, a connect call :-).

But anyway, for blocking operations, which is what Trio uses as a model, Unix and C and Python are all consistent about using b"" to indicate EOF. I guess this comes from files (where it makes total sense that calling read when the file pointer is at EOF returns b""). For streaming data, it is a bit strange, but it's such a strong tradition that I hesitate to break from it...

I guess the other question is, which approach leads to more convenient code in common cases. Supporting async for would erase a lot of the difference, so i guess we're looking specifically for cases where you wouldn't use async for? The one that comes to mind is in wrappers, like SSLStream or sans-io adapter code, where the pattern is that occasionally while doing some other operation you have to call some sort of refill_buffer helper.

Also, on a packetized bytestream (which Unix doesn't have, historically), how do you distinguish between EOF and an empty packet?

Heh, I was just looking at this. Linux actually has two not-quite-standard packetized byte streams: Unix domain sockets with SOCK_SEQPACKET and pipes with O_DIRECT. With SOCK_SEQPACKET, you can send a zero-byte packet and AFAICT for the receiver it's reported exactly the same way as EOF, whoops. With O_DIRECT pipes, zero-byte sends are documented as forbidden, and in practice they're silently discarded.

njsmith · 2019-03-04T17:31:50Z

But anyway, for blocking operations, which is what Trio uses as a model, Unix and C and Python are all consistent about using b"" to indicate EOF. I guess this comes from files (where it makes total sense that calling read when the file pointer is at EOF returns b""). For streaming data, it is a bit strange, but it's such a strong tradition that I hesitate to break from it...

Thinking about this more, I realized that there's also a very straightforward conceptual justification for ByteChannel.receive_some using b"" to indicate end-of-channel.

Think of ByteChannel.receive_some as being a loop that makes multiple calls to Channel[single_byte].receive. What should it do if it's looping along and then Channel[single_byte].receive raises EndOfChannel? Obviously it can't let the exception escape, because then it will lose all of the bytes it gathered on previous passes. Really the only sensible thing to do is to swallow the exception, break out of the loop, and return all the bytes that it gathered before getting EndOfChannel.

OK, so then what happens if we take this logic, and apply it to a case where the first call to Channel[single_byte].receive raises EndOfChannel? We automatically get receive_some returning b"".

Another way to think of it: receive_some returns an iterable of single bytes. The guarantee is that the iterable you get is a somewhat-greedy sub-iterable of the underlying Channel[single_byte]. So the relevant thing isn't the behavior of Channel.receive, it's the behavior of async for _ in channel_obj. And when iterating a Channel, end-of-channel is indicated by terminating the iteration.

JefffHofffman · 2019-03-10T04:07:18Z

Hiya,
I'm new to trio and GitHub. I'm competent in writing Twisted apps, but kinda shaky on internals. While trying to make a naive ConsoleStream (stdio, stdout) I ran into a Stream/Channel mixup and wound up here. Since smurfix already brought up half of the topics in my head, I figured this is the place to chime in. If my noob question belongs elsewhere, please let me know.

I'm unable to make a LoopbackStream that connects to my ConsoleStream. The desired result is a simple echo server in memory without TCP. Type in a line, hit enter, line is redisplayed, repeat. I can wire up Console to do this by itself but my goal is to later replace the 'server' logic with something non-trivial. Seems desirable to have server logic for a stream be independent of what pumps the stream. Is there an easy solution I missed? I'm about to force Memory*Channel into a StapledStream and that doesn't feel right.

a Channel carries objects while a Stream carries bytes.

Totally missed it. First thing I did with trio was make a Stream do the frame problem, receive many type As and send many type Bs, kinda like Twisted's bytes->string NetstringReceiver. I'd like trio to have one base class so I can easily stack these transforms (Protocols). I can see how one (Streams?) was dominated by getting multiple bytes to/from the OS and the other (Channel) dealt with single chunks (objects). So yes, it seems that Sockets, Process, Files and MemoryQueues can all share top-level calls. Not sure if making all options possible reaps the benefits of sharing.

What fits my head is an app facing baseclass with receive_some(N) that only returns with 1 to N things, possibly bytes or unichars or dicts or .... The lower layers seem a better place to deal withif not data -> OkNotYet or DisconnError. (Why not lean on trio's excellent exception handling?)
send_all(iterable) works like any other iterable in that send_all([msg]) is used for one mutil-byte message. Only the implementation determines when if for i in iterable is sensible. No reason to bake it into the ABC.

Anyhoo, I'm rambling. I'm way excited about what trio has and can accomplish. Once I get out of the kiddie pool I'll attempt to contribute more than freshman opinions.

njsmith · 2019-03-11T09:27:34Z

@JefffHofffman Hey, welcome to Trio! 👋

There's a basic philosophical difference between Twisted and Trio that I suspect might be tripping you up. In Twisted, usually it's Twisted that takes charge of making things actually happen. Your job is to build the car, and then Twisted drives it. In Trio, it's much more like "regular" Python, where if you want something to happen, you write a loop, or call a function, or something – you have to drive your car yourself :-).

This doesn't mean you can't separate your protocol parsing logic from your stream pumping logic – see #796, and sans-io.readthedocs.io. But you will generally have some loop somewhere.

In Trio's approach, "connecting" two streams doesn't make a lot of sense... you could write a loop to proxy between them, but for an echo server it'd be a lot simpler to just proxy the original stream's input to its output directly :-).

The main way we do abstraction is composition: if you want to add TLS encryption to a stream, you wrap it in an SSLStream object, and then when you call the send_all or receive_some methods on the SSLStream, it performs the appropriate operations on the underlying transport stream. If you want to speak websocket over a stream, you can wrap a trio_websocket.WebSocketConnection object around it, and then you call methods like websocket_obj.send_message, websocket_obj.get_message. For the equivalent of twisted's NetstringReceiver, you want an object that wraps around a Stream, and has methods to send/receive individual frames. (This is discussed more in #796.)

BTW, if you're interested in console I/O, you might want to check out #174, which is our tracking issue for console I/O support in Trio. (The last comment in particular has a summary of what needs to happen.)

What fits my head is an app facing baseclass with receive_some(N) that only returns with 1 to N things, possibly bytes or unichars or dicts or .... The lower layers seem a better place to deal withif not data -> OkNotYet or DisconnError. (Why not lean on trio's excellent exception handling?)

Hmm, adding batched send_all and receive_some to channels in general is an interesting idea. But, I don't think it would help with unifying streams and channels... for a Stream, the return type of receive_some is a single bytes object. For a Channel[bytes], the return type of receive_some would be a list of bytes objects, like [b"message 1", b"message 2", ...].

JefffHofffman · 2019-03-12T01:21:10Z

Excellent links. Thx! I think I hit the framing problem early on and attempted to make an internal Stream do send_all(b"line\n") and receive_some(1) -> str("line"). I thought it could be used not only as a two pronged plug into the OS, but also as an internal wire to do conversion (Queue with different in/out types). Probably not the intended usage.

I had no difficulties with pump-it-yourself approach. I think I was aiming for composition with a more functional style pump = dot(S1.f, S2.g). Composing by wrapping/subclassing is another pump = lambda x: S1.f(S2.g(x)) a la decorators and super().

Is "blah" one string or a list-y iterable? Seems like Stream says it's a list and uses extend() type calls. Channel says it's one and uses append() style on a list of strings. I'm used to Python letting me read/write either way on the same thing. There can be specialized code to deal with nested lists for desired char/word/sentence logic, but the underlying object at each level (iterable list) is the same so there's not a different read and write mechanism for every level.

Perhaps since Channel can do nesting, it's not a big deal. I'll look at the framing topic on how to convert.

smurfix · 2019-03-13T09:27:32Z

Another thought (which I meant to post a week ago but forgot to):

Separating BytesChannel and Channel does imply that a filter that transcodes arbitrary strings to bytes can't be used as a filter that translates Unicode lines to byte lines. However, I'd argue that this is a non-problem because we obviously want to preserve framing when reading lines, while bytes-chunk boundaries may split a UTF-8 sequence. Translating lines takes a five-line class – byte buffers, on the other hand, require a sans-IO-encapsulation of a codecs.IncrementalDecoder.

Yes it's very convenient to be able to use the same(-sounding) methods for both BytesChannel and Channel but IMHO they're conceptually different and thus should have different names. They also have way different typical usages. For a BytesChannel (aka Stream), receive_some and send_all are the essential basic methods for data transfer, while on a Channel these would be "receive/send multiple objects at a time" convenience methods which I won't need or use – I receive messages with async for … and I (almost) never have more than one ready to send anyway.

JefffHofffman · 2019-03-14T06:12:51Z

Occasionally I work with a Subscriber that receives many small messages. Like the rationale for not wanting to go through await for each byte, there's a use case to process these messages in a batch from one await. Having *_many() convenience methods that awaits each one might negate the benefits of batch processing. My guess is *_one() convenience methods have far less impact. But yeah, I can see how a tad extra for the majority of cases might be a bad move.

I'll try out Channel[list of msg]. I'm just afraid it won't play nice with all the handy modules dominated by send_one and Channel[msg].

esnyder · 2019-05-30T15:51:33Z

Hey all,

Raw newbie to trio here. (Hi Nathaniel, long time since monotone :) )

FWIW, and since I didn't see anyone else directly respond to the first idea / second idea suggestion here, I really like the first and am lukewarm at best about the second.

For the first, it feels like the perfect little extra affordance; most of the time I don't have any idea what max_nbytes should be and am extremely unlikely to do the testing to figure it out. Having the async for loop just work and take that decision off my hands is perfect.

As for major idea two; I think there is too much painful history and collective knowledge around all the arcana of unix socket behavior that the existing streams interface is heir to. Turning everything into channels makes it that much harder for people to reason about how it matches up under the covers.

Trio looks super nice; thanks to all the contributors!

@njsmith
...(snip)...
So I think we can divide the stuff in this thread into two major ideas.

First major idea

Maybe our bytestream interface would be more friendly if we made max_nbytes optional, and implemented __aiter__.

If we do this, then there's an open question about whether we should make max_nbytes optional for the consumer but mandatory for the stream implementor (the ABC's signature is receive_some(max_nbytes=None)), or make it optional for both (the ABC's signature is receive_some(), but some concrete implementations add a max_nbytes=None argument). One place where this matters:

Second major idea

Maybe we should somehow connect streams and channels more closely in terms of names/types/concepts.

This came out of discussion in python-triogh-959

- Remove max_refill_bytes from SSL (python-trio/trio#959) - Add synchronous close and context managers to the memory channels (python-trio/trio#1797)

njsmith added design discussion potential API breaker labels Mar 1, 2019

Badg mentioned this issue Mar 4, 2019

Subprocess support: run_process() #872

Merged

oremanj added the communication label May 4, 2019

njsmith added a commit to njsmith/trio that referenced this issue Jun 25, 2019

Streams are iterable + receive_some doesn't require an explicit size

ee4cedb

This came out of discussion in python-triogh-959

This was referenced Jun 25, 2019

Streams are iterable + receive_some doesn't require an explicit size #1123

Merged

Should we replace send/receive with some other verbs? #1125

Open

njsmith mentioned this issue Sep 13, 2019

Names for communication ABCs #1208

Open

agronholm mentioned this issue Mar 22, 2020

Narrow the API gap between AnyIO and Trio agronholm/anyio#105

Closed

richardsheridan mentioned this issue Oct 31, 2020

Use Windows message-oriented pipes to create Channels #1782

Closed

4 tasks

takluyver mentioned this issue Nov 10, 2020

Fine-tuning channels #719

Open

goodboy mentioned this issue Mar 4, 2021

Bidirectional streaming? goodboy/tractor#53

Open

tjstum added a commit to python-trio/trio-typing that referenced this issue Aug 5, 2021

Update stubs for trio 0.18.0

994e434

- Remove max_refill_bytes from SSL (python-trio/trio#959) - Add synchronous close and context managers to the memory channels (python-trio/trio#1797)

tjstum mentioned this issue Aug 5, 2021

Update stubs for trio 0.18.0 python-trio/trio-typing#38

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Should we merge the Stream and Channel interfaces? #959

Should we merge the Stream and Channel interfaces? #959

njsmith commented Mar 1, 2019

smurfix commented Mar 1, 2019 •

edited

oremanj commented Mar 1, 2019

oremanj commented Mar 2, 2019

njsmith commented Mar 4, 2019

smurfix commented Mar 4, 2019 via email

njsmith commented Mar 4, 2019

njsmith commented Mar 4, 2019

JefffHofffman commented Mar 10, 2019

njsmith commented Mar 11, 2019

JefffHofffman commented Mar 12, 2019

smurfix commented Mar 13, 2019 •

edited

JefffHofffman commented Mar 14, 2019

esnyder commented May 30, 2019

First major idea

Second major idea

Should we merge the Stream and Channel interfaces? #959

Should we merge the Stream and Channel interfaces? #959

Comments

njsmith commented Mar 1, 2019

Basic usability

Conceptual level

Technical level

smurfix commented Mar 1, 2019 • edited

oremanj commented Mar 1, 2019

oremanj commented Mar 2, 2019

njsmith commented Mar 4, 2019

First major idea

Second major idea

smurfix commented Mar 4, 2019 via email

njsmith commented Mar 4, 2019

njsmith commented Mar 4, 2019

JefffHofffman commented Mar 10, 2019

njsmith commented Mar 11, 2019

JefffHofffman commented Mar 12, 2019

smurfix commented Mar 13, 2019 • edited

JefffHofffman commented Mar 14, 2019

esnyder commented May 30, 2019

First major idea

Second major idea

smurfix commented Mar 1, 2019 •

edited

smurfix commented Mar 13, 2019 •

edited