Thoughts on Postcard and writing a network protocol

2024-05-14

ℹ️ These are misc notes and thoughts towards figuring out to handle the diverse transports that postcard-rpc may traverse. As of 2024-05-14, USB is the most robustly supported transport, but we'll want to expand this to support serial transport, and particularly support cases where one side (the client) isn't necessary on a std based OS.

One way to approach this is the traditional-ish OSI or TCP/IP layered model (flavored in my personal opinion):

bonus:

Thoughts

We need a "pick and mix" approach to layering. Some transports contain many necessary pieces "out of the box", but require re-fills of framing.

TCP uses a sliding window and connection oriented synchronization.

https://en.wikipedia.org/wiki/Sliding_window_protocol is a good overview of possible techniques we could use.

It would also be interesting to know what UDP-based protocols like H3 do for guaranteed delivery over a packet medium as well.

https://bsky.app/profile/jamesmunns.com/post/3ksgn5l55wc2s - asking if anyone knows anything about HTTP3 flow control.

Really, I could build something totally separate to postcard-rpc, which would be a frame oriented network protocol. The goal would be to have "stubbable" and "pick and mix" layers that could expand to the necessary depth.

Interfaces

Rather than officially layering everything, it might be reasonable to define a specific client and server API working on sockets + frames. Postcard-rpc is currently a session-less protocol. It would be possible to refactor this to use keys as essentially ports, but I think ports and endpoints tend to have overlapping responsibilities.

I'm unsure how we separate "session-ful" and "session-less" interfaces, as flow control is likely coupled to session state. It isn't like postcard-rpc is REALLY session-less, but rather "implicit single session". This is necessary when postcard-rpc has to bring all of its own semantics, but perhaps not necessary otherwise.

Issues with frames

One issue with frames in the absence of an allocator is the lifetime of variable sized data. On the desktop it is very easy to think of frames as Vec<u8> downstream of ser/de, but this is more challenging on embedded systems.

The two ways I have approached this are:

  1. "callback style" handling of incoming messages, where all handling and capture of the borrowed frame are handled downstream of a single borrow. We decode the frame, then pass a &[u8] or similar, never allowing the caller to own the data.
    • PROS: There is little wasted space, and you only ever need a single message frame size as scratch memory — e.g. if your largest frame is 256 bytes, you only need a single 256 byte buffer to hold "in flight" incoming messages.
    • CONS: This forces a certain usage pattern, and is one reason we need to handle dispatching on the server side in a macro that expands to a single function where we use callbacks - callback style dispatching.
  2. "psuedo alloc" handling of incoming messages, using a slab allocator to provide N ownable Vec<u8>-alikes, backed by some max frame size.
    • PROS: This allows for much more flexible handling, particularly in the face of routing, where we might want to pass on the buffer to another interface.
    • CONS: This requires many multiples of the max frame size, and in cases where most messages are smaller than the max frame size (e.g. 32 bytes of 256 bytes max), the utilization ratio is even worse, as even used frames are only partially used. Any attempt to solve this brings the "pseudo alloc" solution to "general allocator" complexity.

Additionally, this doesn't consider the potential of retransmissions - if this is necessary it might require additional buffer space as well.

Interfaces again

I think it's worth thinking about the difference between "channel style" interfaces vs. "callback style" interfaces.

If we use "channel style" interfaces, it might be necessary to borrow the receiver, likely meaning we need some sort of "split" interface. It's unclear how this would handle concurrency with the network interface, which might need to concurrently modify a different part of the buffer.

I believe this is how smoltcp works, specifically with the RxToken and TxToken interfaces, which borrow internal buffers: https://docs.rs/smoltcp/latest/smoltcp/phy/index.html.

If we use "callback style" interfaces, like the pingora-proxy trait, it might be possible to express similar message layer handling similar to how we currently handle the dispatcher, with a slightly expanded scope, as it is already using callback style dispatching.

In the channel style, I think the "stack" would look something like this:

I actually think this shape is reasonable for both callback and socket based APIs, and even in the callback style, we likely end up getting a frame in one call, and passing it to the next. However, this might be problematic wrt lifetimes, if we need to borrow self for the lifetime of Frame<'_>, e.g.:

let frame = self.interface.recv().await?;
// is self already borrowed by `frame` here?
self.dispatch(frame).await?;

This could be handled in concrete types by splitting lifetimes, but we may need to have separate interface and dispatch types, something like:

let PostcardRpc { interface, dispatcher, .. } = self;

// frame borrows from interface (SocketReceiver)
let frame = interface.recv().await?;

// dispatch is its own thing
dispatcher.dispatch(frame).await?;

This would force my hand to have a concrete type like this:

pub struct PostcardRpc<I, D>
where
    I: Interface<'_>,
    D: Dispatcher<'_>,
{
    interface: I,
    dispatcher: D,
}

which is probably reasonable, though is less coherent than asking a user to provide a single trait implementation. However, I can't think of a way of allowing the user to provide handlers in a way that ISN'T this rough callback shape.

Counterpoint: Just use TCP

One thing to consider as a baseline: does building a bespoke stack ACTUALLY provide any material resource benefit (CPU, RAM, Code Size) or UX benefit over "just" using TCP?

TCP doesn't usually run over "any given interface", like USB or UARTs or SPI or I2C, but projects like SatCat5 show that the approach of encapsulating standard TCP over various interfaces is likely tenable, and could be possible by augmenting smoltcp and by extension embassy-net or similar crates.

This should be considered as a reasonable and possibly preferrable alternative, when making a case for a new network stack to support postcard-rpc's transport.

Also worth considering is something like 6LoWPAN, a specialization of a subset of IPv6 designed for lightweight devices, and could also potentially serve as a reasonable transport even over wired interfaces.