Streaming Support in WebSocket#

Tags: websockets, streaming

In my clone of the websocket-sharp project I wanted to add support for infinite streams as defined by the websocket RFC. The original project would eagerly consume any received message in order to provide the payload either as a string or a byte array.

This works well will small payloads and has the advantage that the received message can be multicast to many event subscribers without worrying about corruptive mutation of the received data. The downside of course is that it doesn't work well with large payloads because you would have to assign memory to hold the payload.

Websockets transmit messages in one or more frames. Despite the terms message and frame, the websocket protocol is essentially a streaming protocol. Each frame can transfer up to 9,223,372,036,854,775,807 bytes (due to the fact that the protocol allows for a 63bit length indicator). Each message can be made up of an infinite amount of frames. This makes the fact that the websockets API is event based, with each event signalling a new message, somewhat wrong. If there are multiple subscribers to the event, then it becomes necessary to buffer the message to avoid subscribers corrupting each others event data (as described above).

If one accepts that messages can potentially be infinite length (an infinite amount of finite length frames) then one should consider whether a websocket should allow more than one subscribers because in reality the event should pass a websocket message with a stream of data instead of buffered data.

From the RFC:

   The primary purpose of fragmentation is to allow sending a message
   that is of unknown size when the message is started without having to
   buffer that message.  If messages couldn't be fragmented, then an
   endpoint would have to buffer the entire message so its length could
   be counted before first byte is sent.  With fragmentation, a server
   or intermediary may choose a reasonable size buffer, and when the
   buffer is full write a fragment to the network.

In my current changes to the WebSocket# code I created a WebSocketDataStream class which exposes the payload from a continuous sequence of frames as a single .NET Stream. The benefit is that the class respects the potentially infinite length of a websocket message while still keeping a low memory footprint. The project currently uses a 10kB buffer, but will probably be made configurable. The WebSocketDataStream class is expected to be instantiated when a stream reader, in this case a WebSocketStreamReader, identifies that there is a message which has a payload. The length of the initial payload is passed to the stream's constructor along with a delegate to load the payload from consecutive frames. When the stream is read it will consume the payload of a thread as long as there is data and then move on until the next frame's payload until a final frame is signalled.

In this way it becomes transparent to the caller that data is stitched together from multiple frames. The memory footprint is also kept low because there is no need to buffer the payload data.

The downside is that if there are multiple subscribers to the socket, then they will all read from the same stream, thus corrupting each others data. Another downside is that an event subscriber could potentially receive a message but never consume the payload data, in which case he has effectively blocked further reading from the websocket stream.

However if one accepts that there can only be one subscriber for each websocket's events then there will be no shared stream reading and the subscriber would only be preventing itself from receiving further messages by not reading forward in the incoming stream. So the API will probably change to restrict each websocket to a single subscriber. I'd appreciate comments, opinions or suggestions about this through the project.

Latest Tweets