Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add WebSocketsSansIOProtocol #2540

Open
wants to merge 7 commits into
base: master
Choose a base branch
from
Open

Add WebSocketsSansIOProtocol #2540

wants to merge 7 commits into from

Conversation

Kludex
Copy link
Member

@Kludex Kludex commented Dec 14, 2024

No description provided.

@Kludex
Copy link
Member Author

Kludex commented Dec 14, 2024

@aaugustin Do you have availability to review this in the next days?

@Kludex
Copy link
Member Author

Kludex commented Dec 14, 2024

Seems like I need to drop Python 3.8 to make my life easier. 👀

@aaugustin
Copy link
Contributor

I'm making a note to look into it tomorrow. If I don't find time tomorrow, then I'll do over the Christmas break.

requirements.txt Outdated Show resolved Hide resolved
Comment on lines +144 to +145
if event.opcode == Opcode.CONT:
self.handle_cont(event)
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Recommendation on how to test this? The other implementations don't have the continuation frame 🤔

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This happens when a message is fragmented. See https://datatracker.ietf.org/doc/html/rfc6455#section-5.4 for context. In short:

  • the first frame is TEXT or BINARY
  • every subsequent frame is CONT
  • the last frame has the FIN bit set

This is correctly implemented in handle_cont. As far as I can tell, ASGI doesn't support fragmented WebSocket messages so you have to buffer and reassemble.


For comparison, websockets provides both options — buffer and reassemble the message (if you don't care) or receive it frame by frame (if you need streaming). If you're curious, it's implemented here.

Copy link
Contributor

@aaugustin aaugustin left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not very familiar with ASGI & uvicorn so I focused my review on the WebSocket protocol.

Your usage of websockets is straightforward.

I noticed that:

  • You aren't detecting the case when data_to_send() returns the empty bytestring to tell you to half-close the connection; that could be OK if you're confident that you're closing the transport in every scenario where the connection is terminating.
  • You aren't using the close_expected API to check whether you should close the TCP connection; that may be harmless on the server side because the server is expected to initiate the TCP closing handshake.

@@ -104,7 +106,7 @@ async def websocket_app(scope: Scope, receive: ASGIReceiveCallable, send: ASGISe
elif message["type"] == "websocket.disconnect":
break

async def open_connection(url):
async def open_connection(url: str):
async with websockets.client.connect(url) as websocket:
return websocket.open
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

return websocket.state is websockets.protocol.State.OPEN

Cf. https://websockets.readthedocs.io/en/stable/howto/upgrade.html#open-and-closed

Comment on lines +144 to +145
if event.opcode == Opcode.CONT:
self.handle_cont(event)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This happens when a message is fragmented. See https://datatracker.ietf.org/doc/html/rfc6455#section-5.4 for context. In short:

  • the first frame is TEXT or BINARY
  • every subsequent frame is CONT
  • the last frame has the FIN bit set

This is correctly implemented in handle_cont. As far as I can tell, ASGI doesn't support fragmented WebSocket messages so you have to buffer and reassemble.


For comparison, websockets provides both options — buffer and reassemble the message (if you don't care) or receive it frame by frame (if you need streaming). If you're curious, it's implemented here.

from typing import Any, Literal, cast
from urllib.parse import unquote

from websockets import InvalidState
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

from websockets.exceptions import InvalidState

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Rationale: I recommend that you avoid shurtcut imports going forwards.


extensions = []
if self.config.ws_per_message_deflate:
extensions = [ServerPerMessageDeflateFactory()]
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Default settings lead to high-memory usage with marginal performance benefits e.g. 316KiB RAM per connection instead of 64KiB for an additional 10% compression in a basic benchmark.

FYI websockets defaults to:

            ServerPerMessageDeflateFactory(
                server_max_window_bits=12,
                client_max_window_bits=12,
                compress_settings={"memLevel": 5},
            )

See here for much more than you ever wanted to know.

extensions=extensions,
max_size=self.config.ws_max_size,
logger=logging.getLogger("uvicorn.error"),
)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Are you checking the Origin header elsewhere in uvicorn? If not, supporting it in the config would be good, as it helps defend against Cross-Site WebSocket Hijacking attacks. (I have forgotten about the details of this attack.)


def connection_lost(self, exc: Exception | None) -> None:
code = 1005 if self.handshake_complete else 1006
self.queue.put_nowait({"type": "websocket.disconnect", "code": code})
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In a normal closing handshake scenario where you generate a "websocket.disconnect" event when you receive a close frame, it looks like this will generete a second "websocket.disconnect" event when the TCP connection terminates, which sounds incorrect. Ideally, you'd generate that event only if you haven't generated one already.

self.transport.close()

def eof_received(self) -> None:
pass
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I would expect self.conn.receive_eof() here.

data_type = self.curr_msg_data_type
msg: WebSocketReceiveEvent
if data_type == "text":
msg = {"type": "websocket.receive", data_type: self.bytes.decode()}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You may want to handle the case when decode() raises an exception. In that case, websockets closes the connection with code 1007 (invalid data).

Invalid UTF-8 is relatively uncommon so you could choose to ignore it. I forgot this case in the new asyncio implementation and it took some time to detect the bug.


def handle_ping(self, event: Frame) -> None:
output = self.conn.data_to_send()
self.transport.write(b"".join(output))
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I would suggest that you always do this in handle_events() rather than only in handle_xxx() methods where you believe that websockets sent an automatic response. I'm not sure how much of a different it will make in practice but it feels cleaner.

@Kludex Kludex mentioned this pull request Dec 26, 2024
11 tasks
@ilyakamens
Copy link

Hi! I'm just leaving a comment to say that I'm running into the issue this fixes, and I'm greatly anticipating this being merged. Let me know if there's anything I can do to help.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants