Fix 4006 reconnect#536
Conversation
There was a problem hiding this comment.
Pull request overview
Fixes voice reconnect behavior when Discord rejects Resume with close code 4006 by falling back to Identify, and ensures the audio pipeline restarts correctly when a new UDP socket/session is established after reconnect.
Changes:
- Voice gateway reconnect: treat 4006 as retryable and inject
open()viaopenFuncfor testability. - UDP transport: close any prior UDP
net.Connbefore dialing a new one to prevent socket leaks and unblock stale readers. - Audio pipeline: restart sender/receiver on
SessionDescriptionwhen previously configured; adds focused unit tests for both gateway and conn behaviors.
Reviewed changes
Copilot reviewed 5 out of 5 changed files in this pull request and generated 2 comments.
Show a summary per file
| File | Description |
|---|---|
voice/gateway.go |
Allows reconnect retry on 4006 and introduces injectable openFunc. |
voice/gateway_test.go |
Adds tests verifying 4006 retry behavior and non-reconnectable close codes. |
voice/udp_conn.go |
Closes previous UDP connection before re-dialing to avoid leaks/stale reads. |
voice/conn.go |
Restarts audio sender/receiver on SessionDescription when configured. |
voice/conn_test.go |
Adds tests validating audio pipeline restart behavior on SessionDescription. |
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
| c.audioReceiver.Open() | ||
| } | ||
|
|
||
| c.openedChan <- struct{}{} |
There was a problem hiding this comment.
handleMessage always sends to openedChan on every SessionDescription. Since openedChan is only read once by Conn.Open, subsequent reconnects (Identify path) will leave the channel full and the next c.openedChan <- struct{}{} will block the gateway message handler, potentially deadlocking voice reconnects. Consider making this send non-blocking (e.g. select { case c.openedChan <- struct{}{}: default: }) or otherwise draining/resetting the channel on reconnect.
| c.openedChan <- struct{}{} | |
| select { | |
| case c.openedChan <- struct{}{}: | |
| default: | |
| } |
| // Close any existing connection so that readers (e.g. audio receiver) | ||
| // get an error and can be restarted on the new connection. | ||
| if u.conn != nil { | ||
| _ = u.conn.Close() | ||
| } |
There was a problem hiding this comment.
After closing an existing u.conn during Open, consider also setting u.conn = nil and making UDPConn.Close()/deadline/addr methods nil-safe. If DialContext (or the IP discovery write/read) fails, callers may still invoke Close() (e.g. during teardown), and udpConnImpl.Close() currently dereferences u.conn unconditionally, which can panic when u.conn is nil.
|
this PR isn't ready as pointed out by Copilot. Needs more testing around shared state in different goroutines |
|
possible better fix: #537 |
When a voice gateway drops and the reconnect's Resume is rejected with close code 4006 ("Session is no longer valid"), doReconnect treated it as non-reconnectable and gave up. This tore down the entire voice connection and the bot stopped receiving audio.
Gateway fix (voice/gateway.go): Exempt 4006 from the non-reconnectable bail-out in doReconnect. Since open() already clears resume state (ssrc/seq) on failure, the retry naturally sends Identify instead of Resume. Also made open() injectable (openFunc) for testability.
Audio pipeline fix (voice/conn.go, voice/udp_conn.go): The Identify path produces a new Ready → new UDP socket, but the audio sender/receiver goroutines were still reading from the old one. Two changes: