Server Crash Mitigation
Posted: 2019-01-02 23:15
I've noticed that recently (on HOG at least), server crashes occur almost exclusively on map changes. At the same time, the server seems to recover almost immediately and load the starting map in the rotation.
Fundamentally there are two problems with a server crash on map load. First, the clients are disconnected and need to manually reconnect. Second, a different map than intended is loaded (rotation starting-map). Both of these cause players to stop playing out of frustration, prematurely killing the server for the day.
I know the root cause of server crashes is unknown, but would it be possible to have the client and server work together to mitigate them? For example, here is how I assume the current system works:
Normal Transition
Fundamentally there are two problems with a server crash on map load. First, the clients are disconnected and need to manually reconnect. Second, a different map than intended is loaded (rotation starting-map). Both of these cause players to stop playing out of frustration, prematurely killing the server for the day.
I know the root cause of server crashes is unknown, but would it be possible to have the client and server work together to mitigate them? For example, here is how I assume the current system works:
Normal Transition
- Win condition reached; server sends clients round-complete signal.
- Server sends clients next-map info.
- Clients begin loading next-map.
- Server begins loading next-map.
- Clients finish loading next-map and reconnect to server.
- Win condition reached; server sends clients round-complete signal.
- Server sends clients next-map info.
- Clients begin loading next-map.
- Server crashes, restarts, begins loading starting-map.
- Clients finish loading next-map and try to reconnect to server.
- Server is in a state unexpected by the client (different map), or maybe there's a queued disconnect message, so the client disconnects from the server.
- Win condition reached; server sends clients round-complete signal.
- Server sends clients new soft-disconnect signal.
- Server sets starting-map to next-map, and intentionally restarts.
- Clients stay at round-complete screen and silently query for server-available.
- On server-available, clients connect to newly-restarted server as fresh clients. (on timeout, client does a full disconnect)