We’ve spent considerable amount of time over the last half year preparing to migrate our chat system to AWS EC2 servers. We’re finally ready to start the transition (and identify any misconfigurations / new scaling concerns in the new environment).
As this will affect third-party developers significantly, I want to
a) make you aware
b) ask for suggestions to make this transition less painful for you
Our current rollout plans are as follows:
Migrate channels one-by-one on a whitelist basis, partners / staff who opt-in to the risks of using the new cluster
As our confidence increases that things are working correctly, slowly add more Twitch partners to the whitelist
Once all Twitch partners are on the whitelist, transition all channels to use the new servers
I recognize that for many of you, querying the servers list API on initialization of your bot may be a burden (particularly for very large bots). We could consider returning an invalid cluster message in response to JOIN if you’ve joined a channel on the wrong cluster.
Another note that you may have noticed, is that I did not mention anything about event chat. Our hope is that moving to EC2 will allow us to maintain a single chat cluster for all channels. This is a hypothesis we’re planning to test before we migrate the very largest channels to the new servers, so hopefully that works out.
There are a handful (~10) of staff accounts which are on the new cluster this weekend. We’re hoping to compile a list of issues, resolve them, and start reaching out to partners early next week to see if any are willing to opt-in to test the new servers.
I suspect this week it will happen fairly slow, probably less than 50 channels in total and probably not all at once. The following week we’ll probably move significantly faster – but this all depends on what issues we find and how difficult they are to resolve.
That currently won’t happen – but one of my goals for this thread to identify any changes we should make to support bots deal with this transition easier.
FWIW, assuming our script works, we shouldn’t be transitioning a channel’s chat cluster while they are actively streaming, only while they are offline.
The people opting in will just simply then not be able to use the bot. Having things on separate clusters just creates a ton of issues, especially when you have to query it on a channel-basis, for hundreds of thousands of channels. And that’s not even taking into account the amount of code that must be in place to manage two different clusters. (it’s a ton)
More importantly, what’s the timeframe here? How long will this be in place? And are you saying the partners will not be able to opt-out?
The goal will be to get to a single cluster as quickly as possible. We don’t want to distribute the load to the new servers too quickly, or we risk the stability of chat. I’d guess it’d take 2 weeks to migrate all channels.
At some point every channel is moving to these new servers (you can’t opt-out), though at that point we hope there is only a single cluster.
You should refer to the link that @brildum posted to check which server you should connect to for your target channel(s) since I imagine connecting to the new IP just results in no chat for the channels you join
shrug my bots do the call at boot and pick one of the available servers… So makes no real difference to me (that said mine are generally single channel kids)
Maybe a compromise? An IRC command that tells you the optimal IP to connect to?
Connect to a regular server
USER > SERVER: INQUIRE #channel (or w/e you want the command to be)
SERVER < USER: IP|DNS
APPLICATION: (Checks if IP/DNS is correct, if not switches connection.)
What is this, JTVIRC all over again? remembers the horror that was JTVIRC server REDIRECTs
Joking aside, as long as this is just temporary I would be fine with logic like this. Just include in the SERVERCHANGE command the cluster it’s swapping to (and maybe change the command to CLUSTERCHANGE, because I sure do hope we’re not going back to channels being on individual servers [a JTVIRC horror story in the making]). Ideally I could have all clusters open such that I could swap at whim between them without needing to fetch server lists per channel when given a cluster change command.
Regarding mass move when testing is complete, please do so in small batches such that JOIN/PART IP rate limits are not a problem. Thanks.