So I’m looking to create a multi-channel bot, and I have some basic questions:
In developing IRC bots before, I’ve found that most networks throttle outbound messages. So for example, if you wanted to make a bot that greeted users as they came in the channel, you would be fine as long as X users per second was not exceeded, where X is the maximum messages allowed per second.
Inbound traffic seems like it could be a bit of a nightmare for bots like Moobot. Am I overthinking that? This is a bot that is in thousands of channels, with some channels having tends of thousands of users. Moobot has to parse each and every one of those messages for valid input, right? That seems overwhelming from a networking perspective, but perhaps I am not giving enough credit to modern networking.
Are there any recommended frameworks to start with, or do most people tend to start programming a bot from socks networking up (utilizing socks libraries for your language of choice, of course)?
I’m sure more questions will come to me, but thank you for taking the time to read and address therse.
In regards to my first question, I was able to dig this up from the Twitch API:
If you send more than 20 commands or messages to the server within a 30 second period, you will get locked out for 8 hours automatically. These are not lifted so please be careful when working with IRC!
How on earth does Moobot/NightBot get around such a limitation?
Okay, so this is really only accurate if your bot is running in a single channel. Per that same thread, a Twitch staffer responds by stating that if a bot is modded in all channels it is in, it can send 100 messages per 30 seconds across all channels, so if that bot is in 100 channels, that means 1 message per channel per 30 seconds.
From the looks of it, multi-channel bots solve this with multiple connections. That is a bit mind boggling, but perhaps there is a reason behind it that I’m missing.
I’m pretty sure that some of the larger bots don’t necessarily use the same connection for reading and writing to a single room.
Since you can write to any room from any connection, a load-balancing system could spread out all messages on all connections, which means if one room exceeds the message limit, other connections will start emitting those messages.
Sending to rooms that the connection is not joined in would mean that you have to store mod status somewhere for each channel in order to use the 100-cap and not hit the 20-cap for any one connection.
If I made a really large bot, I think I would build it as a cluster of connection-nodes and a cluster of message-parsing nodes, then connect them with something like an MQ and/or load-balancing system. (Either spawning actual new programs and communicating by an external MQ, or by at least running multiple threads/coroutines, maybe both)
That way, we could spread messages and join/parts to all of the connections.
If I have loads of connections, I could spread out messages sent to the larger channels throughout all the connections, using round-robin or something more advanced. That way we could even send much more than the 100-cap to a single heavily trafficked channel.
I don’t really have the need for making another large-scale bot though. I’ll never reach feature parity with any of the big ones, so my projects will mostly be niche and/or self-hosted.
Great answer! I think I can avoid a lot of the hoop jumping by making a bot that only needs to send messages, which is all I really need to do at this point anyhow. I’m currently looking at pydle or node irc (perhaps one of the twitch-specific projects). If I ever get to a point where I need to create a load-balancing system, I’ll consider my endeavor a huge win.
My guess is you would only need a bunch of sockets, not threads or program instances. Keep track of the load on each socket and distribute it accordingly.
For my current single outgoing socket implementation I keep a list of the timestamp on each sent message. When a timestamp is older than 30 seconds I can discard it. If my list is less than 100 (or whatever limit) timestamps long I know I can send another message. I intend to scale this by adding sockets/connections to spare that will be used when the primary is above limit.