Opensourcing Wormkit modules

kubasienki · April 26, 2025, 07:23 PM

Hi everyone,
Really nice community you have here. I was honestly shocked (in a good way!) to see that Worms Armageddon is still alive and kicking — even more so than some of the newer titles in the series.

I'm a Reinforcement Learning practitioner and I work on plugging different games into neural networks. This time, I wanted to try it with WA. I came across some great modules by nizikawa, and now I'm writing my own module to connect an AI agent to the game. It exports the game state via sockets, pauses the game while waiting for the neural network's response, and can speed up the game up to a stable maximum speed.

Originally, I planned to release it on GitHub and develop it as open source. But after reading some posts here, I noticed that some WormKit modules are kept closed-source to prevent cheating. Since my project gives programmatic access to the game state and controls, it could theoretically be used for cheating, though it would require some actual coding effort to do so. (I'm exporting a list of worms' positions, so basic aim assist is a high school math level problem away.)

I would prefer to release it as open source, especially to get help with development.
What do you think I should do?

TheKomodo · April 26, 2025, 11:56 PM

Speak to Deadcode and CyberShadow. Those are the 2 active developers.

They may possibly be able to give you some good guidance here.

Impossible · April 29, 2025, 10:46 PM

hi there, kubasienki. I'm AI developer and a fan of reinforcement learning projects too.

what part of the game are you thinking about?

the game is very diverse. there's classic schemes like Normal/Intermediate/Elite, they're purely skill-based. There's a few probabilistic ones with different ratio of rng elements.

T17 scheme depends on how good the weapons are in the crates. there's few succesful examples of AI in RNG games, like DeepStack beating people in Poker, but generally, RL seem to perform worse in RNG games. perhaps it makes sense to avoid such schemes, at least for the first attemps?

and the problem of schemes like normal is that they are too complicated. too many weapons are available to the agent. learning to use them all on a high level is definitely not possible. at least i don't think so. mastering a single weapon, like a rope, would require a very advanced agent by itself. I don't know your experience, perhaps you're training very big and complicated AIs and can pull that off, but my default assumption is that it'll make sense to avoid them for the first attemps

you mentioned aim assist, from what I can assume that perhaps you wanted to use AI on a scheme like BNG?
BnG is probably ideal scheme for training an agent, in terms of how simple it is. the only problem is that it would be kinda boring. the agent would just throw perfect grenades that land exactly on target, perhaps even plop opponents right away, if the agent is sophisticated enough. it's also a perfect scheme for training, because it's the only scheme, where in-game bots are actually decent. once an agent can beat few 5-UP level bots (like in 1vs3 worms situation) it'll be close to a superhuman level

then there's schemes that depend on one niche skill, like roping. agent that can rope would be amazing to watch. but it's hard to train. for example, in TTRR, there's no way of telling how close you are to the finish, no real way to give clues to the agent. the other schemes that require high level roping, like wxw, usually depend on rules, like touching the walls, or collecting crates before attacking.

perhaps the best scheme is something like hysteria, which is simple enough for training, yet it would be interesting to watch the agent, because there's also room for strategizing. like killing your own worms to get turn advantage, darksiding, perhaps AI would discover interesting strategies? 1 sec turn time would also help with training, and this is the scheme in which the agent would look the most spectacular (people can't do in 1 second what an agent can - like walk, use multiple utilities at once, flight, throw weapons, knock, and so on)

what are your thoughts? maybe you wanted an agent for the game missions and i'm throwing all this stuff at you xD

btw, there are quite a few tool-assist replays in the game, in different schemes. they are useful if you want see how absolutely optimal actions in the game look like and what is actually possible in the game. might be useful to know in case the agent gets stuck in a suboptimal state

nizikawa · April 30, 2025, 07:41 AM

1. scrape replays from tus league games. they are already neatly categorized by scheme names. filter them by top players
2. build a module that will extract game state. you can easily extract worm positions, speed, state, targeting angle, health, ammo, turngame info and whatever you like. dump pclandscape collision bitmap. you can group it into 8x8 pixel blocks and store it as 0/1 value for each block of the map to reduce data size. (check wkjellyworm for reference)
3. build a dataset - play all gathered replays. for each logic frame, dump your game state and the taskmessage (player action) read from the replay fifo
4. cleanup your dataset - discard losing players, bad or boring turns. you can extend it with some pathfinding, ballistics solver etc
5. train a classifier - given the current state of the game (and maybe n last frames) select which taskmessage (player action) is most likely to be used
6. embed your classifier into a wormkit module. during your turns dump the gamestate, classify it using your classifier and insert taskmessage into the input fifo (check wkrealtime for reference)
7. publish it on github like a based man

cheers

kubasienki · April 30, 2025, 05:52 PM

Quote from: Impossible on April 29, 2025, 10:46 PMhi there, kubasienki. I'm AI developer and a fan of reinforcement learning projects too.

what part of the game are you thinking about?

the game is very diverse. there's classic schemes like Normal/Intermediate/Elite, they're purely skill-based. There's a few probabilistic ones with different ratio of rng elements.

T17 scheme depends on how good the weapons are in the crates. there's few succesful examples of AI in RNG games, like DeepStack beating people in Poker, but generally, RL seem to perform worse in RNG games. perhaps it makes sense to avoid such schemes, at least for the first attemps?

and the problem of schemes like normal is that they are too complicated. too many weapons are available to the agent. learning to use them all on a high level is definitely not possible. at least i don't think so. mastering a single weapon, like a rope, would require a very advanced agent by itself. I don't know your experience, perhaps you're training very big and complicated AIs and can pull that off, but my default assumption is that it'll make sense to avoid them for the first attemps

you mentioned aim assist, from what I can assume that perhaps you wanted to use AI on a scheme like BNG?
BnG is probably ideal scheme for training an agent, in terms of how simple it is. the only problem is that it would be kinda boring. the agent would just throw perfect grenades that land exactly on target, perhaps even plop opponents right away, if the agent is sophisticated enough. it's also a perfect scheme for training, because it's the only scheme, where in-game bots are actually decent. once an agent can beat few 5-UP level bots (like in 1vs3 worms situation) it'll be close to a superhuman level

then there's schemes that depend on one niche skill, like roping. agent that can rope would be amazing to watch. but it's hard to train. for example, in TTRR, there's no way of telling how close you are to the finish, no real way to give clues to the agent. the other schemes that require high level roping, like wxw, usually depend on rules, like touching the walls, or collecting crates before attacking.

perhaps the best scheme is something like hysteria, which is simple enough for training, yet it would be interesting to watch the agent, because there's also room for strategizing. like killing your own worms to get turn advantage, darksiding, perhaps AI would discover interesting strategies? 1 sec turn time would also help with training, and this is the scheme in which the agent would look the most spectacular (people can't do in 1 second what an agent can - like walk, use multiple utilities at once, flight, throw weapons, knock, and so on)

what are your thoughts? maybe you wanted an agent for the game missions and i'm throwing all this stuff at you xD

btw, there are quite a few tool-assist replays in the game, in different schemes. they are useful if you want see how absolutely optimal actions in the game look like and what is actually possible in the game. might be useful to know in case the agent gets stuck in a suboptimal state

I would aim for a deathmatch, with some default scheme. But the tasks will differ depending on the experiment. For sure I'll start from a simple shooting/navigation.

From the RL standpoint, I want to do diverse experiments. Worms is a very complex game for the RL, but at the same time, they are deterministic during a single round. I feel it's a nice setup to test the transformer world model techniques like IRIS.

Great to hear about those assist tools - can you name them so I can research?

kubasienki · April 30, 2025, 06:03 PM

Quote from: nizikawa on April 30, 2025, 07:41 AM1. scrape replays from tus league games. they are already neatly categorized by scheme names. filter them by top players
2. build a module that will extract game state. you can easily extract worm positions, speed, state, targeting angle, health, ammo, turngame info and whatever you like. dump pclandscape collision bitmap. you can group it into 8x8 pixel blocks and store it as 0/1 value for each block of the map to reduce data size. (check wkjellyworm for reference)
3. build a dataset - play all gathered replays. for each logic frame, dump your game state and the taskmessage (player action) read from the replay fifo
4. cleanup your dataset - discard losing players, bad or boring turns. you can extend it with some pathfinding, ballistics solver etc
5. train a classifier - given the current state of the game (and maybe n last frames) select which taskmessage (player action) is most likely to be used
6. embed your classifier into a wormkit module. during your turns dump the gamestate, classify it using your classifier and insert taskmessage into the input fifo (check wkrealtime for reference)
7. publish it on github like a based man

cheers

I'm already working in clones of your repos

But for the training, I'm looking into the pertaining, scraping replays, but will start with unsupervised learning. I have the implementation ready to go, so I'll run it and later expand.

Right now, I struggle a bit with the time to finish the extraction of the collision data. I had just enough time to see where the struct is located and test the pointers in it. Those do not seem like pointing to a map data. I managed once to find the data and lost it:P

Now I assume the easiest way to find it is to debug now, which memory area is written to when you call the functions to fill in the terrain?

(I'm very new to modding, not very new to coding.)

kubasienki · May 11, 2025, 12:41 PM

I've made it with the collision data and communication with the game:

Now I'm generating pretraining data for locomotion ai.

Opensourcing Wormkit modules

kubasienki

TheKomodo

Impossible

nizikawa

kubasienki

kubasienki

kubasienki