Hacking the Cortex: Thousands of Steps, Dozens of Falls.

Cortex Banner

A lot of the accepted wisdom with regard to learning holds that certain skills, such as language and walking, are learned because a “module” in the brain activates and allows us to learn that skill quickly and effectively. The assumption made in these models is that the brain has evolved to contain a set of rules which constrain that learning and accelerate the process. For example,  with language, we’re thought to have a Language Acquisition Device in the brain that contains a master set of rules for a universal grammar. How else, they argue, could we learn something as complex as language so quickly? How else, they argue, can we explain how the overwhelming majority of children learn to form grammatically correct sentences without explicit instruction in grammar?

Those theorists are flat out wrong. Learning occurs in tiny increments, with many, many errors along the way. Much of it happens unconsciously, so it seems “innate” to observers and to our own subjective experience. Because we don’t see explicit lessons, we believe that there are no lessons.

Children between the ages of 12 and 19 months average a total of 2,368 steps per hour. The physical design of human legs limits their range of motion to the point where learning to walk is relatively easy, as there are only so many ways you can move your legs. But even with that advantage, it takes millions of learning experiences to become proficient at toddling, let alone running, jumping, climbing, sprinting, and navigating obstacles.

The important statistic I want to get at for this article is that within those 2,368 steps, a child also averages 17 falls per hour.

That’s a lot of falling.

Anyone who has played Warmachine for any length of time knows how many falls are involved in learning this game. It is the single most complex and nuanced wargame I’ve ever had the pleasure of playing, but the learning curve is steep. The first few games you play are training wheels games, in which the demoing player goes easy on their opponent and allows them to learn what their army actually does. We slowly introduce the basic mechanics of the game. But once the training wheels come off, we warn them – “You’re going to lose a lot of games”. And they do.

You have to learn about Molik missiles and Snipe-Feat-Go, about Overrun and Goad angles. You need to learn about the true brutality of Cryxian debuffs and overwhelming hordes of infantry. And there’s always more, always another combo.

And that’s just learning what other armies do.

You need to learn the nuances and subtleties of how your own army plays, about how a tiny rules interaction can be exploited to turn your troops up to 11 or to really ruin your day. You have to get blasted apart by Dire Troll Bombers before you learn to spread your troops out, then someone shows you how spray angles can surprise you even then.

There is so very much to learn. And as we established in the previous article, you learn via practice. But there’s more to it than that.

The core process which governs all of human learning is Operant Conditioning.

If you ask about 80% of Psychologists (totally made up number, but a large majority) they will tell you that B. F. Skinner was “discredited” or that his findings have been “discarded”. The truth is that the rules of operant conditioning are some of the most well researched, well established effects in all of psychology. I’d personally argue that it’s the closest thing we’ve got to true First Principles, and that it’s going to wind up explaining a lot more than anyone ever thought it would. But that’s a whoooole other tangent and I’m getting off topic. The point is, this Skinner guy was legit.

Gaze upon his forehead, ye mighty, and despair.
Operant Conditioning works on the simple principle that behaviour is shaped by the consequences of our actions. Learning is selectionistic in the same way that evolution is selectionistic, but instead of selecting for reproductive success, learning selections for behaviours that are reinforcing. 

In simple terms (because this is my psychological wheelhouse so I’m having to restrain myself from Extremely Unnecessary Detail), when we do a behaviour that is reinforced, the probability of use doing that behaviour again is increased. When we do a behaviour that is punished, the probability is decreased.

How is this relevant to Warmachine? On the surface level, it’s pretty obvious – when we do something that loses us a game, that behaviour is punished. Get to close to Molik? Bam, instantly punished. We learn not to do that pretty quickly.

The problem is that our basic learning processes are pretty bad at judging cause and effect. The stimulus that happens right before the consequence is taken to be the cause. Getting to a higher level of Warmachine play requires the ability to analyse potential causes that happen long before the consequence of losing. In a straight up assasination game, behaviour that caused the consequence is often obvious – you put your caster inside the threat range of a model that could kill them. That’s why learning threat vectors tends to happen early on, and smart learners learn to ask “what’s the threat range of X?” or “do you have any effects which increase charge ranges” when they encounter a new army. But in games that go to attrition or scenario, big causes can get missed. You can lose an attrition game because of how far you ran on your first turn, but still not take enough losses that you realise that that was the only real mistake you made, because the consequence of losing the game comes so much later. You can lose a scenario because you failed to grab a point top of two that you didn’t see was on, and then lose five turns later.

This topic is on my mind at the moment because I feel that’s one of the areas where I need to “level up” my game – to notice the early game errors which are making things more difficult for me to win later.

You can “fix” these kinds of flaws in your learning processes by being aware of its particular biases towards immediate causes. You can “rewire” your automatic learning by talking about what happened in the game, by being mindful of the effects of your early moves on the game.

The second way that Operant Conditioning folds into Warmachine is on the issue of takebacks. I’m all for takebacks when you’re teaching the game, or learning a new caster (for the first few games at least) – the objective is to learn to rules, not to learn to win, at that point. But consider takebacks in the context of Selection By Consequences.

Antecedent: You screw up

Behaviour: You ask for a Takeback, which is given.

Consequence: Your screwup no longer impacts the game (Reinforcing consequence)

And when a behaviour has a reinforcing consequence, it becomes more probable that it will be your response in that situation in future. You become more likely to ask for takebacks in future. If you change the above situation and remove the takeback, then the screwup has a punishing consequence in some way, shape, or form. What happens? The screwup (and you knew it was one, because you asked to take it back) because less likely to happen again in future.

Win win. Less likely to screw up, rather than more likely to ask for a takeback – which is a behaviour that can be bad in a tournament setting, as some people don’t like to give them, and that only leads to hurt feelings. Particularly if it’s a game ending screwup (and people are way less likely to give takebacks when you’ve just punted the game, even if a couple of minor ones have been given before. Where do you draw the line?)

From an operant conditioning perspective, takebacks are bad news in serious practice. I was on the side of allowing them in practice (to let the game unfold and really test the list) but as soon as I started to think about Operant Conditioning and Warmachine I realised my position was objectively less good than a “no takebacks” rule when serious practice was being engaged in. You want to reinforce good habits, and punish bad habits (Disclaimer: Punishment is far less effective in shaping behaviour than reinforcement. They’ll take away my behaviour analyst card if I don’t point that out very clearly. The best thing to do is to simply Not Reinforce bad habits, rather than actively punish them).

Good Night, and Good Luck

-I_Avian

P.S. I was planning to talk about some of the potential pitfalls of practice and expertise here, but I felt I needed to talk a little about conditioning first, and of course it got away from me… Next week, I promise!

P.P.S. I think I need a new sign off. The only article I’ve written after nightfall has been my first, so it just feels weird to type…

 

Read more from I_Avian and the rest of the Overload Online crew at threediceoverload.wordpress.com.
We’re also on Facebook and Twitter.

Author: I_Avian

Anthony began his Warmachine journey on the raggedy edge between Mark 1 and Mark 2, playing just enough Mk1 to be certain that Mk2 was a good thing, and just enough field test models to lament what might have been if Mulg had remained at 11pts and Stalkers could still Leap. Some of his early trials and tribulations were documented on Lost Hemisphere, which was also home to a short “Storytime with I_Avian” series which now continues on Overload Online. Anthony channels his constant urge to talk about Psychology into a series of articles about “the mind game” aspect of Warmachine and Hordes. For a brief moment in time, he was a Hunters Grim player, but WTC duties have brought him back into the cold, cold, embrace of Cryx.

Share This Post On