I am one of a growing army of people trying to bring positive reinforcement training to the horse world. The most familiar picture of positive reinforcement training is that of clicker training. Many people have heard of clicker training with dogs; what’s less well known is how useful clicker training is with horses.
One reason I’m so passionate about bringing clicker training to horses is that I am a convert myself. I first dismissed positive reinforcement training as a treat-inspired gimmick promoted by trainers without talent. Because these trainers were not adept at motivating the animal themselves, I figured, they relied upon motivating with clicks and treats. Clicks were a crutch for trainers with poor timing. Treats were basically used as bait to get the animal to do what clumsy trainers couldn’t convince them to do otherwise.
Furthermore, I objected to the mistaken implication that “positive” reinforcement meant that it was good, better, morally superior.
This pervasive misunderstanding of the “positive” in positive reinforcement is slyly exploited in any marketing of this training method. This exploitation drives me nuts, because I think it prevents people from truly understanding what the method is all about.
Positive reinforcement is not better, but it is different. Is a hammer better at building a house than a screwdriver? No, it’s not better. It’s just different. The builder with both tools in his box will build a stronger house, faster, than the builder who needs to rely upon just one.
Positive, and negative, reinforcement are for the trainer what hammer and screwdriver are to the builder. They are two fundamental tools, and being skilled with both allows the trainer to do her job more efficiently. Efficient, effective methods mean a lot less frustration for trainer and animal alike.
So here’s the deal with the naming of reinforcement types: “Positive” and “negative” are taken from scientific jargon, with the same value proposition that is associated with naming positive or negative integers on a number line. “Positive” means something is added. “Negative” means something is subtracted. One is not better than the other, nor is it necessarily preferred by the animal.
Many “Natural Horsemanship” trainers would be appalled to learn that what they are practicing everyday, as they use pressure-and-release, the language of Equus, or whatever they want to call it–what they are using everyday is what the animal behavior scientist calls “negative reinforcement.”
What? I thought negative meant that the animal was getting punished, turned away from something?
No. Negative reinforcement means that when we see a behavior we want to see more of, we reinforce (encourage) that behavior by removing something from the equation. In the case of “Natural Horsemanship,” the thing removed is pressure.
Positive reinforcement means that when we see a behavior we want to see more of, we reinforce that behavior by adding something. Very often, this addition is a food treat.
Clicker training is just going one step further to link a secondary reinforcer to the equation. Food is the primary reinforcer. The click means nothing to the animal in and of itself, but over time we “load” the clicker with meaning by conditioning the animal to expect food after every click.
Remember Pavlov’s dogs? How Pavlov rang a bell just before feeding time every meal, and eventually the dogs would begin to salivate just from the sound of the bell? In slang behavior-speak we would say the bell became “loaded” with meaning; it began to mean “food is coming.” In proper behavior-speak, we say (as Pavlov demonstrated) that the animal was classically conditioned to expect the treat after the sound of the bell.
Once you have an animal conditioned to expect a food reward after every click, it becomes much easier to use the click to mark the desired behavior, rather than to use the food. Food is clumsy, We fumble for it in our pockets. We drop it. We take too long between seeing the behavior we like, and reinforcing that behavior with food.
Conversely, the clicker is precise. It is exact, succinct, immediate. Whether we use a hand-held clicker or make an audible click with our mouths, it is easy and repeatably fast to mark the behavior we like with the click.
Then, of course, once the click has been suitably loaded, that classical conditioning buys us a little time between saying “Yes!” to (clicking) the behavior we want, and getting that treat to the animal. We have time to be imprecise and fumble with the food, because the click has already delivered the message of exactly which behavior we liked.
It’s kind of like the difference between getting handed a paycheck for doing our work (the paycheck is the click) and then cashing that check at the bank (or depositing it into our 401k if we’re prudent!). We get the actual cash reward once we’re at the bank, but we know that we are not getting rewarded (paid) for going to the bank. We’re getting rewarded (paid) for having done the work. That’s a bit of an analogy stretch, but worth considering.
There is oodles more to be written in this blog about positive reinforcement and secondary reinforcers and how clicker training works. But for the sake of getting this post out and leaving you with a thought to chew on: positive reinforcement isn’t better–but it is fascinating!!