Horse learning is a complex thing, and natural horsemanship trainers often simplify their description of the approach in order to distinguish natural horsemanship from “old school”. An example of that sort of simplification is suggesting that there is never a role for moderately strong negative experiences – that is, something actually uncomfortable, as opposed to the technical “negative reinforcement”, which is the application of pressure followed by the release that tells a horse they have correctly responded to that pressure.
I understand and agree that the vast majority of learning experiences for a horse need to be as positive as we can make them and the pressures we use can and should be very, very light. Like any other living being, a horse depends on an overall emotion of successful living (efficacy) to decide whether or not its behavior in general is beneficial or not. Too much negative experience for horse (or human) will lead to conflict behaviors and even potentially learned helplessness. At the same time, no learning creature, horse or human, spends every moment of its life comfortable – discomfort is often a key driver for learning.
We use negative reinforcement (pressure / release) rather than positive reinforcement because pressure suggests the correct action in a way that is both intimate and rapid,using a very simple grammar – move away from pressure applied in certain places in certain repeatable ways. We intentionally use the lightest pressures possible and release them as quickly as possible when something even approaching the correct response emerges, and converge on a more correct response as training progresses, but we do use pressure – a very mild sensation in the fully trained horse, sometimes much more uncomfortable in the early phases of training – to signal that a change is needed. When the change is underway or complete, depending on the nature of the desired response, we release the pressure as a reward. This is not so different from how nature works – the horse escapes the discomfort of the bramble bush by pushing forward or backing up; hurls the predator from its back by bucking or rearing, finds grass to release the pressure of hunger, or finds water to release the pressure of thirst.
Correction is an important part of learning. Horses do not know what we want or what goals we have, and they go through all sorts of periods in their learning, including those where they inappropriately attempt to use things they are learning in the wrong context. A great example of this is when the horse is doing lead change work and the trainer is changing between leads in the center of the arena with some consistency. As context-sensitive learners, horses often connect the lead change with the location, rather than with the application of the aids (the operant or classical conditioned cue). This is because location is the ultimate light cue (“I am approaching the center of the arena, and I am likely to feel the pressure to change leads – why not just change leads before the pressure arrives?”) – and in nature, location often signals an appropriate behavior – for instance, being wary near cliffs or rocks that might conceal predators.
But when the horse attempts to change leads based on location, it needs to be corrected – for instance by pushing it back onto the correct lead with a second lead change, or by an increase of pressure on the outside leg on approach to the center or a touch of the spur from the outside leg as the anticipatory muscle set up by the horse is felt by the rider or trainer.
Riders have to correct all the time. “No, don’t turn that direction”, “pay attention to me, not to the open door”, “move your hip over NOW”. It’s especially important during the “obedience” phase of learning to correct the horse when it is not reacting quickly enough and when it is not self-carrying a gait or maneuver. Stopping within the right number of strides, departing within the right number of strides. maintaining rate – all of these are the outcome of gradually reducing the frequency of correction in order to shape the desired behavior. Thus, corrections are a form of negative reinforcement, but the pressure to be released is the repetition of the correction and the time scale is minutes not moments.
Corrections should be as light as possible. But sometimes a horse is distracted, sometimes the mare you are riding is in heat, and when that happens, the strength of a correction needs to be raised above the threshold of the noise represented by the distraction. That can cause noticeable discomfort for the horse, and there is a point where the trainer has to make a judgement to find a positive thing to do and end the session because the corrections are just getting too loud and are still non-productive.
Sometimes the correction needs to be loud to discourage the horse as rapidly and completely as possible from offering a certain response. I call this “discouragement” rather than “correction” because it is about an instant application of a fairly large discomfort to signal that a certain behavior is unacceptable and, ideally to drive it almost immediately to extinction. These are things like the open handed slap on the neck of the horse when it attempts to bite, a yelling and threatening advance when the horse attempts to climb onto the trainer (this is not funny, though it sounds funny – it is very frightening, and yes, it happened to me once, and never again), or an intense spur when the horse is attempting to scrape the rider’s legs off on the wall. Like a normal correction, this should be as light as possible, but it needs to be loud enough to immediately extinguish the behavior.
The volume of a discouragement is not easy to set up. For one thing, it’s often the case you are discouraging the horse in a moment of potentially dangerous extremity and there isn’t a lot of time to measure your response precisely. For another “over-discouraging” the horse can lead to an explosive hyperreactivity; but making the judgement between a discouragement that is too quiet and one that is too loud requires a sense of feel that is based on experience in general, experience with the specific horse and a sense of which way things are going when the horse is being discouraged – if the horse ceases the behavior – mission accomplished. If it is escalating with the application of the discouragement – wrong thing to do, stop now, try another way.
Horses loudly correct each other all the time – when a conspecific fails to move quickly enough off a desired resource (space, hay, water), for instance, pinning of ears, baring of teeth, snaking of neck, kick threats or actual kicks are used. Watching horses form a domestic herd, you can see emerging leaders “overreact” during the “obedience” phase of teaching their herd mates the way they are expected to behave as a result of an approach or a gesture.
Nature discourages horses from doing things that hurt – like walking into pricker bushes or going down overly steep or loose slopes. Cuts, scrapes and injuries are painful, but they also quickly teach the horse to avoid certain locations or actions.
There is a place for this in training. We, as trainers, have an obligation to be aware of our role as surrogate nature to use discouragement in the right place, to quickly modulate to correction and then to leave the darn horse alone when the right behavior (or the reasonably right shapable behavior) is being offered, only using the lightest of aids and the most rapid possible release of pressure.
And then finally we have punishment. Punishment is an aggressive action by the trainer outside the “causal window” of the horse intellect (roughly ten seconds after the event). While our larger causal window allows us to understand the connection between an event ten months ago and a negative consequence now, that connection is impossible for the horse’s mind to assemble. A horse will not even understand a discomfort generated in response to an action that happened many seconds ago.
Punishment is a worse than useless action. It actively shuts down the horse by creating a fear relationship with the trainer or rider that disconnects the human from being a predictable vessel for learning and instead begins to push the human into the category of a threat. The trainer has now become unpredictably aggressive in the eyes of the horse.This can create all sorts of negative stress and conflict behaviors as the horse tries to make sense of what was predictable now becoming unpredictable – as would happen to you if your spouse suddenly and unpredictably hit you.
Categories like “correction”, “discouragement” and “punishment” have some overlap. It’s critical to try to keep the separating line as clear and bright as possible so that the horse can have the best and most successful possible learning experience. But remember that correction and discouragement have their place, and spend some time to determine where they should be used, why they are being used, and how you can quickly de-escalate to operant or classical negative reinforcement so that the work of training and learning can quickly return to its positive side.