The laws of learning have been broadly grouped into two main categories: operant conditioning and classical conditioning. Ken focused on operant conditioning not only because it is more easily observed and understood by beginners, but also because operant conditioning depends on the animal to think and make choices. Classical conditioning works on a much more instinctive level, and does not result in animals who are actively participating in the training process.
Now, if you’ve ever been to a basic training seminar- or even read a book on the topic- you’ve probably been exposed to the four quadrants. Despite this almost universal approach to explaining the basics of training, Ken didn’t even mention them. This was by design; not only are the quadrants a bit difficult to wrap your brain around at first, but it is also somewhat unnecessary.
What you really need to know about the laws of learning can be summed up in Thorndike’s Law of Effect: behaviors which result in a satisfactory outcome will be repeated, while those that result in discomfort will not. Or, to put it simply, behavior is a function of past consequences.
Consequences come in opposing pairs:
Reinforcing or Punishing
Positive or Negative
Unconditioned (inherent) or Conditioned (learned)
Proximate (immediate) or Distal (in the future)
Beluga whale receiving reinforcement. Photo by Kate Mornement. |
Ken believes that the best consequences are the first of every pair; it is far better to have positive reinforcement which is immediate and inherently satisfying. And of all of those consequences, the most important is the use of reinforcement, no matter what form it takes. If you reinforce the behaviors you like, Thorndike’s law tells us that you will see more of those behaviors, which is the ultimate goal of training.
There are three main things to keep in mind when using reinforcement. First, you need to be sure that what you are not mixing up the idea of a reward with reinforcement. Rewards are things we provide that we believe will be an incentive to perform a behavior, but that may or may not actually be something the animal finds desirable. For example, most people consider chocolate to be a great reward, but it gives me headaches, so I would not change my behavior to get some. Second, while inherently reinforcing consequences like food are best, we can certainly teach animals to enjoy and even work for things like petting or praise. Finally, and perhaps most importantly, the timing of reinforcement is the key to successful training; reinforcers should be given as soon after the behavior as possible. Reinforcing in a timely manner with a low-value item will yield better results than poorly timed reinforcers, even if they are very highly desired.
In future posts, we will discuss some of the questions that arise when considering these pairs of consequences: How do you ensure that your timing is good? How do you teach an animal to enjoy something so much that it can be reinforcing? How do you elicit the behaviors you want… and what do you do when the animal doesn’t do what you want? And if you have specific questions about the basics of operant conditioning, please ask in the comments!
2 comments:
This was by design; not only are the quadrants a bit difficult to wrap your brain around at first, but it is also somewhat unnecessary.
Ha! I like this no-nonsense approach.
Agreed. :)
Post a Comment