Reactive Champion: reinforcement

Showing posts with label reinforcement. Show all posts

Sunday, October 6, 2013

Denise Fenzi Seminar: Rewarding Your Dog

When Denise talks about rewards in dog training, they seem to fall into one of two categories: things or activities.

Rewarding with things is very common in the dog training world; these are the ones that require that you plan ahead and have food or toys available for the reward process. Both food and toys are important rewards. Denise prefers to use toys when she working on happy, enthusiastic performances and food when she’s working on precision, but when it comes down to it, she believes attitude is more important than precision.

Where Denise really shines, though, is rewarding with activities. These rewards are ones that don’t come with something tangible, but instead in doing something. For example, Denise does a lot of personal play with her dogs. This is different than toy play. Instead, it’s about the dog and handler interacting together in a fun way.

You know what's fun? Running!

Denise also talks a lot about making activities in and of themselves rewarding. She told us about a study (sorry Science Geeks, I don’t have a citation) where researchers split kids into two groups. Both groups were told that they were studying some new puzzles, and they wanted the kids to play with them and then answer some questions. The first group was told they would be paid at the end, and the other was not paid. After the kids played with the puzzle for a specified period of time, they were told the researcher would be in to ask them some questions. While the kids were waiting, the group of kids who were not paid continued to play with the puzzles, while the paid group did not.

What does this have to do with dog training? Well, as this study demonstrated (and as many of us know from experience), activities done as volunteers often yield more satisfaction than those done for pay. In other words: we enjoy work more when we find it intrinsically rewarding. Dogs are the same. We shouldn’t need to pay them for things that are fun… and training can be fun! Many dogs naturally enjoy retrieving or jumping or running.

Of course, you have to make the work interesting for the dog. Make it an exciting privilege for your dog, like a child getting to go to DisneyWorld. Teach your dog that if he’s going to work, it needs to be at his full capacity. Or, to paraphrase Master Yoda: Work or or don’t work. There is no halfway.

Be sure that your dog receives one unit of reward for one unit of effort. Denise and Deb talk quite a bit about this in their upcoming book, but basically, if your dog tries to do something that is very difficult for him, compensate him fairly for it- kind of like hazard pay. As your dog gets better at that same thing, you can reduce the amount of reward he receives because it doesn’t require as much effort any more. Doing so often naturally leads to the reduced reward schedule so necessary in trials.

Finally, don’t be afraid of making mistakes. Denise told us that the more things your dog does wrong, the better. Mistakes help a dog understand what won’t be rewarded, meaning that in the long run, he will have a better idea of what will. If you feed your dog so much that he never fails, all he learns to do is to eat, not work. Teach your dog to work.

How do you reward your dog? Personally, I tend to be a bit dependent on things. A lot of this is because I compete mainly in venues that allow me to take food in the ring, so I don’t have a ton of incentive to develop activity rewards. For the few times I do compete somewhere I can’t use food, I’m fortunate enough that Maisy does find my smiles and praise rewarding.

Sunday, August 4, 2013

Kathy Sdao Seminar: The R in Dog Training

To Kathy, the most essential thing to understand about dog training is that consequences drive behavior. Period, end of story. What happens after a behavior happens is the best predictor of whether or not that behavior happens again. There are other important things, of course, and in fact, Kathy has an acronym for them: “Get SMART,” which stands for See, Mark, and Reward/Reinforce Training. (There’s actually a second S- set up- which I’ll talk about in a separate post because there’s a lot of great material there to apply to reactivity.) But the most important is the “R,” so that’s what I’m going to write about today.

The camera caught me mid-reinforcement!

Let’s start with the difference between reinforcement and rewards. Although it might appear that she’s using the two words interchangeably, she’s not. They aren’t the same thing. Rewards are given to an animal; it’s something he earned. Rewards don’t necessarily affect behavior (although they can create good will and enthusiasm). On the other hand, behaviors are reinforced. Reinforcement both causes the behavior to be repeated or occur more often and are contingent on that behavior happening. Reinforcement, Kathy says, is the trainer’s responsibility, not the animal’s.

Obviously, the more reinforcers you have, the better, and the amount of things you can use as a reinforcer is really limited by your own creativity. Classical conditioning will allow you to create a reinforcer. Or, you can use things your dog is distracted by as a reinforcer (this is basically the Premack Principle, and it’s very potent). And, as I’ve discussed on this blog before, cues can also be reinforcing.

I’ve always found this last bit fascinating, if a bit confusing. The truth is, while it’s awesome that cues as reinforcers gives you a lot more options, there are some downsides. You have to put a bit of work into cues-as-reinforcers; the cue must be familiar and the behavior must be fluent. It also needs to have been taught with positive reinforcement only. And then, if you’ve been lucky enough to create reinforcing cues, you need to be careful. If you give them simultaneously with bad behavior- such as when we try to redirect a behavior we dislike- it can reinforce that bad behavior. Oops.

This isn’t the only way reinforcement can go wrong. Remember how Pavlov’s dogs were conditioned to feel happy when they heard a bell that was followed by food? This can happen to anything. So if there are two events that happen sequentially, the way the dog feels about the second event can go backwards in time and contaminate the first. Sometimes this is awesome; dogs learn to love their clickers because they’ve been followed by treats. Sometimes, not so much. Kathy told us that if you reinforce a dog immediately after you’ve punished him, that punishment will become a reinforcer.

Say what? But… yeah, it can happen. It’s just two events getting associated with one another. For example, if you yell at your dog and then immediately praise him for making a better choice, the dog can learn to anticipate being praised after you yell. Or if you give him a collar correction for pulling on leash and then click and treat for heeling, collar corrections can become an opportunity to earn food. If this happens, every time you try to punish your dog by yelling at him or using a collar correction, you’ll actually be reinforcing the behavior and therefore causing it to happen more often!

This also works the other way around. If something bad happens immediately after you’ve offered your dog a toy or some food, then the bad thing can contaminate the good one. This can create a dog that “isn’t food motivated”- not because he doesn’t like food, but rather, because he’s afraid of what it predicts. And this doesn’t have to be punishment. If you try to help a dog get over his fears by luring him into the situation (for example, luring him to you to get a nail trim or to step on a wobble board), you’ll actually make things worse.

But don’t let all this scare you away from using reinforcement! For one thing, even if you aren’t a clicker trainer, it is impossible to avoid (anything that increases a behavior is reinforcement). Instead, avoid the pitfalls by simply separating reinforcement (good things) and punishment (anything scary or bad) with a pause long enough that the dog doesn’t associate the two.

Okay, so you’re ready to reinforce behaviors. You know how to avoid poisoning your treats. So… how do you give them? Experienced trainers know that the way you deliver reinforcement influences the final behavior. Using a marker (like a clicker) will reduce the impact of food delivery because the marker says that’s the behavior. Even so, that marker becomes a sort of cue in itself: it tells the dog that he has earned his reinforcer and that he should go to the location it will likely be delivered. Don’t fool yourself into thinking that only a clicker will tell the dog this; my Maisy has discovered that praise or even just a smile from me means that she should look for her treat. This is why, whether you use a marker or not, the place you give the treat matters so much.

There are three main places to give the treat: in position (while lying down, in heel position, etc.), in order to set up the next repetition of the behavior (for example, tossing the treat away from the dog’s mat when teaching “go to bed”), or “direction sliding” (where you move the dog to the correct location in order to fix a problem such as forging in heel or to further the dog’s learning such as teaching a spin). The option you choose will depend on both the stage of learning your dog is in as well as your final goal. And you may even switch back and forth between locations!

So that’s the down and dirty on reinforcement, AKA, the most important part of dog training. What have you learned about reinforcement? Worse yet, what did you learn the hard way?

Wednesday, June 12, 2013

The 100 Treat Philosophy

Imagine that I told you that you could fix your dog’s behavior problems with 100 treats. It will take 100 days and all you have to do is give him a treat once a day at a specified time. That sounds pretty good, right?

Now imagine that I told you that you could cut that time in half. You will use your 100 treats by giving your dog two treats a day for 50 days. Sounds even better, right?

But what if I told you that you could shorten that time all the way down to one day? You’d need to give your dog all 100 treats throughout that day, and it would probably take quite a bit of concentration to make sure you did everything correctly… but the effort would be worth it, right?

So which option would you choose? 100 days, 50 days, or 1 day?

"I would like to eat these now, please."

The numbers in this scenario are not realistic. If they were, I’m pretty sure I’d be a millionaire. Still, the idea behind them IS realistic. When I teach reactive dog classes, the one thing that I consistently notice is that the dogs who make the most progress are the ones who have the most generous people.

But having a high rate of reinforcement is hard for most of us, even when we’re fully on board with using food to train. I know that I personally struggled with that when Maisy and I were first working on her reactivity. I was afraid she’d become dependent on the treats. I was afraid that I’d have to carry food with me everywhere I went. I was afraid I’d never get her back in competitions because food is either not allowed or is very limited in the ring.

The truth is, though, food is like a foundation. Just as an abundant supply of bricks or concrete will make a better base for your house than a small number of logs, using a lot of high value treats in the early stages of training gives you a better chance of getting the results you want. You’ll also get those results sooner because you won’t have to constantly rebuild after every little storm.

When I finally began to reward Maisy for as many good choices as I could, even if that seemed like “too many” treats, she began to make PHENOMENAL progress. In fact, these days she’s practically normal. While there are situations where I do still use a lot of treats- like when she’s acting as a neutral dog for one of my reactive dog clients- there are also times when I don’t use any. I walk Maisy at least twice a day, and I rarely take treats with me. Last week we made a spur-of-the-moment stop at a pet store, and even though I didn’t have any treats with me, it didn’t matter. She was just fine without them.

Now, of course we all know that there are no guarantees when it comes to dog training. There are things you just can’t control when trying to work on a dog’s behavior problems. Notably, a dog’s genetics will limit the amount of progress possible, so not every dog can be “fixed.” (See this post for an in-depth discussion on why this is.) And although you can affect how long it takes to help your dog reach his full potential, there is no way to know in advance how long that will take.

So the next time you’re worried that you’re giving your dog too many treats, remember that it won’t be like that forever. In fact, by being generous, you’re making things easier for both you and your dog. Still, it’s up to you. Use your 100 treats wisely.

Monday, May 13, 2013

Shedd Animal Training Seminar: Aversives, Punishment, and You!

I don’t know why, but I always enjoy discussions on punishment. In ways, it feels like a “forbidden fruit.” I very rarely use punishment with my dog or my clients’ dogs, and if you try to discuss it- even theoretically- online, it can cause a lot of controversy. So my opportunities to talk about it are rare.

During the Shedd seminar, Ken talked about the advanced concepts of punishment, negative reinforcement, and aversive stimuli. These are three distinctly different concepts that are often confused, misused, and misunderstood. Still, the definitions are quite simple, and if you plan to use any of these techniques, you really do need to understand them.

An aversive stimulus is something that the animal wants to avoid. There is no definitive list of what makes something aversive; each animal will have different feelings about this. For example, some dogs hate being squirted in the face with water, but Maisy thinks it’s AWESOME.

A reinforcer is anything that increases the behavior it follows. Positive means something was added to make that behavior increase, while negative means something was removed. A negative reinforcer happens when something is removed, and as a result, a behavior increases in the future. This can happen for two reasons. First, the behavior may increase due to avoidance; an aversive isn’t actually applied, it’s simply threatened. The animal acts in order to prevent it from happening. Or, the behavior may be the result of escape. This happens when the aversive is actually applied and the removed with the desired behavior occurs. But either way, negative reinforcement is at play. It’s important to note that negative reinforcement can work and be both humane and effective if it’s done correctly.

A punisher is something that decreases the behavior it follows. This, too, can come in the positive or the negative variety. One way punishment can be used humanely is through deprivation; a reinforcer is withheld (negative) so that the animal will not perform the incorrect behavior again (punishment). Ken pointed out that this is why it’s so important to have multiple reinforcers available because this allows you to withhold certain reinforcers without depriving the animal of his full diet.

With that said, you really do need to know your audience when you use these terms. A trainer will punish a behavior; she wants a particular action to stop. But the public tends to punish the animal. That is, the punishment happens well after the fact, such as grounding a child for a bad report card or putting someone in jail for a crime they committed. In both cases, the actual behavior is so far removed from the consequence that it’s probably not being affected much.

So, while Ken does use punishment, he does not use it as the public understands it.

Ken talked about the use of conditioned punishers, as well. These are things that become aversive by association. Just as a clicker is a conditioned reinforcer because it predicts good things, there are also things that will predict bad things.

A delta signal, which is a warning to the animal that an aversive is about to be applied, can sometimes be used as a last chance to get things right. “Stop doing that or else,” it tells the animal. Your mom using your full name can be a delta signal; it tells you that you need to stop pulling your sister’s hair or face her wrath. The problem with deltas is that it can be very easy for the emotional trainer to escalate the use of punishment.

Ken also told us that a no reward marker acts as a punisher. This is the opposite of a bridge; it marks the moment when a behavior is wrong so the animal won’t do it again. These are typically quite mild, but can still cause frustration in the animal. So, while a skilled trainer can use no reward markers effectively and humanely, Ken thinks the potential for misuse is high.

I think my favorite part of this section was Ken’s discussion on how trainers use punishment versus how the public does. I appreciated the focus on behavior, not whether the animal is being “good” or “bad,” “cooperative” or “stubborn” (a word that always makes me crazy).

But what do you think? Anything intriguing here?

Monday, May 6, 2013

Shedd Animal Training Seminar: Advanced Concepts in Reinforcement

Okay, gang, I’m back with the Shedd Animal Training Seminar recaps. It’s been awhile, but thankfully I left off at a pretty good breaking point because we’ve come to the section on advanced concepts.

Ken defined advanced concepts as those that require experience in order to apply them. This is any training that ventures past the basics of “reward behaviors you want and ignore the ones you don’t.” You know you’re ready to start dabbling in some of these concepts when you understand training theory well enough to know when to ask for help (seriously. All good trainers get in over their heads sometimes) and you have some good mechanical skills (able to use a marker with good time, able to deliver reinforcers efficiently and effectively).

That said, just because YOU are ready to use an advanced concept does not mean that your animal (or your human client) is ready for the concept. So you also need to know when it’s appropriate to use one of these concepts, and when to stick with the basics.

A great example of this is the concept of defining criteria for a behavior. In the early stages, we think of behavior as a black-or-white kind of thing: either the behavior was 100% correct, or it was wrong. Except… there IS a gray area in training. This happens fairly often when a behavior is still in training, especially when you’re shaping a behavior with a series of approximations. Sometimes the animal gives you something you weren’t looking for or expecting, and you need to make a quick judgment call about whether or not to mark it.

With that out of the way, let’s talk a bit about when Ken considers reinforcement to be an advanced concept.


Being sprayed by a water bottle is a secondary reinforcer for this dolphin.

One situation in which using reinforcement requires an experienced trainer is when a secondary reinforcer is being used. Also called a conditioned reinforcer, this is something that the animal is taught to value. The most common example is a clicker or marker, but it’s anything that any animal will accept as a reinforcer. Secondary reinforcers can be indispensable when an animal is sick and is refusing to eat but you need to give them medications or reward them for a behavior.

Ken notes that your relationship to the animal is critical when you’re using a secondary reinforcer; while a kiss from your significant other may be welcomed, a kiss from your boss probably won’t be. For a more in depth discussion on secondary reinforcers, please see this post. http://reactivechampion.blogspot.com/2011/08/ken-ramirez-seminar-non-food.html

Another reinforcement technique that Ken considers to be an advanced concept is the use of variable reinforcement. Ken likes to look at reinforcement schedules simply. Instead of all the technical terms like CRF, FI, FR, VI, VR, etc., he tends to see them as either continuous and consistent or variable and intermittent. Of course, he readily agrees that understanding the technical terms can be helpful, but said that most of the time, it really isn’t necessary in most situations.

Variable reinforcement happens when an animal does not get a reinforcer for each and every behavior. It’s often used in training because it makes a behavior more resistant to extinction. This allows you to have the animal do a number of behaviors for only one reinforcer. However, it does need to be carefully introduced or it can lead to frustration in your animal.

Although there are many ways to introduce a variable schedule of reinforcement, Ken shared how the Shedd staff do it. First, every new trainer AND every new animal begins with a continuous, fixed schedule of reinforcement. They will provide a variety in the types of reinforcers, though. Then, they condition and establish secondary reinforcers (see the post linked above for more details on this). Next, start using your secondary reinforcers so that they are not always followed by a primary reinforcer. Finally, use other well-established behaviors as a reinforcer. This entire process generally takes four to six weeks with an experience trainer AND an experienced animal. With a naïve trainer and animal combo, it can take several years.

There is one more advanced concept in regards to reinforcement that Ken discussed: negative reinforcement. However, I decided it makes more sense to present it with the seminar summary on aversives and punishment. Keep an eye out for the next installment in the Shedd Animal Training Seminar series!

Monday, September 24, 2012

Shedd Animal Training Seminar: Basic Operant Conditioning

In the last hundred years or so, science has learned a lot about animal training. In fact, we have learned so much that Ken stated definitively that training is a technology. That is, the laws of learning are always true, no matter what species we are working with. In that sense, training can be compared to the laws of gravity: no matter what you drop, it will fall downward. Of course training, like gravity, can be influenced by outside factors. If you drop a pen during a tornado, it may fly sideways or appear to hover in the air, but that’s not because gravity has ceased to work. Likewise, the laws behind training are still at work, even when the results are unexpected.

The laws of learning have been broadly grouped into two main categories: operant conditioning and classical conditioning. Ken focused on operant conditioning not only because it is more easily observed and understood by beginners, but also because operant conditioning depends on the animal to think and make choices. Classical conditioning works on a much more instinctive level, and does not result in animals who are actively participating in the training process.

Now, if you’ve ever been to a basic training seminar- or even read a book on the topic- you’ve probably been exposed to the four quadrants. Despite this almost universal approach to explaining the basics of training, Ken didn’t even mention them. This was by design; not only are the quadrants a bit difficult to wrap your brain around at first, but it is also somewhat unnecessary.

What you really need to know about the laws of learning can be summed up in Thorndike’s Law of Effect: behaviors which result in a satisfactory outcome will be repeated, while those that result in discomfort will not. Or, to put it simply, behavior is a function of past consequences.

Consequences come in opposing pairs:
Reinforcing or Punishing
Positive or Negative
Unconditioned (inherent) or Conditioned (learned)
Proximate (immediate) or Distal (in the future)

Beluga whale receiving reinforcement. Photo by Kate Mornement.

Ken believes that the best consequences are the first of every pair; it is far better to have positive reinforcement which is immediate and inherently satisfying. And of all of those consequences, the most important is the use of reinforcement, no matter what form it takes. If you reinforce the behaviors you like, Thorndike’s law tells us that you will see more of those behaviors, which is the ultimate goal of training.

There are three main things to keep in mind when using reinforcement. First, you need to be sure that what you are not mixing up the idea of a reward with reinforcement. Rewards are things we provide that we believe will be an incentive to perform a behavior, but that may or may not actually be something the animal finds desirable. For example, most people consider chocolate to be a great reward, but it gives me headaches, so I would not change my behavior to get some. Second, while inherently reinforcing consequences like food are best, we can certainly teach animals to enjoy and even work for things like petting or praise. Finally, and perhaps most importantly, the timing of reinforcement is the key to successful training; reinforcers should be given as soon after the behavior as possible. Reinforcing in a timely manner with a low-value item will yield better results than poorly timed reinforcers, even if they are very highly desired.

In future posts, we will discuss some of the questions that arise when considering these pairs of consequences: How do you ensure that your timing is good? How do you teach an animal to enjoy something so much that it can be reinforcing? How do you elicit the behaviors you want… and what do you do when the animal doesn’t do what you want? And if you have specific questions about the basics of operant conditioning, please ask in the comments!

Tuesday, February 28, 2012

Jane Killion Seminar: Reinforcement Training

Jane Killion is a reinforcement trainer.

Notice the words I chose there, because she doesn't call herself a positive trainer. Why? Well, in behavior science, all “positive” means is “add.” You add something in order to manipulate future behavior. Sometimes you add something in order to increase a behavior, and sometimes you add something in order to decrease a behavior. Technically speaking, this label can easily refer to some wildly divergent training methods. But Jane also feels the term “positive” can be divisive. Since so many people misunderstand it to mean upbeat, kind, or virtuous, it implies that people use use other methods aren't. What's more, it also implies that anyone who is nice must automatically be positive, which is misleading.

As a result, Jane favors the term “reinforcement training.” Her methods are based on providing reinforcement to the dog, whether that's by adding something desirable to increase a behavior (positive reinforcement), or by removing something unpleasant in order to increase a behavior (negative reinforcement). What she avoids whenever possible is the use of punishment, as all that does is suppress behavior, not teach the dog what you want him to do.

Training under the umbrella of reinforcement gives a trainer a variety of things to use to create behavior. Without a doubt, the most popular- and Jane's favorite- is to use food. Not only is it a primary reinforcer, but it's also easy to dispense, easy to control, and it tends to tap into the “thinking” part of the dog's brain.

Of course, any one who has worked with dogs knows that sometimes training with treats just doesn't work. Jane identified two main reasons this happens: emotional interference (ie, stress), or competing reinforcers that are more relevant or interesting. After all, when it comes down to a choice between eating a bit of hot dog and chasing a squirrel, most dogs will choose the squirrel. And if you're working with a “Pigs Fly” dog, the chances that he will choose the environment over you is large.

That doesn't mean that you can't use positive reinforcement with a dog who isn't interested in food. You can, it just takes a bit of creativity. Enter Premack, a principle which states that a high probability behavior can be used to reinforce a low probability behavior. If your eyes just glazed over, don't worry; Jane has come up with a very clever way of describing the Premack principle. She calls it ICE: Identify “hot” reinforcers (anything your dog wants, which might include sniffing the ground, rolling in something stinky, or peeing on a bush), Control them, and then Exchange a behavior for the hot reinforcer.

Jane showed us an excellent video of her working with a very distracted young dog that really wanted to do was explore and sniff, which was incompatible with his owner's desire that he walk nicely on the leash. In the video, Jane identified the hot reinforcer (sniffing), controlled that reinforcer by shortening the leash so he couldn't sniff, and then waited for an opportunity to exchange a behavior for sniffing. As soon as the dog looked at her, she let out the slack in the leash and told him to go sniff. After a few moments, she shortened the leash again, and the process repeated. Soon, he was willingly paying attention to her while they walked through the field together. Pretty cool.

As the video demonstrated, Premack is a powerful thing, and I wish she would have demonstrated it live during the seminar. I can understand why she didn't- we had limited time, and Premack takes longer than handing out a cookie- but I was still a bit disappointed. There was even a great opportunity to demonstrate Premack/her ICE system since one of the working dogs had a medical problem that made it difficult to use food for training, but instead Jane had that handler switch out for a different dog.

Another way to train with reinforcement is through negative reinforcement. This often ignored method has us remove something aversive in order to provide relief to the dog. This is a powerful thing, and dogs and people alike will perform behaviors that bring them relief. Jane was clear that she will not add the aversive herself, but that she will use unpleasant things that are already present; a shiny floor, perhaps, or the presence of the judge leaning over the dog to do the stand for the exam. And, Jane said, often the equipment in agility is a stressful thing for dogs.

To that end, she showed us another excellent video demonstrating how reward placement can provide negative reinforcement. On screen, we saw a dog who was hesitant to jump on the table, an agility obstacle that requires the dog to stay on a platform for several seconds before moving on to the next obstacle. The dog in question was slow and reluctant to get on it and lie down. Jane shaped the dog in several steps to move towards, get on, and then lie down on the table, each time throwing the treat away from the table, which allowed the dog relief from the piece of equipment he found unpleasant. Soon, she had a dog eagerly offering the desired behavior.

Again, I was disappointed that she didn't demonstrate the use of negative reinforcement during the seminar, even though there was actually a training problem that could have been set up to take advantage of it. One of the working dogs was having trouble performing recalls in the presence of other people. Jane chose to have people stand in two rows, then instructed the handler to call the dog to run between them. I would have loved to see Jane use some negative reinforcement in this situation through the relief of social pressure by having the people move back when the dog responded. This could have been a powerful reward for this dog, but instead, the dog got a treat when she came.

Now, there's nothing wrong with this. The approach worked; the dog was able to improve her ability to come despite the presence of people, but it was stressful for the dog. Jane was okay with this; learning is stressful, she said. She's right, of course, but after my experiences with Maisy- a dog with clinical anxiety- I have far less tolerance for stress during training. As a result, I just wasn't comfortable watching the amount of stress the working dog endured- especially when Jane could have used negative reinforcement to relieve the dog's stress while still accomplishing her goal.

As these two examples demonstrate, most of what Jane showed us over the course of the weekend was straight positive-reinforcement-with-food. More to the point, Jane has a tendency to shape everything. At times, it felt like this the only tool she has, but then, she's so good at it, she doesn't really need anything else.

I doubt that I will ever attain her level of skill in observation, setting criteria, and timing, but I am hopeful that having the chance to watch her will improve my own abilities. I look forward to telling you about some of what I learned... But it will have to be in the next post, as this one has gotten far too long already.

Tuesday, January 17, 2012

The Pleasure of Anticipation

Last spring, I wrote about how cues can be reinforcing for dogs. If the cue predicts a good outcome (a click and treat, for example), then the dog will find the cue exciting. More talented trainers than I have taken advantage of that by reinforcing a dog’s response with another cue.

Some readers met this with skepticism. Maybe my explanations made sense, maybe they didn’t, but let’s be honest: logic and anecdotes alone are not always convincing. That’s fine; I don’t expect everyone to agree with me, and in fact, I would find that rather boring. But when one of those skeptics found this hour-long lecture, she remembered my post and emailed me.

The lecture, given by neurobiologist Robert Sapolsky, explored what makes humans unique. His entire talk is fabulous, and I urge you to watch the entire thing. Personally, I really enjoyed his discussion of how language affects our perceptions of others because of the insular cortex, but what’s relevant today is what he shares about dopamine (starts about 30 minutes in).

Throw it... throooow iiiiiitttttttt.....

Dopamine is a neurotransmitter that helps control the brain’s pleasure and reward centers. For many years, it was believed that when someone (human or animal, it doesn’t matter- dopamine is present in all mammalian brains) received a reward, their brain would release dopamine. In turn, this would result in a pleasurable feeling.

However, when scientists actually studied what was going on, they found something very different. Sapolsky described an experiment in which chimps could receive a food reward if they press a lever when a light turns on. The dopamine levels in the chimps’ brains increased not when they completed the task, but rather when the light went on.

In other words, what the chimps found pleasurable was the opportunity to receive a reward, not the reward itself. After they pressed the level, their brain quit releasing dopamine, even before they received the reward. Anticipating the reward was better than the reward itself.

That light signified an opportunity to receive a reward; press the lever now, it said, and you will be reinforced. This is exactly what we do in dog training. I say “sit,” and if my dog does, she’ll get a treat. So the light was acting as a cue. The study Sapolsky cited says that it was the cue that made dopamine levels rise, which means that my dog will feel good when I say “sit,” not when I give her the treat. The cue is reinforcing.

I suspect that clickers work the same way, although Sapolsky didn’t address that directly. He did, however, say that dopamine is about the anticipation of the reward, not the reward itself. If cues can cause that anticipation, it seems that a sound could, too. Can a click cause dopamine levels to increase because the dog is now expecting to receive his reward? I don’t see why not.

What scientists found even more remarkable, however, was that when the food was given in response to the correct behavior only half the time, the chimps’ dopamine levels went through the roof. This wasn’t exactly surprising to me; dog trainers often talk about how a variable schedule of reinforcement creates stronger, more durable behaviors than when the dog gets a treat for every correct behavior. B.F. Skinner and his students proved that over and over again in the lab, although of course they couldn’t know that it was the result of dopamine. As Sapolsky put it, “maybe is addictive like nothing else.”

Finally, the scientists also found that if they blocked dopamine production in the chimps’ brains, when the light came on, the chimps didn’t care. Instead of eagerly pressing the lever, they sort of shrugged it off. The chimps knew they’d get a reward if they did, but they just didn’t seem to care. Could this be a possible explanation for why a dog doesn’t respond to a cue? Maybe. But I'd point out that there are many, many other reasons dogs don’t perform a behavior, and most of them are probably more logical. Still, it is fun to think about.

I found all of this really interesting. Not only did it support the concept of cues being reinforcing- something I find pretty fascinating in and of itself- but it also suggests that there is more at play in clicker training than just the food. In fact, it would seem that anticipation is what's truly powerful, an idea I find amusing since trainers often get upset when their dogs anticipate what's coming next.

To be fair, having the dog act before we ask them to can be a problem. Still, is that indicative of a corresponding spike in dopamine? And if so... how can we use this to our advantage? What can we do to harness our dog's natural brain chemistry to create a more favorable training outcome? I'll admit, I don't have an answer here, so I turn it over to you: have you ever used the power of anticipation to your advantage? And if so, how?

Friday, August 26, 2011

Ken Ramirez Seminar: Non-Food Reinforcers

One of the big objections people have to clicker training is “all that food.” They always want to know when they can stop using it, an attitude that used to baffle me. I mean, I get that people who are active in dog sports need their dog to perform many behaviors for a single treat, but when there are no rules, what's the problem? It's not hard to stick a handful of kibble in your pocket, after all.

Well. Leave it to Ken to not only be entertaining, but to also convince me that non-food reinforcers are both valuable and necessary (mostly because it is much easier to perform husbandry behaviors on a sick animal who is refusing to eat when you have a non-food reinforcer available). He also presented a very thorough method for creating non-food reinforcers, and gave us some tips on how we should and shouldn't use them once they've been established.

Let's start at the beginning: what are non-food reinforcers? Well, obviously, they're not food, but Ken was a bit more scientific than that. When Ken talked about reinforcers, he broke them down into two categories: primary reinforcers and secondary reinforcers. A primary reinforcer is something that is inherently reinforcing; the animal doesn't need to have any experience with it to understand that it is a good thing. Typically, these reinforcers satisfy biological needs, and food is the ultimate primary reinforcer (that's why it is so useful in training). By contrast, secondary reinforcers are something the animal needs to learn is desirable.

Despite the fact that secondary reinforcers are learned, Ken made the point that secondary reinforcers can be very, very powerful. In fact, they can sometimes be more powerful than primaries because of what they represent. For example, money is a secondary reinforcer- the paper itself has no inherent value. However, society teaches us that money is a desirable thing because of what it can buy, and this association is so strong that humans will do some very boring or unpleasant tasks in order to obtain it. In fact, we are more likely to take a job that pays money than one that provides food and shelter.

(As a side note, play can be looked at as both a primary and a secondary reinforcer. Often the act of playing- running or chasing, for example- is innate, making it a primary reinforcer. However, the objects used in play, like balls or tug toys, are secondaries because the dog needs to learn what they are used for. A ball that is not thrown is neither interesting nor reinforcing to most dogs.)

Another way to think of secondary reinforcers is as a “reinforcement substitute,” which emphasizes the fact that secondary reinforcers only become powerful through conditioning. Ken is very, very systematic in the way that he creates secondary reinforcers. His approach is so thorough and slow, in fact, that I suspect some readers will be turned off by it. This is partly because he's found that the more time you spend conditioning them, the more powerful they will be, but also because he believes that if you use secondary reinforcers improperly, it can lead to a lot of frustration. Since frustration is sometimes inherent in training, he tries to minimize it whenever possible, something which is both kind to the animal and practical when working with wild animals who are less tolerant of human mistakes than the domesticated dog.

Ken creates secondary reinforcers in almost exactly the same way he trains a behavior. He starts by choosing a stimulus to act as a reinforcer. This stimulus should be one that is useful- that is, it is easily accessible, and not overly cumbersome to implement. He also thinks it works well to use something that is novel to the animal; choosing something the animal has habituated to and now ignores is going to make things much more difficult. One of his favorite secondary reinforcers is clapping.

Then he does straight-up classical conditioning: he presents the chosen stimulus, and then immediately follows it with a primary reinforcer. So, he claps, and hands over a bit of food. Clap, food, clap, food, until the animal seems to understand that the clapping predicts the food. This shouldn't take long at all unless the animal finds the stimulus aversive (if the animal is sound-sensitive, for example), or if your primary isn't that exciting.

Then he asks the animal for an easy, well-established behavior. This is something the animal already knows well, and has a very strong reinforcement history for. In dogs, the behavior of sit is often a good choice. When the animal does the behavior, the trainer will present the new stimulus, and then give a primary. For example, the dog sits, the trainer claps, and then he gives a treat. Ken will do this daily for several weeks, although the length of time will vary based on the animal, his relationship to the trainer, and his past reinforcement history.

The next step is to cue the same easy, well-established behavior, and then reward with only the new stimulus. Here, it is truly acting like a reinforcement substitute, as the dog will sit and receive only clapping as his reward. Ken will do this a maximum of three times during a training session, and he'll spread it out so that the animal is also getting primary reinforcers for other correct responses in between. Again, he'll stay at this step for several weeks.

This cycle repeats, except now Ken will cue a harder behavior, though it should still be well-established. When the animal responds, he'll give the new stimulus, and follow it by a primary. So, he'll cue, for example, a roll over or a stay, clap, and then give a treat. He stays at this level for several weeks before cuing the harder behavior and using the stimulus as a reinforcement substitute. Again, he continues doing this for several weeks.

Once this process has been completed, you're ready to use your new reinforcement substitute in training... but Ken has a few rules before you do. The most important is the 80/20 rule, which is actually more of a guideline, but basically, he says that you should use primaries approximately 80% of the time, and secondaries approximately 20% of the time. He never uses the same secondary reinforcer twice in row, although he might use two separate secondaries in a row. He always treats the new secondary as a behavior and occasionally “recharges” it so that it retains its strength. Finally, he recommended that novice trainers use secondaries only to maintain existing behaviors, and not to create new ones.

Yes, this is a very regimented way of moving away from food reinforcers, but Ken has a very convincing story to support the importance of being so systematic. It involves a new trainer trying to use tongue scratches to reward a killer whale, with a very poor result. I won't spoil the story for you (and Ken tells it so much better than me anyway), but trust me when I say that I completely understand why Ken is so thorough. His advice is that we never take any reinforcer for granted, and to work to build up as many reinforcers, both primary and secondary, as possible.

If you've found any of this even remotely interesting, you totally need to see Ken speak about it. Imagine this information, only peppered with incredibly funny and informative stories about dolphins, seals, penguins, and yes, even some dogs. He's also got illustrative videos, and I always enjoy seeing familiar concepts used with exotic creatures.

What about you guys? Has anyone used such an in-depth process to create secondaries? Do you see value in doing it with your dogs? What types of non-food reinforcers do you use... or would you like to start using? I'd love to hear your thoughts!

Tuesday, July 5, 2011

Hard Work Remains

Maisy is doing great these days. It is wonderful to be able to take her places and do things with her without having to worry that she's going to have a reactive outburst. It's nice to be able to relax and enjoy our daily walk. It's absolutely amazing that she can handle just sitting outside a petstore, watching the other dogs come and go.

Maisy with Fritz and my super awesome friend, Megan.

And yet... I've come to the conclusion that we are possibly at the most critical part of her rehabilitation. Instead of enjoying the fruits of our labor, now is the time to redouble our efforts and work even harder. This realization hit me pretty hard, but I have two main reasons for believing it's true.

First, Maisy is inarguably awesome these days. In fact, she's so awesome that it is easy to forget that she has issues. It is easy to believe they are in the past, and as a result, to take her good behavior for granted. You wouldn't think that after only six or seven months I'd forget where we started, but when you see something day in and day out, it's easy to get used to. Why else would the difference between these videos surprise me so much?

At any rate, it is easy to forget that this new state of being is still tentative. Although Maisy's had several months of excellent behavior, she has several years of lunging, barking and growling to overcome. It's going to take time to turn these new neural pathways from sparsely populated jungle trails into well-traveled interstate highways. I need to make sure I'm reinforcing Maisy's good choices every chance I get instead of relaxing and going easy on the cookies.

Second, it's much harder to tell when Maisy is feeling anxious now. This is because the way she expresses stress has changed- gone are the days of impossible-to-miss, over-the-top reactive displays. This is great, of course, but it also means that I must pay much closer attention to her body language now. When your dog is lunging and barking at the end of her leash, it's obvious that she's not feeling so great. But when you have to rely on such subtle indicators such as a change in muscle tension, breathing patterns, or tailset... well, it requires a lot more effort.

This effort is critical, though. If I can't see these much smaller signs of stress, I won't be able to adjust the environment for her. I won't be able to change my expectations, nor will I be able to increase the rate of reinforcement for the good choices she makes in order to really help cement those new roadways we're building. In short, I will be setting her up for failure instead of success.

So what's the new plan of action? Well, there isn't one, not really. Instead, I'm just holding the course, doing all the same desensitization and counter-conditioning and Control Unleashed techniques that I've done all along. The big thing right now is being mentally present with my dog. I need to remain in tune with how she's feeling, and what she's telling me. I need to continually remind myself that no matter how well she's doing, hard work remains. And I am confident that if I do, good things await us.

Thursday, April 14, 2011

Musings on Why Cues May Not Be Reinforcing

In my last post, I tried to explain why cues can be reinforcers, mostly because I think it's mind-blowingly awesome: how great is it that I can take a legal reward in the ring? Or that I'll always have something for those times that I run out of treats? The concept is awesome, however it's also confusing. One of the cardinal rules of reinforcement is that the receiver gets to decide if something is reinforcing, so just because a cue can be reinforcing doesn't mean it will be.

Of course, this begs the question: why not? What happens to prevent a cue from becoming reinforcing? At Clicker Expo, Kathy Sdao identified one reason: the use of punishment in training. This absolutely makes sense to me. If the reason a cue becomes reinforcing is because the dog sees the cue as an opportunity to earn reinforcement, then it makes perfect sense that punishment would interfere with that. But I don't train with punishment, and not all of my cues are reinforcing. What gives? Am I doing something wrong, or is there something else at work here?

In search of answers, I went right to the source: Kathy Sdao herself. Last week, I had a twenty minute phone conversation with her, and the ultimate answer was: we don't really know. But we came up with some pretty good ideas. Now mind you, there really isn't a lot of science on it yet, so what follows should not be taken as certain truth. These are theories, so if (when) I write something illogical or silly, don't blame it on Kathy! She was so incredibly nice and generous with her time that I'd hate for someone to attribute one of my dumb ideas to her.

Before we go any further, I need to pick on myself first. The reason some of my cues aren't reinforcing is probably because Maisy just doesn't understand them. I'm really not very good at teaching cues (I find it terribly boring), so Maisy seems to guess which behavior I'm asking for most of the time. She's a pretty good guesser, mind you, but it is clear to me that she has trouble discriminating between even simple things like “sit” and “down.” All that uncertainty no doubt colors her perception of the cues.

But what about other dogs? You know- the ones who are lucky enough to have a good handler and have been trained to understand what cues mean and respond reliably. Why is it a cue may not reinforcing for a dog who's been trained with positive methods? This is the question I explored with Kathy, and the one I want to write about today.

First, it is entirely possible that the trainer only thinks she's training positively. Now, I'm not talking about trainers who are suffering from some misconception about what positive training entails. I'm talking about trainers who understand learning theory and know how to use clickers and treats, yet unintentionally do something that the dog finds aversive. Maybe she intimidates the dog by leaning over him unconsciously, or maybe her voice has just an edge of frustration to it. There are many ways our body language can effect the way our dogs respond, and I doubt most of us recognize that we're even doing it!

And then there is luring. Could luring interfere with the end result? Don't get me wrong- I use luring when it makes sense- but as Kathy put it, there is a “continuum of coercion” at play when we lure behaviors. She didn't use the word coercion to imply that it's evil or wrong; instead she defines it as the dog having no choice in how to respond. Isn't it possible that the dog is so into the lure that he isn't really thinking about what he's doing? And by extension, if he's not thinking, can he really be making a choice about what to do? Instinctively following the lure may mean that the dog is being coerced, no matter how nicely. If so, it's possible that the cue for the lured behavior won't be reinforcing. Of course, since it's a continuum, it's equally possible that the dog is thinking, that dog does have a choice, and that the cue could end up being reinforcing as a result. But it's interesting to consider that even methods that are widely considered "positive" could have a downside.

Speaking of how things are taught, Kathy and I talked about the idea that something could happen in the acquisition stage of the behavior that would effect the eventual cue. This could be something as simple (and as common!) as the trainer's criteria being too high or the rate of reinforcement too low- to the point that the dog felt frustrated with the behavior, or was left with some lingering confusion. Or maybe the trainer's timing is so poor that the dog can't quite figure out what he's supposed to be doing? Couldn't this affect a dog's perception of the behavior, and ultimately, how he feels about the cue? It might... it might not... it would depend on the dog, of course, but Kathy said it seemed plausible.

Speaking of early stages of learning, maybe something went so wrong that it forever tainted the dog's view of that behavior. I'm specifically thinking of single-event learning, wherein a dog forms a long-lasting and negative association with something in just one trial. If a dog inadvertently received punishment from the environment while learning a new behavior- for example, a socially shy dog learning something in an overwhelming group class- it seems possible that the dog could associate those initial bad feelings with the behavior. It seems to me that this might ultimately affect the cue as well.

Although this seems more likely during the critical early stages of learning, I imagine that single-event learning could also happen later on, too. Perhaps a loud clap of thunder happened during a training session with a storm-phobic dog, or there was a lot of static electricity on a given day, resulting in accidental little shocks each time the trainer fed the dog a treat. Depending on how aversive the dog found this, couldn't this affect the behavior (and the cue) too?

But let's put aside blame for a minute and pretend the trainer did everything perfectly. This hypothetical trainer is a brilliant shaper who is great at adjusting criteria so that the dog receives a high rate of reinforcement with minimal frustration, does a great job of taking environmental factors into consideration, and is very conscious about her body language so that it doesn't interfere with her training. Is it possible her dog wouldn't find cues reinforcing?

I think so, if the behavior was “self-punishing.” This would be the opposite of self-reinforcing behaviors where the dog gets some kind of internal relief or inherent joy from engaging in a behavior (typically one the trainer doesn't want). A self-punishing behavior would be something that makes doing a particular behavior painful or unpleasant. It might happen when a dog has some kind of physical condition like arthritis or hip dysplasia that makes it physically painful to do something. Or it might happen when the behavior itself is scary. For example, Maisy doesn't like to step on things that move, like wobble boards, and yet she will do it over and over again, simply because I am asking her to (and if I'm honest, because there are treats involved). I have no idea why she persists- it's clear that it scares her each and every time- but she continues to do it anyway. For her, the cue “step up” will probably never be reinforcing.

Finally, I would be remiss if I didn't mention a plain old lack of reinforcement. Cues become reinforcers because of conditioning- they've been frequently paired with treats or other awesome things. If there aren't enough treats following the behavior to make it exciting, then the effect certainly won't carry over to the cue. This doesn't mean that the behavior needs to be followed by a treat every time- Kathy pointed out that intermittent reinforcement is stronger than continuous reinforcement, and she thinks this would apply to cues as well- but the odds do need to work in the dog's favor.

Anyway, those are just some of my musings about why cues may not act as reinforcers. Are they right? Maybe. Maybe not. I don't know- like I said, they're just ideas we kicked around. And now it's your turn! Poke holes in my theories. Point out the stuff I'm not taking into account. Let me know how your experiences stack up against my speculation. If you've seen it, share some research on the topic with me (I know some exists about “poisoned cues”). Or share your own brilliant ideas. Maybe you've got a theory of your own! I'd love to hear any and all of it!

Tuesday, April 12, 2011

Why Cues Can Be Reinforcing

Last week, while summarizing Kathy Sdao's session at Clicker Expo on cues, I wrote that cues can be used as reinforcers. I was pleased that several of you commented with examples of how you do this, and was intrigued that most of your cues-as-reinforcers were for behaviors the dog really enjoys, like retrieving or sniffing. Of course, this begs the question: if the behavior is inherently pleasurable, is it really the cue that's reinforcing?

The cue "go get it" is exciting, sure, but not because I conditioned it that way.

Well, yes. The cue allows the behavior to happen. Ask any newly-licensed sixteen-year-old: permission to borrow the car is incredibly reinforcing. I use cues this way all the time. When Maisy and I are out, if she walks nicely instead of dragging me towards a smelly fire hydrant, I'll give her the “go sniff” cue as a reward. Of course, that behavior- the sniffing- is not something I taught her to enjoy, and the cue is simply letting her do something she wants anyway... but it's still reinforcing.

Kathy was talking about so much more, though. She said that any cue, for any behavior can be reinforcing. This is a powerful concept because if we can make the cue for an obedience exercise reinforcing, we will be able to reward our dogs in the ring. Of course, that means the cue for a rather innocuous behavior, like sit or down, must be reinforcing, which is a bit harder to think about. I mean, sure, dogs like to sniff, but what's so exciting about sitting?

The answer is that it's exciting because the trainer has taught the dog to find it exciting. This is done by repeatedly pairing the sitting action with a great outcome, like treats. As a result, if the trainer does her job well, the dog will become quite excited to sit. Karen Pryor talks about this in her book Reaching the Animal Mind, and calls the cue “a promise of happy outcomes.” She says the cue tells the dog “if you understand what I'm saying, and you carry it out correctly, you will definitely win” (page 35). In this way, she says, the cue becomes another kind of conditioned reinforcer.

Kathy called it a “tertiary reinforcer,” because it's the third one out. The cue precedes the click which precedes the treat. Of course, the cue isn't as strong as a click. While a click always results in a treat (or at least, it should), a behavior may not always result in a click. We don't click good tries, after all; we only click behaviors which meet our criteria. Therefore, the click is contingent upon the dog performing the behavior correctly, making it an opportunity... and opportunities can be reinforcing, too.

Let me explain what I mean by expanding upon Kathy's metaphor of traffic lights. A green traffic light is a cue for someone to drive. It tells the driver that pressing the gas pedal is allowed now. For people who like to drive, this will be reinforcing in and of itself because the green light is a cue that gives permission to do something they find pleasurable.

For some of us, though, it's not so simple. Although I don't hate driving, I don't find the action all that thrilling, either. Still, it gets me places I want to go, so despite my ambivalence towards the act itself, I like to drive because of the results. This works with our dogs, too. They may not find sitting all that exciting, but it gets them a tasty treat or a toss of the tennis ball. If this happens enough times, it can transform a relatively mundane action into something associated with a good outcome.

The cue also becomes associated with that good outcome. I have had a long history of driving after seeing a green light, and then ending up at my destination. This has happened often enough that now I want to see the green light, not because it will let me drive, but because it predicts that I will end up at my desired location. Similarly, our dogs get excited when they hear our cue, not because they want to sit, but because they want the treat that they think will come when they do. Thus, my green light- and my dog's cue- becomes reinforcing because of the opportunity it signals.

Of course, things can go wrong. I might get in an accident, and the dog might respond to his cue incorrectly. In both cases, neither of us will get what we want. If this happened too often, neither of us would find our respective cues very exciting anymore. Thankfully, these occurrences are rare- or at least they should be. Just as I should be able to safely drive my car before I get in it, the dog should be able to perform the cue before the trainer gives it. This is why Kathy had three entire sessions at Clicker Expo on how and when to attach cues to behaviors.

As fascinating as all this is, I really think it's only half the story. If we want to use cues to their fullest potential by using them as reinforcers, I think it's important to understand why cues sometimes fail to be reinforcing. Because it's true; while cues can be reinforcing, that doesn't mean they will be. Kathy identified one reason (the use of punishment in training) at Clicker Expo, but I think it's more complicated than that. In my next post, I'll share some of my ideas with you.

In the meantime, though, I'd love to hear what you guys think. Does my explanation make sense? Do you have any reinforcing cues that were conditioned instead of taking advantage of your dog's innate loves? How and when do you use them? Let me know!

Tuesday, April 5, 2011

Clicker Expo 2011 (Chicago): Kathy Sdao- What a Cue Can Do, Part 1

The fabulous Sara and Layla demonstrate "breathe."

Believe it or not, Sara actually taught Layla to take a deep breath when cued.

Kathy’s session on developing cueing skills changed my life. Okay, maybe that’s a little dramatic, but it was possibly the best session I attended all weekend. I have to admit, I suck at getting behaviors on cue, so I knew going into it that this would be full of great information. Even so, I was not expecting to feel so devastated about my subpar skills. In a panic, I begged my way into her lab on cueing skills so I could see it in action. I'm glad I did!

In my last entry, we talked about how to get behaviors. That is the first job of a trainer, and despite my feelings to the contrary, Kathy said it is the most difficult part. Personally, I think getting the behavior itself is far more fun than the tedious process of getting the behavior on cue. This is probably why Maisy has virtually nothing on cue (at least not reliably). That’s not her fault, of course- as Kathy pointed out, the reliability of the dog reflects the reliability of the trainer.

Despite the not-fun-ness of it all, it’s important to develop reliable cues so that you can get the behavior you want, when you want it, and also so you aren’t getting the behavior when you don't want it. (This is also called "stimulus control.") Who hasn’t seen the clicker dog offering behaviors willy-nilly? It’s kind of exhausting of watch, and having one of those dogs myself, it is also kind of frustrating. Still, I created this monster, so I can’t get mad at her. It’s time to fix (um, okay, get) those cues!

Let’s start with some information on cues in general. Cues don’t make the dog do the behavior; astute readers will remember that in Kathy’s talk on shaping, she said consequences drive behavior. Those consequences- the reinforcers- provide the motivation for the dog to perform the behavior. Cues simply provide the clarity of “now would be a good time to try that behavior.”

Throughout her talk, Kathy compared the idea of cues to a green traffic light. If you’re sitting in a car at a green light, you don’t go because the light makes you, you go because you want to, and the green light is a cue that gives you permission. What’s more, you can’t go at a green light if you don’t know how to drive, so you always have to get the behavior first, and then attach the cue.

This is different from how people used to train. In the past, trainers would say a word like sit and then make the dog do it by physically manipulating or luring him into position. Kathy calls these words “commands,” which she distinguishes from “cues.” Commands carry an implicit threat: do it or I’ll make you. Cues are simply an opportunity; if the dog doesn't do the behavior, he won't be forced. However, he needs to do do it if he wants to earn reinforcement, and assuming the dog has been adequately reinforced in the past, he should be excited for that opportunity.

It is this last point that absolutely fascinated me. If cues are an opportunity to earn reinforcement, then they should be pretty awesome things, right? Trained well, a cue should be like a release word. They tell the dog, you no longer have to wait, you can do that behavior now, and when you do, I’ll click and treat you.

The cues therefore become reinforcing in and of themselves because they become what’s called a tertiary reinforcer. Primary reinforcers are things that the dog likes inherently. Dogs don’t need to be taught to like hot dogs, they just do. Secondary reinforcers are things that predict a primary reinforcer. Clicks or marker words tell the dog a piece of hot dog is coming through the process of classical conditioning. The tertiary reinforcer predicts the secondary reinforcer which predicts the primary reinforcer. The cue predicts a click which predicts a piece of hot dog. Cool, huh?

Anyone who wants their dog to do complex behaviors, or a chain of behaviors, will recognize the value in having a cue act as a reinforcer. Agility or obedience dogs often need to do a series of behaviors with no primary (food) reinforcer. Being able to reinforce a behavior simply by giving the cue for the next exercise or obstacle will help sustain motivation and ensure that the dog continues to perform over the long-term.

However, there is a catch- in order for a cue to act as a reinforcer, that cue must always predict a good thing. If you mix in corrections, the cue is then sometimes associated with unpleasant things and no longer acts as a reinforcer as a result. Therefore, Kathy said that training positively is not about fairness to the dog- it’s about giving yourself more tools. Losing a reinforcer by using punishment is a big price to pay.

Whew! Who knew there was so much to say about cues? But there’s so much that Kathy talked about them for 90 minutes, and then did two 90 minute labs about them! I only attended one of those labs, but in my next post, I’ll share her cue tips with you, as well as information about how to add a cue to a behavior. In the meantime, I can't wait to hear what you guys think. Have you ever used a cue as a reinforcer, or does this concept just blow your mind? Let me know!

Sunday, April 3, 2011

Clicker Expo 2011 (Chicago): Kathy Sdao- You're in Great Shape!

You guys, I loved, loved, loved Kathy Sdao. In fact, I loved her so much that I went to four of her sessions (the equivalent of a third of the weekend). I skipped sessions that I’d planned on seeing in order to attend her sessions. And I wanted more. Someone bring her to the Midwest, please; I’m dying for a working spot with her.

Why did I love Kathy Sdao so much? Well, she’s wicked smart and has had tons of practical experience, including training dolphins to do defense-related open-ocean work for the US Navy. How cool is that? Her lectures are just packed full of great information and fascinating stories. She’s energetic and engaging and enthusiastic and entertaining! Go see her… and take me with you.

Okay, enough gushing. Dog training is all about changing behaviors, right? The thing is, though, we can’t manipulate the behavior directly- the behavior belongs to the dog. So what we need to do is manipulate what happens both before and after the behavior.

There are many ways to get a behavior to happen from the front end, everything from physically prompting or luring the dog to capturing or shaping the behavior. The method you choose is up to you, and Kathy said that you should understand the science so that you can choose the method for yourself instead of letting someone else decide for you. However, no matter how you decide to train, remember that the fundamental law of behavior is that consequences drive behavior. No matter what you do on the front end, your power as a trainer comes from being a reinforcer, not a commander.

In fact, that’s so important that I’m going to repeat it: Your power as a trainer comes from being a reinforcer, not a commander. If you are labeling your dog in some way, be careful! Dogs that are “distractible” and “stubborn,” say more about you than about them. Specifically, those labels say that you don’t reinforce them enough! Awesome dogs come from awesome reinforcement histories.

Although there are lots of ways to get the behavior, as the title implies, this session was about shaping. Kathy defined shaping as “teaching new behaviors by use of differential reinforcement, systematically reinforcing successive approximations toward the goal behavior.” Or, to put it simply, teaching a dog to do something one small step at a time.

Kathy demonstrates shaping.

Shaping can take place through either reinforcement or punishment, but it should go without saying that Kathy focused on the reinforcement side. Click, then treat. You don’t need to punish wrong responses, you don’t even need to mark them with a No Reward Marker. The opposite of reinforcement is not punishment, after all- the dichotomy is reinforcement versus no reinforcement. You’re either clicking the dog for getting it right, or you’re not clicking.

So, why shape? Well, of course it’s based in the science of operant conditioning, which means that it takes advantage of using consequences in your favor. But it also makes the dog an active participant- there’s two of you present, so why not use both brains? What’s more, it gives the dog a sense of control, and can help him make sense of a seemingly random and punishing world. It allows him to find a way to make the world work in his favor, thus creating a sense of internal motivation and desire to interact with you. And of course, sometimes it’s the only way to get a complex behavior the dog wouldn’t offer otherwise.

Once you’ve decided to shape a behavior, there are a few pre-requisites. First and most importantly, you need to know what your goal behavior is. Be specific- does “come” simply mean “move towards me,” or does it imply that the dog will move towards as soon as you call, quickly, and directly? And what should he do when he arrives- stick around or dash off again? Know what you want.

Once you know what the completed behavior is, you need to know what the first criterion is. What sliver of behavior will you be clicking for in the first 60 seconds? That sixty seconds thing? Yeah, she means it. Your job as a trainer is to choose the criteria, do a brief bit of training, click when it happens (and withhold the click when it doesn’t), and then stop and ask how it went. In fact, that was sort of the refrain throughout the weekend. Many of the presenters I saw recommended working in really short bursts with lots of thinking time in between.

What are you thinking about? Well, you need to assess your dog’s progress and then plan the next 60 seconds. You’re trying to figure out what your criteria will be the next time around. Kathy recommended making the job easier when your dog is getting it wrong half the time, or if he’s getting less than 8 clicks and treats a minute. That’s one click every 8 to 9 seconds- not much time! But it helps underscore just how small your steps should be during shaping. If your dog is doing well, getting 8 to 11 clicks per minute, and getting things right somewhere between 60% to 80% of the time, you’re on the right track. Continue working on that criterion. And if your dog is getting it right more often than 80% of the time, or is getting clicked every 4 to 5 seconds, it’s time to move on to the next step!

Incidentally, if you’re working on a duration behavior and can’t get in enough repetitions in order to get a high rate of reinforcement, you should increase your density of reinforcement so that the dog is getting the same amount of treats- 12 to 15 per minute’s worth of behavior. Deliver them however you want- all at once or one at a time- but make it worth the dog’s while to do that behavior.

Kathy also covered Karen Pryor’s ten laws of shaping, paying special attention to number 5 (stay ahead of your subject by having a plan) and number 8 (don’t interrupt the session gratuitously, which includes talking. Don’t be more distracted than you expect your dog to be!). She also noted that Karen herself says that number 3 (put the current response on a variable ratio schedule of reinforcement before raising criteria) isn’t necessary with dogs, which I thought was fascinating, and relieving- I’ve never done it!

Remember that when you’re shaping, your main job is to sit back and watch your dog. You should not be doing the moving, your dog should. If your dog stalls out and gets stuck, your response should not be to help him out. Instead, have faith in your reinforcement history. He can figure it out. If need be, go ahead and reduce your criteria, but resist the impulse to help him out by pointing or luring him.

And that’s shaping in a nutshell. I feel like I glossed over so much of what she said- her sessions really are just jam packed with great information- but this should give you a good idea of what she said. Even though I shape a lot (I find it addicting, even if I’ll never use that behavior), I still took a lot away from this session.

I have to admit, I’m not very good at the planning/assessing piece. I just sort of… sit down and click. I usually don’t end up where I’m planning to go, either. That’s fine for casual sessions, but I’ve got to admit, it impedes my progress for tricks or competition behaviors. But between this session and Kathy’s presentation on cuing skills, I think I’ve figured out what I need to do! In fact Kathy’s cue seminar was amazing, and I cannot wait to share it with you.

Tuesday, February 8, 2011

Is it ever necessary to use pain or fear in training?

One of the things that I love about blogging is its interactive nature. My last post spawned a great conversation about training methods, the “right” way to train, and the role of punishment in dog training. The thread really made me think about my training philosophy, so much so that I felt the discussion deserved its own post.

Let me start by saying that I think it's impossible to use 100% positive reinforcement. Personally, I probably use approximately 85% positive reinforcement, 10% negative punishment, and 5% positive punishment. (If you're not familiar with these terms, this link does a nice job of discussing them.) This makes me a decidedly lopsided trainer, and while I do use primarily clicks and treats, I'm not afraid to use the occasional “correction.” My “corrections” trend pretty light because of who Maisy is- sensitive and anxious- but also because I believe that it is possible to train without using pain or fear.

Does this mean it's wrong if someone trains differently than me? No, of course not. If a given training method is getting the desired results, if it's fair and consistent, and if it's improving your relationship with your dog, then who am I to say that it's wrong? However, I stand by my statement: I believe you can train a dog without using pain or fear.

This does not mean that I think training should be one-size-fits-all. Frankly, I think that's impossible. Every person, dog and situation is different, and as such, needs a different approach. Further, because the dog defines what's reinforcing and what's punishing, it's impossible to make blanket statements about whether or not a technique is acceptable. For example, some dogs find being sprayed in the face with water incredibly aversive. Maisy happens to love it. If I were to try to stop her from doing something by using a squirt bottle, it probably wouldn't be very effective.

The cool thing about positive training is that there are many different ways to teach the same behavior. If I want my dog to go lie on a mat, I can shape her to do that by clicking small movements in the right direction, I can lure her to do it by tossing a cookie on the mat, or I can capture it by waiting for her to go to the mat herself. More than that, though, I can also manipulate the environment to make it more likely that she'll go to the mat by placing it in the hallway and wait for her to walk down it, or putting it in her favorite napping spot, or putting it in her crate... the possibilities are limited only by your own creativity.

I definitely believe that teaching new behaviors can be done with almost completely positive reinforcement. However, stopping an already existing and unwanted behavior is much more difficult. Personally, I call this type of training “behavior modification” but perhaps that's splitting hairs. Regardless, my approach is to try positive methods first, and if those fail, use punishment in a thoughtful, fair, and consistent manner. The punishment used should be the least invasive and minimally aversive option possible.

I think we can find pain and fear-free punishments that will stop a behavior, however, I will concede that there are situations where we might not have the time to figure out how to do that. Life-and-death situations like a dog trying to eat something poisonous or running towards a busy highway are no time to futz with a clicker. You do what you have to do in that moment, even if it hurts or scares your dog. Of course, I don't think that's training so much as damage control, and once your dog is safe, you teach a stronger "leave it" or a better recall to prevent the situation from occurring again. It should go without saying that I would do that training without pain or fear.

This leads to the question of whether it's ever necessary to use pain or fear in training. Philosophically, I'd say no, but I will grudgingly admit that there are times where using pain or fear in training is the lesser of two evils. For example, if a dog is going to lose his home because he's barking too much while his owner is gone, a bark/shock collar is probably the better option. Do I like this? Of course not. In fact, it makes me really uncomfortable to say it, because I believe there's always a “positive” solution. Unfortunately, people don't always have the time, money or knowledge to find it.

All of which is to say that while I think some training methods are better than others, I recognize that those other methods that not only work but that they might make sense in a given situation. I may not like those methods, I may think they're unnecessary, but I'm trying hard to avoid judging people who use them. I haven't walked in their shoes, and I don't know what they're up against. I will speak out against abuse when I see it, but the rest of the time, I will offer support or suggestions when appropriate. This only makes sense. After all, my goal is to be as positive with people as I am with my dog. By my own calculations, that requires offering them positive reinforcement 85% of the time!

So, let me take this opportunity to positively reinforce everyone who comments on my blog. I appreciate you all, even those of you who disagree with me. Perhaps especially those of you who disagree with me, because it forces me to examine what I've said. Sometimes it strengthens my convictions, and sometimes it causes me rethink my position. Either way, I learn and grow as person and a trainer, and for that, I am grateful.