Thursday, March 4, 2010

Clicker Theory: No Reward Markers and Keep Going Signals

I apologize in advance to any readers who are not familiar with clicker training, or who are just beginning to learn about the learning theory behind it, as today’s post concerns more sophisticated clicker concepts.

About a week ago, someone on a mailing list I belong to posed a very interesting question: If the click means “yes,” then what does no click mean?

The poster, a teacher, mentioned that when she has her students play the“clicker game,” in class, they initially learn faster if they receive feedback for both yes, that’s what I want you to do, and no, you’re going in the wrong direction. In other words, a no reward marker. She went on to say that once her students understood the game, they learned that the absence of a click or verbal marker was basically the same thing as being told no. Once they figured that out, they could figure out the task just as quickly with only the positive marker.

She wondered: do our dogs understand the absence of a click the same way? Do they interpret silence as “no”? If so, why do they keep working in trial settings, where they receive neither clicks nor encouraging verbal feedback? Wouldn’t the silence inherit in a trial tell our dogs that they are doing it wrong? If so, this would have dire consequences on our performances.

The general response was that silence should not- cannot- imply that the dog made an error. Instead, we must teach our dogs that silence is a keep going signal- that they are on the right track, and that if they keep up with what they are doing, they will earn reinforcement. That is the only way that our trial performances will hold up.

So, if silence means “keep going,” then how do we tell our dogs they’re going off track? As Clicker Trainers, we don’t use corrections (defined here as anything that causes the dog pain or stress) to tell the dog they’re wrong. The logical response would be the use of a no reward marker- an emotionally neutral way of saying no, try something different… but some people on the list argued that this would actually slow learning down.

I disagreed. I shared with the group that when Maisy begins to get off track during a shaping session, I tell her “Nope! Try Again!” in a cheery voice. I wrote that I felt my dog learns faster this way, but that even if she doesn’t, it helps me feel better to be giving the feedback.

Still, in light of the conversation, I decided that I would test my theory, so I sat down with Maisy to work on a shaping project. First, I just worked with her like normal, not really thinking about what I say or when I say it. Although I did say “Nope! Try Again!” perhaps three or four times in the course of five minutes, I found that I said it more as conversation and less as information. Interestingly, I discovered that I was saying it at times when we were in the midst of a long period of silence. That “nope!” served to fill the silence until she finally got the click for doing what I wanted.

Next, I worked with her, but remained silent. I didn’t speak; I simply clicked or didn’t. Maisy continued on, doing well until we hit one of those long periods of silence. She kept trying things, but after about thirty seconds of neither a click nor a “nope!”, she laid down and looked at me as if she wasn’t sure what she was supposed to do.

Finally, I tried using the no reward marker more regularly. We continued shaping, but I tried to think in terms of right and wrong. I clicked when she got it right, and said “Nope!” when she got it wrong. This led to rapid-fire clicks and “nopes,” and after she got three “nopes” in the space of about ten seconds, Maisy again laid down with her chin on the floor. This time, though, I had to encourage her quite a bit to re-engage with the shaping game. But when she again got several more “nopes,” she laid down and refused to play any more.

I began to feel frustrated; this is not how it’s supposed to work! She’s supposed to want to play! My frustration came out in my voice, and I began to tell her to get up with an edgy tone. When she didn’t, my feelings of frustration gave way to anger. Since I didn’t want to take that out on her, I ended the session to evaluate what had just happened.

The first thing that I decided was that I was wrong: Maisy does not learn faster with a no reward marker. In fact, she gave up so quickly, and was so difficult to persuade to re-engage with the task, that I believe she found it punishing. True, she also gave up when the silence went on too long in the second scenario, but she worked approximately three times longer, and was much more willing to re-engage when I asked. As a result, I think she found the lack of any feedback confusing, but not aversive.

Still, I concluded that the complete lack of any kind of feedback was also not the best way to help Maisy learn. Instead, her learning is most efficient when she gets lots of reinforcement over a short period of time. This means my job is to break the shaping task at hand down into as many pieces as possible so it is easier for her to progress through each step of the task. However, sometimes it is difficult to figure out how to break a task down any further. As a result, if I cannot figure out how to make the task easier, and if it’s been fifteen to twenty seconds without a click, I need to give Maisy a “gimme” click- reverting to the previous level of criteria for a few moments before trying the higher criteria again.

I also suspect that my initial use of “nope!” wasn’t actually serving as a no reward marker. Given the way Maisy responded, I think it actually served the purpose of a keep going signal for her. This means that for tasks that haven’t had a sufficient amount of duration built in yet, she depends on verbal encouragement to know that she’s doing what I want. (Interestingly, though completely off topic, I haven’t been very good at building duration past 30 seconds or so, which was Maisy’s threshold for silence during these tests. It makes me wonder if my inability to build more duration in her behaviors is due to her threshold, or if she’s developed that threshold because I have neglected to put in the work necessary to build more duration. On second though, I’m pretty sure I know the answer to that.)

Finally, and perhaps most importantly, I learned that I don’t like it when I have to tell Maisy she’s wrong. I became frustrated and then angry as she continued to fail, even though that “failure” was behaviorally no different than when we did silence only, or when I used the keep going signal. Maisy was going about the shaping task in the exact same way in each scenario. She wasn’t any more wrong when I told her she was than when I didn’t. In other words: focusing on the wrong behavior rather than the right one changed the way I viewed and felt about the training session, and it took all of the fun and joy out of playing the shaping game with Maisy.

In the end, doing this experiment not only taught me that my initial supposition was wrong, but it also reaffirmed my commitment to positive training. Focusing on what I want her to do helps Maisy learn faster, but it also makes us both feel better.


Kristen said...

Megan thought I would enjoy this, and I did. I love how you thought things through and experimented.

My comments: Shaping and competition are two different things. In shaping, the absence of the click indicates the dog needs to try something else (...essentially a 'no, not that'). But in competition, if we have prepared our dogs (and our selves!), we are not shaping, and we are using our cues as markers/reinforcers for previous behaviors. There is no real pause or time where the dog should be thinking about other options. Heel. Sit. Stay. Front. Finish. Heel. Sit. Stand. Stay...... Heel. But that's another whole topic to think about!

Laura, Lance, and Vito said...

I still have a hard time wrapping my head around this theory. I'm not sure how dogs see the absence of a click but I think a lot of it does depend on the setting. In a shaping session I find it hard to believe that not hearing the click means they are on the right track. It is in some sense a no cue.

When I first started trialing with Lance a big mistake I made was to not work on heeling with long periods of silence. When in the ring I think Lance stressed and left me to heel by myself because he wasn't getting any feedback and wasn't sure if what he was doing was right. I have since worked on training like I trial- heeling with a tight smile and silent (at least when I'm not working on shaping something specific). Hopefully Lance is learning that silence IS a keep going signal. But this is still out of the context of a free shaping session which looks very different.

In shaping, Lance does not do well with a no reward marker. He shuts down very easily and hates to be wrong. I'm human so I still use some "oops" in a happy voice, but I think Lance sees it more of a keep trying cue. I try to wait the silences out but it's hard. Talking to Lance gives him encouragement to try again. But I can't make it seem like he's doing something wrong or he quits on me. I'm not sure how Vito reacts yet. He's also sensitive but I don't see the same worry in him that Lance has about being wrong.

Crystal said...

Kristen- I'm glad you enjoyed it! I agree that there is a vast difference between shaping a new behavior and performing a behavior which is (hopefully!) under stimulus control, but like you said, that's a whole different topic. I really do want to write about that more in the future- I am not great at building duration on behaviors, and that's what needs to be done for behaviors to hold up in competition. It would seem that, in the context of a behavior under stimulus control, the dog understands that silence simply means to continue the activity until cued otherwise.

I also like that you bring up using cues to reinforce behaviors. I've read about this, but the idea is still new to me. I'll need to test that out with Maisy some time.

Crystal said...

Laura- I also find this all hard to figure out. I don't think the absence of a click acts as a keep going signal during shaping, but that it might when the dog is performing a behavior under stimulus control. In that case, the absence of the click would simply indicate that the criteria of duration has not yet been met.

I find your comments about Lance intriguing. I often hear people with clicker trained dogs say that their dogs hate to be wrong. Even more so, I've never heard compulsive trainers say something similar about their dogs. Logically, this shouldn't be- if there is no correction applied, there should be nothing to worry about for the clicker dogs, whereas the compulsively trained dogs do receive a correction. Shouldn't they hate being wrong more? I know that punishment and reinforcement are defined by the receiver, so it makes me wonder what it is about operant dogs that they so hate that absence of a click? Just how do they perceive it?

I'm pretty sure I could go on and on about this... clicker training has far more depth than people give it credit for!

Megan said...

I haven't replied yet because I'm still thinking. I like your execution of the plan, and your results were what I was expecting... so I don't know why this is hard for me to process. Ha!

I would say that Bailey hates to be wrong too, but I'm itching when I say that because there's other reasons for it. Annelise notes it a lot, when it shows, but that means I need to A) train in more environments, B) train more behaviors to fluency before doing multiple repetitions in a not-so-familar-place, and C) help my dog understand that repeating a behavior is NOT a bad thing. In that same vein, I think compulsory trained dogs get "higher" whereas Bailey gets "lower" when she doesn't understand anything. I know people will disagree with me, but they are trained to see the absence of feedback as a positive. That seems to be a stronger response, in a dog with less training (taken here to mean, less behaviors to fluency), than a (seeming to be) typical clicker dog's response.

When I first started clicker training, I was skeptical, but followed the rules to a figurative "T." I don't usually talk to my dog during a shaping session (but I talk to Jane a lot when I do it at AGDN, so I don't know if that has an impact), other than to make mental notes out loud. I'm nowhere near perfect (just saying that makes me laugh!), but my shaping skills have improved drastically since I first started, and even more so in the last year. Shaping is about a high rate of reinforcement, to keep the dog working and actively thinking, and I'm getting to be okay with ending a session that isn't going well, not on a "good note" which used to strike me with horror! I'm also (getting) better at breaking the chains down further, while in a session. Neither of which I could have/would have done not too long ago.

I'm intrigued and fascinated by clicker training, but boy does it get complicated! All these things I have to do and remember... instead of just being able to pop my dog (said tongue in cheek, for anyone else reading this). I never showed (or trained, for that matter) either dog successfully before I "discovered" clicker training, so I'm impressed, to say the least. I'm impressed by my "crossover dogs" but even more-so by the "completely positive puppies" I see. Man oh man can they DO things!

Crystal said...

So, Megan, when did you discover clicker training, and how had you trained your dogs before?

I took Maisy to a lure-reward class when she was a puppy, and only discovered shaping about a year ago. I really love shaping now, although I found it difficult at first. Like you said- it can be complicated! Getting the rate of reinforcement high enough and my criteria low enough has been difficult, but sometimes I think I struggle with not knowing when to raise criteria and when to drop back a bit.

Megan said...

I was taught to use jerk and pop training methods in 4-H when I started with Buzz, almost eleven years ago now. When I had no relationship with my dog and he was running AWAY from me given the chance, I needed something different. Annelise was the only positive trainer who actually followed through with positive, when I searched. I've been taking agility lessons from her for six years now I think (maybe seven, I should look into that). At the same time I bought Morgan Spector's book and read everything I could online.

I still haven't taken a "clicker class" or even an obedience class. I've done a lot of self teaching and have found a great sounding board. This does not mean I don't think I should be in classes, it means I can't find classes that meet my criteria.

Blah blah blah, so, I started shaping when I learned about clicker training six years ago, but it wasn't good!

Anonymous said...

Interesting post! I also think that for the dog, certain cues (e.g. silence) have different meanings depending on the context. Faith understands what we're doing when I pull out my clicker and bowl of kibble and look at her expectantly, and she starts throwing stuff at me till something gets a click. In that case, my failure to click means she's wrong, and she keeps trying. She also understands what we're doing when I tell her to heel (or whatever) and then remain silent. In that case, no click means nothing at all--or if anything that she's right.

Faith has a cue that I suspect is like your "Nope!" which is "Go on!" This means (to me and I think to Faith) "Keep trying!" If she is on the wrong track in a training game and gives up (stops offering whatever behavior), a "go on!" elicits a fresh attempt to earn a click.

I have used a no reward marker in the past ("whoops!") but for the most part felt that Faith did not learn from it and that it was dampening her enthusiasm. I do still use it in two cases I can think of offhand--first when Faith misses or falls off her contacts and also when she misses a weave pole entry. I am still not sure that it's useful in those contexts, though, but I also don't think it's hurting, either.

Crystal said...

Hi! Glad to see you here! :)

I think you make an interesting point about context... in fact, I plan to make a post about this tomorrow or Tuesday, the gist of it being that I think our dogs are smart enough to understand the difference between silence during shaping and silence as duration.

Denise said...

Hmm. Maybe it's better not to think in terms of wrong and right. It's just a click and treat or no treat. I would never allow a silent period of more than 10 seconds....I would get involved in some way. I might say a cheerful "good girl!" to keep the dog in the game, or I might move the dog to reset, or something...but no way I'd sit there staring at my dog staring at me. I also remember what Bob Bailey says....set up the situation so the dog is bound to stumble upon the answer quickly. Sometimes it takes a lot of thought to set up a problem so the dog will be right, but often if I take a break I can come up with an answer. I am quite sure if my puppy were without feedback for more than 5 or 10 seconds that she would quit and find something better to do.

Crystal Thompson said...

Denise, I don't disagree with a single thing you say!

This post is so old that I had to go back and reread it because I couldn't remember it anymore, and I was thinking "Really, 30 seconds with no click? What the heck was I thinking?!" You're absolutely right that even 10 seconds is a long time. No wonder Maisy quit on me. The game was no longer fun.

Sometimes I re-read my old posts and feel a bit embarrassed because I do things or think about things differently now, but what I really like about this post was my realization about how I felt (frustrated, angry) when I told Maisy she was wrong vs. how when she's right.