I apologize in advance to any readers who are not familiar with clicker training, or who are just beginning to learn about the learning theory behind it, as today’s post concerns more sophisticated clicker concepts.
About a week ago, someone on a mailing list I belong to posed a very interesting question: If the click means “yes,” then what does no click mean?
The poster, a teacher, mentioned that when she has her students play the“clicker game,” in class, they initially learn faster if they receive feedback for both yes, that’s what I want you to do, and no, you’re going in the wrong direction. In other words, a no reward marker. She went on to say that once her students understood the game, they learned that the absence of a click or verbal marker was basically the same thing as being told no. Once they figured that out, they could figure out the task just as quickly with only the positive marker.
She wondered: do our dogs understand the absence of a click the same way? Do they interpret silence as “no”? If so, why do they keep working in trial settings, where they receive neither clicks nor encouraging verbal feedback? Wouldn’t the silence inherit in a trial tell our dogs that they are doing it wrong? If so, this would have dire consequences on our performances.
The general response was that silence should not- cannot- imply that the dog made an error. Instead, we must teach our dogs that silence is a keep going signal- that they are on the right track, and that if they keep up with what they are doing, they will earn reinforcement. That is the only way that our trial performances will hold up.
So, if silence means “keep going,” then how do we tell our dogs they’re going off track? As Clicker Trainers, we don’t use corrections (defined here as anything that causes the dog pain or stress) to tell the dog they’re wrong. The logical response would be the use of a no reward marker- an emotionally neutral way of saying no, try something different… but some people on the list argued that this would actually slow learning down.
I disagreed. I shared with the group that when Maisy begins to get off track during a shaping session, I tell her “Nope! Try Again!” in a cheery voice. I wrote that I felt my dog learns faster this way, but that even if she doesn’t, it helps me feel better to be giving the feedback.
Still, in light of the conversation, I decided that I would test my theory, so I sat down with Maisy to work on a shaping project. First, I just worked with her like normal, not really thinking about what I say or when I say it. Although I did say “Nope! Try Again!” perhaps three or four times in the course of five minutes, I found that I said it more as conversation and less as information. Interestingly, I discovered that I was saying it at times when we were in the midst of a long period of silence. That “nope!” served to fill the silence until she finally got the click for doing what I wanted.
Next, I worked with her, but remained silent. I didn’t speak; I simply clicked or didn’t. Maisy continued on, doing well until we hit one of those long periods of silence. She kept trying things, but after about thirty seconds of neither a click nor a “nope!”, she laid down and looked at me as if she wasn’t sure what she was supposed to do.
Finally, I tried using the no reward marker more regularly. We continued shaping, but I tried to think in terms of right and wrong. I clicked when she got it right, and said “Nope!” when she got it wrong. This led to rapid-fire clicks and “nopes,” and after she got three “nopes” in the space of about ten seconds, Maisy again laid down with her chin on the floor. This time, though, I had to encourage her quite a bit to re-engage with the shaping game. But when she again got several more “nopes,” she laid down and refused to play any more.
I began to feel frustrated; this is not how it’s supposed to work! She’s supposed to want to play! My frustration came out in my voice, and I began to tell her to get up with an edgy tone. When she didn’t, my feelings of frustration gave way to anger. Since I didn’t want to take that out on her, I ended the session to evaluate what had just happened.
The first thing that I decided was that I was wrong: Maisy does not learn faster with a no reward marker. In fact, she gave up so quickly, and was so difficult to persuade to re-engage with the task, that I believe she found it punishing. True, she also gave up when the silence went on too long in the second scenario, but she worked approximately three times longer, and was much more willing to re-engage when I asked. As a result, I think she found the lack of any feedback confusing, but not aversive.
Still, I concluded that the complete lack of any kind of feedback was also not the best way to help Maisy learn. Instead, her learning is most efficient when she gets lots of reinforcement over a short period of time. This means my job is to break the shaping task at hand down into as many pieces as possible so it is easier for her to progress through each step of the task. However, sometimes it is difficult to figure out how to break a task down any further. As a result, if I cannot figure out how to make the task easier, and if it’s been fifteen to twenty seconds without a click, I need to give Maisy a “gimme” click- reverting to the previous level of criteria for a few moments before trying the higher criteria again.
I also suspect that my initial use of “nope!” wasn’t actually serving as a no reward marker. Given the way Maisy responded, I think it actually served the purpose of a keep going signal for her. This means that for tasks that haven’t had a sufficient amount of duration built in yet, she depends on verbal encouragement to know that she’s doing what I want. (Interestingly, though completely off topic, I haven’t been very good at building duration past 30 seconds or so, which was Maisy’s threshold for silence during these tests. It makes me wonder if my inability to build more duration in her behaviors is due to her threshold, or if she’s developed that threshold because I have neglected to put in the work necessary to build more duration. On second though, I’m pretty sure I know the answer to that.)
Finally, and perhaps most importantly, I learned that I don’t like it when I have to tell Maisy she’s wrong. I became frustrated and then angry as she continued to fail, even though that “failure” was behaviorally no different than when we did silence only, or when I used the keep going signal. Maisy was going about the shaping task in the exact same way in each scenario. She wasn’t any more wrong when I told her she was than when I didn’t. In other words: focusing on the wrong behavior rather than the right one changed the way I viewed and felt about the training session, and it took all of the fun and joy out of playing the shaping game with Maisy.
In the end, doing this experiment not only taught me that my initial supposition was wrong, but it also reaffirmed my commitment to positive training. Focusing on what I want her to do helps Maisy learn faster, but it also makes us both feel better.