My Unique Clicker Approach

by **Andromeda** » Fri Mar 15, 2013 3:16 pm

Michael wrote:I took a look at the article and see where it is coming from about VR not being necessary for the basic dog owner teaching tricks to their dog. There is a substantial difference when it comes to parrots that makes me disagree that VR isn't necessary for most pet owners. Parrots are naturally wild animals so except for the positive reinforcement taming and training that we give them, they are difficult to handle. In order to get our parrots to step up reliably, come out of the cage, flight recall to us, not bite us, etc we cannot keep giving treats every single time or the parrot will be stuffed with treats in the first 5 minutes it spends with us. The more good behavior that we can move to a diluted variable ratio reinforcement schedule, the more domesticated the parrot's behavior appears to be.

This is a good point, training a wild animal (such as a parrot) is very different from training a domesticated animal such as a dog. A dog is more likely to want to please its owner whereas a parrot is not motivated to "people please" and unless it is hungry enough to be motivated by treats it loses interest.

You definitely get a lot more out of a parrot when rewarding on a VR. For example, if I am doing a session with my GCC and he gets full, he just flies away. I try to anticipate this and end the session before he loses interest but sometimes I miscalculate and he just flies off. You're not really going to have that problem with a dog since it doesn't have wings, but good luck chasing down a flighted bird and trying to get it to train once it's full.

Michael wrote:As a side effect it has also taught my parrot to try harder when training new behavior. Sometimes the behavior I am soliciting in training something new just isn't click-worthy for a while. Since my parrot is used to clicks confirming all correct behaviors, it makes her realize she is doing things wrong and makes her keep trying to get it right. Better yet, since she is so accustomed to training in this manner, she even varies the behaviors she offers until it's right enough to earn a click.

I've experienced something like this when training my Poi. When I was training "wings" I used capturing so at first it was just him lifting his wings at the shoulder, folded. I wanted him to extend his wings but after a month of working on it I wasn't seeing any progress whatsoever. I decided I'd withhold the click until I got even a tiny amount more. When he didn't get a click after a few tries he varied the behavior and really surprised me with a complete lateral wing stretch. I was only looking for a tiny stretch so that was far more than I was expecting! I clicked and made a big deal out of it and offered him a jackpot reward. When I cued "eagle" again he initially offered the folded shoulder lift but when that didn't get a click after he tried it twice, he fully stretched his wings out laterally again on the third try. The next time I cued eagle he immediately did the lateral stretch.

I know you could get the similar results in shaping with a bird that wasn't used to the clicker as CFR and food as VR but I don't know if you could get them as quickly, nor do I think the bird would vary its behavior so readily.

by **KaratParrot** » Sun Mar 17, 2013 8:15 am

I took a look at the article and see where it is coming from about VR not being necessary for the basic dog owner teaching tricks to their dog. There is a substantial difference when it comes to parrots that makes me disagree that VR isn't necessary for most pet owners. Parrots are naturally wild animals so except for the positive reinforcement taming and training that we give them, they are difficult to handle.

I think that's where some confusion comes in Micheal. The Baily's are trying to say new dog trainers don't need to go as far as VR. This makes sense because experienced trainers and dogs certainly do need VR for complex behaviors when being a service animal, show dog or police dog. New and hobbyist parrot owners are the same way. It's about the trainer and their skill level, not how "wild" an animal species is. New owners of parrots tend to still use CRF, just like dog owners.

As for the clicker, yes I use the clicker on a continuous reinforcement schedule.

Yes! When when a click is given for every response it is considered CRF (continuous reinforcement). Yet a CRF schedule is only used when we want to teach a new behavior as stated by every Behavior Modification teacher and researcher out there (just one example is Raymond G. Miltenburger). Can you think why that is?

Just for the animal training world I can think of a few good reasons for practical applications. Clicks for every behavior is time consuming and can be seen as annoying when presented onstage. It certainly isn't a necessary part of performances, even with shows with parrots.

I also give praise and attention on a pretty continuous reinforcement schedule. These are mild reinforcers and mostly secondary.

Many trainers get confused with primary and secondary reinforcement. They often state that primary reinforcers are stronger than secondary. This of course is untrue. The definition of Primary Reinforcement is anything a species need to stay alive; food, shelter, air, water, sex, even companionship depending on the species. Steve Martin and Dr. Friedman go as far to even say that the ability to manipulate ones environment is a primary reinforcer, also known as choice. Secondary Reinforcers are anything that an individual from a species finds reinforcing from a continuous pairing with the aforementioned primary reinforcer. It hasn't been often that this distinction has been made clear to the audience of the parrot forum and should be made more often.

The treats - which are the predominant thing the parrot is working for - are on a variable ratio reinforcement schedule. Thus I am successfully playing the best of both worlds where I can differentiate good from bad demonstration of behavior in a long chain or sequence yet get the parrot to complete all of that behavior for infrequent food rewards.

This is the important part. Variable Reinforcement and Continuous Reinforcement cannot be mixed. This is because the entire schedule of reinforcement becomes CRF! It's just like you said, sometimes they get a click (as a secondary reinforcer) and sometimes they get food (as a primary reinforcer). This is still CRF, only the type of reinforcement is changing, which adds variety of course! It's always good to add excitement by adding variety.

So why do the teachers and researchers of Behavior Modification only recommend to use CRF for only teaching new behaviors?

by **Pralina** » Sat Mar 23, 2013 12:08 pm

Doesn't the bird get confused by using primary and secondary reinforcements in variable sequences and ratios?
From Michael's example:

Michael wrote:For example I can execute the following sequence with the following outcomes:
cue/response/reinforcement
wave/good/click
shake/good/click
wings/poor/no click
wave/good/click
wings/better/click
turn around/incomplete/no click
wings/good/click & treat
etc

In this manner the parrot learns success from failure throughout the sequence of behaviors (note they are not a chain because they are performed in random order and not repeated this way again) and continues trying for the sake of earning the more desirable food rewards on that variable ratio schedule.

it appears to me that the bird might get confused about the tricks that are incomplete and not successfully executed... wouldn't it gradually perform THAT single trick worse and worse, because it would become unclear to him in the sequence (as he gets his food reward at the end anyway, and as he gets clicks for the other successfull tricks anyway) which trick he missed and why?
I do agree with the fact that they are intelligent animal.
However, how precisely do you think they can discern from a multitude of tricks the one they misperformed, and to what extend they screwed up, and to what level to perform better NEXT time ?

Also, wouldn't you think that this way, gradually with time, the bird will start misperforming more and more tricks out of the sequence ?

Unless in such a case, you would go back and re-train / practice that one trick only, with a one click / one reward technique, until the bird gets it systematically right every time.

Parrot Youtube Channel · by **Michael** » Sat Mar 23, 2013 8:49 pm

Pralina wrote:Doesn't the bird get confused by using primary and secondary reinforcements in variable sequences and ratios?

Nope.

Pralina wrote:it appears to me that the bird might get confused about the tricks that are incomplete and not successfully executed... wouldn't it gradually perform THAT single trick worse and worse, because it would become unclear to him in the sequence (as he gets his food reward at the end anyway, and as he gets clicks for the other successfull tricks anyway) which trick he missed and why?

No. The parrot realizes that the poor instance of the trick didn't even earn a click but the better one did. It still strives for repeating the better variants because the variants that always receive clicks, occasionally receive treats. The variants that never receive clicks, never receive treats. Therefore it is always worthwhile to demonstrate click-worthy behavior. I'm not making this up. Take a look at the videos demonstrating variable ratio reinforcement behavior. In reality it works even better than in the videos. Those are condensed and simplified just to make it easier to share. I could probably get Kili to run through her entire 20 tricks routine now for just a single treat. Since she gets clicks throughout the process, she both realizes she is doing things right and is encouraged to keep trying. If you try to get the animal to perform 10-20 behaviors without offering any feedback, there is a greater tendency to just figure it is doing it wrong and give up. Whereas if the clicks keep coming and the parrot is certain that some clicks must eventually earn treats, then it is worthwhile to continue and keep trying.

Pralina wrote:However, how precisely do you think they can discern from a multitude of tricks the one they misperformed, and to what extend they screwed up, and to what level to perform better NEXT time ?

Very precisely. However, not in the scope of "next time." This works extremely well in the long term. Obviously the trick must be carefully and precisely taught first to a satisfactory standard. However, some improvement could always be made. It's always better to achieve longer, further, higher, stronger, better. These finishing touches often take months or years to achieve and are not in the scope of the initial training. By using my approach, it continues to stimulate long term improvement while soliciting the greatest number of trials for practice. The behavior becomes extremely robust and resistant to extinction. Because my parrots get to practice the tricks thousands of times, they can't forget them. I can pull tricks we haven't done in a year and they know immediately what to do on the first shot and do it well. This is because of the endless practice we have underwent. But how can you practice a routine of over 30 tricks (plus a lot of flight recall for exercise) on a continuous reinforcement schedule!? That would require 30 treats just to practice each trick once!!!! That's ridiculous! Yet when I get 10 tricks for the price of 1 treat, I can make 10 treats and 100 clicks reward the practice (and exercise, yes exercise, lifting feet, rolling over, etc takes exercise to be able to do!) and gives more chances to learn if it is being done right or not.

Actually come to think of it, NOT using the clicker every time (when applying VR) would lead to diminishing performance. THEN the parrot tries to get away with a sloppy job and since treats come on a VR anyway, it still feels like it got occasional rewarding for sloppy behavior. My approach ensures that the parrot knows which good or improving cases it is being rewarded for and which ones it isn't.

Pralina wrote:Also, wouldn't you think that this way, gradually with time, the bird will start misperforming more and more tricks out of the sequence ?

No. That makes no sense. They get better over time. They get to practice the behavior much more times. Also they are used to working for a thinner reinforcement margin. In other words, it takes less motivation to get the parrot to perform when it is used to getting a single treat for 10 tricks than the parrot that gets one every time. This allows for some great differential reinforcement and makes continuous reinforcement even more effective!

Pralina wrote:Unless in such a case, you would go back and re-train / practice that one trick only, with a one click / one reward technique, until the bird gets it systematically right every time.

Well there's no doubt the bird has to be doing it right in the first place. But I really can't think of many cases where I've ever had to go back to continuous reinforcement for a trick that was put on VR.

Look, it's one thing to talk about it and postulate that it won't work and another to actually see it happening. It's not a method I recommend for most people but it is by far the most effective when used properly of everything I've come across. If you try to think about VR, you'd also come to the logical conclusion that less treats means less performance but when you actually apply it, you realize this is not the case.

by **Pralina** » Sun Mar 24, 2013 4:28 pm

Thanx for your reply and I really do understand the point you make.

However, don't get me wrong, but the reason why Im asking so many questions is because I am myself reading a lot and other people's theories as well, and I came accross this article right before you created this post and your blog entry. So I find it interesting how... contradictory both approaches are.

The article I am talking about is Blazing Clickers by Steve Martin and Susan Friedman that was brought to my attention by another parrot trainer.
http://www.naturalencounters.com/docume ... sFINAL.pdf

When persistence is required, the best approach is to first teach the new behavior with
continuous reinforcement (click-treat) for the clearest communication of the behavior-consequence
contingency. Next, gradually thin the reinforcers over time (known as stretching the reinforcement ratio)
to the desired variable schedule changing the amount of behavior unpredictably while increasing the
amount of behavior required for reinforcement overall. For example, if a trainer wants a lion to make
several trips to a public viewing window each day, a variable ratio schedule of reinforcement (i.e., the
click-treat together, no solo clicks!) would be the right tool.

Therefore, and Im just asking here, doesn't it sound like it makes more sense to define the variable ratio by BOTH the schedule of the click and the treat, instead of ONLY the schedule of the treat?

Because in the end wouldn't you want from the parrot to actually DO a behavior when you ask it (cue it) and therefore understand the cue without necessarily having to receive a click or treat for it every time?
What I mean is, you want to be able to say "Kili" and she flies to you no matter what, right? Not only in a training session, but just about all the time, without having to earn a click or a treat. So therefore wouldn't it make more sense to withold BOTH if using a variable ratio training?

by **rebcart** » Thu Mar 28, 2013 7:56 am

Michael wrote:For example I can execute the following sequence with the following outcomes:
cue/response/reinforcement
wave/good/click
shake/good/click
wings/poor/no click
wave/good/click
wings/better/click
turn around/incomplete/no click
wings/good/click & treat

You know, come to think of it, seems to me that you're using differential continuous reinforcement the whole time. After all, every single correct response that meets your criteria gets a click, yes? And the reward from the click is either a treat, or the opportunity to do another trick (which, considering how consistently and often you train, should probably also be considered a secondary reinforcer).

If you were REALLY using VR, it would look something like this:

trick/good/click
trick/good/no click
trick/bad/no click
trick/good/no click
trick/better/click
trick/good/click
trick/bad/no click
trick/good/click
trick/better/no click

etc.

Where even some correct performances no longer get a click.

by **KaratParrot** » Mon Apr 01, 2013 12:18 am

Pralina - You bring up some excellent points! You pointed out that birds will not always knows when they did a behavior "poorly" when on a Variable Ratio (VR) reinforcement schedule! Because we know it takes a lot longer to teach a new behavior by using VR, like teaching a bird how to wave, we stick to a Continuous Reinforcement (CRF) schedule until the trick is exactly how we like it. So as you said, this means that if a behavior was performed poorly we go off VR and back onto CRF until the "poor" behavior is eliminated by extinction (this is known as Differential Reinforcement of Alternative behavior DRA). Right on the money!

Micheal doesn't realize it but this is what he is doing when he does not click for the "poor" response. The only hard and fast rule is that once the good response does occur you always want to reinforce it so that the "poor" behavior goes on extinction via a DRA procedure.

You said that a good idea would be to re-teach the behavior on CRF in a separate training session. And indeed that can be done if the behavior is REALLY BAD! But say that an Amazon says the wrong word during a show, what do you do? You wait until he says the right word and then give the treat! It cannot be emphasized enough to always, always, always give a reinforcement when the bird says the right word, when before he didn't. Again, this is called Differential Reinforcement of an Alternative Behavior.

Micheal -

I could probably get Kili to run through her entire 20 tricks routine now for just a single treat. Since she gets clicks throughout the process, she both realizes she is doing things right and is encouraged to keep trying. If you try to get the animal to perform 10-20 behaviors without offering any feedback, there is a greater tendency to just figure it is doing it wrong and give up.

You keep touting that this idea is unique and has never been done before. It has and continues to be used by cetacean trainers. Remember those whistles that they blow every few seconds? That is the equivalent of your click. Yeah, the whales don't get a fish when they are breaching and swimming either, just like your birds don't get a treat every time.

The thing is cetacean trainers are criticized for doing this. Why?

1) They claim it to be Variable Reinforcement.
2) They run into problems down the road. (slower less enthusiastic behaviors, "refusal" to do a behavior)
3) Those whistles are annoyingly unnecessary. Maybe it makes them feel important blowing a whistle every five seconds?

As already explained further up this thread, variable and continuous reinforcement cannot be combined together, it has to be one or the other. If a trainer tries to combine it the schedule of reinforcement always comes out to be a CRF! There is no avoiding it.

According to researchers John M. Pearce, Edward S. Redhead, and Aydan Aydin from the University of Wales in a paper titled "Partial Reinforcement in Appetitive Pavlovian Conditioning with Rats" the theory of Extinction applies a little differently when used with partial vs. continuous reinforcement.

What they found (and has been confirmed even before this paper) is that if extinction is applied to a behavior on CRF, that behavior breaks down really quickly. But if extinction is applied to behavior on partial reinforcement, that behavior breaks down really slowly.

So what does this mean for you? Think about how strong Kili's behavior is. You relate that to the clicker being on a CRF. But is it really? With the above knowledge I can come up with a test for Kili. Test to see what happens when the click stops occurring and you only give a food reward on a VR schedule as usual. If the behavior breaks down quickly then the click is a CRF. If not then something else is going on here.

My bet is that the bird is currently "tuning out" the click as extraneous information. The bird's cue has likely evolved into your hand moving towards Kili for the delivery of the treat.

But by stating that "By using my approach....The behavior becomes extremely robust and resistant to extinction." you have already outed the answer to my test. Your clicker is NOT on Continuous Reinforcement even though it looks like it is. And this can mean only one thing, it's not acting as a reinforcer anymore. Your bird is tuning it out.

It's not magic, it's science. We know that when a clicker (a secondary reinforcer) stops being paired with treats (a primary reinforcer) the click no longer stays a secondary reinforcer.

Those crazy researchers! What will they think of next!? :budgie:

My Unique Clicker Approach

Re: My Unique Clicker Approach

Re: My Unique Clicker Approach

Re: My Unique Clicker Approach

Re: My Unique Clicker Approach

Re: My Unique Clicker Approach

Re: My Unique Clicker Approach

Re: My Unique Clicker Approach

Who is online