The piece Eric and I recently posted on Robert Covington’s streakiness — or lack thereof — led to an excellent comment thread covering many interesting issues. I thought I’d try to summarize and address some of those issues here.
As a reminder, in the piece we tested Robert Covington versus other players who shoot three-pointers about as often and as accurately as Cov. In separate tests of streakiness on an across-game, across-month, and across-year basis, we found no evidence that Cov is an especially streaky shooter. Yet in our poll of thousands of fans, we found that 92% perceived Cov as someone who shoots streakily.
Before I get into the Q&A, I want to make one big-picture point. Many of the commenters had questions about specific details of the tests we ran. What if we’d left out ballhandlers like Harden? What if we’d looked at shorter or longer intervals? Etc. These are all perfectly reasonable questions. But here’s the thing: either Cov is a lot streakier than other shooters, or he’s a tiny bit streakier, or he isn’t any streakier. From everything we’ve read, from fans and pundits alike, the consensus is that Cov is a lot streakier than the norm. And our point is this: if Robert Covington were a lot streakier than others, it would have shown up on these tests! He could be a little bit streakier and our tests could have missed it. Maybe if you only include small forwards in the test, you’d find Cov is a little above the norm for streakiness — I’d bet against that hypothesis, but it’s not impossible. But there’s just no way that cov is the streakiness standout that people claimed, given our results. Do you see my point? If we didn’t have any data on taking charges, and your eye test told you that Ersan Ilyasova was really good at taking charges, and you got some data and tested that hypothesis using three different tests that are just as imperfect as the ones we used, do you know what you’d find? You’d find that Ersan Ilyasova takes a tremendous number of charges! There is almost no chance that you’d come up with plausible-sounding tests and then, because you included too many guys at the wrong positions, or chose the wrong cutoffs, or looked by month instead of by week, that you’d end up wrongly concluding Ersan is ordinary at taking charges. Statistical tests just don’t work that way. When someone is a standout, unless you pick exceptionally cruddy tests, they will, um, stand out! Cov absolutely does not stand out for his streakiness. So if you believe that a better set of tests would show Cov is a bit streakier than the norm, well, I can’t say I know you’re wrong. But if you think that a better set of tests will show that Cov stands out among NBA players for his streakiness, I can tell you there’s almost no chance that’s the case. If you’re not convinced yet on that, see below for more. If you’re still not convinced after that, perhaps we’ll have to agree to disagree, at least for now. Let’s all keep open minds and we’ll see what the future teaches us!
OK, on to the questions!
Q: Why do you guys refer to dinky little studies like this as “science,” calling your podcast “Sixers Science” and all that? Isn’t that presumptuous?
A: Quite the reverse. I believe this really is science, social science of course, but science nevertheless. And our point in referring to it that way is not, “we are big shots who do science, and others aren’t.” Our point is that anyone can do science, even jokers like us, and you can do science on any subject, even something as simple and trivial as shooting a basketball. What makes something science is not that you have a fancy lab or a fancy title. Nor does “science” require that you use 1000 comps over 100 years to make your point. If you pick a topic to study, think in a clear and unbiased way about some ways to study that question, run the tests, report the results, and stay open to changing your mind if others find something different... if you do all that, that’s science. I learned to do science from the baseball writer Bill James, and that’s how he did it. Later I was priveliged to get to study with a respected scholar, and he did it the same way. In one of his best-known papers, he broke his sample into three groups, top 30%, middle 40%, and bottom 30%. When people asked whether it should instead have been 1/3-1/3-1/3 or 25%-50%-25% or something else, he said he picked a way that seemed reasonable and then he stuck with it for consistency. A quarter-century later most researchers are still using 30-40-30 for that particular experiment, because that way everyone knows they didn’t cherry-pick it. And that’s science.
Q: In the original post you talked about the idea that folks who see things differently should run and report their own tests to see if their critiques hold up. Aren’t you just being lazy and obnoxious when you say that? Aren’t you just demanding others do your work for you? I mean, obviously readers aren’t going to create their own studies, right?
A: In no way was I trying to be lazy or obnoxious when I wrote about that! Instead I was expressing my openness. Readers on this site do small studies all the time for Fanposts, or even for comments on Fanposts. Doing a comprehensive study is hard, but doing an add-on to an existing study may not be. For example, we reported on Cov relative to other shooters of similar volume and accuracy but we didn’t include only forwards; it would be easy to look at our list of shooters, drop all the non-forwards, and see if Cov is suddenly a streakiness outlier; that would take less than 2 minutes, and is a perfectly reasonable thing to do. And our point is just this: if someone did that, and found that Cov is indeed super-streaky when compared to forwards, that would be an interesting finding and would change my view of the original question! Science , especially social science that uses widely-available data such as NBA data, is a game everyone can play!
Q: Cov feels streaky to most observers -- why should anyone care what the data says about this?
A: If you’re reading this, it’s probable that you care a lot about whether the Sixers win. I sure do! I’m human too, and so like everyone else I “sense” streakiness in shooters, including Cov. The question is, what should Brett Brown, and Cov himself, do with that feeling? If “normal” Cov is a 37% shooter, but “slumping” Cov is a 27% shooter, then when Cov has not shot well lately, the Coach should probably either bench him or tell him not to shoot or tell him not to shoot unless he’s super-duper wide open. If, on the other hand, Cov is still right around 37% when he’s missed a bunch lately -- if each shot is independent, and there’s no streakiness -- then Cov should play his game the same way whether he’s been making or missing. What we find here is that Lord Covington does not have an especially strong tendency to shoot poorly when he is “on a cold streak.” That has real implications for Sixer strategy. So I think it’s worth our while to understand which is more reflective of reality: the eye test, which says that when Cov is cold he should play less, shoot less, or both, or the data, which as far as we can tell says we’ll win more if we just let Cov be Cov.
Q: What difference does it make if Cov is streakier than other shooters? Maybe all shooters are streaky!
A: Maybe they are! In the original article I mentioned that while studies treaditionally found little to no evidence of streakiness for the average player, recent evidence had suggested there was probably some, though still very little. I was referring to the seminal academic work of Gilovitch, Tversky and Vallone and the recent response of Miller and Sanjurjio. In response to some smart commenters, especially the always-acute Paul Moo, I revisited Miller and Sanjurjio’s work, and I am no longer comfortable making the statement that average NBA streakiness is tiny. That may in fact be true, indeed I still kind of suspect it is true, but I need to dig deeper before I can report that with confidence. I’ll post more on this another time as it’s a fascinating question. For now I’ll just say this: our article presented no evidence on the average streakiness of NBA players. The only question we asked was whether Cov was similarly streaky to others, more streaky, or less streaky. We find he is entirely ordinary in this regard, not an especially streaky shooter as many perceive him to be..
Q: OK, you were focused on streakiness relative to other shooters. But what if you just look at Cov’s own shooting, what does that show?
A: I haven’t done the work yet. But there’s some good stuff in the comments from NJNS and ftaok, both of whom seem to find no special streakiness. I can’t vouch for their counts, though. Also, this is a tricky area, the whole point of the Miller and Sanjurjio paper mentioned above is that this kind of streak analysis has to be done carefully -- one reason Eric and I took the approach we did, which I believe does not suffer from the Miller-Sanjurjio critique. Finally, we need to agree on what we call streakiness. Suppose we find that when playing excellent wing defenders like those on the Spurs Cov shoots 32% from 3, but while playing the Nets he shoots 44% from distance. And suppose that’s the norm for all shooters facing those teams. Would we then want to say he was “hot” in his 4-for-9 night against Brooklyn and “cold” when he went 2-for6 against San Antonio? Or that he was quite consistent, and just faced a better D in SA? There’s no “correct” answer, we can use words however we wish, but I’d say the latter is closer to my own view of what it means to be “streaky.”
Q: How could it possibly be true that shooters don’t have nights when they are shooting better and thus more likely to hit?
A: There’s a lot going on here and I’ll go in depth in a future post. As discussed in the original article, even if a player never feels “hot” or “cold,” we might still expect streakiness because some nights you have a weak defender on you, some nights your teammates are injured so you get less good looks, some nights your wrist is sore, some nights you have had more rest, etc. etc. These effects will clearly create better and worse nights for players. Then add on that your confidence grows when you’ve been hitting, and confidence seems as though it should help shooters be more accurate. And on top of all that, we have the possibility of some form of “being hot” that goes beyond just “Nik Stauskas is guarding you tonight, instead of Andre Roberson!” So what we might expect is that players would have lots of 6-for-7 and 7-for-7 nights, certainly far more than would be predicted if shots were totally independent.
Against all that we have the possibility that players attempt tougher shots when they have been hitting (the so-called “heat check”), plus the fact that defenders guard a player tougher after he’s hit some shots; indeed they may put a different defender on the shooter or double-team or stop going under screens, etc. There’s really no way to predict how all these effects will net out, it absolutely could go one way, or the other, or pretty much be a break-even. Most research to date suggests it’s the last of these, the effects mostly balance and so players who have been hitting in the first half, for example, shoot about as well as their norm in the second half, rather than far better as the streakiness hypothesis would suggest. Now, we aren’t presenting evidence on this, that’s for another day. All I’m hoping for now is that you’ll agree it’s totally plausible that the anti-streaky effects could be similar in size to the pro-streaky effects, leaving not much streakiness in the end for the typical player (just as it’s also plausible that net streakiness or anti-streakiness IS consequential!).
Q: If Cov isn’t streaky, what kind of shooter is he?
A: I’d say Cov is a good-to-very-good high-volume three-point shooter who seems streakier than he is because there aren’t all that many people shooting 7 threes a game and some of them are truly amazing 3-point shooters like Steph and Klay and JJ and Korver. Cov isn’t any streakier than those guys, as far as I’ve been able to tell, but he is less good than they are at shooting, so he suffers by comparison (as far as shooting goes; once we include defense he fares a lot better!). Ideally we’d keep separate the concept of how “good” someone is from how “consistent” they are, but understandably these lines are often blurred in casual conversation. When doing analysis, it’s worth the effort to separate “B is a less-good shooter than A” from “B is less consistent or more streaky than A.”
Q: Not really about streakiness, but you suggest that all-time-great three-point shooter JJ Redick may sometimes err by passing up a 32% triple (0.96 expected points) for a 45% two-pointer (.90). Do you really think a hoops savant like JJ, of all people, would fail to recognize the value of the three?
A: I certainly can’t say for sure that JJ makes that mistake, as it’s impossible to know the exact probabilities for any given attempt. And even if we knew that, it would be possible he needs to sink some long twos so as to keep defenses from knowing what he’s going to do. It’s a complex, repeated strategy game! So let’s just say that I worry that players may sometimes pass up a modest-probability three for a higher-probability two due to instinct overcoming optimality. JJ is probably better than most at avoiding such errors, and maybe he hardly ever makes them; but since he’s human like the rest of us my guess is he does sometimes fall into the trap!
OK, that’s enough on general streakiness, what about our work, which compares Cov to others?
Q: Why didn’t you include more players in your comparison groups? Weren’t your samples too exclusive?
I’ll answer this together with another popular question.
Q: Why did you include so many players in your sample who are so different in play style from Cov? Aren’t your samples too inclusive?
A: Both these critiques are valid! Ideally we’d have a sample with hundreds of players, each of whom played in the same seasons as Cov, played the same style, shot as often as Cov and as accurately. Unfortunately such samples are not available; there’s only one Lord Covington! In choosing our samples we prioritized three things:
1) Using a method that had no subjectivity. If I said that Jae Crowder should be in because he’s a Cov-style player but Trevor Ariza shouldn’t because he isn’t, then people would rightly be concerned that I made that choice because I wanted the results to come out a certain way. So we used simple rules and stuck to them. That way our research is replicable.
2) Choose only shooters who shoot in similar volume to Cov. There are a lot more players who take 4 threes per 36 than there are who take 7. So if you don’t demand shooters with similar volume to Cov, you’re going to get a whole lot of guys who only shoot 4 a game. But then the games in which they shoot 7 threes are their outlier games, which could create a skewed sample in any number of ways. Our view was and is that apples to apples demands similar volume of shooting.
3) They had to have similar accuracy to Cov. If one guy hits 34% and one hits 41%, you need to devise extremely careful statistical tests to do a proper comparison of their hot, medium, and cold games. I could come up with a methodology, but I bet if I did it would be biased in some important way. Maybe you are smart enough to devise the proper statistical test and be sure it’s clean, and if so, my hat is off to you, and I look forward to seeing the results of your study. For me, if i don’t match on accuracy, I’m not sure what to do next.
Once you put in those three rules, you basically have one final set of choices, which is how wide to set the range around Cov’s level. If Cov shoots 36.5%, you could include only people between 36 and 37, or you could have everyone between 33 and 40, or whatever. What we did was, we tried something, and it got what seemed to us a reasonable number of comps, so we stuck with it. We feel pretty confident the results won’t change if we widen the ranges.
Now, we could have done other things that would have been objective, like say we’ll only take players whose Basketball Reference pages say they play “Small Forward.” Of course then you’d have a tiny sample unless you widen the ranges and include guys who shoot a lot more or less than Cov, or a lot better or worse. As explained above, I think that leads to more serious problems.
I guess my general point is this: no study is perfect. But here are two ways in which a study could be flawed:
1) The study uses not-perfectly-comparable comps and has a less-than-infinite number of them.
2) The study uses a comp set that, for clear and logical reasons, is biased toward finding the conclusion that it reaches.
I would urge people who evaluate and comment on studies to distinguish these two cases. If the situation is the latter, the study may well be useless, so explain the source of the bias and we’ll know to adjust or even ignore the interpretation accordingly. But if the problem is of type 1), well, that’s true of literally every empirical study ever done. It’s fine to note the imperfection, but if said imperfection is just as likely to strenthen the study’s claim as weaken it, that point should be noted by the person raising the issue.
In other words: for any study you read, you can always say “here’s another way it could have been done.” Always true. But that does not meaningfully undermine the results of the study. To do that you either need to show the study was biased, or show a different approach gets a different result.
One last note: if we use 12 comps, and Cov is right around the middle of them it’s a very weak critique to say we should have used more comps. Do you see why? Suppose we’d picked 20 more guys. What would have to happen for that to change our conclusion? Well, what would have to happen is, almost all 20 of those people would have to be on the same side of the middle of the original 12. But the odds of that happening, statistically, are close to nil unless our sample is horribly biased! See, if I show you 12 Comps and Cov finishes first by some measure then, yes, it is a valid critique to suggest he might not be #1, or even top 10%, if we had a larger comp set. But middles don’t work that way. There are mathematical proofs of this kind of claim, but really you just have to play around with data a bit to see how true this is. If you do so you’ll see that if a guy is in the middle of a small but appropriate comp set, he’s essentially never going to end up near the extreme of a larger, also-appropriate comp set.
Q: OK, but isn’t it still wrong to compare a spot-up shooter like Cov to a guy like Lillard or Harden who has so many offensive responsibilities?
A: If we were comparing them in terms of their offensive contribution, yes, that would be totally unfair. Just because Cov shoots as many threes as those guys, and as well, does not in any way make him comparably valuable on offense.
But for testing streakiness... I have to admit I just don’t see the problem. Remember, the finding here is that Cov isn’t meaningfully streakier than Harden or Lillard (or Marco, or others). We’re not saying anything about contribution. To suggest that Harden’s greater offensive role makes the study invalid you’d have to say that being a ballhandler makes Harden extra-streaky, and so the fact that Cov is no streakier than him might still leave Cov very streaky. Man, I’m starting to tire of typing that word! Anyway, I just don’t get it. I never hear people say Harden is especially streaky, whether because of his role or his psychology or for any other reason. And Cov isn’t any streakier than Harden and the rest of the group. And... that’s it! Cov is about as streaky as other high-volume, good-not-great three-point shooters.
Since there were like 100 comments that went down this rabbit hole, I’ll just say one more thing: we presented the streakiness levels for the comp groups, which included spot-up guys and also off-the-dribble guys. If Cov were way streakier than all the other spot-up guys, and less streaky than the Harden-Lillard types, you’d see it right there in the tables. But it isn’t there, because the current evidence is telling us that no such effect exists. Indeed if you throw out Harden and lillard from the tests, , Cov looks even LESS streaky, as those are two of the least streaky shooters in the comp group. So, yes, sure, in theory Cov seeming not to be streaky could have been a weird consequence of including all the matches, but as anyone can see it isn’t! Cov is not streaky for an NBA player, and he’s not streaky for a catch-and-shoot player, and he’s mot streaky for a forward. At least as far as we can tell from this set of tests.
Q: What about other choices you made, like the cutoffs for a hot-medium-cold night? Weren’t they arbitrary?
A: Yes, these were arbitrary — deliberately so! Cov shoots around 37.5%(37.7 as I write this), so 25 and 50 percent are about equidistant from his average. Thus it seemed to us the obvious choice to use 0-25 as cold, 25-50 as medium, and 50+ as hot. We think it’s important to make the obvious choice when one is available so people can see we didn’t mine the data to find the one choice that gives a result we wanted. In this case the obvious choice yielded a situation where most players had around 30% each hot and cold nights and about 40% in the middle, so that felt right. You wouldn’t want to define the middle so narrowly that almost every night was declared either “hot” or “cold.”
We did three studies, in truth there are an infinitude of studies one could do on even a question as simple as “is Cov streaky?” Having done these three studies, and having played around with the data a bit since, we believe that other approaches would lead to the same non-streaky conclusion, unless one literally makes an effort to find the opposite result.
Q: You look across games but not within games. Would the result be different if you looked at streaks within games?
A: First, I think looking across games is a great way to search for within-game streaks. If a player who hits three in a row is extremely likely to hit three of his next four, you’re going to see a lot more 6-for-7 games from that guy than from someone that isn’t true of.
Second, looking at streaks within games turns out to be a much trickier thing to do correctly than one might think, as Miller and Sanjurjio show.
Third, for those who want to play with that data, Eric posted a tool that let’s you see every three Cov has taken this season and look for streaks. As I mentioned above, some commenters did so, but I myself have not looked closely at that data.
And fourth, I’ll just note that within-game is the form of streakiness most likely to be a simple result of playing against a bad defender or a bad team defense. Suppose that after hitting 55% over the previous two weeks Cov is just as likely to go 0-for-5 as when he’s been hitting in the 20s lately. So, that sounds like he’s not streaky at all. But then suppose we find that he shoots for a high percentage against terrible defenses and for a low percentage against good ones, so that if he’s hit his last three shots tonight he’s probably not up against a great D and consequently his 3P% on his next shot is above his norm. That would be worth knowing, but would we, Archimedes-style, declare “Eureka -- we have found the elusive beast known as CovStreakiness”? Seems like a pretty minor discovery to me that shooters shoot better when facing lousy defenders. Still, I’d be interested to know what the data says about that, so we’ll look if we find the time.
Q: Is Liberty ballers being overrun by articles saying Cov is terrific? Is this the new Prokafor-Nokafor?
A: As it happens, our Fearless leader KFL is 100% correct that Cov is terrific. I will probably write a piece at some point reinforcing his evidence on this subject. Back in the 1980s Musician magazine had a cover story on the Australian band Midnight Oil with the headline (IIRC) “Midnight Oil: We Are Going To Keep Putting Them On The Cover Until You Start Buying Their Albums!” I reserve the right to keep writing about Cov until everyone appreciates him properly. That said, this article was not about Cov being good. I don’t think it’s obvious at all that being “streaky” is a particularly bad thing. Maybe it’s better to have a 37.5% three-point shooter who completely takes over the game once a week and is lousy the other nights than it is to have a 37.5% shooter who goes 3-for-8 from deep every night. Or maybe being streaky is worse, but just by a tiny bit not worth worrying about; in truth that would be my off-the-cuff guess. I honestly don’t know. So while I do stand for Cov, this piece was not intended as support for Cov. We just wanted to know if he is, as so many claim, unusually streaky, so we looked and it turned out he wasn’t.
Q: Does this relate to the question of “clutchness”?
A: Not directly. But for what it’s worth, clutchness, like streakiness, is something that turns up a lot less in the data than it does in the writings of sports pundits. Eric and I are working on some research on the clutchness question, the NBA is doing a good job making available some cool relevant data these days.
Q: Last one: is there any recent evidence that sheds light on the Cov streakiness question?
A: Actually, yes. The whole streakiness thing with Cov has been out there a long time, but it came to a head recently due to Cov shooting 29% in February after hitting over 46% in October followed by three months in the mid-to-high 30s. Exacerbating the streakiness concern was the fact that each month his 3P% was lower than the month before. So if Cov drops to 26% in March, or even stays down around 29, that would suggest he is “cold.” With less than a week left in March, Cov is shooting 43% from 3. Here’s a split we didn’t look at in the original set of studies: pre- and post All-Star break. This year the numbers are:
Note I’m using this season, the season where at the break a zillion people said Cov was done playing well because he was complacent after the big contract, or injured, or just wasn’t good, or something. But unless you sift through the months and cherry-pick the one bad one, he’s been a model of consistency. We’ve played six calendar months so far, and in five of them Cov has either been right around his career 3P%, or better. Go look at other shooters and you’ll see it’s just totally normal to have one month out of six that’s well below their average. Everybody does it, the by-month sample sizes just aren’t large enough to avoid it except by luck.
One problem with using the eye test to study streakiness is confirmation bias. If Cov hits 40%+ the next few weeks, we can tell ourselves he “got hot.” If he hits in the 20s, we’ll say he’s in a prolonged slump that had a few good games in the middle. It’s easy to see what we expect to see! Here’s what I propose to folks who are not yet persuaded on the streakiness issue. Before each game, write down whether you think Cov is hot or cold these days. Then afterward look and see if he hit for above or below 37%. See how well your forecasts do. I think you’ll find yourself right around 50-50 over time. Indeed I think you’ll find that most nights, you’re not even sure whether to expect cold Cov or hot. In retrospect it always seems obvious, but as anyone who has ever tried to predict the stock market knows, it’s unbelievably difficult to make good predictions in advance. But, try it and see, if I’m wrong and it’s easy to predict Cov’s hot and cold nights, we’ll have learned something.
Thanks to everyone for all the great comments and questions!