On Teaching Evaluations and Corr Sent Policy

--Mike Connelly

“Do the Best Professors Get the Worst Ratings?”

“Elite Athletes Often Shine Sooner or Later—But Not Both”

Brief diversion before we get started on what these two articles have in common and why they apply to Corrections Sentencing policymakers and policymaking. One of the reasons I left my tenured position at my small state university (boy, is my wife still happy), one of several, let me make clear, was the increasing importance of student evaluations in making decisions on faculty promotions and retention. It’s not that I didn’t use student evaluations. I checked every semester for consistency in my scores on the various items and to make sure that I could relate changes to differences in things I had done and either keep doing them or stop. IOW, the internal feedback function of evaluations. No problem with that then or now.

But, as I mentioned, we were seeing more and more evaluation of the actual quality of our work being based on the decisions of primarily 18-21-year-olds who had no idea whether I “knew my subject matter,” “used class time well,” or “explained material clearly” (to sleeping or clueless students forced to take a mandatory US gov’t class???). The camel broke when the fad-tuned administrators (but I repeat myself) decided that a key question should be “I would recommend this class to a friend.”


The value and utility of a course from anyone may not be known for years or even decades. My first college English teacher gave me a C+ on my first essay. What a moron. She had the world’s next Ralph Waldo Hemingway sitting in her room and she’s giving him C+’s? They pay her?? . . . Except years later, whenever I would start planning an essay, an article, a dissertation, bleeping across my synapses would be “start with an inverted pyramid structure with a broad beginning subject statement narrowing into your thesis statement, then provide the details to back up your thesis, and finish with a traditional pyramid structure restating the thesis, branching into the key points to support it and ending with a broad general statement of the subject’s importance to the reader/community.” What a moron.

The absolute worst time to ask a student about a class is at the end of that class, pre-finals or not. Even when they graduate is poor. Five years later? Maybe. Ten, better. Retirement? Probably. Death bed? That would be best. “One last thing before you go, Connelly’s government class on a scale of 1 to 7, with 1 being best possible . . . .”

The only measure of whether a government class has been effective is whether the student performs well as a citizen, such as, makes intelligent decisions in their choice of voting (or not voting) for officials and policies. It would be hard to argue that I or any other political science professors as a whole would measure out well on that indicator.” This insistence we have on judging quality K-12 education by scores on a test on items that the same students won’t remember 3 years later? “Can YOU score better than a 4th-Grader??” A quality school system is shown by communities and states that generate quality citizens, organizations, interactions, products, not by how many of them star on “Jeopardy.”

The “Jeopardy” definition of “smart” has warped so much of how we judge education and the students who go through it. Being a good test-taker will get you into quality schools which then connect you to a world of more likely success so we mistakenly believe that the test-taking is indicative of the person and not the system. Far too much of what gets done in policymaking is done by people who tested well, not by people who actually know anything important about how to get policy done well. Want proof? Just look at those highly credentialed, fad-tuned college administrators who think student evaluations are indicative of the education being provided rather than the personal interactions of varyingly mature individuals who may or may not be acting in good faith but certainly do not know what they’re talking about regarding the quality of what they just received, except at the very furthest extremes either way . . . perhaps.

Fortunately for us (actually for you if you aren’t buying any of this yet), some new research shows exactly what I’ve just described. The first article above talks about a couple of recent experiments that demonstrated how the glitz of the classroom (how “nice” and/or (more importantly) attractive the professor is, how much the student “enjoyed” the class) dominates the evaluations whereas later exam of what got learned, retained, and used tended to be at odds with those evaluations. Evaluations tell one thing, later demonstrated learning tells another. Big shock to those of us who moved on to other professions.

The results
When you measure performance in the courses the professors taught (i.e., how intro students did in intro), the less experienced and less qualified professors produced the best performance. They also got the highest student evaluation scores. But more experienced and qualified professors' students did best in follow-on courses (i.e., their intro students did best in advanced classes).

The authors speculate that the more experienced professors tend to "broaden the curriculum and produce students with a deeper understanding of the material." (p. 430) That is, because they don't teach directly to the test, they do worse in the short run but better in the long run.

To summarize the findings: because they didn't teach to the test, the professors who instilled the deepest learning in their students came out looking the worst in terms of student evaluations and initial exam performance. To me, these results were staggering, and I don't say that lightly.

Bottom line? Student evaluations are of questionable value.


The second piece shows the problems of timing evaluation of performance in another way. Another digression. When I was on the school board in my small college town, one of the board members proposed that the middle school adopt an “everyone plays” policy for its sports programs. His point was to spread the long-lauded, rarely-proven advantage of what you learn about teamwork, competition, yada, yada, from playing sports to all students rather than just those who had already been proclaimed “stars.” You would have thought we were discussing live demonstrations of sex education. Not even the battle we had with the so-called Christian Coalition types trying to take over school policy had that unique level of emotion and stupidity on display at one time. Oddly, one of the few talking sense was the high school football coach. He sorta liked the idea of seeing what else he might have out there in terms of talent than what was getting filtered through the less-than-high-test-scoring middle school coaches. Of course, he just “sorta” supported the idea so all that got lost in the turmoil. (In case you’re wondering, no, we did not adopt the policy.)

But the only reason I really was interested in that policy change was precisely because of his logic. I had heard all the stories about the Walter Paytons and Michael Jordans whose star talents did not show up until later in high school, and I knew of plenty of baseball types I’d grown up with who “blossomed” later while the early stars “fizzled” in less-than-nova-like splendor. I also had done enough community coaching and taking the training to know that one of the big fears of actual professionals in athletics was how many early “stars” burned out and got out of their sports waayyy too early in the professional opinion. Why not pull in some of the non-“stars” yet, give them training, up their interest, create better conditions for that “blossoming” to occur at some point in the future?

You’re shaking your head. You see the problem. Applying that logic thing and looking long-term again. Better to think only short-term, not to look down the road at outcomes, not to plan ahead to make sure you max out your performance at the end rather than the beginning. No amount of evidence would ever have persuaded that very large crowd in our board room that evening of exactly what the research in the second article found:

Here are some of the findings:
• Senior athletes performed best at a significantly later age than their junior counterparts in all four men's event groups and three of four women's event groups.
• Compared to the star junior athletes, the senior athletes showed a significantly greater percentage of improvement in lifetime best performance compared to their best performances as junior athletes in six of eight groups.
• 23.6 percent of the junior athletes studied went on to medal in the Olympics.
• 29.9 percent of the Olympians studied won medals earlier in their career while competing in the Junior World Championships.
Variability in maturation rates and potential differences in performance as athletes age can pose a challenge for recruiting coaches. Coaches anecdotally have known this was an issue, Chapman said, but the IU study bolsters it with data. He said the findings also are relevant in light of how sports organizations and national sport governing bodies budget their limited funds. Focusing their spending on junior athletes will not necessarily result in Olympic champions as the juniors age.

Who knows what Walter Payton we missed because of that decision that night? And, no, none of the middle school “stars” that night ever showed up on your TV even in college games.

By now, long-time readers will already have seen where we’re going with this for Corrections Sentencing. Judging success on how impressive the presentation is and not on the long-term outcomes of the actions. Reliance on conventional wisdom and short-term outcomes to guide attention and responses. Steadfast denial that the status quo could be less functional than change and development of new approaches. Basing judgments of performance on individual outcomes rather than overall impacts on the communities/states being served. The elevation of superficial process and vaguely tied outputs to prominence at the expense of deep institutionalization and firmly grounded impacts. Unthought student evaluation, premature assessment of “star” athletes, or passage of legislation as the key impact rather than actual changes in crime and public safety in the affected communities/states? Actually, “yes” to them all.

Premature evaluation is a problem inherent always to the process, whether it’s of performance of professors, athletes, or Corr Sent policy. This is NOT an argument that it’s too early to judge Corr Sent Reform 1.0, whether Justice Reinvestment or other forms, at this point since more than enough years have gone by for some states and since the individual “reforms” have even longer outcomes known. But it IS an argument to say again that the plaudits and celebration given by the Reform 1.0 folks and the funding decisions of their grantors for nothing more than input and output measures should be recognized for the vapor upon which they are based.

What we have right now is a process too much like student evaluations, superficially tied to informed judgment but factually ungrounded and tethered only to inputs and outputs with weak ties to the outcomes and impacts desired. Put in contrast to an insufficient and wasteful status quo founded on practitioners’ denials of the inadequacies of their certainties, equivalent to those of the college administrators and middle school coaches who believe (and have short-term incentives to do so) the bromides they spout without scientific basis in Reality, as the articles show. The articles above at the very least should cause everyone in policy areas with similar ideas and practice on notice.

It’s always nice when research verifies the qualitative conclusions you’ve reached regarding what should and shouldn’t be done where you’re working. Just think. If the student evaluation research had been around and accepted 20 years ago, I might still be tenured and teaching somewhere quiet and peaceful (“I still can’t believe you gave up tenure!!”) and you’d never have to listen to these daily moans, I mean, insights. You can add your own qualitative assessment to that result.

No comments (Add your own)

Add a New Comment

Enter the code you see below:

Comment Guidelines: No HTML is allowed. Off-topic or inappropriate comments will be edited or deleted. Thanks.