Matching Tracksuits

fun in fours

education and teaching

Testing

Today was the first day of state standardized testing, and it was, as I expected, a mess. The company that our state pays to do the testing is DRC Insight. I'm not sure why: we've never had a smooth testing experience with them. We've staggered starts by grade; we've staggered by grade and then hall; we've staggered by grade, and then hall, and then room -- nothing has ever produced a simple experience where all students get logged on immediately and start the test without issue.

How many millions of dollars are we spending for this substandard, time-wasting torture?

For my part, it's hellish because I'm not allowed to do anything other than watch the students test. We don't want them cheating, you see. But the truth is, students know this test really has no impact on their lives, and while they usually do their best, they're not overly worried about it.

And this led me once again to cynicism: as I walked around the room, I crafted a sentence. I took a moment and jotted it down, then continued walking around the room, looking at the tops of students' heads. I thought of edits and changed the sentence. I repeated the process until I'd eliminated all unnecessary words to express the simple truth of standardized testing:

Standardized testing quantifies students and teachers to provide politicians scapegoats for their failed education policies.

Lit Circles

Kids are ending the year with lit circles, which gives them a lot of independence and an opportunity to show themselves (and others) how well they can handle such responsibilities. Unlike the rest of the year, for this work I allowed them to choose their own groups. Several of the Latino students decided to work together. Their English ability is a wide spectrum: one boy has just moved to America and speaks no English at all; another boy just moved to America and speaks intermediate-level English. One girl has been living in the States for a number of years but still has some difficulties with English.

I told them to do their best to stick to English, to help each other out as they're working. They've been doing just that.

These kids have a very special place in my world right now: I know, to a slight degree, the struggles they're going through. I often remind them about how much they've improved this year, and I tell them how proud I am of them and more importantly, how proud they should be of themselves.

"And just between us, teachers aren't supposed to have favorites, but I so enjoy working with you guys," I told them. "You're not my favorite, because I'm not supposed to have them, but you're close," I added, with a wink.

"We know," one of the girls laughed.

Poster Day

For about four years now, each of my classes during the book fair has picked out a poster that seems uniquely out of character for me, which they then all sign, and I hang it on the wall.

Previous years' posters include two BTS posters, a Riverdale poster, and several kitty posters.

Today was our day in the book fair, so all classes picked a poster. They'll be signing it tomorrow, and they should be on the wall by the end of the week.

This year, more kids seemed more interested in picking the poster. Usually, it's just a handful of students in each class; this year, the whole class at times was inspecting the poster and making suggestions about which one to buy.

It made me feel exceptionally good.

Review

One of my classes is working on clauses -- recognizing them, using them, transforming them. Today we had a lower-than-usual attendance because of a Junior Beta Club field trip, so it was a somewhat relaxed day. We finished with a game of Kahoot, an online learning site that gamifies quiz-type reviews. We played a variant called Submarine Squad. According to the site,

You and your crew are stuck in the deep blue! A hungry fish is quickly approaching. Answer questions correctly to boost your submarine and follow instructions to escape.

Kahoot usually encourages a bit of competition; this particular variant encourages teamwork and cooperation. They have to work together to escape, and as the beast approaches, opens its jaws, and slowly begins to crunch down on the ship, the encouragement to each other increases.

When they do manage to escape (they didn't make it the first round), they all jump out of their chairs and cheer, giving each other high fives and doing silly little happy dances.

It's one of the reasons I love teaching eighth grade: they're still kids at heart.

Still More Testing

The results of this test will pass into mysterious silence: the students would get more feedback about their writing from a random stranger on the street than they will from this test. Other than the practice they gain from writing yet another analytic piece about a text (which was likely painfully boring and irrelevant to them), this test is an utter waste of time.

Yet I admire these kids for the effort they are putting into a largely Sisyphean task, for even those who’d had their heads down at one point complete the test and appear to do their best. This shows a perseverance and maturity that I, in my increasingly cynical fifties, seem to lack. Were I taking this test, my temptation would be to submit a rebellious, snarky response: “We are completely sick of all this testing, and I for one refuse to participate in this charade.”

What would be the reaction of the evaluator? Would she nod in agreement, lowering her head a bit in shame at her admittedly-minimal role in the process? Would he grow indignant, frustrated that the student didn’t see the value of a test he regards so highly, angry at the student’s teacher who so obviously neglected to impress upon the student the critical nature of the test? In short, just how much faith do the creators and evaluators of these tests have in the tests themselves? It’s hard to imagine how they could see any value in today’s test that will be unevaluated and provide students with absolutely no feedback, so we’re all left wondering just why we did it. We all, teachers and students alike, develop the sense that test in general is just a tool to provide numbers to some group of bureaucrats so they can create for educators arbitrary comparisons and goals to provide these bureaucrats with a false sense of effort and accomplishment. We’ve recognized the problem, determined its scope, and created (or rather, ordered the creation of) a set of tests sure to solve the problem. And if they do not solve the problem, we can always create still more tests and metrics that ignore the actual issues but can serve as a balm for our consciences.

This all assumes that something will be done with the test. For all we know, the responses could simply be saved on some computer somewhere, completely forgotten soon enough and totally meaningless as a result.

The Change

This year, we are trying a new approach to scheduling. Different times of the day affect kids in different ways. Medications that in one class were effectively helping kids control their impulses have declined in effectiveness. The energy levels of some kids increase through the day and of other kids decrease. Some periods during the day are closer to times when kids have eaten, and this can make them sleepy as the blood starts to shift to their digestive systems or tired as they run out of fuel to help them move and think. Some teachers have more patience at the beginning of the day; some teachers have more energy at the end of the day. And all these facts interact with each other, affecting students’ learning and teachers’ effectiveness. All this means that some kids will learn better earlier in the day and some kids will learn better later in the day. REcongizing this, our principal enacted a shifting schedule this year: at the start of each new quarter, the order of the core classes (math, English, science, and social studies) flips — the last class becomes the first class, and the first the last.

The difference for my inclusion class, which is a mix of regular education students and special education students, is remarkable. When they were my final class, they were my most challenging class. They were tired; I was tired. Their meds had worn off; I was hoping for any kind of medication myself. Now that I see them first, they are a different class. A joy to work with.

Diary

My English 8 students are beginning my favorite unit of the year: The Diary of Anne Frank. We'll be acting out the play in class (at least some of it), reading actual diary entries and comparing them to the play, looking at how the authors use only dialog to develop the characters for the audience.

One of the things I love about this unit is that a lot of kids who might not otherwise be so eager to participate become a little more engaged, a little more focused.

And it's just fun...

First Day Back

The view outside my window would make it clear even if my mental state didn't:

Today was the first day back from spring break, and while I was pleased to see my students, I wasn't as happy about having a regimented schedule.

Classes went well: everyone started a new unit (The Diary of Anne Frank and To Kill a Mockingbird), and everyone was relatively focused.

A good day, overall.

Mastery Disconnect

How the District’s Over-Reliance on Mastery Connect Frustrates Teachers and Harms Students

Over the last few years, Greenville County Schools has become increasingly dependent on Mastery Connect (MC) as its primary summative assessment tool. Over the last two years, GCS has been pushing for increased use of MC as its primary formative assessment tool as well. While I am often an early adopter of new technology, and although I am an advocate for technology-assessed assessment, I cannot share GCS’s enthusiastic support of Mastery Connect, and I have serious concerns about its effectiveness as an assessment tool: because of its poor design and ineffectively vetted questions, the program, instead of helping students and teachers, only ends up frustrating and harming them.

UI Design

Using Mastery Connect with my students is, quite honestly and without hyperbole, the worst online experience I have ever had. I am not an advocate of the ever-increasing amount of testing we are required to implement, but having to do it on such a poorly-designed platform as Mastery Connect makes it even more difficult and only adds to my frustration.

I don’t know where MC’s developers learned User Interface (UI) design. It is as if they looked at a compilation of all the best practices of UI design of the last two decades and developed ideas in complete contradiction to them. If there were awards for the worst UI design in the industry, Mastery Connect would win in every possible category.

Basic Design Problems

Note in the above image that there are more students in this particular class (or “Tracker” as you so cleverly call it). However, the only scroll bar on the right controls the scrolling of the whole screen. To get to the secondary scroll bar, I have to use the bottom scroll bar to move all the way over to the right before I can see the scroll bar. This means that to see data for students at the bottom of my list, I have to first scroll all the way to the right, then scroll down, then scroll back to the left to return to my original position. Not only that, but the secondary vertical scroll bar almost invisible. My other alternative is to change the sorting order to see the students at the bottom.

This is such a ridiculous UI design choice that it seems more to be a literal textbook example of how not to create a user interface than a serious effort to create a useful and easily navigable tool for teachers.

Ambiguous Naming

When we get to the assessment creation screen, we see even more poor design. The information Mastery Connect provides me about a given text is collection of completely useless numbers:

There is no indication about the text for a given question, no indication about what the actual question is. Instead, it’s a series of seemingly arbitrary numbers.

Scroll Bar Design Reliance

Finally, once I try look at the completed assessment, I have even more challenges.

To navigate to see the students work, I have to deal with four scroll bars. Four! If there were an award for most inept UI design, this would have to be the hands-down winner. For twenty years now basic UI design best practice has always been to limit the number of scroll bars because the more there are on a page, the more inefficient and frustrating the user experience.

The fact that the district has made Mastery Connect the center of its assessment protocol given this bad design that makes it difficult for me even to access information is so frustrating as to make me think that perhaps it wasn’t the program itself that sold the district on spending this much money on such a horrific program.

Assessment Results and Screen Real Estate

When I want to look at the results of the assessment, there’s another significant problem: the fast majority of the screen is frozen while only a small corner (less than 25% of the screen) scrolls.

The results of the assessment are visible in the non-shaded portion of the screen. The rest of the screen stays stationary, as if there is a small screen in a screen. Not only that, but that portion of the does not include any indication that it does scroll: a user only happens to discover this if the cursor is in the lower portion of the screen and then the user presses the arrow-down button. Otherwise, a user is just going to face increasing frustration at not being able to view past the first seven students in a given tracker.

A further problem arises when trying to view assessments group by standards. In the screen shot below, it’s clear that while the scroll bar is at its lowest extreme, there is still material not visible on the page. How one can access that information remains a mystery to me.

Assessment Creation and Previewing

When creating a CFA or even deciding on which standard to assess for the CFA, it would be useful to be able to browse questions without first creating an assessment. This, however, is not possible. Browsing for content in a given program seems like such a basic function that all websites have that we take it for granted, but it’s not available in MC.

Organizational and Content Concerns

Question Organization

Many of the questions are two-part queries, with the second part usually dealing with standard 5.1, which for both literary and informational texts is the same: “Cite the evidence that most strongly supports an analysis of what the text says explicitly as well as inferences drawn from the text.” The problem is, when one is creating a formative assessment to target that one standard and one uses the filter view to restrict Mastery Connect to that single standard, the “Part B” question shows up with no indication of what “Part A” might be:

One teacher explained to me that she found a way to fiddle with the question number to trick the program into showing more questions about a given text. Again we’re working to overcome the deficiencies of the program.

This is especially problematic for a CFA because we are trying to focus on a single standard. Standard 5.1 is arguably the most foundational of all standards, and this is even reflected in the number of questions per standard: RL-5.1 and RI-5.1 have vastly more questions than any other standard, yet they are usually tied to another question, and it is all but impossible to figure out what that question is.

There’s a certain irony in the fact that the most foundational standard is all but impossible to assess in isolation.

Finally, some standards are grouped together into their parent standard, and this can create significant issues when trying to create an assessment that includes one standard but excludes another. For instance, questions about RI-11.1 and RI-11.2 are combined into RI-11.

  • RI-11.1 Analyze the impact of text features and structures on authors’ similar ideas or claims about the same topic.
  • RI-11.2 Analyze and evaluate the argument and specific claims in a text, assessing whether the reasoning is sound and the evidence is relevant and sufficient; recognize when irrelevant evidence is introduced.

RI-11.1 is about text structure. (Strangely, so, too, is RI-8.2, a fact districts have been pointing out to the state department since the standards were released. If MC were the kind of program the district touts it to be, the programmers would have realized this and dealt with it within the program in some efficient manner.) Standard RI-11.2 deals with evaluating an argument. It says a lot about the state department that they grouped these two ideas together under a main standard of RI-11, but it says even more about MC that they then take these two standards, which are radically different, and drop them into the same category. This means that when trying to make a CFA that deals with evaluating an argument, one has to deal with the fact that many of the questions MC provides in its search/filter results will have to do with a topic completely unrelated.

Question Length and Effective Use of Class Time

There are almost no ELA questions in the MC database that are not text-based. Even a standard like RI-8.2 (“Analyze the impact of text features and structures on authors’ similar ideas or claims about the same topic,” which, again, is identical, word for word,to RI-11.1) has entire essay-length questions. If one wants 10 questions about standard RI-8.2/RI-11.1, instead of delivering ten, single-paragraph questions asking students to identify the text text in play, creating such an assessment would likely result in eight, multi-paragraph text for ten questions. This means that such an assessment would take an entire class period for students to complete.

Indeed, it has taken entire class periods. Below are data regarding the time required to implement CFAs during the third quarter of the 2023/2024 school year. Periods 4 and 5 were English I Honors classes; periods 6 and 7 were English 8 Studies classes.

PeriodTotalStudentsMan-Minutes/Hours
40:26:002410:24:00
50:25:002912:05:00
61:46:002951:14:00
71:43:002949:47:00
123:30:00

The roughly one hour and forty minutes per English 8 class amounts to two entire class periods spent on Mastery Connect. That is two class periods of instruction lost because of the assessments Master Connect creates. Multiplied out for all students, it comes to an astound 123 man-hours spent on Mastery Connect. This does not take into consideration other, district-mandated testing such as benchmarks and TDAs, all done through MC.

It’s difficult to comprehend just what an impact this use of time has on students’ learning. The 123 man-hours spent on MC equates to three weeks of eighth-hour-a-day work. That is a ridiculous and unacceptable waste of student time, and that is just for one team’s students. If we take an average of the time spent per class, that comes to 1:05. Multiple that across the school, and we arrive at a jaw-dropping 975 man-hours spent in the whole school just for English MC work. If those numbers carry over to the other classes required to use MC for its assessments, the total time spent on MC assessments borders on ludicrous: 3,900 man-hours. That is just for CFAs. Factoring in benchmarks and TDAs administered during the third quarter and that number likely exceeds 10,000 man-hours spent on Mastery Connect.

A common contention among virtuosos of any given craft is that it takes about 10,000 hours to master that skill, whether it be playing the violin or painting pictures. That’s the amount of time we’re spending in our school for one quarter just to assess students, and Master Connect’s inefficiency only compound the problem.

This excessive time spent on Mastery Connect skews the data as well as wastes time. Students positively dread using Mastery Connect: on hearing that we are about to complete another CFA, students groan and complain. There is no way data collected under such conditions can possibly be high quality. Add to it the overall lack of quality in the questions themselves and I am left wondering why our district is spending so much time and money on a program of such dubious quality.

Quality of Questions

Most worrying for me is the baffling fact that many of the district-approved, supposedly-vetted questions about a given standard have nothing whatsoever to do with the given standard.

As an example, consider the following questions Mastery Connect has classified as having to do with text structure. The standards in question read:

  • RI-8.2 Analyze the impact of text features and structures on authors’ similar ideas or claims about the same topic.
  • RI-11.1 Analyze the impact of text features and structures on authors’ similar ideas or claims about the same topic.

Accurate Questions

Some of the questions are clearly measuring text structures:

Acceptable Questions

Other questions have only a tenuous connection at best:

The above question has, at its heart, an implicit text structure, but it is not a direct question about text structures but instead deals with simply reading the text and figuring out which comes first, second or third.

The questions above deal with the purpose of the given passage, and text structure is certainly connected to the purpose of a given passage, and that purpose will change as the text structure changes.

Unacceptable Questions

Some questions, however, have absolutely nothing to do with the standard:

The preceding question has nothing to do with text structures, and it is in no way connected to or dependent upon an understanding of text structures and their role in a given text. It is at best a DOK 2 question asking students to infer from a given passage, thus making it an application of RI-5.1; at worst it is a DOK 1 question connected to no standard.

The question above is another example that has nothing at all to do with text structure. Instead, this might be a question about RI-9.1 (Determine the meaning of a word or phrase using the overall meaning of a text or a word’s position or function) or perhaps RI-9.2 (Determine or clarify the meaning of a word or phrase using knowledge of word patterns, origins, bases, and affixes). It cannot be, in any real sense, a question about text structures, and it requires no knowledge of text structures to answer.

The preceding question has more to do with DOK 3-level inferences than text structure.

Multiple Question Banks

Teachers are told to use this bank and not that bank, but why should the question bank matter? If the program is worth the money the district is paying for it, we shouldn’t have to concern ourselves about the question bank. A question about standard X should be a question about standard X. If one question bank is better than another, that speaks more to Mastery Connect’s quality control and vetting process than anything else.

Accessibility Concerns

Due to the poor UI design and the generally poor quality of images used within questions, students with visual impairments are at a severe disadvantage when using Mastery Connect. Images can be blurry and hard to read, and the magnification tool is inadequate for students with profound vision impairment.

Output Concerns

Incompatible Data

I have significant concerns about the efficacy and wisdom of using Mastery Connect as an assessment tool. It’s bad enough that it’s so poorly designed that it appears the developers tried to create a good example of bad user interface design: working with MC is, from a practical standpoint, a nightmare. Add to it the fact that the data it produces for one assessment is often incompatible with data from a different assessment and one would have to wonder why any district would choose to use this program let alone pay to use the program. Neither of those two issues, though, is most significant concern I have.

When comparing CFAs to benchmarks, we are comparing apples to hub caps. The CFAs measure mastery of a given standard or group of standards. The resulting data are not presented as a percentage correct but rather as a scale: Mastery, Near-Mastery, Remediation. However, the benchmarks do operate on a percentage correct, so we’re comparing verbal scale to a percentage when comparing benchmarks and CFAs. These data are completely incompatible with each other, and it renders moot the entire exercise of data analysis. Analytics only makes sense when comparing compatable data. To compare a percentage to a verbal scale makes no sense because it is literally comparing a number to a word. Any “insights” derived from such a comparison would be spurious at best. To suggest that teachers should use this “data” to guide educational choices is absurd. It would be akin to asking a traveler to use a map produced by a candle maker.

Proprietary Data

While my concerns about proprietary data are tangential to the larger issue of the district’s self-imposed dependency on Mastery Connect, they do constitute a significant concern I have about Mastery Connect’s data in general. Because a for-profit private company creates the benchmark questions, the questions are inaccessible to teachers. The only information we teachers receive about a given question is the DOK and the standard. Often, the data is relatively useless because of the broad nature of the standards.

I’ve already mentioned the amalgamation of standards RI-11.1 and RI-11.2:

  • RI-11.1 Analyze the impact of text features and structures on authors’ similar ideas or claims about the same topic.
  • RI-11.2 Analyze and evaluate the argument and specific claims in a text, assessing whether the reasoning is sound and the evidence is relevant and sufficient; recognize when irrelevant evidence is introduced.

If I see that a high percentage of students missed a question on RI-11, I have no clear idea of what the students didn’t understand. It could be a text structure question; it could be a question about evaluating a claim; it might be a question about assessing evidence; it could be recognizing irrelevant evidence. This information is useless.

Occasionally, the results of a given question are simply puzzling. In a recent benchmark there were two questions about standard 8-L.4.1.b which is that students will “form and use verbs in the active and passive voice.” At the time of the benchmark, I had not covered the topic in any class, yet the results were puzzling:

P4 (English I Honors)P5 (English I Honors)P6 (English 8)
726936
727652

For the second question about active/passive, the results among the English 8 students were vastly better, especially in my inclusion class (period 7). However, among English I students, the results were consistently high despite having never covered that standard. It would have been useful to see why the students did so much better on one question than another given the fact that no class had covered the standard.

For the want of a sentence

One sentence -- one single, simple sentence, the contents of which students already had planned in class. It was merely a matter of taking two phrases and generalizing. That was one class's homework last night. In fulfillment of one of the many instructional standards for the eighth-grade language arts curriculum, students were working to write a single sentence that expressed the main idea of a multi-paragraph non-fiction text. We'd examined the text in class. Student effort hadn't been stellar, but the majority fulfilled the most basic criteria for the small project. We had in place everything we needed to write that sentence, but we'd run out of time. The homework was something like leftovers: we didn't have time, do it at home.

One sentence, probably no longer than ten words. Out of a class of twenty-five, four did the "work" in its entirety, one had begun writing the sentence but was less than half finished, three had the presence of mind to jot the sentence on a piece of paper as I was checking that other students had completed the work, and the rest did nothing.

One sentence, and sixty-six percent of the class was too lazy, too unmotivated to do it. "I had better things to do with my time," one "student" said. "I forgot," another said. "I just didn't want to do it," a third explained.

In a flash, I saw the possible future, and it was terrifying. Students in the second world -- countries like China, Brazil, and India -- see what we have, and they want it. Their parents see it, and they want their children to have it. And so they work for it. They work for the education that will give them the job that will allow them to buy that smartphone, that flat-screen television, that car -- their little version of the American dream, exported and translated.

"Yet we already have it -- we've won. We've got nothing to worry about," replies the consumer prevailing (often unacknowledged or even unrealized) "wisdom." True, we won. In the Cold War, we came out on top. What spurred us? A moment like we're facing now, a moment where we realize our ascendancy is being eclipsed. We've grown complacent, though, and most feel our current reality could never truly disappear.

Yet looking at the standings of US students among those from the rest of the world, it certainly does appear that they want education -- and all that that brings with it -- more than we in the already-ascended West.