How the District’s Over-Reliance on Mastery Connect Frustrates Teachers and Harms Students

Over the last few years, Greenville County Schools has become increasingly dependent on Mastery Connect (MC) as its primary summative assessment tool. Over the last two years, GCS has been pushing for increased use of MC as its primary formative assessment tool as well. While I am often an early adopter of new technology, and although I am an advocate for technology-assessed assessment, I cannot share GCS’s enthusiastic support of Mastery Connect, and I have serious concerns about its effectiveness as an assessment tool: because of its poor design and ineffectively vetted questions, the program, instead of helping students and teachers, only ends up frustrating and harming them.

UI Design

Using Mastery Connect with my students is, quite honestly and without hyperbole, the worst online experience I have ever had. I am not an advocate of the ever-increasing amount of testing we are required to implement, but having to do it on such a poorly-designed platform as Mastery Connect makes it even more difficult and only adds to my frustration.

I don’t know where MC’s developers learned User Interface (UI) design. It is as if they looked at a compilation of all the best practices of UI design of the last two decades and developed ideas in complete contradiction to them. If there were awards for the worst UI design in the industry, Mastery Connect would win in every possible category.

Basic Design Problems

Note in the above image that there are more students in this particular class (or “Tracker” as you so cleverly call it). However, the only scroll bar on the right controls the scrolling of the whole screen. To get to the secondary scroll bar, I have to use the bottom scroll bar to move all the way over to the right before I can see the scroll bar. This means that to see data for students at the bottom of my list, I have to first scroll all the way to the right, then scroll down, then scroll back to the left to return to my original position. Not only that, but the secondary vertical scroll bar almost invisible. My other alternative is to change the sorting order to see the students at the bottom.

This is such a ridiculous UI design choice that it seems more to be a literal textbook example of how not to create a user interface than a serious effort to create a useful and easily navigable tool for teachers.

Ambiguous Naming

When we get to the assessment creation screen, we see even more poor design. The information Mastery Connect provides me about a given text is collection of completely useless numbers:

There is no indication about the text for a given question, no indication about what the actual question is. Instead, it’s a series of seemingly arbitrary numbers.

Scroll Bar Design Reliance

Finally, once I try look at the completed assessment, I have even more challenges.

To navigate to see the students work, I have to deal with four scroll bars. Four! If there were an award for most inept UI design, this would have to be the hands-down winner. For twenty years now basic UI design best practice has always been to limit the number of scroll bars because the more there are on a page, the more inefficient and frustrating the user experience.

The fact that the district has made Mastery Connect the center of its assessment protocol given this bad design that makes it difficult for me even to access information is so frustrating as to make me think that perhaps it wasn’t the program itself that sold the district on spending this much money on such a horrific program.

Assessment Results and Screen Real Estate

When I want to look at the results of the assessment, there’s another significant problem: the fast majority of the screen is frozen while only a small corner (less than 25% of the screen) scrolls.

The results of the assessment are visible in the non-shaded portion of the screen. The rest of the screen stays stationary, as if there is a small screen in a screen. Not only that, but that portion of the does not include any indication that it does scroll: a user only happens to discover this if the cursor is in the lower portion of the screen and then the user presses the arrow-down button. Otherwise, a user is just going to face increasing frustration at not being able to view past the first seven students in a given tracker.

A further problem arises when trying to view assessments group by standards. In the screen shot below, it’s clear that while the scroll bar is at its lowest extreme, there is still material not visible on the page. How one can access that information remains a mystery to me.

Assessment Creation and Previewing

When creating a CFA or even deciding on which standard to assess for the CFA, it would be useful to be able to browse questions without first creating an assessment. This, however, is not possible. Browsing for content in a given program seems like such a basic function that all websites have that we take it for granted, but it’s not available in MC.

Organizational and Content Concerns

Question Organization

Many of the questions are two-part queries, with the second part usually dealing with standard 5.1, which for both literary and informational texts is the same: “Cite the evidence that most strongly supports an analysis of what the text says explicitly as well as inferences drawn from the text.” The problem is, when one is creating a formative assessment to target that one standard and one uses the filter view to restrict Mastery Connect to that single standard, the “Part B” question shows up with no indication of what “Part A” might be:

One teacher explained to me that she found a way to fiddle with the question number to trick the program into showing more questions about a given text. Again we’re working to overcome the deficiencies of the program.

This is especially problematic for a CFA because we are trying to focus on a single standard. Standard 5.1 is arguably the most foundational of all standards, and this is even reflected in the number of questions per standard: RL-5.1 and RI-5.1 have vastly more questions than any other standard, yet they are usually tied to another question, and it is all but impossible to figure out what that question is.

There’s a certain irony in the fact that the most foundational standard is all but impossible to assess in isolation.

Finally, some standards are grouped together into their parent standard, and this can create significant issues when trying to create an assessment that includes one standard but excludes another. For instance, questions about RI-11.1 and RI-11.2 are combined into RI-11.

RI-11.1 Analyze the impact of text features and structures on authors’ similar ideas or claims about the same topic.
RI-11.2 Analyze and evaluate the argument and specific claims in a text, assessing whether the reasoning is sound and the evidence is relevant and sufficient; recognize when irrelevant evidence is introduced.

RI-11.1 is about text structure. (Strangely, so, too, is RI-8.2, a fact districts have been pointing out to the state department since the standards were released. If MC were the kind of program the district touts it to be, the programmers would have realized this and dealt with it within the program in some efficient manner.) Standard RI-11.2 deals with evaluating an argument. It says a lot about the state department that they grouped these two ideas together under a main standard of RI-11, but it says even more about MC that they then take these two standards, which are radically different, and drop them into the same category. This means that when trying to make a CFA that deals with evaluating an argument, one has to deal with the fact that many of the questions MC provides in its search/filter results will have to do with a topic completely unrelated.

Question Length and Effective Use of Class Time

There are almost no ELA questions in the MC database that are not text-based. Even a standard like RI-8.2 (“Analyze the impact of text features and structures on authors’ similar ideas or claims about the same topic,” which, again, is identical, word for word,to RI-11.1) has entire essay-length questions. If one wants 10 questions about standard RI-8.2/RI-11.1, instead of delivering ten, single-paragraph questions asking students to identify the text text in play, creating such an assessment would likely result in eight, multi-paragraph text for ten questions. This means that such an assessment would take an entire class period for students to complete.

Indeed, it has taken entire class periods. Below are data regarding the time required to implement CFAs during the third quarter of the 2023/2024 school year. Periods 4 and 5 were English I Honors classes; periods 6 and 7 were English 8 Studies classes.

Period	Total	Students	Man-Minutes/Hours
4	0:26:00	24	10:24:00
5	0:25:00	29	12:05:00
6	1:46:00	29	51:14:00
7	1:43:00	29	49:47:00
			123:30:00

The roughly one hour and forty minutes per English 8 class amounts to two entire class periods spent on Mastery Connect. That is two class periods of instruction lost because of the assessments Master Connect creates. Multiplied out for all students, it comes to an astound 123 man-hours spent on Mastery Connect. This does not take into consideration other, district-mandated testing such as benchmarks and TDAs, all done through MC.

It’s difficult to comprehend just what an impact this use of time has on students’ learning. The 123 man-hours spent on MC equates to three weeks of eighth-hour-a-day work. That is a ridiculous and unacceptable waste of student time, and that is just for one team’s students. If we take an average of the time spent per class, that comes to 1:05. Multiple that across the school, and we arrive at a jaw-dropping 975 man-hours spent in the whole school just for English MC work. If those numbers carry over to the other classes required to use MC for its assessments, the total time spent on MC assessments borders on ludicrous: 3,900 man-hours. That is just for CFAs. Factoring in benchmarks and TDAs administered during the third quarter and that number likely exceeds 10,000 man-hours spent on Mastery Connect.

A common contention among virtuosos of any given craft is that it takes about 10,000 hours to master that skill, whether it be playing the violin or painting pictures. That’s the amount of time we’re spending in our school for one quarter just to assess students, and Master Connect’s inefficiency only compound the problem.

This excessive time spent on Mastery Connect skews the data as well as wastes time. Students positively dread using Mastery Connect: on hearing that we are about to complete another CFA, students groan and complain. There is no way data collected under such conditions can possibly be high quality. Add to it the overall lack of quality in the questions themselves and I am left wondering why our district is spending so much time and money on a program of such dubious quality.

Quality of Questions

Most worrying for me is the baffling fact that many of the district-approved, supposedly-vetted questions about a given standard have nothing whatsoever to do with the given standard.

As an example, consider the following questions Mastery Connect has classified as having to do with text structure. The standards in question read:

RI-8.2 Analyze the impact of text features and structures on authors’ similar ideas or claims about the same topic.
RI-11.1 Analyze the impact of text features and structures on authors’ similar ideas or claims about the same topic.

Accurate Questions

Some of the questions are clearly measuring text structures:

Acceptable Questions

Other questions have only a tenuous connection at best:

The above question has, at its heart, an implicit text structure, but it is not a direct question about text structures but instead deals with simply reading the text and figuring out which comes first, second or third.

The questions above deal with the purpose of the given passage, and text structure is certainly connected to the purpose of a given passage, and that purpose will change as the text structure changes.

Unacceptable Questions

Some questions, however, have absolutely nothing to do with the standard:

The preceding question has nothing to do with text structures, and it is in no way connected to or dependent upon an understanding of text structures and their role in a given text. It is at best a DOK 2 question asking students to infer from a given passage, thus making it an application of RI-5.1; at worst it is a DOK 1 question connected to no standard.

The question above is another example that has nothing at all to do with text structure. Instead, this might be a question about RI-9.1 (Determine the meaning of a word or phrase using the overall meaning of a text or a word’s position or function) or perhaps RI-9.2 (Determine or clarify the meaning of a word or phrase using knowledge of word patterns, origins, bases, and affixes). It cannot be, in any real sense, a question about text structures, and it requires no knowledge of text structures to answer.

The preceding question has more to do with DOK 3-level inferences than text structure.

Multiple Question Banks

Teachers are told to use this bank and not that bank, but why should the question bank matter? If the program is worth the money the district is paying for it, we shouldn’t have to concern ourselves about the question bank. A question about standard X should be a question about standard X. If one question bank is better than another, that speaks more to Mastery Connect’s quality control and vetting process than anything else.

Accessibility Concerns

Due to the poor UI design and the generally poor quality of images used within questions, students with visual impairments are at a severe disadvantage when using Mastery Connect. Images can be blurry and hard to read, and the magnification tool is inadequate for students with profound vision impairment.

Output Concerns

Incompatible Data

I have significant concerns about the efficacy and wisdom of using Mastery Connect as an assessment tool. It’s bad enough that it’s so poorly designed that it appears the developers tried to create a good example of bad user interface design: working with MC is, from a practical standpoint, a nightmare. Add to it the fact that the data it produces for one assessment is often incompatible with data from a different assessment and one would have to wonder why any district would choose to use this program let alone pay to use the program. Neither of those two issues, though, is most significant concern I have.

When comparing CFAs to benchmarks, we are comparing apples to hub caps. The CFAs measure mastery of a given standard or group of standards. The resulting data are not presented as a percentage correct but rather as a scale: Mastery, Near-Mastery, Remediation. However, the benchmarks do operate on a percentage correct, so we’re comparing verbal scale to a percentage when comparing benchmarks and CFAs. These data are completely incompatible with each other, and it renders moot the entire exercise of data analysis. Analytics only makes sense when comparing compatable data. To compare a percentage to a verbal scale makes no sense because it is literally comparing a number to a word. Any “insights” derived from such a comparison would be spurious at best. To suggest that teachers should use this “data” to guide educational choices is absurd. It would be akin to asking a traveler to use a map produced by a candle maker.

Proprietary Data

While my concerns about proprietary data are tangential to the larger issue of the district’s self-imposed dependency on Mastery Connect, they do constitute a significant concern I have about Mastery Connect’s data in general. Because a for-profit private company creates the benchmark questions, the questions are inaccessible to teachers. The only information we teachers receive about a given question is the DOK and the standard. Often, the data is relatively useless because of the broad nature of the standards.

I’ve already mentioned the amalgamation of standards RI-11.1 and RI-11.2:

RI-11.1 Analyze the impact of text features and structures on authors’ similar ideas or claims about the same topic.
RI-11.2 Analyze and evaluate the argument and specific claims in a text, assessing whether the reasoning is sound and the evidence is relevant and sufficient; recognize when irrelevant evidence is introduced.

If I see that a high percentage of students missed a question on RI-11, I have no clear idea of what the students didn’t understand. It could be a text structure question; it could be a question about evaluating a claim; it might be a question about assessing evidence; it could be recognizing irrelevant evidence. This information is useless.

Occasionally, the results of a given question are simply puzzling. In a recent benchmark there were two questions about standard 8-L.4.1.b which is that students will “form and use verbs in the active and passive voice.” At the time of the benchmark, I had not covered the topic in any class, yet the results were puzzling:

P4 (English I Honors)	P5 (English I Honors)	P6 (English 8)
72	69	36
72	76	52

For the second question about active/passive, the results among the English 8 students were vastly better, especially in my inclusion class (period 7). However, among English I students, the results were consistently high despite having never covered that standard. It would have been useful to see why the students did so much better on one question than another given the fact that no class had covered the standard.

Matching Tracksuits

Fun in Fours

Results For "Day: March 13, 2024"

Mastery Disconnect

How the District’s Over-Reliance on Mastery Connect Frustrates Teachers and Harms Students

UI Design

Basic Design Problems

Ambiguous Naming

Scroll Bar Design Reliance

Assessment Results and Screen Real Estate

Assessment Creation and Previewing

Organizational and Content Concerns

Question Organization

Question Length and Effective Use of Class Time

Quality of Questions

Accurate Questions

Acceptable Questions

Unacceptable Questions

Multiple Question Banks

Accessibility Concerns

Output Concerns

Incompatible Data

Proprietary Data