Sturm and Guinier present a persuasive case for why testing fails as a fair and credible gauge of merit. In their new approach to selection and inclusion in the workplace, they focus on actual performance on the job as “the best evidence of the ability to perform.” They propose that a period of demonstrating competence and growing into a position should give women and minorities greater opportunities to display their merits and should avert the biases, scars, and resentments that surround tests. At the Center for Gender in Organizations, we study the workplace to see how the dynamics of difference play out for women and minorities, in ways that impact both equity and the effectiveness with which work gets done. We explore what the experience of the “extremely interactive and extended selection process” might be like and how it might still be a test, though a test of a different sort.

Consider the case of “Bernice,” the woman who, as Sturm and Guinier explain, became general counsel of a corporation after sharing the job with two other lawyers for nine months on a temporary basis. She networked with top executives who might not otherwise have seen her in action, showed her mettle by handling crises effectively, and discovered and honed her skills as a team player. To her surprise, she made the “final cut,” and was offered the position.

This vignette has several features of interest. All eyes are on Bernice. This pressure might not bring out the best in some candidates (indeed, the fact that Bernice does not really perceive this period as an audition and is surprised at her “win” might be a significant reason for her successful performance). She has to prove herself in a difficult situation in which the criteria are unclear and emergent. The audition has some high stakes. In essence, “it’s still a test.”

And Bernice could trip up on this test in a number of ways. Knowing that she is being watched, she may be concerned about performance, not just her own, but of the group she manages. In our research, we have noticed how being “under the microscope” can cause people to micromanage and tightly control work. Bernice might engage in this behavior, unrepresentative of her true style, as a form of self-protection. But it could be read by others as evidence that she lacks the delegation skills critical for a leader. She will likely be conservative in her actions and avoid taking risks because she is being watched closely. Thus, she might be rated as failing to demonstrate the risk-taking that is regarded as the mark of leadership.

It’s also possible that Bernice will actually demonstrate the special kinds of skills that she can bring to the job, based on her different socialization and standpoint. For example, if she does not know that she is being observed for a promotion, she might practice her own style of coach-like and team-oriented leadership. But top executives, especially if they are a relatively homogenous group, might not see all the benefits this style brings to her work, and might be looking for traditional top-down leadership style instead. In addition, the rules of the game can shift along the way. If there is ambiguity about what constitutes high-level job performance, the pressures to conform spill over into the dress, speech, and demeanor that the contenders under scrutiny are expected to display. If Bernice does not play golf, are her job prospects subtly diminished during this extended test?

Finally, after attaining a position, a candidate from a different background might start to act more like herself, and perhaps make changes that enable the organization to appreciate different and needed types of merit or encourage other women and minorities. Or she might feel compelled to follow the norms of her predecessors. For a lone member of a minority group, the test is never really over. And if the two men who lost this contest to Bernice still bear the familiar resentment that she got the job because she was a woman, then the scrutiny and conformity pressures might feel even more intense. Under these circumstances, it will not be easy to change the organization from the top down.

Rethinking the use of tests is at the heart of this provocative and important essay. We agree, and suggest rethinking even the subtle, elongated, and high-pressure tests that might arise from the proposed alternative. All such tests are still anchored in a world of ranking, contests, and pyramid-shaped hierarchies, with fewer and fewer plum jobs as one approaches the top. Making the tests fairer and more closely linked to actual performance is a first-order fix.

A deeper fix is to rethink tests in all their various forms and the very shape and assumptions of structures that require such testing and selection games. For example, at Bernice’s organization, the nine-month period when three people shared the general counsel job suggests a promising alternative. Job sharing and teams of equals diminish some of the pressures to pick one winner. They enable a mix of different work styles and types of merit to be developed and appreciated without the undercurrent of competition. Job sharing and teamwork can also promote other important ends, like giving employees more time to spend in their family and civic life. The prospects for such changes should not be oversimplified. But they suggest a method of looking at how work gets done, taking the counterproductive politics of competition into account, and using opportunistic moments–like the formation of the three-person general counsel office prompted by a merger–as triggers for small wins that could lead to bigger organizational changes.