A new tech tool uncovers ways organizations can eliminate bias duringÂ the hiring process.
By Judd B. Kessler and Corinne Low
A growing body of evidence suggests that hiringÂ managers and recruiters display bias againstÂ underrepresented minorities. These findings have comeÂ from a research method called a âresume audit.â TheÂ idea is simple.
Researchers send fake resumes that vary only in theÂ candidateâs name to a large sample of employers,Â revealing clues about the candidateâs race and gender. AÂ randomly selected group of employers receives a resumeÂ with a male or female name from one demographicÂ group while another group of employers receives aÂ resume with a name from a different demographicÂ group. Researchers then analyze differences in responseÂ rates between demographic groups.
A prominent example is a 2004 study of racial biasÂ titled, âAre Emily and Greg more employable thanÂ Lakisha and Jamal?â by Marianne Bertrand and SendhilÂ Mullainathan. The researchers of that study askedÂ whether the resume with the name Jamal got fewerÂ voicemails inviting him for interviews than the sameÂ resume with the name Greg. Indeed, it did.
The resume audit method hasnât changed muchÂ since 2004. But despite its longevity and success atÂ uncovering discrimination, this approach has a fewÂ major limitations. The fi rst is that it requires deception.Â Hiring managers and recruiters do not know they areÂ participating in a research study in which fake resumesÂ are being passed off as real ones. It is hard enough toÂ sort through hundreds or thousandsâor hundreds ofÂ thousandsâof resumes to identify potential recruits.Â That task is made even more arduous when some ofÂ those resumes are fake.
The second limitation is that researchers can onlyÂ observe the decision to âcall backâ a candidate, orÂ follow up with a candidate for an interview. ButÂ making it to the interview stage may not be a perfectÂ measure for how much a firm likes a candidate. SinceÂ no one wants to waste time and resources pursuing aÂ candidate who would not end up taking the job, callingÂ a candidate back reflects both desirability and interestÂ in whether a candidate is realistically âgettable.â
As a case in point, a recent resume audit study looked atÂ the effect of unemployment (where being unemployedÂ was varied on the resume instead of name) on candidateÂ call back rates and found that firms called backÂ unemployed candidates at higher rates than employedÂ ones. Presumably, experience has taught hiringÂ managers that unemployed candidates are more likelyÂ to be responsive to job offers, and thus better targetsÂ for their recruitment energy.
The third limitation of the resume audit method isÂ that researchers can only study the hiring practices ofÂ firms that respond to unsolicited resumes. Many firmsÂ hire through partnerships with schools or other feederÂ organizations, automatically reducing the sample sizeÂ of potential study participants. Researchers cannot sendÂ too many fake resumes to one firm for fear of beingÂ discovered, and cannot learn all that much from anyÂ single company.
But now, along with Colin Sullivan, a postdoctoralÂ candidate at Stanford, we have developed a newÂ approach to measure firm preferences and detect bias inÂ hiring practices: incentivized resume rating (IRR). RatherÂ than putting the interests of firms and researchers inÂ conflict, IRR combines those interests. Employers areÂ invited to evaluate a set of resumes that they knowÂ to be hypothetical. However, thereâs good reason forÂ the hiring managers and recruiters to evaluate theseÂ hypothetical candidates carefully: Their responses areÂ used to match them with real candidates.
IRR sifts through piles of resumes using the dataÂ generated from their ratings of 40 hypotheticalÂ candidates. This is done by combining the hypotheticalÂ resumes with a machine learning algorithm thatÂ identifies what characteristics each firm is looking forÂ in job candidates (for example, a prestigious summerÂ experience, a high GPA, or specific majors) and thenÂ finds the best matches among available candidates.Â The more carefully the hiring managers and recruitersÂ evaluate the hypothetical resumes, the better theÂ algorithm will be at finding them real candidates thatÂ will likely be a match.
This first implementation of IRR invited firmsÂ engaging in on-campus recruiting at the University ofÂ Pennsylvania to evaluate hypothetical resumes createdÂ using real resume componentsâGPA, major, internships,Â extracurricular activities, and skills from the resumes ofÂ real Penn studentsâalong with names that indicatedÂ race and gender. These components could be randomlyÂ combined in hundreds of millions of ways, allowingÂ IRR to identify the influence of individual candidateÂ characteristics on employer preferences. The responsesÂ were then used to recommend real Penn graduatingÂ seniors who fit each organizationâs needs.
What did the research find? Through surveys, the firmsÂ recruiting at Penn expressed a seemingly genuine desireÂ to hire diverse candidates. And yet, IRR identified waysÂ they might be handicapping their own efforts.
- Firms hiring in STEM fields rated candidates withÂ female and minority names significantly lower thanÂ candidates with white male names. In fact, a female orÂ minority candidate with a 4.0 GPA received the sameÂ rating as a white man with a 3.75 GPA. Plus, female andÂ minority candidates received less credit for prestigiousÂ internships from firms in all fields. Results suggest thatÂ these biases are most likely subconscious and get moreÂ pronounced when employers are fatigued from ratingÂ more resumes in a row.
- The IRR diagnostic tool asked about a firmâs interestÂ in hypothetical candidates and the likelihood thoseÂ candidates would accept a job if offered. IRR found thatÂ firms expect women to be harder to hire.
- IRR findings also show that firms were not particularlyÂ going after women and minorities. At best, firmsÂ looking to hire in the social science and business fieldsÂ displayed no gender preference for diverse candidatesÂ while firms hiring STEM candidates displayed a biasÂ against them.
Given this potential mismatch between the goals of theÂ firm and the reality in the trenches, IRR can deliver bigÂ benefits for organizations interested in investigatingâand improvingâtheir hiring practices. IRR could serve asÂ a useful diagnostic for firms internally checking whetherÂ individuals display subconscious bias in their evaluationÂ of resumes.
In particular, firms could have their hiring managers useÂ the diagnostic tool and then work with researchers toÂ analyze their data from IRR to identify whether bias isÂ present. But they could also go beyond this and use theÂ tool to help correct any bias. This could be done by usingÂ the preference data elicited by IRR to help screen realÂ candidates without preferences over race and gender.Â As was done in the study at Penn, the tool could identifyÂ which characteristics of candidates (such as applicantÂ education, work experience, or skills) the firm particularlyÂ values and then screen real candidates by looking forÂ these characteristics while ignoring name and any otherÂ indicators of race and gender that might trigger bias.
Even when there is no evidence of bias, IRR can be aÂ useful tool for firms to learn whether leadership prioritiesÂ have been properly communicated to hiring managersÂ and recruiters. Having both executives and front-lineÂ staff members evaluate resumes using the IRR diagnosticÂ tool allows the firm to identify whether the two groupsÂ place the same value on candidate characteristics such asÂ applicant education, work experience, or skills. FindingÂ that the groupsâ preferences diverge would allow theÂ firm to improve their alignment in recruiting.
IRR allows a peek under the hood at firms engaging inÂ on-campus recruiting. These big, prestigious firms valueÂ diversity, but major roadblocks remain.
Corinne Low and Judd B. Kessler are professors of businessÂ economics and public policy at the University of PennsylvaniaâsÂ Wharton School.