There are so many examples of artificial intelligence going rogue, misbehaving, or simply hallucinating that any AI applications developer must pause when considering deployment. At the highest levels there are relatively easy ways to see AI failures. Ask any generative AI model to create an image of a clock and it will always generate a timepiece showing 10:10. Generative AI cannot today distinguish or calculate time. In the same vein, it is challenged with generating hands and fingers and cannot differentiate left-handedness from right. These small discrepancies do not necessarily represent issues that impact a businesses bottom line. However, they should raise warning flags for organizations looking to deploy artificial intelligence models in customer facing or revenue generating environments.
Over the coming weeks we’ll explore a few other instances of AI trust failures and highlight ways in which the Tumeryk AI Trust Score™ may have been deployed proactively to mitigate many of these inherent AI issues.
Let’s start with a fairly straightforward and more easily mitigated AI implementation: using generative AI to identify the best candidates for job openings. One of the largest employers, Amazon attempted to use AI to sort through the thousands of job applicants and provide a first sort of candidates to speed the hiring process. In 2014, Amazon developed an AI-powered solution to assist in recruiting and streamline this very process. Amazon’s system leveraged a star rating system and the system was trained on 10 years worth of resumes. However, this training data was skewed towards men. As a result, the system downgraded women, people referring to women studies, and even those graduating from women’s colleges.
All the while Amazon insisted that it didn’t solely rely on the system for vetting candidates it did attempt to neutralize the system bias. All to no avail as Reuters reported that, by 2018, Amazon had discontinued the project.
This is a great example of an AI failure that could have been identified and mitigated by an AI Trust Score™. A Trust Score would have quickly and easily identified the biases in the GenAI model and would have provided the mitigation capabilities to have solved the problems in the model before deployment.
Tumeryk offers an AI Trust Score™, modeled after the FICO® Credit Score, for enterprise AI application developers. This tool helps identify underperforming AI models and establishes automated guardrails to assess, monitor, and mitigate these models before they affect consumers or the public. The Tumeryk AI Trust Score™ is the ultimate in AI Risk Assessment, Mitigation, and Management.