Google's Gemini AI Raises Accuracy Concerns with New Rating Guidelines

Generative AI technologies, such as Google’s Gemini, may appear magical, but developing these systems requires a significant amount of human labor. Companies such as Google and OpenAI have teams of “prompt engineers” and analysts that examine AI replies in order to increase accuracy. However, a new Google policy is creating questions about Gemini’s credibility, particularly on sensitive issues like as health care.

According to TechCrunch, Google has directed contractors working on Gemini to analyze AI replies, even if the topics are much above their knowledge. These contractors work for GlobalLogic, which is owned by Hitachi. They consider criteria such as how honest the AI’s replies are. Previously, contractors might avoid duties for which they lacked the necessary qualifications, such as analyzing technical or scientific material.

The previous policy enabled employees to delegate responsibilities if they lacked “critical expertise” in areas such as coding or medicine. However, a new modification compels contractors to assess all prompts, regardless of their knowledge. The new regulation directs students to focus on the sections of the response that they understand and to leave a remark if they lack competence in the subject.

This change has raised concern among several employees. Evaluating technical replies in specialist sectors, such as uncommon illnesses, without sufficient understanding may result in incorrect evaluations. One contractor expressed concern over the change, wondering, “Wasn’t skipping meant to increase accuracy by assigning tasks to more qualified people?”

Under the new rules, contractors can only skip jobs if the information is lacking (such as a missing prompt or response) or if the assignment contains dangerous content that requires specific clearance to examine.

Critics believe that this method may jeopardize Gemini’s capacity to deliver reliable information, especially in vital sectors like as healthcare. As of now, Google has not responded to TechCrunch’s request for feedback on the policy change.

This article is based on reporting by Charles Rollet for TechCrunch on December 18, 2024. You can check out the full article here.

AiVoxo

I’m Voss Xolani, and I’m deeply passionate about exploring AI software and tools. From cutting-edge machine learning platforms to powerful automation systems, I’m always on the lookout for the latest innovations that push the boundaries of what AI can do. I love experimenting with new AI tools, discovering how they can improve efficiency and open up new possibilities. With a keen eye for software that’s shaping the future, I’m excited to share with you the tools that are transforming industries and everyday life.