when algorithms fail: building safer AI with human intelligence

It's 11:23 PM in Barcelona. Photojournalist Marie Dubois has had a long exhausting day covering climate protests in Brussels. She contains her exhaustion and uploads her latest images to the platform because she’s excited about what she saw and heard. The pictures she captured give her hope. It is, after all, democracy in action where protestors have a stand off with the police.

But before she could even type a caption, her screen is covered with the message that the content has been removed. Why? Because the platform’s AI judged the content to be sensitive. It detected raised weapons (83% confidence), violent imagery (91% confidence), and distressing content (96% confidence). Case closed. Post deleted.

But here's what the algorithm couldn't see or did not know. Marie has a verified journalist’s badge. The image's metadata shows that  it was taken at an officially sanctioned press area. Let’s not forget that this or similar images would appear in major news outlets in a few hours.

At 11:51 PM, Marie appeals the decision. This time, a human moderator named Sofia in Dublin reviews the case. She sees the context that AI missed. Within minutes, the post is restored with a sensitivity screen. The public gets to see what happened. Truth survives the algorithm.

While this is just an imagined scenario, it could very well become our reality without human-in-the-loop systems.

Let us be real here. While AI can make fast and mostly accurate decisions in milliseconds, it can and will miss certain nuances. Especially if they are cultural nuances it has not been trained on. Like the scenario we just looked at, AI can also fail to understand and correctly interpret the context. That is why human-in-the-loop (HITL) systems offer an optimal balance between scalability and accuracy.

As AI becomes increasingly embedded in workflows, the human-in-the-loop (HITL) approach becomes more relevant. In this approach, human intervention is embedded in the workflow to participate in decision making, supervision and correction. In sensitive areas like content moderation, HITL fits right in, by allowing rapid initial filtering by AI, but routing ambiguous or risky decisions to humans for context-sensitive review. The key principle is leveraging AI’s speed and scalability, while ensuring human insight preserves nuance, ethics, and accountability.


why AI needs a second opinion

HITL is being increasingly used in AI deployments because pure automation often falls short while making sensitive decisions. The numbers tell the story. A study by the Center for Democracy & Technology (CDT) found that AI moderation systems have a 5–10% error rate in identifying harmful content. This can lead to false positives and negatives that then translate to incorrect removals resulting in missed threats. It is also interesting to note that 25% of AI-flagged content disputes are overturned by human reviewers.

Here are some of the reasons AI gets it wrong.

  • context blindness: AI struggles with sarcasm, cultural nuances, and situational appropriateness
  • edge cases: Ai also struggles in cases where it needs to differentiate between news reporting vs. glorification or educational content vs. harmful content
  • algorithm bias: Bias can creep in based on the data the AI has been trained on.

the HITL framework

The National Institute of Standards and Technology (NIST) talks about  modular workflows when implementing HITL. This is a combination of supervised learning, active feedback, and human verification to minimize bias and errors. Stanford’s adaptive HITL systems encourage active human involvement whenever the algorithm encounters uncertainty or high-stakes decisions.


the four step framework

  • AI screening: The automated first pass filters the obvious violations and clearly acceptable content. This can be implemented by establishing a trigger system based on confidence thresholds. For example:
  • Confidence threshold of >90%: The system acts on its own
    Confidence threshold of 70-90%: Additional information or context is sought
    Confidence threshold of <70%: A human intervention is triggered.

  • human review: Flagged ambiguous cases are escalated to trained moderators.
  • expert oversight: Complex or precedent-setting cases are reviewed by senior specialists.
  • feedback loop: Human decisions retrain AI models for improved accuracy.

best practices for implementation

Building an effective HITL system requires more than just plugging humans into an automated workflow. It is all about intentional design and ongoing commitment.

  • define clear escalation criteria: Establish a clearly defined confidence threshold that is easily measurable.
  • invest in moderator training: Ensure consistency and protect moderator wellbeing.
  • build diverse review teams: Include multiple perspectives to reduce bias.
  • create robust guidelines: Maintain living documents that evolve with emerging challenges
  • monitor and audit: Regularly review both AI and human decision quality
  • transparent appeals process: Allow users to challenge decisions

The verdict is out. HITL systems deliver the perfect balance between speed and accuracy. Organizations that want to scale but are also focused on building trust must choose this path to grow. As more and more governments come out with regulator frameworks like the EU AI act, it is very likely that HITL will become more of a compliance necessity instead of a competitive advantage. Industries handling sensitive decisions like social media platforms, e-commerce marketplaces, healthcare providers, and financial services are leading the charge as they recognize that responsible AI is not just good ethics, it is good business.

The AI story that is being written today is not about choosing between humans and machines. It is about designing systems where both thrive together.


how Infosys BPM can help

Infosys BPM's Trust and Safety Services help enterprises protect users, assets, and brand reputation through comprehensive content moderation and proactive threat management solutions. From policy design to AI-powered detection and human oversight, Infosys BPM brings proven frameworks across ecommerce, gaming, media, BFSI, healthcare, and travel sectors to ensure compliance and to foster user confidence.