The Black Box Problem: Tackling the explainability challenge in GenAI

The Black Box Problem: Tackling the explainability challenge in GenAI

Generative AI (GenAI) models like GPT-4o and DALL-E 3 have captured the public's imagination with their ability to create human-like text, images, and even code from simple prompts. However, a major problem is emerging with these powerful AI systems - they're essentially "black boxes."

The Black Box Problem

Most state-of-the-art GenAI models are extremely complex, with millions or even billions of interconnected artificial neurons. It's incredibly difficult, if not impossible, to fully understand how these models work under the hood and arrive at their outputs. This lack of explainability poses serious challenges. For instance, in 2020, a healthcare algorithm was found to be biased against Black patients, resulting in less accurate diagnostic results for them compared to white patients​.

The opacity of these AI systems also makes them vulnerable to manipulation. Researchers have shown that it's possible to subtly change the inputs to make the models generate harmful or deceptive outputs. Without transparency, it's difficult to detect and defend against these kinds of adversarial attacks.

Approaches to Explainable AI

To tackle the black box problem, organisations interested in creating transparent, secure and ethically robust GenAI solutions can follow the strategies mentioned below:

Interpretable Model Design

Building GenAI models from the ground up to be more transparent, using simpler machine learning techniques or incorporating human-understandable rules. For example, a generative model that creates product descriptions by applying a set of predefined templates rather than a complex neural network.

Explainable AI Techniques

To make existing black box models more understandable, researchers develop methods that identify which input features (like words or data points) are most important for the model's decisions or show similar examples from the training data that influenced the model's output.

  • LIME (Local Interpretable Model-agnostic Explanations): LIME explains predictions by approximating the original model with a simpler, interpretable one, highlighting which parts of the input were most influential.
  • SHAP (SHapley Additive exPlanations): SHAP values provide a clear picture of how much each feature contributes to the final prediction, based on game theory principles.

For example, using LIME, you might discover that a model generating news article summaries relies heavily on the article's title and first paragraph. This helps you understand which parts of the article are most influential in the summary generation process.

GenAI Guide

Proactive Transparency

Rather than explaining the model itself, focus on disclosing the data, assumptions, and processes used to develop and deploy the AI system. For instance, providing detailed documentation about the dataset used to train a generative model for writing product reviews.


Having human experts work alongside the AI system, providing oversight, asking questions, and challenging the model's reasoning adds a layer of interpretability. For example, a team of editors reviewing and validating the outputs of a generative model writing marketing copy before publishing.

Ethical and Social Implications

The ethical implications of the black box problem are significant. AI systems that lack transparency can perpetuate biases and discrimination.

In May 2023, Johns Hopkins Bloomberg School of Public Health held a Q&A about how healthcare algorithms and AI can help and harm, featuring Kadija Ferryman, PhD, assistant professor of Health Policy and Management and core faculty at the Berman Institute of Bioethics. Amongst other very interesting insights regarding the subject, here is a part of the Q&A that is a great example of the ethical implications of the black box (the following part has been taken verbatim.)

"Q: You and your colleagues recently wrote about an example of an algorithm that's widely used when patients report having pain.

A: We wrote about the NARX score—a risk score used to assess an individual's risk for opioid misuse. We raised questions like, what data points are used to develop this risk score? Could some of those data points be unintentionally making that algorithm biased against certain groups? We wrote about wanting to be cautious about a tool that sounds, on the surface, very necessary and helpful, but could have biases embedded within. 

Many algorithms are proprietary. We don't know the exact formulation of the NARX score. But based on other evidence we've gathered, a data point like criminality, for example, is likely to be included. There's racial bias embedded in data on criminal history because of histories of over-policing in certain communities. Often, because of the black box nature of algorithms, a clinician may not know the reasons behind the NARX score their patient had received, and the patient may not know how the algorithm determined their risk level. "

The Road Ahead

While these approaches offer promising solutions, tackling the explainability challenge in GenAI is complex. Businesses must overcome technical limitations, data governance gaps, regulatory uncertainty, and talent shortages to make their AI systems more transparent and accountable. By prioritising explainability, organisations can unlock the full potential of generative AI to augment and empower human intelligence while ensuring these systems remain safe, fair, and trustworthy.

If you are reading the above and thinking how complex successfully integrating GenAI into your organisation is, we are here to help. Calls9's AI Fast Lane Programme addresses these challenges by offering tools and expertise to overcome technical and data governance issues. The programme includes robust data management frameworks, ensuring high-quality, unbiased data, which is crucial for transparency.

Calls9 provides detailed documentation and proactive transparency, aligning with best practices in explainable AI. Their human-in-the-loop approach integrates expert oversight, enhancing accountability and reducing bias.

We also offer compliance support and training programs, addressing regulatory uncertainty and talent shortages. This ensures that your organisation can achieve transparent and accountable AI systems.

By partnering with Calls9, you can implement successful AI systems, making them safer, fairer, and more trustworthy. For more information, check out the AI Fast Lane programme and book a free consultation with our experts.