Q+A: How Do We Make AI Programs Show Their Work?

A year after OpenAI publicly released the artificial intelligence language model ChatGPT — that captured the world’s imagination about the possibilities of AI — the European Union and White House are working to establish standards and safety measures to address risks of the technology’s rapid proliferation. A Blueprint for an AI Bill of Rights, issued by the Office of Science and Technology Policy over the summer, followed by the recent Executive Order on Safe, Secure and Trustworthy Artificial Intelligence, place the onus on AI developers and companies to ensure the programs are transparent and trustworthy, and cannot be used in a way that jeopardizes citizens’ privacy or safety.

But making a program that is capable of sifting through unimaginable volumes of data in seconds to also explain what it’s doing in a way that human users can understand is just the first of many challenges facing developers as they expand the applications of AI, according to Rosina Weber, PhD, an information science professor in Drexel University’s College of Computing & Informatics who studies explainable artificial intelligence. Weber’s research looks at ways AI technology can be designed for transparency in its decision making, which enable it to support users managing challenging problems.

She recently shared some of her insights on what it will take to achieve the goals of the executive order and to safely unlock the potential of artificial intelligence.

As the use of artificial intelligence programs becomes more widespread, why is it important for users to have an understanding of how it works?

It’s important for users to have a baseline knowledge of the technology primarily to avoid misuse and to avoid being caught or tricked by malicious threats. When the public knows the limits of technology, they will not be easily misled or fooled by scammers.

What things should a user know about an AI program before they decide to use it?

The same as with any other technology, it’s important for the user to understand their intended goal, whether the technology can be helpful, and what the consequences of using or trusting it might be.

For example, if you use a route-planning tool to provide a route for you to drive to work, you may want to know how often it updates the route based on traffic and how serious the consequences are if you arrive late. Or if you ask Google for someone’s address, you should know that it might be outdated before you send them mail.

Now that tools like ChatGPT are available, it is important that people comprehend that this tool behaves as if it could understand natural language, but it doesn’t. It is just extremely fast in reusing sentences and make them look natural and eloquent. But those answers cannot be considered true.

For example, I asked ChatGPT to create an assignment to evaluate a student’s comprehension of a chapter from a book. It created several questions of multiple types, such as true/false, narrative, problems, and multiple choice. Then I asked the same for a text that I had created, and it could not do a good job. The reason it did well with the book chapter is because that book has an accompanying bank of resources, which is also available online.

What does “transparency” mean when it comes to artificial intelligence?

The most frequent interpretation of transparency, also called interpretability, is the extent to which an AI method reveals its inner workings. Do you need to comprehend how a watch works to trust its time? Probably not. But if a bank that uses AI to assess its loan applications rejects your loan application, you should ask for the reasons. The reasons for rejection of a loan application may not come from the inner workings of the AI method. It may come from the data they used — but it’s important that the program is transparent enough to show how it arrived at its recommendation.

It’s important to understand that data-oriented AI algorithms, also called machine learning, can only learn from the data they are provided with. If that data is inaccurate, incomplete, or fails to account for extenuating factors, then the program will internalize these shortcomings and its performance will be limited.

When we talk about AI “bias,” this is typically a reflection of bias in data collection, or decisions made by the institutions using the programs. AI algorithms do not add bias or prejudice, but they can amplify existing biases — so transparency is crucial for pinpointing where these problems are entering the system. The more transparent the AI method, the easier it is to guarantee it is safe, reliable, and secure.

What is a “black box” program?

The use of the term black-box is typical for models that are opaque or inscrutable with respect to their inner functioning. The term became popular to describe deep learning networks that can have hundreds of layers and even millions of parameters. It is simple to understand when we think of a simple equation with three variables — and then imagine billions of them. It is called a black box because it is not feasible for humans to process such a volume of data. Just like it is difficult for humans to comprehend theoretical concepts that we cannot visualize and follow.

To clarify, consider the opposite, which are models that are referred to as “interpretable.” A simple model based on a chain of, say, 100 rules can be used for loan underwriting.

Note that some algorithms that made the news under misleading claims of being black boxes, such as COMPAS, are considered black boxes not because of the AI method they use, but because it is proprietary and no information about it is given. So, a program may be considered “black box” based on the AI method, or other circumstances.

The Executive Order calls for developing standards to ensure that AI systems are “safe, secure and trustworthy,” — how are systems tested to ensure they are safe, secure and trustworthy?

To guarantee systems are safe, it is important that they be predictable. Of course, once predictable, this is how safety can be guaranteed. This field is based on theoretical principles. The idea is to be able to guarantee that a result would not go above and below established thresholds.

These methods have advanced substantially but they are constantly trying to catch up with the advances on AI models. Right now, it is still not possible to guarantee predictable behavior in large models such as LLMs, like ChatGPT.

Is there a way that the programs could be used to spot deficiencies in their training data?

Absolutely. There is a class of explainability methods that produce the instances that are most relevant to a decision. The previous method I described indicated the contributions of features, this one provides lists of instances—it’s called instance attribution. These results can help spot errors in data labels and even exposure to adversarial attacks.

Is there anything that can be done to foresee and prevent “alignment problems” — such as when parameters are not set in a careful/detailed enough way to generate responses in line with the values of the user/society (for example, providing an answer that suggests the user do something illegal)?

This would require a system to be aware of the local culture, norms and laws. This has been proposed and devised as an additional step, but not something that would be considered as part of some training data.

Most of the real-world challenges we discuss these days would require new architectures that combine some commonsense or domain knowledge with data. The current models are not able to do much better than they currently do. The current best practice for fairness is to adopt it throughout the entire software development life cycle so it is not too late to think ethically.

Is it possible to make AI programs more accessible/transparent without slowing them down?

This depends on the type of data used, and the amount of data available. It is possible to build a deep-learning architecture that follows the reasoning principles in case-based reasoning. This architecture has been shown to become more accurate as the reasons for its conclusions are aligned with domain knowledge from experts.

For example, using anonymized data sold by the three main credit score companies, a financial institution doesn’t even know how it makes decisions by using a deep learning model. Using a prototypical part network — “ProtoPNet” — a form of deep case-based reasoning, for example, a financial institution can align its decisions with organizational principles and guarantee fairness and have knowledge of which decisions it makes.

Reporters who would like to speak with Weber should contact Britt Faulstick, bef29@drexel.edu or 215.895.2617.