Q+A: Have AI Chatbots Earned Our Trust?

AI chat prompt

Millions of people are using AI-driven large language model (LLM) programs, like ChatGPT, Gemini, LLaMA, Copilot and Claude, for everything from punching up writing, to fact-checking stories, to writing computer code. In a very short period of time, these programs have taken on many of the functions of trusted personal assistants — even taking on data and cybersecurity tasks. But have the programs — and the companies behind them — done enough to earn our trust? That’s the subject of a recently published survey of research by a group of security and privacy researchers from Drexel University’s College of Computing & Informatics.

Looking at “the good, the bad and the ugly” when it comes to using LLMs for enhancing privacy and security, the research suggests that while these programs have the potential to be helpful for functions like reviewing computer code for vulnerabilities and supporting cybersecurity monitoring efforts, they likely have not faced as much scrutiny as other privacy and security technology at this point and should be used with caution.

“On the positive side, LLMs have significantly contributed to enhancing code and data security, while their versatile nature also opens the door to malicious applications,” they wrote.

The team, led by Yue Zhang, PhD, and Eric Sun, PhD, who head up the College’s Security and Privacy Analytics Laboratory, reviewed 281 papers on the subject of LLMs, privacy and security and noted a majority of them (144), mostly in the past year, are focused on vulnerabilities and weaknesses within LLMs and the security and privacy risks they could pose — suggesting that this is an area of growing concern.

Zhang, recently shared some insights with the News Blog about LLM security and what people should know about the technology before using it.

What is the biggest misconception users have about the security of their personal information and the information they are sharing when using LLMs?

There are a number of misconceptions or deficiencies of understanding that I think people have when they use large language models (LLMs). Here are a few of them:

  • Lack of awareness about data usage: Users may not fully understand how their data is used to train and improve large language models. This data can include sensitive information that, if improperly handled, could lead to privacy violations. Many users believe that their data is secure as long as vendors do not share or steal it. However, third parties can launch various attacks using the model. For example, a membership inference attack (MIA) can identify whether a specific data record was used to train the model. If your data is used to train the model, an MIA could potentially reveal this, compromising your privacy. Google’s updated privacy policy doubles down on using your data for training AI.
  • Ignorance of third-party access: Users may not realize that their data can be accessed by third-party entities, either through partnerships, data sharing agreements, or security vulnerabilities. This can expose their personal information to a wider audience than intended. For example, some LLM services enable plugins developed by third parties, not by the LLM providers themselves. These third-party providers could be malicious and might steal your data.
  • Underestimation of data aggregation risks: Users often do not realize that even seemingly innocuous data points can be aggregated and analyzed to build comprehensive profiles about them. This aggregated data can lead to significant privacy risks. For example, over time, multiple mentions of locations can reveal the user’s daily routines, favorite spots, home address and workplace. This comprehensive location profile can be used for targeted advertising, but it could also be exploited for stalking or other malicious purposes.

Overlooking the value of their data: Users may underestimate the value of their data to attackers, not recognizing that even mundane information can be valuable in the wrong hands. For example, users frequently share their birthdays on social media and online forms. As such, it is normal for them to share their birthdays to LLMs. However, birthdates are crucial for identity verification and can be used to answer security questions, reset passwords, or commit identity theft.

How concerned should we be about the security of information being shared with an LLM in the course of using it? Should users assume that any information input into an open-source LLM will become public — as it could be used to train the LLM and could subsequently surface as an output to someone else’s query and/or be extracted by the company that created the LLM?

First of all, privacy concerns always exist, whether the LLM is open-sourced or not. Even for closed-source models, attackers can use membership inference to obtain sensitive information. Currently, if the data is truly trained on users’ inputs, there is no perfect solution to this issue, although there are multiple methods to mitigate it.

The key to addressing this concern is for vendors not to use users’ data, and for users to maintain high standards when inputting their data. Users should exercise caution and assume that any information shared with an LLM could potentially be used for training purposes and might not remain private. By following best practices and being aware of the data policies of the LLM services they use, users can better protect their information. For highly sensitive data, alternative secure methods of handling should be considered.

Given LLMs’ constant demand for new training data, how easy would it be to manipulate the outputs of an LLM through techniques like data poisoning, or injecting malicious training data?

Manipulating the outputs of an LLM through techniques like data poisoning or injecting malicious training data can be a serious concern, given the LLMs’ constant demand for new training data. 

If an attacker has access to the training pipeline or can inject data into the training set, data poisoning becomes relatively straightforward. Open-source models or models with open data contribution systems are more vulnerable. However, LLM providers with strict controls, security measures, and closed datasets make it harder for malicious data to be introduced. The good news is that LLMs are typically trained on massive datasets, so a small amount of poisoned data might not have a significant impact unless it is highly targeted, or the proportion of poisoned data is large enough to influence the model.

What are some techniques being employed to improve/ensure the quality of training data?

There are multiple types of techniques throughout the data lifecycle. During data collection, it is crucial to source data from diverse and reliable sources, and verify the integrity of the data (e.g., make sure that there are no attackers who modify the data). In the preprocessing stage, removing noise, normalizing text and deduplicating entries are essential. Accurate data annotation and labeling can be achieved through expert annotators and crowdsourcing platforms. Data augmentation techniques, such as synthetic data generation and balancing, help maintain diversity. Automated quality checks using language models and statistical methods detect grammatical errors and inconsistencies.

What other privacy and information security concerns should folks be aware of when using LLMs?

The new security concern we should really care about is jailbreak. LLM jailbreak refers to techniques used to bypass the safety and ethical guidelines built into large language models, allowing them to generate harmful, inappropriate, or restricted content.

This concept is relatively new and impactful because LLMs are designed to follow ethical guidelines to prevent misuse, such as generating hate speech, providing dangerous instructions, or violating privacy. However, sophisticated users can exploit vulnerabilities in the models to override these safeguards, which poses significant risks. The ability to jailbreak LLMs undermines trust in these technologies, exposing users and organizations to potential harm and legal issues.  For example, an attacker might attempt to jailbreak an LLM used in a chatbot or automated assistant to execute malware.

What precautions should be taken to protect user and data privacy when using LLMs? Is there a safe way to use an LLM while ensuring the security of information being shared with the program?

First of all, LLMs are similar to traditional software, so any sensitive information you wouldn’t share with traditional software should also not be shared with LLMs. Both LLMs and traditional software can face risks, such as data breaches, unauthorized access and misuse of stored user information. Both LLMs and traditional software can have security vulnerabilities that attackers might exploit, including input manipulation, unauthorized access and injection attacks.

The same methods used to protect sensitive information in traditional software, such as authentication, encryption and access control, are also effective for LLMs. From this perspective, there’s no need to panic. By applying best practices for data security, you can protect your information when using LLMs just as you would with traditional software. Our group also reviewed research on protection mechanisms for LLMs to further tailor your security measures based on your specific needs, which offers some helpful insights on further securing your sensitive information.

What guidelines or oversight could be put in place to limit the extent to which LLMs could be used for malicious intents, such as creating malware, misinformation and phishing attacks?

There are several efforts we can undertake to enhance security. At the user level, we should strive to prevent attackers from accessing the system by conducting thorough risk assessments, clearly defining and enforcing permissible use cases, and designing robust access control measures. This includes limiting access to vetted users, implementing strong user authentication and monitoring interactions.

At the content level, implementing content filtering systems to screen for harmful content is crucial. Human moderators can review flagged outputs to ensure safety. Additionally, legal frameworks, compliance audits, red teaming exercises and adversarial training can significantly improve security.

Technical safeguards such as usage limits, response shaping and enhancing transparency and explainability also play vital roles in mitigating misuse risks. Combining these strategies fosters a safer and more ethical AI ecosystem.

Drexel provides guidance and policies for its employees regarding the use of AI technology here: https://drexel.edu/it/security/policies-regulations/ai-guidance/

Reporters interested in speaking with Zhang should contact Britt Faulstick, executive director of News & Media Relations, bef29@drexel.edu or 215.895.2617.

Tagged with: