Is it a “Fair Game”
to Trade Your Privacy
for Benefits from ChatGPT?

Our study found that users are constantly faced with trade-offs between privacy, utility, and convenience when using LLM-based CAs (e.g.,ChatGPT). Many users felt they had to pay the price of privacy to get benefits from ChatGPT and considered the tradeoff a “Fair Game”. However, the erroneous mental models we observed and the difficulty of exercising privacy controls due to dark patterns suggest the game is far from fair.

Paper

It’s a Fair Game”, or Is It? Examining How Users Navigate Disclosure Risks and Benefits When Using LLM-Based Conversational Agents

¹Zhiping Zhang, ²Michelle Jia, ²Hao-Ping (Hank) Lee, ³Bingsheng Yao, ²Sauvik Das, ¹Ada Lerner, ¹Dakuo Wang, ¹Tianshi Li*

¹Northeastern University, ²Carnegie Mellon University, ³Rensselaer Polytechnic Institute

LLM-based conversational agents (CAs), such as ChatGPT, are increasingly being incorporated into high-stakes application domains including healthcare, finance, and personal counseling. To access this functionality, users must often disclose their private medical records, payslips, or personal trauma.

The benefits of disclosing personal data to these CAs is often concrete to users — in that the CA can help them more directly with a task — the risks are more abstract and difficult to reason about. This asymmetry can lead to users divulging information that can increasingly expose them to both types of privacy risks over time, perhaps unwittingly.

A fictional example of sensitive disclosure to ChatGPT inspired by real-world use cases: A user shared their doctor’s email and ICD-10-CM diagnosis results with ChatGPT upon its request. And then ChatGPT interpreted the codes, indicating the user had multiple diseases. Three issues are demonstrated in the example: 1. disclosed PII (name) and non-identifiable but sensitive information (diagnosis results); 2. disclosed other person’s information (doctor’s information); 3. ChatGPT actively requested for detailed information from the user which encouraged user’s disclosure behavior.

So we delivered a study to learn:

🎯 What sensitive disclosure behaviors emerge in the real-world use of LLM-based CAs?

🎯 Why do users have sensitive disclosure behaviors, and when and why do users refrain from using LLM-based CAs because of privacy concerns?

🎯 To what extent are users aware of, motivated, and equipped to protect their privacy when using LLM-based CAs?

How to Use This Study

Privacy-preserving Tool/Model/System Designer or Developer: Dive into our comprehensive analysis of sensitive disclosures in ChatGPT, offering a user-centered perspective to understand the trade-offs between privacy, utility, and convenience. Our findings, particularly on dark patterns and user mental models, can guide you in developing more ethical and user-friendly LLM-based CAs.

LLM-related Privacy Specialists: Our research sheds light on the intricate dynamics of user behavior with LLM-based CAs, highlighting areas where policy and regulations can play a vital role. By understanding users' navigation of privacy trade-offs and their perception of the "fair game" myth, you can better frame future research and policy decisions in the domain.

Other curious readers: Explore the fascinating world of Large Language Models and understand the privacy concerns that arise from human-like interactions with these chatbots. Our in-depth analysis and interviews provide a clear picture of users' experiences, giving you a deeper insight into the trade-offs and challenges they face daily.

Methodology

Mixed Methods Dataset Analysis: To learn what sensitive disclosure behaviors emerge in the real-world use of LLM-based CAs, we qualitatively examined a sample of real-world ChatGPT chat histories from the ShareGPT52K dataset1 (200 sessions containing 10380 messages).

Semi-structured interviews: To deeply understand when and why users have sensitive disclosure behaviors as well as how users perceive and handle privacy risks, we conducted semi-structured interviews with 19 users of LLM-based CAs covering 38 types of use cases.

LLM-based CAs used by the interviewees

Types of use case discussed in the interviews

Highlighted Findings

👯 Users disclose PII and personal experiences, not only of their own but also of others.

📈 The use of interactive prompt strategies led to a gradual increase in disclosure, sometimes prompted by ChatGPT.

🙌 Many of the interviewees were pessimistic about having both utility and privacy. Yet nearly all took privacy-protective measures.

🧠 Flawed mental models that could hinder users' awareness and comprehension of emerging privacy risks in LLM-based CAs, such as the memorization risks.

👣 Dark pattern: ChatGPT users' data is used for model training by default, increasing memorization risks, with opt-out options being obscure and often tied to reduced functionality.

😥 Users fear being discovered using AI, worrying that others will question their abilities.

We’ll dive deeper into these takeaways below.

👯 Users disclose PII and personal experiences, not only of their own but also of others.

There are not only institutional privacy issues but also interdependent privacy issues in LLM-based CAs. The disclosure of other individuals’ data was common when users used ChatGPT to handle tasks that involved other people (e.g.,friends, colleagues, clients) in both personal and work-related scenarios. For example, a user shared email conversations regarding complaints about their living conditions and asked ChatGPT about further steps to resolve and go forward with the matter. The conversation included email communication between the user and a staff member responsible for handling the issue. It contained both individuals’ email addresses, names, and phone numbers

In the interviews, we also noticed the similar phenomena among users while they have different opinions on that.

“That’s not my decision to make.” : Some people expressed heightened caution when sharing data related to others, even more so than their own data. They voiced concerns regarding data ownership and the ethics of sharing information without the original owner’s consent.
“A fair exchange” : Conversely, others expressed less concern about sharing information about others, viewing it as a fair exchange.
Interestingly, one user deemed it acceptable to share extensive research data containing details of various individuals, including full names, demographic specifics, personal experiences, and feedback, justifying her actions as being non-commercial in intention. However, she expressed discomfort with the idea of her data being shared by others, worrying about “how AI will summarize or do what kind of judgment for me.” .

📈 The use of interactive prompt strategies led to a gradual increase in disclosure, sometimes prompted by ChatGPT.

We developed a multidimensional typology to characterize the disclosure scenarios of ChatGPT from dataset analysis. The typology consists of four dimensions: Context, Topic, Purpose, and Prompt strategy.

Prompt strategy captures tactical approaches users used to interact with ChatGPT to achieve their goals, including Direct Command, Interactively Defining the Task, Role-Playing, and Jail-Breaking.

Interactively Defining the Task refers to scenarios where users engage in multi-round interactions with ChatGPT. In these scenarios, users gave ChatGPT instructions and adjusted their instructions based on ChatGPT’s response. This interactive process led users to gradually reveal more information as the conversation progressed; sometimes, this progressive disclosure was driven by ChatGPT.

For example, a user set up ChatGPT to act as a therapist. The user conversed with ChatGPT about their experiences and sought advice through multiple rounds of conversation. As ChatGPT was giving advice, the user would respond with more specific details, similar to the flow of a conversation between real people.

🙌 Many of the interviewees were pessimistic about having both utility and privacy. Yet nearly all took privacy-protective measures.

Participants appeared to be pessimistic about accessing the benefits of LLM-based CAs while also preserving privacy.

“There is a price for getting the benefits of using this application... It’s a fair game”

“you can’t have it both ways”

“Let’s say I need some advice about resume. If I don’t provide those contents that contain a lot of my private things, ChatGPT won’t work.”

Yet nearly all took privacy-protective measures, e.g., censor/falsify sensitive info, sanitize inputs copied from elsewhere, and seek general advice.

🧠 Flawed mental models that could hinder users' awareness and comprehension of emerging privacy risks in LLM-based CAs, such as the memorization risks.

We summarize users’ mental models for two processes.

1. Response generation process: How users understand their data will be used to get the responses?

Model A: “ChatGPT is magic.”

“some kinds of magic I don’t know” (P10)
A shallow technica understanding of how ChatGPT generates responses. Participants who harbored this mental model thought of the generation process as an abstract transaction: messages are sent to an LLM or a database, and an output is received. P8 illustrated a typical example of this model, shown in this figure. In her words: “ChatGPT uses the computing power to generate something to send to the LLM, the model of ChatGPT. And then you get your output data...Actually it likes a blackbox for me. I just use it. I mean, I never thought about that before.”

Model B: “ChatGPT is a super searcher”

“My input will get broken down. And then look for keywords...After the keywords, will start searching in their database, trying to find an answer. At this point they might try to combine the answer.” (P4)
Participants with this mental model envisioned the response generation process as a form of live keyword search on the internet or in a database sourced from the internet, followed by a synthesis of the gathered information.
They often expected rule-based methods or human interventions to play a role in generating the responses.

Model C: “ChatGPT is a stochastic parrot”

2. Improvement and Training process: How users understand their data will be used to improve and train the AI (in LLM-based CAs system)?

Model D: User input is a quality indicator ; Model E: User input is training data

The results indicate that users’ mental models demonstrated overly simplified or flawed understandings of how their data was used and how LLM-based CAs worked, which could hinder users' awareness and comprehension of emerging privacy risks in LLM-based CAs. For example, participants with mental model D were less able to expect and understand memorization risks. P1 could not see how the personal data mentioned in her prompt, such as zip code, can be used in generating future responses.

Memorization Risks related to LLM: Unintentional memorization of portions of the training data, which also contain user input data, including personally identifiable information, which might also be included in the generated output.

👣 Dark pattern: ChatGPT users' data is used for model training by default, increasing memorization risks, with opt-out options being obscure and often tied to reduced functionality.

Dark patterns are another issue. By default, ChatGPT users allow OpenAI to use their data for model training, exposing them to memorization risks. ChatGPT offers two ways for a user to opt out of having their data used for model training:

The one in the user settings is easier to discover (all but P15 found it), while the training and chat history opt-out control are bundled together, so a user who wants to opt out of model training will have to turn off the chat history feature as well.
The users could also submit a form to opt out of training and keep the history, while it is in an FAQ article that is harder to discover (none of our participants found it).

The opt-out interfaces unnecessarily link privacy with reduced functionality, and the more flexible control is hard to find and use.

😥 Users fear being discovered using AI, worrying that others will question their abilities.

In the dataset, we found several users expressed concern — in their prompts — that others would find out that they had used AI for the task at hand. For instance, one user who was writing a book provided ChatGPT with a list of content-generation tasks. The user explicitly wrote in the prompt like“do something for me so that no one will find out this book was written by AI.”.

In the interview, P8 shown concerns if other discovered her reliance on AI for tasks like schoolwork and email writing.

“I hope they (my professors) will never know I used AI to do that (write emails).”

1000_F_611498953_tanDWV9SiI66LCABwvN2QTvqmeu5UAwu_edited.jpg

More Findings in the Paper

@article{zhang2023s,

title={" It's a Fair Game'', or Is It? Examining How Users Navigate Disclosure Risks and Benefits When Using LLM-Based Conversational Agents},

author={Zhang, Zhiping and Jia, Michelle and Lee, Hao-Ping Hank and Yao, Bingsheng and Das, Sauvik and Lerner, Ada and Wang, Dakuo and Li, Tianshi},

journal={arXiv preprint arXiv:2309.11653},

year={2023}

}

Updated on 2024. 10. 10

Is it a “Fair Game” to Trade Your Privacy for Benefits from ChatGPT?

So we delivered a study to learn:

How to Use This Study

​

​

​

Methodology

​

Mixed Methods Dataset Analysis​​​​: ​​​To learn what sensitive disclosure behaviors emerge in the real-world use of LLM-based CAs, we qualitatively examined a sample of real-world ChatGPT chat histories from the ShareGPT52K dataset1 (200 sessions containing 10380 messages).

​

Semi-structured interviews: To deeply understand when and why users have sensitive disclosure behaviors as well as how users perceive and handle privacy risks, we conducted semi-structured interviews with 19 users of LLM-based CAs covering 38 types of use cases.

Highlighted Findings

We’ll dive deeper into these takeaways below.

👯 Users disclose PII and personal experiences, not only of their own but also of others.

📈 The use of interactive prompt strategies led to a gradual increase in disclosure, sometimes prompted by ChatGPT.

🙌 Many of the interviewees were pessimistic about having both utility and privacy. Yet nearly all took privacy-protective measures.

🧠 Flawed mental models that could hinder users' awareness and comprehension of emerging privacy risks in LLM-based CAs, such as the memorization risks.

👣 Dark pattern: ChatGPT users' data is used for model training by default, increasing memorization risks, with opt-out options being obscure and often tied to reduced functionality.

😥 Users fear being discovered using AI, worrying that others will question their abilities.

Is it a “Fair Game”
to Trade Your Privacy
for Benefits from ChatGPT?

Mixed Methods Dataset Analysis: To learn what sensitive disclosure behaviors emerge in the real-world use of LLM-based CAs, we qualitatively examined a sample of real-world ChatGPT chat histories from the ShareGPT52K dataset1 (200 sessions containing 10380 messages).