Doctoral Student Liying Qiu Studies AI, Consumer Behavior, and Market Dynamics
By John Miller
Media InquiriesLiying Qiu is a fifth-year PhD student in business technologies at the Tepper School of Business. Her research starts from a broader perspective on how algorithmic decision-making influences social welfare, including consumer surplus and firm profitability.
In her paper, “Consumer Risk Preferences Elicitation from Large Language Models,” Qiu looks at the ability of large language models (LLM) to mirror human behavior in consumer decision-making. These models, capable of generating human-like responses, have had promising results when applied to market research. However, do these algorithms truly capture the nuances of our choices?

The paper evaluates how well the GPT-4-turbo model (which is faster and uses fewer computing resources than GPT-4) predicts home insurance selections. Qiu, with Tepper School faculty Dr. Param Vir Singh and Dr. Kannan Srinivasan, looked at how well the GPT-4-turbo model predicts home insurance selections. Their goal was to determine if AI could mimic consumer behavior and also align with theories that explain how we weigh potential losses and gains.
Qiu and her co-authors tested GPT-4-turbo against a dataset of nearly 6,000 real-world home insurance policies with deductibles ranging from $100 to $1,000, while premiums varied from a few hundred dollars to over two thousand. This represented a diversity of consumer decisions involving deductibles and premiums, then compared the model's predictions with actual consumer choices.
The immediate results showed the LLM’s predictions mirrored aggregate consumer trends, particularly for mid-range deductible options. However, a closer look revealed a more complex picture. When the researchers assessed decisions at the individual level, the model's accuracy plummeted, with less than half of its predictions matching actual consumer choices.
This discrepancy led the researchers to go deeper into the model's reasoning. They used a technique called the theory-based chain of verification (basically, asking the LLM to validate its answers) to assess whether the model's selections aligned with prospect theory, a cornerstone of behavioral economics. Prospect theory posits that individuals make decisions based on potential gains and losses, with a tendency to weigh losses more heavily, a concept known as loss aversion.
The study revealed that, in scenarios with multiple insurance options, GPT-4-turbo often tended to avoid extreme choices, such as the highest or lowest deductible, regardless of the underlying risk preferences. This tendency obscured the model's ability to truly grasp how consumers balance potential losses and gains.
However, when the researchers simplified the choice to two insurance options, GPT-4-turbo demonstrated some capacity to apply risk preferences, revealing measurable parameters like loss aversion and probability weighting. Yet, even here, the model's parameters deviated significantly from those observed in human behavior. For example, the model underestimated loss aversion and exaggerated the weighting of low-probability events, indicating a divergence from established consumer behavior patterns.
Qiu et al also explored various ways to improve the model's accuracy. They experimented with different LLM architectures and even fine-tuned the model using the insurance dataset. However, these yielded only marginal gains, suggesting that the limitations in modeling risk-sensitive behavior may be inherent to the current generation of LLMs.
This study has significant implications for using LLMs in market research and beyond. While LLMs can provide insight into aggregate consumer trends, their limitations in capturing individual risk preferences raise concerns about their reliability in high-stakes decision-making contexts. The authors caution against assuming LLMs possess a deep understanding of human behavior, emphasizing the need for a more nuanced approach that combines statistical mimicry with behavioral realism.
For industries increasingly reliant on AI, this study is a reminder that LLMs are powerful tools, but they may not replicate human thoughts, reasoning, or decision-making. As we continue to integrate LLMs into capacities that have decision-making capabilities, a critical approach grounded in both empirical evidence and theoretical rigor will be essential to ensure that these tools serve us well.
Liying Qiu will continue researching the evolving roles of agentic AI (AI that can perform specific tasks without prompts), examining how consumers and firms increasingly delegate decision-making to intelligent agents and how these dynamics shape business outcomes in market equilibrium.
Read more of Qui’s research, ”Personalization, Consumer Search and Algorithmic Pricing” from Marketing Science (in collaboration with Yan Huang, Singh, and Srinivisan). In this paper, the research shifts to firms, studying how adopting generative algorithms can help firms better understand and make firms more profitable without eliciting consumer preferences and still harm consumer welfare.