Research
Carnegie Mellon’s Software and Societal Systems Department (S3D) hosts an active research group with a highly interdisciplinary approach to software engineering. Indeed, we believe that interdisciplinary work is inherent to software engineering. The field of software engineering (SE) is built on computer science fundamentals, drawing from areas such as algorithms, programming languages, compilers, and machine learning. At the same time, SE is an engineering discipline: both the practice of SE and SE research problems revolve around technical solutions that successfully resolve conflicting constraints. As such, trade-offs between costs and benefits are an integral part of evaluating the effectiveness of methods and tools.
Emerging problems in the area of privacy, security, and mobility motivate many challenges faced by today’s software engineers, motivating new solutions in SE research. Because software is built by people, SE is also a human discipline, and so research in the field also draws on psychology and other social sciences. Carnegie Mellon faculty bring expertise from all of these disciplines to bear on their research, and we emphasize this interdisciplinary approach in our REU Site.
Below, you'll find projects we are planning for summer 2024. Check back frequently as we continue to add new projects.
Accelerated Software Testing - NaNofuzz
Mentors: Joshua Sunshine and Brad Myers
Description and Significance
Generating a robust test suite is often considered one of the most difficult tasks in software engineering. In the United States alone, software testing labor is estimated to cost at least $48 billion USD per year. Despite high cost and widespread automation in test execution and other areas of software engineering, test suites continue to be created manually by software engineers. Automatic Test sUite Generation (ATUG) tools have shown promising results in non-human experiments, but they are not widely adopted in industry.
Prior research provides clues that traditional ATUG tools may not be well-aligned to the process human software engineers use to generate test suites. For instance, some tools generate incorrect or hard-to-understand test suites while others obscure important information that would help the software engineer evaluate the quality of the test suite. Often these problems are evident only by observing software engineers using these tools.
NaNofuzz was recently featured on the Hacker News homepage. Lean more about NaNofuzz at its GitHub repository here: https://github.com/nanofuzz/nanofuzz
Student Involvement
This research initiative will flex your research muscles: you will approach the problem of test suite generation using a comprehensive mix of theory, human observation, PL, HCI, and prototype engineering. You will gain insights using our emerging theory of test suite generation to and using innovative prototype tools, such as NaNofuzz, that you help build. This comprehensive approach will expand your research skill set and help you discover new science-based solutions that may make testing easier and more enjoyable for the 26+ million (estimated) software engineers on earth today.
Advanced Requirements Learning
Mentor: Travis Breaux
Description and Significance
Software requirements describe the normative behaviors that users and other stakeholders come to expect from the systems they use. Government laws, regulations and policies increasingly overlap with software requirements, especially as software is key component in the realization of process automation and artificial intelligence. How we guide companies in the design and development of products and services that respect human and societal values, such as safety and privacy, is a key challenge.
In this project, we aim to address this challenge by combining techniques from psychology, logic and natural language processing to build models that learn whether systems satisfy legal and policy requirements, while enabling design space exploration, when systems are poorly described or when they exhibit behaviors that conflict with requirements. This project combines theory about human language understanding and legal reasoning with advanced prompt engineering techniques using large language models. Students will learn about state-of-the-art prompting strategies and applications, including Chain-of-Thought reasoning (Kojima et al., 2022) and ontology construction (Wei et al., 2023). Students will also participate in the design of experimental pipelines to train, generate and evaluate dialogic systems (Ouyang et al., 2022) for the analysis and validation of software requirements.
References
Kojima et al., “Large Language Models are Zero-Shot Reasoners.” (NeurIPS 2022)
Ouyang et al. “Training language models to follow instructions with human feedback,” (NeurIPS 2022)
Wu et al., “Do PLMs Know and Understand Ontological Knowledge?” (ACL 2023)
Language-Agnostic Resilience Engineering with Wasm
Mentors: Ben Titzer and Heather Miller
Description and Significance
Engineering a resilient distributed system is difficult due to the complexities of partial failure. One promising approach to improve resilience is fault injection testing, but all tools to-date are either manually configured or are tied to a specific programming language. Wasm, as a bytecode that runs on a virtual machine, provides a portable, universal platform for distributed systems that has the potential for great developer tooling since it is a compilation target for many programming languages and supports important debugging functionality. We are building a tool that automatically injects faults into distributed systems during local testing by compiling the distributed system components to Wasm and instrumenting the bytecode. Thus far, we have successfully injected faults into nodes running on Dfinity’s Internet Computer platform and we hope to reach a new domain in this project!
Student Involvement
If you’re interested in research that builds developer tooling to make distributed systems more fault tolerant, you should apply! Students working on this project will extend and leverage our tool to inject faults in a new context: microservices. Students will gain knowledge in many different domains such as bytecode instrumentation, resilience engineering, and microservices. They will also gain skills in working with several different languages like Rust, Virgil, a new DSL specifically for instrumentation (similar to dtrace’s D language), and even Wasm bytecode!
Machine Learning in Production
Mentor: Christian Kästner
Description and Significance
The advances in machine learning (ML) have stimulated widespread interest in integrating AI capabilities into various software products and services. Therefore today’s software development team often have both data scientists and software engineers, but they tend to have different roles. In an ML pipeline, there are in general two phases: an exploratory phase and a production phase. Data scientists commonly work in the exploratory phase to train an off-line ML model (often in computational notebooks) and then deliver it to software engineers who work in the production phase to integrate the model into the production codebase. However, data scientists tend to focus on improving ML algorithms to have better prediction results, often without thinking enough about the production environment; software engineers therefore sometimes need to redo some of the exploratory work in order to integrate it into production code successfully. In this project, we want to analyze collaboration between data scientists and software engineers, at technical and social levels, in open source and in industry.
Student Involvement
We want to study how data scientists and software engineers collaborate. To this end, we will identify open source projects that use machine learning for production systems (e.g., Ubuntu's face recognition login) and study public artifacts or we will interview participants in production ML projects. This research involves interviews and analysis of software artifacts. We may also develop exploratory tools to define and document expectations and tests at the interface between different roles in a project. The project can be tailored to the students’ interests, but interests or a background in empirical methods would be useful. Familiarity with machine learning is a plus but not required. Note, this is not a data science/AI project, but a project on understanding *software engineering* practices relevant to data scientists.
Privacy and AI Threat Modeling
Mentors: Norman Sadeh, Hana Habib and Lorrie Cranor
Description and Significance
The goal of the Privacy & AI Threat Modeling Project is to develop and evaluate methodologies and design patterns to help companies systematically identify and mitigate potential privacy and AI threats as they design and develop new products and services. The project focuses on threats associated with the need to provide users with adequate notices and controls over the technologies with which they interact.
Addressing privacy threats and threats associated with the development of AI solutions is becoming an increasingly significant challenge for industry. Companies are in desperate need for frameworks and methodologies that can help them systematically identify threats and approaches to mitigate such threats (e.g., providing notices and controls that are easily accessible, that are informative and understandable, that are non-manipulative, etc.)
Student Involvement
Students will learn to model new products and services and will help evaluate & refine methodologies developed as part of this project. This work will likely involve conducting human subject studies designed to evaluate the usability of design patterns for notices and controls. This project will involve working under the supervision of several faculty and a post-doctoral researcher and may involve collaborating with other students too.
Related Publications
Y Feng, Y Yao, N Sadeh , A design space for privacy choices: Towards meaningful privacy control in the internet of things, Proceedings of the 2021 CHI Conference on Human Factors in Computing Systems
H Habib, LF Cranor, Evaluating the usability of privacy choice mechanisms
Eighteenth Symposium on Usable Privacy and Security (SOUPS 2022), 273-289
Privacy and Security Question Answering
Mentor: Norman Sadeh
Description and Significance
Security and privacy are becoming increasingly complex to manage for everyday users. The need for effective assistants capable of effectively helping users in this area has never been more important.
Increasingly users are turning to chatbots to answer a variety of everyday security and privacy questions. The objective of this project is to develop personalized GenAI assistants capable of providing users with personalized answers that are not just accurate but that also reflect their level of expertise, are understandable and actionable, and motivate them to heed the assistant's recommendations.
Student Involvement
Students working in this project will learn to systematically evaluate and refine GenAI technologies in the context of security and privacy questions. This will include work designed to increase the accuracy of provided answers as well as work designed to elicit more effective answers.
Recent Publications
Breaking down walls of text: How can nlp benefit consumer privacy?
A Ravichander, A Black, T Norton, S Wilson, N Sadeh
ACL, 2021
Question answering for privacy policies: Combining computational and legal perspectives
A Ravichander, AW Black, S Wilson, T Norton, N Sadeh, EMNLP 2019.
arXiv preprint arXiv:1911.00841
Privacy Infrastructure for the Internet of Things
Mentor: Norman Sadeh
Description and Significance
With the increasingly widespread deployment of sensors recording and interpreting data about our every moves and activities, it has never been more important to develop technology that enables people to retain some level of awareness and control over the collection and use of their data. While doing so as users browse the Web or interact with their smartphones is already proving to be daunting, it is even more challenging when data collection takes place through sensors such as cameras, microphones and other technologies users are unlikely to even notice. CMU's Privacy Infrastructure for the Internet of Things is designed to remedy this situation. It consists of a portal that enables owners of sensors to declare the presence of their devices, describe the data they collect and, if they want to, provide people with access to controls that enable them to possibly restrict how much data is collected about them and for what purpose. The infrastructure comes along with an IoT Assistant mobile app. The app enables people to discover sensors around them and access information about these sensors, including any available settings that might enable them to restrict the collection and use of their data. Deployed in Spring 2020, the infrastructure already hosts descriptions of well over 100,000 sensors in 27 different countries and the IoT Assistant app has been downloaded by tens of thousands of users. The objective of this project is to extend and refine some of the infrastructure's functionality.
Privacy-Preserving Machine Learning
Mentor: Steven Wu
Description and Significance
Many modern applications of machine learning (ML) rely on datasets that may contain sensitive personal information, including medical records, browsing history, and geographic locations. To protect the private information of individual citizens, many ML systems now train their models subject to the constraint of differential privacy (DP), which informally requires that no individual training example has a significant influence on the trained model. After well over a decade of intense theoretical study, DP has recently been deployed by many organizations, including Microsoft, Google, Apple, LinkedIn, and more recently the 2020 US Census. However, the majority of the existing practical deployments still focus on rather simple data analysis tasks (e.g., releasing simple counts and histogram statistics). To put DP to practice for more complex machine learning tasks, this project will study new differentially private training methods for deep learning that improve on existing state-of-the-art methods. We will also study how to use DP deep learning techniques to train deep generative models, which can generate privacy-preserving synthetic data—a collection of “fake” data that preserve important statistical properties of the original private data set. This, in turn, will enable privacy-preserving data sharing.
Scrolling Technique Library
Mentor: Brad Myers
Description and Significance
We have developed a new way to test how well a scrolling technique works, and we need to re-implement some older techniques to see how they compare. For example, the original Macintosh scrollbars from 1984 had arrows at the top and bottom, and a draggable indicator in the middle. Even earlier scrollbars worked entirely differently. I am hoping to recruit one good programmer to help recreate some old scrolling techniques, and possibly try out some brand new ones, like for Virtual Reality applications, to test how well they do compared to regular scrolling techniques like two-fingers on a touchpad or smartphone screen. If there is time, the project will include running user tests on the implemented techniques, and writing a CHI paper based in part on the results.
Student Involvement
The student on this project will be implementing all of the techniques as web applications. The student on this project must be an excellent programmer in JavaScript or TypeScript, preferably with expertise in React or other web framework. Experience with running user studies would be a plus.
References
Paper draft: Chaoran Chen, Brad A Myers, Cem Ergin, Emily Porat, Sijia Li, Chun Wang, "ScrollTest: Evaluating Scrolling Speed and Accuracy." arXiv:2210.00735 https://arxiv.org/abs/2210.00735.
Existing test: https://charliecrchen.github.io/scrolling-test/
Software Supply Chain Security
Mentor: Christian Kästner
Description and Significance
Essentially all software uses open source libraries and benefits incredibly from this publicly available infrastructure. However, with reusing libraries also come risks. Libraries may contain bugs and vulnerabilities and sometimes are abandoned; worse malicious actors are increasingly starting to attack software systems by hijacking libraries and injecting malicious code (e.g., see event-stream, Solarwinds, and ua-parser-js). Most projects use many libraries and those libraries have dependencies on their own and we also depend on all kinds of infrastructure, such as compilers and test framework, all of which could be attacked. Detected software supply chain attacks have increased 650% in 2021, after a 430% increase in 2020. This has gotten to the point that the government has stepped in and requires software companies to build a “Software Bill of Material (SBoM)” as a first step to identify what libraries are actually used.
So how can we trust this *software supply chain*, even though we have no contractual relations with the developers of all those libraries? Research might involve studying how developers build trust, when trust is justified, what attacks can be automatically detected and mitigated (e.g., with sandboxing and reproducible builds), and what actual attacks in the real world look like. There is a large range of possible research directions from code analysis to empirical studies of developers and their relationships, each of which can help to secure open source supply chains.
Student Involvement
Depending on student interest, we will investigate different ideas around software supply chains. For example, we could study how the concept of “trust” translates from organizational science to software security in an open source context and how open source maintainers make decisions about security risks (literature analysis, theory building, interviews/survey), see [1] on trust in a different context. We could build tools that automatically sandbox Javascript dependencies and evaluate the overhead of doing so, see [2] for some related prior work. We could study packages removed from npm to identify what typical supply chain attacks look like in practice. The ideal student for this project is interested in open source and software security.
References
[1] Jacovi, Alon, Ana Marasović, Tim Miller, and Yoav Goldberg. "Formalizing trust in artificial intelligence: Prerequisites, causes and goals of human trust in AI." Proc. FAccT (2021).
[2] Gabriel Ferreira, Limin Jia, Joshua Sunshine, and Christian Kästner. Containing Malicious Package Updates in npm with a Lightweight Permission System. In Proceedings of the 43rd International Conference on Software Engineering (ICSE), pages 1334--1346, Los Alamitos, CA: IEEE Computer Society, May 2021.
Sustainable Open Source Communities
Mentors: Bogdan Vasilescu and Christian Kästner
Description and Significance
Reuse of open source artifacts in software ecosystems has enabled significant advances in development efficiencies as developers can now build on significant infrastructure and develop apps or server applications in days rather than months or years. However, despite its importance, maintenance of this open source infrastructure is often left to few volunteers with little funding or recognition, threatening the sustainability of individual artifacts, such as OpenSSL, or entire software ecosystems. Reports of stress and burnout among open source developers are increasing. The teams of Dr. Kaestner and Dr. Vasilecu have explored dynamics in software ecosystems to expose differences, understand practices, and plan interventions [1,2,3,4]. Results indicate that different ecosystems have very different practices and interventions should be planned accordingly [1], but also that signaling based on underlying analyses can be a strong means to guide developer attention and affect change [2]. This research will further explore sustainability challenges in open source with particular attention to the interaction between paid and volunteer contributors and stress and resulting turnover.
Student Involvement
Students will empirical study sustainability problems and interventions, using interviews, surveys, and statistical analysis of archival data (e.g., regression modeling, time series analysis for causal inference). What are the main reasons for volunteer contributors to drop out of open source projects? In what situations do volunteer contributors experience stress? In which projects will other contributors step up and continue maintenance when the main contributors leave? Which past interventions, such as contribution guidelines and code of conducts, have been successful in retaining contributors and easing transitions? How to identify subcommunities within software ecosystems that share common practices and how do communities and subcommunities learn from each other? Students will investigate these questions by exploring archival data of open source development traces (ghtorrent.org), will design interviews or surveys, will apply statistical modeling techniques, will build and test theories, and conduct literature surveys. Students will learn state of the art research methods in empirical software engineering and apply them to specific sustainability challenges of great importance. Students will actively engage with the open source communities and will learn to communicate their results to both academic and nonacademic audiences.
References
[1] Christopher Bogart and Christian Kästner and James Herbsleb and Ferdian Thung. How to Break an API: Cost Negotiation and Community Values in Three Software Ecosystems. In Proc. Symposium on the Foundations of Software Engineering (FSE), 2016.
[2] Asher Trockman, Shurui Zhou, Christian Kästner, and Bogdan Vasilescu. Adding sparkle to social coding: an empirical study of repository badges in the npm ecosystem. In Proc. International Conference on Software Engineering (ICSE), 2018.
[3] Bogdan Vasilescu, Kelly Blincoe, Qi Xuan, Casey Casalnuovo, Daniela Damian, Premkumar Devanbu, and Vladimir Filkov. The sky is not the limit: multitasking across github projects. In Proc. International Conference on Software Engineering (ICSE), 2016.
[4] Bogdan Vasilescu, Daryl Posnett, Baishakhi Ray, Mark GJ van den Brand, Alexander Serebrenik, Premkumar Devanbu, and Vladimir Filkov. Gender and tenure diversity in GitHub teams. In Proc. ACM Conference on Human Factors in Computing Systems (CHI), 2015.
Teen Online Safety
Mentor: Lorrie Cranor
Description and Significance
Teenagers (children aged 13 to 17) increasingly use computing technology for education and entertainment [1]]. While they may be technically competent, a lack of life experience and a still-developing brain may place them at an increased risk of falling victim to online safety risks [2]. In particular, teens generally have a higher level of impulsivity and sensation-seeking that may lead to increased risk-taking online [3, pgs. 528-530]. Much prior work in the child safety literature has focused on issues such as cyberbullying [4] or online sexual exploitation [5]. In this project, we will explore how teenagers handle more traditional computer security challenges (e.g., avoiding fraud, maintaining secure authentication, etc.). This research could also involve developing interventions to improve teenagers’ understanding of online security. Ultimately, we hope to contribute to making the Internet a safer place for all people.
Student Involvement
Students will work with a graduate student to conduct a user study related to teens' security behavior. Through this process, they will learn about how HCI research methods (e.g., interviews, surveys, etc.) are applied to computer security and privacy issues. Based on student interest and the results of ongoing research, students may be involved in all stages of the research process, including design, execution, and analysis.
References
Emily A. Vogels, Risa Gelles-Watnick, and Navid Massarat. “Teens, Social Media and Technology 2022,” Pew Research Center, https://www.pewresearch.org/internet/2022/08/10/teens-social-media-and-technology-2022/
Diana Freed, Natalie N. Bazarova, Sunny Consolvo, Eunice J Han, Patrick Gage Kelley, Kurt Thomas, and Dan Cosley. (2023). Understanding Digital-Safety Experiences of Youth in the U.S. In Proceedings of the 2023 CHI Conference on Human Factors in Computing Systems (CHI '23). Association for Computing Machinery, New York, NY, USA, Article 191, 1–15. https://doi.org/10.1145/3544548.3581128
Laura Berkand Adena Meyers. Infants, Children, and Adolescents, 8th Ed. (2016) Pearson
Olweus, Dan, and Susan P. Limber. (2018) "Some problems with cyberbullying research." Current opinion in psychology 19
Whittle, Helen, Catherine Hamilton-Giachritsis, Anthony Beech, and Guy Collings. (2013) "A review of online grooming: Characteristics and concerns." Aggression and violent behavior 18, no. 1.
Usable Policy Languages for Distributed Confidential Computing
Mentor: Lorrie Cranor
Description and Significance
The goal of Distributed Confidential Computing (DCC) is to enable scalable data-in-use protections for cloud and edge systems, like home IoT. An important aspect of DCC is ensuring that data use adheres to specific policies. Typically, these policies are written in technical languages, like formal logic, which makes it impractical for non-experts to write their own policies. The goal of this project is to (1) determine what kind of policies home IoT users would want to express for their own data, and (2) how they can communicate these policies so that they can be verified by the DCC technology, without being too technical to be understood by the user.
Student Involvement
Students will learn how to conduct research in usable privacy and security by working with a postdoc to conduct a user study that will identify privacy preferences of home IoT users. Based on this data, the team will design and implement prototype policy interfaces, which can be evaluated by another user study. Depending on interests and project needs, the student may help set up online surveys and collect data on a crowd worker platform, perform qualitative and/or quantitative data analysis, or design and implement prototypes.
References
Center for Distributed Confidential Computing. https://nsf-cdcc.org/
Hana Habib and Lorrie Faith Cranor. Evaluating the Usability of Privacy Choice Mechanisms. SOUPS ‘22. https://www.usenix.org/system/files/soups2022-habib.pdf
McKenna McCall, Eric Zeng, Faysal Hossain Shezan, Mitchell Yang, Lujo Bauer, Abhishek Bichhawat, Camille Cobb, Limin Jia, and Yuan Tian. Towards Usable Security Analysis Tools for Trigger-Action Programming. SOUPS ‘23. https://www.usenix.org/system/files/soups2023-mccall.pdf
Verifying Rust Code
Mentor: Bryan Parno
Description and Significance
Rust is already a rapidly growing mainstream language (e.g., with users in Amazon, Google, Microsoft, Mozilla, and the Linux kernel) designed to produce "more correct" low-level systems code. Rust supports writing fast systems code, with no runtime or garbage collection, but its powerful type system and ownership model guarantee memory- and thread-safety. This alone can rule out a large swath of common vulnerabilities. However, it does nothing to rule out higher-level vulnerabilities, like SQL injection, incorrect crypto usage, or logical errors.
Hence, we are developing a language and tool called Verus, which allows Rust developers to annotate their code with logical specifications for the code's behavior, and it automates the process of mathematically proving that the code meets those specifications. This means we can guarantee the code's correctness, reliability, and/or security at compile time.
Student Involvement
In this project, students will learn more about software verification, write code in Verus and prove it correct, and work to make Verus even easier for developers to use.
References
Verus: https://github.com/verus-lang/verus
Rust: https://www.rust-lang.org/
Our Research Lab: https://www.andrew.cmu.edu/user/bparno/research.html