17-712: Fantastic Bugs and How to Find Them

Course Description

This advanced course studies the nature of software bugs and security vulnerabilities arising in complex application domains and surveys specialized program analysis + automated testing techniques for identifying such issues proactively. The course will take a tour of various domains such as mobile systems, databases, web browsers, distributed and networked systems, autonomous vehicles, smart contracts, and generative AI. For each domain, the class will discuss state-of-the-art research techniques that aim to uncover a special class of software bugs automatically. Apart from the literature review, students will engage significantly with software system design + engineering via a semester-long project, which will involve working with real-world applications and analysis tools for one or more domains.

Logistics

Spring 2026, 12 units
Class: Tue/Thu 11:00am-12:20pm in GHC 4102

Professor Rohan Padhye
Office hours: Tuesdays 1:30—2:30pm in TCS 325
Email: rohanpadhye@cmu.edu
Headshot of Professor Rohan Padhye

TA Michael McLoughlin
Office hours: Thursdays 10:00-11:00am in CIC 2117
Email: mcloughlin@cmu.edu
Headshot of TA Michael McLoughlin

Should I take this course?

Note: This is not a traditional lecture-based course. Classes will often consist of group discussions and student-led presentations, based on assigned readings of research papers, online articles, or case studies on open-source software.

For students doing research in the areas of programming languages, software engineering, computer systems, or security: this course will provide exposure to (a) a number of new application domains and the challenges of reasoning about software in those domains, and (b) techniques for leveraging domain-specific assumptions in order apply their research to new problems.
For students targeting careers in security, software quality, or as domain experts: this course will (a) provide an introduction to a wide array of techniques for highly specialized software analysis and bug finding, and (b) help develop a knack for acquiring knowledge about state-of-the-art techniques from academic literature and prototyping with associated tools and artifacts.
For students with a general interest in program analysis and security, this course will provide an opportunity to learn about and discuss a variety of different approaches to automated bug finding, as well as to engage in hands-on tool building through the course project.

Prerequisites

This course is aimed at PhD and Masters students, as well as undergraduates interested in advanced research. The course assumes some background in understanding the source of common software bugs (e.g., buffer overflows) and dealing with program representations (e.g., abstract syntax trees) or automated testing tools (e.g., fuzzing). Any one of the following courses serve as sufficient prerequisites: 18-335/732 (Secure Software Systems), 14-735 (Secure Coding), 17-355/665/819 (Program Analysis), 15-411/611 (Compiler Design), 17-770 (Virtual Machines and Managed Runtimes), 15-414 (Bug Catching), 15-330/18-330/18-730 (Intro to Computer Security). 14-741/18-631 (Intro to Information Security) may also be sufficient, depending on background or related coursework. If you have taken a course equivalent to any of the listed pre-requisites in a different institution, or if you have had other relevant experiences (e.g., participating in CTFs or working in industry), please register and contact the instructor via email. You should expect to have the following background:

Basic understanding of build systems and program execution: compilers, interpreters, type checkers, bytecode, threads, system calls, virtual machines, inter-process communication, client-server architecture.
Comfort working with large-ish code-bases (10K+ LoC) in C and Java.
Ability to discover resources from the web to quickly learn unfamiliar programming languages, build systems, virtual machine setups, etc.
Basic understanding of foundational algorithms and data-structures such as hash-maps, trees, and graph traversal.
Basic understanding of discrete mathematics (e.g., set theory) and fluency in first-order logic notation. Non-trivial formulas using the following symbols should make sense: {∀, ∃, ⇒, ⇔, ∅, ⊆}.

Degree Requirements Fulfilled

Masters: Contact the instructor to request.

PhD students: Satisfies the ENG requirement of the Software Engineering PhD program. Contact the instructor to request others.

Learning Objectives

Students completing this course should be able to:

Identify practical challenges of applying well known program analysis techniques to a variety of application domains.
Formulate and leverage domain-specific assumptions for making program analysis tractable and useful in a specialized setting.
Build practical tools for improving software quality in real-world systems.

Course Topics

Overview of general techniques for finding software bugs (static analysis, fuzzing, symbolic execution, formal methods)
Program analysis techniques for various domains, including:
- Database systems
- Operating systems
- Mobile applications
- Web applications
- Compilers
- Web browsers
- Distributed systems
- Network protocols
- Machine learning
- Generative AI
- Smart contracts
Considerations in industrial adoption of automated bug-finding tools.

Assessments

20% pre-class reading responses
20% in-class paper presentations
20% participation
15% two assignments (5% + 10%)
25% final project

Schedule

Note: The list of topics below is tentative and subject to change.

Date	Topic	Assigned Reading (Guide and PDFs on Canvas)	Artifacts/Due Dates
Jan 13	Introduction (slides)	Meta: How to Read a Paper	Assignment 1 Released
Jan 15	General: Static Analysis	Paper: Finding Bugs is Easy Tools: SpotBugs, cppcheck Case Study: X11 privelege escalation bug
Jan 20	Real-World Case Studies	Youtube video: 25 crazy software bugs explained	Assignment 1 Due
Jan 22	Real-World Case Studies	Youtube video: 25 crazy software bugs explained
Jan 27	General: Symbolic Execution	Papers: EXE: Automatically Generating Inputs of Death, KLEE: Unassisted and Automatic Generation of High-Coverage Tests for Complex Systems Programs Tool: KLEE	Assignment 2 Release
Jan 29	General: Fuzz Testing	Papers: Coverage-based Greybox Fuzzing as Markov Chain, Semantic Fuzzing with Zest Tools: AFL++, JQF Case Study: Heartbleed bug in OpenSSL
Feb 3	Database Systems	Article: How SQLite is tested Paper: Squirrel: Testing Database Management Systems with Language Validity and Coverage Feedback Tool: Squirrel	Assignment 2 Checkpoint
Feb 5	Database Systems	Papers: Testing Database Engines via Pivoted Query Synthesis, Finding Bugs in Database Systems via Query Partitioning Tool: SQLancer
Feb 10	Web Applications	Articles: OWASP Top 10 and SQL Injection Case Studies: SQL Injection Hall of Shame Paper: AMNESIA: Analysis and MoNitoring for NEutralizing SQL-Injection Attacks
Feb 12	Web Applications	Case Study: GitLab privelage escalation bug Papers: Static Detection of Second-Order Vulnerabilities in Web Applications, RESTler: Stateful REST API Fuzzing Tool: RESTler
Feb 17	Programming Languages (Functional)	Papers: QuickCheck: a lightweight tool for random testing of Haskell programs, Property-Based Testing in Practice Tool: Hypothesis
Feb 19	Programming Languages (Concurrency)	Paper: Fray: An Efficient General-Purpose Concurrency Testing Platform for the JVM Tool: Fray	Assignment 2 Due
Feb 24	Distributed Systems	Articles: Chaos Engineering, What's the big deal about Deterministic Simulation Testing? Tool: Jepsen
Feb 26	Distributed Systems	Paper: SAMC: Semantic-Aware Model Checking for Fast Discovery of Deep Bugs in Cloud Systems
Mar 3 & 5	Spring break; no class
Mar 10	Operating Systems (Testing)	Tool: Syzkaller Case Study: DCCP privilege escalation bug (repro, exploit, fix) Paper: HEALER: Relation Learning Guided Kernel Fuzzing
Mar 12	Operating Systems (Verificaiton)	Papers: Hyperkernel: Push-Button Verification of an OS Kernel, Scaling symbolic evaluation for automated verification of systems code with Serval	Project Proposal Due
Mar 17	Compilers (Testing)	Papers: Finding and understanding bugs in C compilers, Compiler Validation via Equivalence Modulo Inputs Tool: Csmith
Mar 19	Compilers (Verification)	Papers: CompCert: Practical Experience on Integrating and Qualifying a Formally Verified Optimizing Compiler, Alive2: Bounded Translation Validation for LLVM Project Pages: CompCert, Alive2 (Demo)
Mar 24	AI: Autonomous Vehicles	Papers: DeepTest: automated testing of deep-neural-network-driven autonomous cars, DeepRoad: GAN-Based Metamorphic Testing and Input Validation Framework for Autonomous Driving Systems Video: Attack on a Stop Sign using Black/White Art Stickers (source)
Mar 26	AI: Autonomous Vehicles	Paper: Doppelgänger Test Generation for Revealing Bugs in Autonomous Driving Software
Mar 31	AI: Large Language Models	Papers: Not What You've Signed Up For: Compromising Real-World LLM-Integrated Applications with Indirect Prompt Injection, Imprompter: Tricking LLM Agents into Improper Tool Use Demos: Imprompter
Apr 2	AI: TBD (Systems?)	TBD
Apr 7	Mobile Applications	Paper: Search-Based Energy Testing of Android or IccTA: Detecting Inter-Component Privacy Leaks in Android Apps
Apr 9	Spring carnival; no class
Apr 14	IoT	Papers: Jetset: Targeted Firmware Rehosting for Embedded Systems, Protecting Smart Homes from Unintended Application Actions
Apr 16	Smart Contracts	Background: Security Vulnerabilities in Ethereum Smart Contracts, Article: Step by Step Towards Creating a Safe Smart Contract: Lessons and Insights from a Cryptocurrency Lab
Apr 21	Industrial Adoption	A few billion lines of code later: using static analysis to find bugs in the real world
Apr 23	Final Project Presentations
May 1

Course Logistics and Policies

Technology Requirements

Here is the technology that students may need to use during the semester. If you have any trouble using any of these tools, please talk to the instructor so that we can figure out an accommodation.

Canvas: For reading material, pre-class quizzes, uploading class presentations.
Access to computer or VM with a UNIX-based operating system (e.g. Linux, MacOS): Most code artifacts that we will encounter run best on Unix environments. Windows users should able to use WSL.
A GitHub account: To interact with source code repositories of analysis tools and target programs, as well as for certain class activities.
Laptop or tablet for in-class presentations: Our classrooms should be equipped with a projector connected via HDMI.

Accessing the Reading Material

When we assign readings, we will provide a link to a web resource containing the official publication. For published academic papers, we will usually reference the Digital Object Identifier (DOI) that takes you to the online proceedings. For other resources such as blog posts, news articles, and software repositories, we will provide a URL to the primary source. For articles behind a paywall, we will provide a PDF via Canvas for internal classroom use only. Please do not distribute these PDFs publicly as doing so may infringe on copyright.

Pre-class Readings

For most classes with assigned readings or tutorials, a quiz will be assigned on Canvas. These quizzes are to be completed individually before the start of class. The quizzes will be based on the assigned reading and will be graded leniently; they are intended to be a checkpoint to ensure that everyone is prepared and on the same page before coming to class. Late submissions will not be accepted, since that would defeat the purpose of the quiz. The lowest four scores (including zeros) will be dropped.

Class Presentations

Depending on the class size, you will be expected to be the discussion lead 1—2 times in the semester. As the discussion lead, you should read the paper carefully (complete all three passes of Keshav’s three pass approach) and prepare a presentation for the paper along with points to seed the discussions afterwards. In some cases, you may be able to find a video of the authors own presentation at the conference or their slides, which you are welcome to use. However the lead is still required to prepare slides for their own view on the work and the paper, including seeding and leading the discussion around the work and any other context necessary.

In class presentations, be sure to avoid infringing on copyright. Most publishers including ACM and IEEE allow using parts of the paper (such as figures) for internal classroom use. If using images sourced from the internet at large, look for works in the public domain or those that allow reuse (e.g., via a Creative Commons License). Provide proper attribution where required. When in doubt, make your own drawings. You are also welcome to use the whiteboard in class in lieu of making complicated custom diagrams on slides.

Participation

A portion of the grade is reserved for class participation, which has both an objective and subjective component. Partial credit is objectively assigned to class attendance (see also the absence policy below). The remaining credit is subjectively assigned based on active involvement in technical discussions such as asking questions, providing clarifications, and sharing opinions or experiences either in class. The instructor will track student participation through the semester and holistically assign participation grades (low/medium/high) through the end of the semester.

Course Project

The most significant component of the course is a semester-long project. Students may work individually or in groups of two or three (the expectations of project scope scale with team size). Projects should be related to analyzing software in some specific domain and must have a concrete implementation component, but otherwise the topic can be of the students' choosing. PhD students are expected to pick a project topic that explores an open research question, usually aligning with their own thesis work. Masters students are welcome to perform research, but can also pick an engineering-oriented project as long as it engages with large-scale real-world software: either the software-analysis tooling or the target applications should be in regular widespread use. The teaching staff will help students refine their project scope to ensure it meets learning objectives while being appropriately sized for completion with the semester.

In the last week of the semester, all project teams will give a presentation of their project outcomes in class. Additionally, project teams are expected to write up a report of their project in a conference/workshop-style short paper, which will be due in Finals week. Projects are graded on contributions and presented insights. More details will be released closer to the end of the semester.

Class Absence Policy

A significant portion of the final grade is dependent on regular participation in class activities. However, we understand that unexpected life events (e.g., health or family issues) and other professional obligations (e.g., conference travel or university-level athletics events) can cause students to miss a small number of classes. To account for any and all absences or lapses of any type, every student will automatically get full points for the lowest four scores (including zeros) in both the pre-class reading quizzes and class attendance/participation. Students need not inform the instructor ahead of time. This policy will account for lapses of all types, including simply forgetting to submit on time, registering for the class late, illness, conference travel, etc. No other make-up provision will be made for reading responses and class attendance, with the exception of official disability-related accommodations. Please do not email the instructor asking for excuses beyond this policy.

For unexpected contingencies affecting scheduled class presentations and final project presentations, please contact the instructor ASAP; these will be handled on a case-by-case basis.

Collaboration and Academic Integrity

Since this is a discussion-oriented and advanced topics course, collaboration is expected. However, each student is expected to submit pre-class reading responses individually. Course projects and in-class activities may be performed in teams. Any contribution by members outside the team (e.g., assistance provided by open-source software developers) should be explicitly credited. In general, we will follow the standard CMU Academic Integrity Policy.

AI Policy

Generative AI is a powerful technology that can aid in data discovery, software development, and content creation. However, research has shown that the use of AI to "short-cut" educational activities can also hinder learning. Since this class is aimed at graduate students preparing for a career and developing specialized skills, we will embrace the use of generative AI to the extent that it does not hinder learning objectives. Grading will be based in large part on in-person discussions and presentations that reflect learning better written content. The following policy is designed with this goal in mind.

You are welcome to use generative AI tools in preparing your presentations, assignments, or in developing your project. However: (1) You must acknowledge the extent of the usage (e.g., “figure XXX generated by Gemini” or “researched case studies using ChatGPT” or “developed parser and visualization UI using Claude Code”), and (2) You are responsible for ownership of the generated content, as well as demonstrating your understanding of the output; you may be penalized if you are not able to discuss/argue in favor of whatever you present.

You are not allowed to use AI tools to compose or edit pre-class reading responses on Canvas. If we detect AI-generated writing, we will consider it a violation of Academic Integrity. Remember that your in-class participation/discussion will reflect your understanding of the reading.

Editorial note: The real test is life. The point of this class is to prepare you for your career/research. This is far more important than the letter grade on your transcript. Nobody is forcing you to take this class. Use your time and money / opportunity cost wisely.

Statement of Support for Students’ Health and Well-being

Grad school isn't easy, so please take care of yourself. Your health matters. Do your best to maintain a healthy lifestyle this semester, including eating well, getting enough sleep, and taking time to relax. This will help you achieve your goals and cope with stress.

All of us benefit from support during times of struggle. You are not alone. There are many helpful resources available on campus and an important part of the college experience is learning how to ask for help. Asking for support sooner rather than later is often helpful.

If you or anyone you know experiences any academic stress, difficult life events, or feelings like anxiety or depression, we strongly encourage you to seek support. Counseling and Psychological Services (CaPS) is here to help: call 412-268-2922 and visit their website at https://www.cmu.edu/counseling. Consider reaching out to a friend, faculty or family member you trust for help getting connected to the support that can help.