Course Description

This course studies the nature of software bugs and security vulnerabilities arising in complex application domains and surveys specialized program analysis + automated testing techniques for identifying such issues proactively. The course will take a tour of various domains such as mobile systems, databases, web browsers, distributed and networked systems, autonomous vehicles, and smart contracts. For each domain, the class will discuss state-of-the-art research techniques that aim to uncover a special class of software bugs automatically. Apart from the literature review, students will engage significantly with software system design + engineering via a semester-long project, which will involve working with real-world applications and analysis tools for one or more domains.

Logistics

Spring 2023, 12 units
Class: Tue/Thu 5:00pm-6:20pm in GHC 4101

Professor Rohan Padhye
Office hours: Wednesdays 5:30–6:30pm in TCS 325
Email: rohanpadhye@cmu.edu
Headshot of Professor Rohan Padhye
TA Ao Li
Office hours: Mondays 11am–12noon in TCS 417
Email: aoli@cs.cmu.edu
Headshot of Ao Li

Should I take this course?

Note: This is not a traditional lecture-based course. Classes will often consist of group discussions and student-led presentations, based on assigned readings of research papers, online articles, or case studies on open-source software.

Prerequisites

This course is open to PhD and Masters students interested in software engineering, program analysis, and/or security. The course assumes some background in understanding the source of common software bugs (e.g., buffer overflows) and dealing with program representations (e.g., abstract syntax trees) or automated testing tools (e.g., fuzzing). Any one of the following courses serve as sufficient prerequisites: 18-335/732 (Secure Software Systems), 14-735 (Secure Coding), 17-355/665/819 (Program Analysis), 15-411/611 (Compiler Design), 15-414 (Bug Catching), 15-330/18-330/18-730 (Intro to Computer Security). 14-741/18-631 (Intro to Information Security) may also be sufficient, depending on background or related coursework. If you have taken a course equivalent to any of the listed pre-requisites in a different institution, or if you have had other relevant experiences (e.g., participating in CTFs or working in industry), please register and contact the instructor via email. You should expect to have the following background:

Degree Requirements Fulfilled

Masters: Contact the instructor to request.

PhD students: Satisfies the ENG requirement of the Software Engineering PhD program. Contact the instructor to request others.

Learning Objectives

Students completing this course should be able to:

Course Topics

Assessments

Schedule

The following schedule of topics is tentative and will be updated in real time during the semester.

Date Topic Assigned Reading
(Guide and PDFs on Canvas)
Artifacts/Due Dates
Jan 17 Introduction (slides) Meta: How to Read a Paper
Jan 19 General: Static Analysis Paper: Finding Bugs is Easy
Tools: SpotBugs, cppcheck
Case Study: X11 privelege escalation bug
Jan 24 General: Symbolic Execution Papers: EXE: Automatically Generating Inputs of Death,
KLEE: Unassisted and Automatic Generation of High-Coverage Tests for Complex Systems Programs
Tool: KLEE
Jan 26 General: Fuzz Testing Papers: Coverage-based Greybox Fuzzing as Markov Chain, Semantic Fuzzing with Zest
Tools: AFL++, JQF
Case Study: Heartbleed bug in OpenSSL
Assignment Release
Jan 31 Database Systems Article: How SQLite is tested
Paper: Squirrel: Testing Database Management Systems with Language Validity and Coverage Feedback
Tool: Squirrel
Feb 2 Database Systems Papers: Testing Database Engines via Pivoted Query Synthesis, Finding Bugs in Database Systems via Query Partitioning
Tool: SQLancer
Assignment Checkpoint
Feb 7 Web Applications Articles: OWASP Top 10 and SQL Injection
Case Studies: SQL Injection Hall of Shame
Paper: AMNESIA: Analysis and MoNitoring for NEutralizing SQL-Injection Attacks
Feb 9 Web Applications Case Study: GitLab privelage escalation bug
Papers: Static Detection of Second-Order Vulnerabilities in Web Applications, RESTler: Stateful REST API Fuzzing
Tool: RESTler
Feb 14 Operating Systems Tool: Syzkaller
Case Study: DCCP privilege escalation bug (repro, exploit, fix)
Paper: HEALER: Relation Learning Guided Kernel Fuzzing
Feb 16 Operating Systems Papers: Hyperkernel: Push-Button Verification of an OS Kernel, Scaling symbolic evaluation for automated verification of systems code with Serval (the morning paper article)
Feb 21 Network Protocols Papers: SNOOZE: Toward a Stateful NetwOrk prOtocol fuzZEr, Polyglot: automatic extraction of protocol message format using dynamic binary analysis
Tool: AFLNet
Feb 23 Software-Defined Networks Paper: SPIDER: A Practical Fuzzing Framework to Uncover Stateful Performance Issues in SDN Controllers Assignment Due
Feb 28 Machine Learning Papers: DeepTest: automated testing of deep-neural-network-driven autonomous cars,
DeepRoad: GAN-Based Metamorphic Testing and Input Validation Framework for Autonomous Driving Systems
Video: Attack on a Stop Sign using Black/White Art Stickers (source)
Mar 2 Machine Learning Papers: Free Lunch for Testing: Fuzzing Deep-Learning Libraries from Open Source, Fuzzing Deep-Learning Libraries via Large Language Models
Mar 7 & 9 Spring break; no class
Mar 14 Mobile Applications Paper: Search-Based Energy Testing of Android
Mar 16 Mobile Applications Paper: TaintDroid: An Information-Flow Tracking System for Realtime Privacy Monitoring on Smartphones
Project page (a bit old): appanalysis.org
Project Proposal Due
Mar 21 Compilers Papers: Fuzzing the Rust Typechecker Using CLP (T),
Compiler Validation via Equivalence Modulo Inputs
Tool: Csmith
Mar 23 Compilers Paper: CompCert: Practical Experience on Integrating and Qualifying a Formally Verified Optimizing Compiler
Project Page: CompCert
Mar 28 Web Browsers
Guest Lecture: Fraser Brown
Paper: Towards a verified range analysis for JavaScript JITs
Tool: Vera
Mar 30 Microservices
Guest Lecture: Chris Meiklejohn
Background Video: Chaos Engineering: A Step Towards Resilience
Papers: Service-level Fault Injection Testing,
Method overloading the circuit
Tool: Filibuster
Project Checkpoint 1
Apr 4 Concurrency Paper: Efficient scalable thread-safety-violation detection: finding thousands of concurrency bugs during testing
Tool: TSVD
Apr 6 Cyber-Physical Systems: Robots Paper: PGFUZZ: Policy-Guided Fuzzing for Robotic Vehicles
Apr 11 Cyber-Physical Systems:
Autonomous Vehicles
Paper: Neural Network Guided Evolutionary Fuzzing for Finding Traffic Violations of Autonomous Vehicles
Software: ADFuzz
Apr 13 Spring carnival; no class Project Checkpoint 2
Apr 18 IoT and
Embedded Systems
Background: A Large-Scale Analysis of the Security of Embedded Firmwares,
Paper: FIE on Firmware: Finding Vulnerabilities in Embedded Systems using Symbolic Execution
Apr 20 Smart Contracts Background: Security Vulnerabilities in Ethereum Smart Contracts,
Article: Step by Step Towards Creating a Safe Smart Contract: Lessons and Insights from a Cryptocurrency Lab
Etherum attacks collection: Consensys website
Apr 25 Industrial Adoption A few billion lines of code later: using static analysis to find bugs in the real world
Apr 27 Final Project Presentations
May 5 Project Reports Due

Course Logistics and Policies

Technology Requirements

Here is the technology that students may need to use during the semester. If you have any trouble using any of these tools, please talk to the instructor so that we can figure out an accommodation.

Accessing the Reading Material

When we assign readings, we will provide a link to a web resource containing the official publication. For published academic papers, we will usually reference the Digital Object Identifier (DOI) that takes you to the online proceedings. For other resources such as blog posts, news articles, and software repositories, we will provide a URL to the primary source. For articles behind a paywall, we will provide a PDF via Canvas for internal classroom use only. Please do not distribute these PDFs publicly as doing so may infringe on copyright.

Pre-class Readings

For most classes with assigned readings or tutorials, a quiz will be assigned on Canvas. These quizzes are to be completed individually before the start of class. The quizzes will be based on the assigned reading and will be graded leniently; they are intended to be a checkpoint to ensure that everyone is prepared and on the same page before coming to class. Late submissions will not be accepted, since that would defeat the purpose of the quiz. However, see the absence policy below.

Class Presentations

Depending on the class size, you will be expected to be the discussion lead 1—2 times in the semester. As the discussion lead, you should read the paper carefully (complete all three passes of Keshav’s three pass approach) and prepare a presentation for the paper along with points to seed the discussions afterwards. In some cases, you may be able to find a video of the authors own presentation at the conference or their slides, which you are welcome to use. However the lead is still required to prepare slides for their own view on the work and the paper, including seeding and leading the discussion around the work and any other context necessary.

In class presentations, be sure to avoid infringing on copyright. Most publishers including ACM and IEEE allow using parts of the paper (such as figures) for internal classroom use. If using images sourced from the internet at large, look for works in the public domain or those that allow reuse (e.g., via a Creative Commons License). Provide proper attribution where required. When in doubt, make your own drawings. You are also welcome to use the whiteboard in class in lieu of making complicated custom diagrams on slides.

Participation

A portion of the grade is reserved for class participation, which has both an objective and subjective component. Partial credit is objectively assigned to class attendance (see also the absence policy below). The remaining credit is subjectively assigned based on active involvement in technical discussions such as asking questions, providing clarifications, and sharing opinions or experiences either in class or on Piazza. The instructor will track student participation through the semester and holistically assign participation grades (low/medium/high) at the end of the semester.

Course Project

The most significant component of the course is a semester-long project. Students may work individually or in groups of two or three (the expectations of project scope scale with team size). Projects should be related to analyzing software in some specific domain and must have a concrete implementation component, but otherwise the topic can be of the students' choosing. PhD students are expected to pick a project topic that explores an open research question, usually aligning with their own thesis work. Masters students are welcome to perform research, but can also pick an engineering-oriented project as long as it engages with large-scale real-world software: either the software-analysis tooling or the target applications should be in regular widespread use. The teaching staff will help students refine their project scope to ensure it meets learning objectives while being appropriately sized for completion with the semester.

In the last week of the semester, all project teams will give a presentation of their project outcomes in class. Additionally, project teams are expected to write up a report of their project in a conference/workshop-style short paper, which will be due in Finals week. Projects are graded on contributions and presented insights. More details will be released closer to the end of the semester.

Class Absence Policy

A significant portion of the final grade is dependent on regular participation in class activities. However, we understand that unexpected life events (e.g., health or family issues) and other professional obligations (e.g., conference travel or university-level athletics events) can cause students to miss a small number of classes. To account for such absences, every student will automatically get full points for up to 4 missed absences (~15%) in both the pre-class reading quizzes and class attendance. Students need not inform the instructor ahead of time. This policy will account for lapses of all types, including simply forgetting to submit on time, registering for the class late, etc. No other make-up provision will be made for reading responses and class attendance, with the exception of explicit disability-related accommodations.

For unexpected contingencies affecting scheduled class presentations and final project presentations, please contact the instructor ASAP; these will be handled on a case-by-case basis.

Collaboration and Academic Integrity

Since this is a discussion-oriented and advanced topics course, collaboration is expected. However, each student is expected to submit pre-class reading responses individually. Course projects and in-class activities may be performed in teams. Any contribution by members outside the team (e.g., assistance provided by open-source software developers) should be explicitly credited. In general, we will follow the standard CMU Academic Integrity Policy.

Statement of Support for Students’ Health and Well-being

Grad school isn't easy, so please take care of yourself. Your health matters. Do your best to maintain a healthy lifestyle this semester, including eating well, getting enough sleep, and taking time to relax. This will help you achieve your goals and cope with stress.

All of us benefit from support during times of struggle. You are not alone. There are many helpful resources available on campus and an important part of the college experience is learning how to ask for help. Asking for support sooner rather than later is often helpful.

If you or anyone you know experiences any academic stress, difficult life events, or feelings like anxiety or depression, we strongly encourage you to seek support. Counseling and Psychological Services (CaPS) is here to help: call 412-268-2922 and visit their website at https://www.cmu.edu/counseling. Consider reaching out to a friend, faculty or family member you trust for help getting connected to the support that can help.