Behind the scenes of science

Introduction

In May 2018, I got some pleasant news: an academic paper accepted to USENIX Security 2018 (full version here, summary here).

This notification was thrilling for two reasons:

  1. This was the first paper I had owned from start to finish.

This post presents the saga of the paper, and includes the different stages of the manuscript and the reviews each version received. I conclude with some reflections about the process.

My intention in writing the post is to give a behind-the-scenes look at the life of an oft-rejected paper. I have heard rumors of such works before, but I am not aware of a description of the process. I hope this post is interesting to current and future graduate students, both as a “fossil record” and as an encouragement when dealing with rejection.

General idea of the paper

Suppose a server handles many clients on one thread. If a client can convince the server to spend longer than expected handling a request, then this can be used to carry out a Denial of Service attack. My paper describes this type of attack as Event Handler Poisoning since typically such servers use the Event-Driven Architecture.

The most prominent example of a server architecture like this is Node.js, in which a single Event Loop (thread) handles client interactions, with support from a small fixed-size Worker Pool (threadpool). If a request might cause the Event Loop or a Worker to block, then a DoS attack can result, since while a thread is blocked on one client the other clients will starve.

Our workshop paper sketched the attack. Our four conference submissions describe the attack in more detail and proposed a defense: C-WCET Partitioning (rejected from CCS’17 and NDSS’18) and then First-Class Timeouts (rejected from S&P’18, conditionally-then-fully accepted at USENIX Security’18).

A brief timeline

  1. At EuroSec’17 (a workshop) we described the Event Handler Poisoning attack and sketched some possible defenses. That paper is available here.

Collaborators

The EuroSec’17 workshop paper was done in collaboration with Gregor Kildow and my advisor, Dr. Dongyoon Lee. After the workshop paper, Gregor left the project.

For CCS’17 and NDSS’18 my friend Ayaan Kazerouni helped with some analysis of the npm ecosystem in support of our C-WCET Partitioning concept. After that direction dead-ended, Ayaan bowed out to pursue his own projects and requested to be removed from the author list if we changed our defense to the timeout approach suggested by the NDSS referees.

In Fall 2018 I implemented the First-Class Timeouts prototype with help from Eric R. Williamson. He and I rewrote the second half of the manuscript to describe the new defense.

The saga in detail

EuroSec 2017

This paper began life as a project in Dr. Daphne Yao’s Spring 2017 Security course. After completing the course project with my partner Gregor, I wrote it up more formally and it was accepted to EuroSec’17.

Artifacts

CCS 2017

In conjunction with the EuroSec’17 workshop, I prepared a conference-length submission to CCS’17. We extended the contributions of the EuroSec’17 paper and implemented defenses against one CPU-bound EHP attack (ReDoS) and one I/O-bound EHP attack (Slow Files) in Node.js.

Our defenses were based on our proposed “C-WCET Partitioning” design principle. In Constant Worst-Case Execution Time (C-WCET) Partitioning, all operations performed by the Event Loop or Worker Pool are partitioned into constant-time pieces, with state preserved across partitions using a technique like baton passing or closures. An EDA-based server partitioned in this manner will serve small requests promptly, while arbitrarily expensive (malicious) requests will regularly (in constant time) defer to the small requests.

  • Our ReDoS defense used a hybrid regex engine. When possible we evaluated regexes using Russ Cox’s linear-time regex engine, RE2.

Neither of these defenses was complete. Our ReDoS defense was an O(n)-partitioning for supported regex evaluations, and if the regex was unsupported by RE2 then we fell back to Node’s built-in exponential-time regex engine, Irregexp. And KAIO on Linux only supports regular files, so a read from a slow device file like /dev/random would still be offloaded to the Worker Pool where it would block.

Artifacts

NDSS 2018

We did not change our prototype in this submission. Instead we focused our time on rewriting the manuscript to clarify our findings in the hopes that a clearer presentation of our ideas and their strengths and weaknesses would be acceptable.

It was not. Encouragingly, we had one referee rate the manuscript a “strong accept” and argue in favor of the work, but this referee was persuaded by the other referees to reject the manuscript during the PC discussion.

Artifacts

Oakland 2018

Based on the criticisms of the CCS and NDSS reviewers, we decided to explore timeouts as a solution to Event Handler Poisoning attacks. We implemented the Timeout Approach, modifying our instantiation (Node.js) to throw TimeoutExceptions if JavaScript (on the Event Loop) or Tasks (on the Worker Pool) take too long. We kept the first half of the paper but rewrote the second half to describe the new concept and prototype. By this time I had read Lamport’s State the Problem Before Writing the Solution paper and incorporated his suggestion to describe correctness conditions before presenting the solution.

The referees criticized the novelty of the work and rejected the paper.

Artifacts

USENIX Security 2018

Similar to what we did between CCS and NDSS, after Oakland we did not touch our prototype but instead focused on effectively communicating our ideas. In a major rhetorical shift, we changed the language from “Timeout Approach” to “First-Class Timeouts” to better emphasize the novelty of our proposal. We didn’t want the referees to think we were just proposing timeouts, since First-Class Timeouts really require re-thinking how EDA-based servers should be written.

Interestingly, as at NDSS, we had one referee rate the manuscript a “strong accept” and argue in favor of the work. This time our champion persuaded the other referees to accept the manuscript during the PC discussion, though the referees attached conditions to our acceptance and assigned us to a shepherd before final acceptance. We rewrote the manuscript to address the concerns of the referees, and the shepherd agreed to accept it to USENIX Security.

Artifacts

Reflections

  1. Don’t give up! This paper was submitted to all four of the top security conferences and was rejected from three of them.

I am a professor in ECE@Purdue. I hold a PhD in computer science from Virginia Tech. I try to summarize my research findings in practitioner-friendly ways.