This is a brief for the research paper “A Partial Replication of “DeepBugs: A Learning Approach to Name-based Bug Detection”, published in the artifact track of ESEC/FSE 2021 . This paper resulted from a course project in my course ECE 595: Advanced Software Engineering at Purdue University. The team members contributed equally during the semester, but Jordan M. Winkler stayed up late one night to trim the report into a 2-page abstract and so he is the first author on the artifact.
In 2018, Pradel & Sen published a paper called DeepBugs that described a software defect detection tool .
Follow-up note on 19 May 2021: This post was written concurrent with discussions across the cybersecurity research community. Since then: The authors withdrew their paper; the conference chairs described significant changes for the next edition of the conference; the Linux community issued a statement. There is also a related comment from Ted Ts’o (Linux contributor) at the bottom of this blog post.
In April 2021, the Linux developer community issued a blanket ban on contributions from the University of Minnesota. This remarkable outcome occurred as a result of a research project by a team at UMN. The headlines from the…
Power tools are helpful — but use them safely.
This blog post describes an anti-pattern in how some aspiring software engineers use the Internet. My observations are my own, and I have made little effort to connect them to scientific studies. Nevertheless, I hope they are helpful to someone.
As a professor of Computer Engineering, I teach and mentor many aspiring software engineers. These proto-engineers face a temptation that I did not have to struggle with when I was a student: Stack Overflow and its brethren.
Most programming-themed Internet help sites emerged while I was in college, and my friends…
This is a brief for the research paper Using Selective Memoization to Defeat Regular Expression Denial of Service (REDOS), published at IEEE S&P 2021. I led the work, with help from Francisco Servant and Dongyoon Lee.
In this article I use the word “regex” as shorthand for “regular expression”.
Attackers can use regex-based denial of service (ReDoS) attacks to damage vulnerable web services. These attacks take advantage of the slow algorithm used by regex engines to evaluate regexes. We present novel optimizations to provably improve the worst-case behavior of these engines to linear-time. Nothing in life is free, so these…
This is a brief for the research paper A Principled Approach to GraphQL Query Cost Analysis, published at ESEC/FSE 2020. Alan Cha led the work, with help from Erik Wittern, Guillaume Baudart, me, Louis Mandel, and Jim Laredo. Most of these authors are affiliated with IBM Research or IBM’s product teams, as part of IBM’s ongoing involvement with GraphQL.
This project is a follow-up to our previous work studying GraphQL schemas.
This post is intended as a “technical two-pager” to summarize a security vulnerability called Regex-based Denial of Service (AKA Regex DoS, ReDoS). There are a variety of write-ups about ReDoS, but I’m not aware of a good one-stop-shop with a higher-level treatment of all aspects of the subject. I have included links at the end to more detailed treatments.
I have used headings liberally to help you navigate to your issue.
A regular expression (regex) is a tool that your engineering team uses to manipulate strings. They probably use it to impose some kind of order on unstructured input, e.g…
My wife Kirsten Davis and I just finished up a two-person academic job search. We were successful!
This essay shares our experiences solving the dreaded “two-body problem”. I hope that it helps another couple in the future.
One note before we begin: My wife studies Engineering Education, and I study Computer Science. The job market in 2020 was pretty good for both of these fields, with a “large” number of openings relative to applicants. This afforded us some luxuries that may not be available to couples in other disciplines.
My wife Kirsten Davis and I were both interested in tenure-track…
This notification was thrilling for two reasons:
This post presents the saga of the paper, and includes the different stages of the manuscript and the reviews each version received. I conclude with some reflections about the process.
My intention in writing the post is to give a behind-the-scenes look at the life of an oft-rejected paper. I have…
This is a brief for the research paper An Empirical Study of GraphQL Schemas, presented at ICSOC 2019. Erik Wittern led the work, with help from Alan Cha (implementation), myself (experimental design), Guillaume Baudart (theoretical analysis), and Louis Mandel (theoretical analysis). Most of these authors are affiliated with IBM, as part of IBM’s ongoing involvement with GraphQL as part of the GraphQL Foundation.
Since GraphQL is unfamiliar to many readers, I’ve included a bit more introductory material and illustrations than I usually do.
GraphQL is a query language for data that can be represented as a graph, and reportedly offers…
This is a brief for the research paper Regexes are Hard: Decision-making, Difficulties, and Risks in Programming Regular Expressions, presented at ASE 2019. Mischa Michael led this project, with support from James Donohue, myself, Dongyoon Lee, and Francisco Servant.
In this article, I use the word “regex” as shorthand for “regular expression”.
This paper describes the first large-scale qualitative examination of the ways software engineers interact with regexes. We surveyed 279 professional developers and conducted 17 interviews. …
I am a professor in ECE@Purdue. I hold a PhD in computer science from Virginia Tech. I try to summarize my research findings in practitioner-friendly ways.