EdgeWise: A Better Stream Processing Engine for the Edge


We wanted to be able to perform streaming data analysis (i.e. low-latency, medium-throughput computation) on Edge-class devices (i.e. few-core, small-memory, everything a server isn’t). We took a look at existing stream processing engines and identified a design decision that limits their effectiveness on Edge-class devices. These engines assume they are being used on non-resource-constrained systems, and allocate many threads regardless of the number of physical cores on the system. We showed using a mix of theory and experiments that tailoring the number of threads to the number of cores can yield large latency gains without a loss in throughput, provided you add in an intelligent scheduler to manage resources wisely.

Motivation and Background

Mechanical Engineering researchers at Virginia Tech have worked to instrument Goodwin Hall, a new engineering building on campus, with hundreds of sensors that collect gigabytes of data every day. There’s no particular reason to stop at hundreds of sensors and gigabytes of data, of course, other than limitations on the ability to analyze that data.

Edge Computing

You can solve a lot of problems using Cloud technologies. The Cloud is great for building “big” applications. It can automatically scale up to handle bursty traffic, offer unlimited storage for your big database o’ customer data, and mask hardware failures from your customers.

  1. You have to pay ($) for the network bandwidth to ship data to the Cloud and to ingest it. Perhaps your sensors generate enough data that it would be cost-prohibitive to ship, even after sampling, compression, etc.
  2. You have to pay (time) for the network latency to and from the Cloud. Perhaps the analysis you want to run won’t take that long, so the travel time will dominate the total time spent to analyze the data. If your sensor is connected to an Internet backbone, the travel time won’t be that long. If your sensor is in the middle of a desert and bounces radio waves off of satellites to call home, it might be another story.

Stream processing engines

OK, so we’ve decided we want to perform data analysis on local Gateways without going all the way to the Cloud, and are planning to do it at the Edge on a network of resource-constrained devices. We turn to stream processing as a natural way to describe the analyses.

The Gateways at the Edge connect Things to the Cloud, and can perform local data stream processing. The yellow rectangle in the middle is a topology expressing the analysis as a directed acyclic graph. There is a low-latency connection between the Things and the Edge, and a high-latency connection between the Edge and the Cloud. We assume modest computational resources in the Edge.

Research plan

  1. Study existing stream processing engines to see how suitable they are in our context.
  2. If unsuitable, identify the problem and design a solution.

The (un-)suitability of stream processing engines

We looked at the major stream processing engines: Apache Storm, Apache Flink, and Apache Heron.

  1. Contention. The contention between the threads is pure overhead, and since we have relatively few processors the proportion of contention can be significant for larger analysis topologies.
  2. Memory limits. We have limited memory and we can’t magic new processors into existence, so if the system becomes overloaded then imbalanced queue lengths can leave us with nowhere convenient to put the data in the system.

Redesigning stream processing engines for the Edge

Based on our analysis so far, we set out to find appropriate modifications to existing stream processing engines that would permit them to accommodate the Edge context. The result was our system, EdgeWise.


Experimental design

  • We used Raspberry Pi v3 devices as representative Gateway-class devices.
  • We evaluated four analysis topologies from the RIoTBench benchmark suite [3].
  • We evaluated each topology five times in each configuration. We did not take further measurements because we observed low variance in each configuration.
  • We conducted both single-node and clustered experiments to see the effect of our implementation at finer and coarser granularities.


Here is one figure from our single-node experiment:

Throughput-latency curve for the PRED topology from RIOTBench, evaluated at varying input rates on different stream processing engines. Storm is Apache Storm v1.1.0. The “WP+Random” line shows the effect of using a worker pool (threadpool) and relying on the OS to schedule the threads, like Storm does. The other “WP+X” lines show implementations of schedulers from the database literature. The EdgeWise curve shows the performance of a worker pool with our scheduler. EdgeWise’s dynamic scheduler performs as well as the WP+MinLat scheduler, with no need for profiling prior to deployment.
Throughput curve as we vary the number of nodes. Each point shows the highest throughput we could achieve with the system without exceeding a latency threshold of 100 ms. (Yes, this was a somewhat arbitrary threshold). EdgeWise outperforms Storm in each configuration.

Conclusions and future work

In this paper we explore the implications of the Fog/Edge Computing paradigm in a promising use case: stream processing. We found that existing stream processing engines were designed for the Cloud and behave poorly in the Edge context. Using a congestion-aware scheduler and a fixed-size worker pool, we showed that these engines can obtain higher throughput and lower latency at the Edge.


My favorite part about this paper is the final sentence: “Sometimes the answers in system design lie not in the future but in the past.” Our observation about threadpools was not new; we found it articulated in research papers from the early 2000s, and it predates those as well. The concept of “Let’s use a dedicated scheduler” is hardly new either. But the application to stream processing engines in the Edge context added enough new wrinkles that our paper could stand on its own. I enjoy the barefaced acknowledgment, however, that the novelty is less about the solution and more about the interpretation. It’s a distinction that I’m glad the computer science research community appreciates.

More information

  1. The full paper is available here.
  2. The PowerPoint slides and presentation video are available here.
  3. The research artifact is available here.


[1] Bonomi et al., 2014. Fog computing: A platform for internet of things and analytics.



Get the Medium app

A button that says 'Download on the App Store', and if clicked it will lead you to the iOS App store
A button that says 'Get it on, Google Play', and if clicked it will lead you to the Google Play store
James Davis

James Davis

I am a professor in ECE@Purdue. I hold a PhD in computer science from Virginia Tech. I blog about my research findings and share tips for engineering students.