Two EIT Digital Master School students developed a feature to reduce significantly the up- and downscaling processing time for Apache Flink during their thesis research. This is so relevant that they were selected to present their solution at two global conferences: Flink-Forward organised by Ververica, the original creators of Apache Flink and BEAM Summit which is organised by Google. Their solution can reduce up- and downscaling time from hours to seconds. This saves companies a lot of money and reduces energy waste, making computing industry more sustainable.
Apache Flink is an open-source framework and distributed processing engine, designed to run in all common cluster environments, perform computations at in-memory speed and at any scale. The framework is being used by big companies, like Alibaba, WWS, Uber, Zalando, eBay, LinkedIn, Spotify, Ericsson and Huawei to name a few.
Muhammad Haseeb Asif and Sruthi Sree Kumar, who both study Cloud Computing and Services at the EIT Digital Master School (now called Cloud and Network Infrastructures), learnt about the Apache Flink during a guest lecture of Assistant Professor Paris Carbonne in the second year of their double degreed masters at the EIT Digital partner university KTH, -they spent their first year at TU Berlin. “We were very interested in the topic”, explain the students. They discussed with Carbone who is also lead researcher at RISE Research Institutes of Sweden to make their thesis about Apache Flink.
In January they started working as researchers at RISE for their internship and thesis which is titled: FlinkNDB - Skyrocketing the stateful capabilities of Apache Flink. During their thesis, they have developed FlinkNDB on top of Mysql Cluster Engine. This is a major feature to improve the up- and downscaling functionality on Apache Flink. Haseeb Asif explains: “Scaling in and out a cluster in no time is one of the active areas of research in Distributed data processing. When you process a lot of data you need a lot of computing resources, vice versa less data needs less resources. But scaling between big and small amounts of resources might take hours. With our application, we reduce those hours for switching to seconds.”
Up- and downscaling within seconds instead of hours has a huge impact, says Sree Kumar. “A lot of companies are not downscaling when less data is being processed. That takes too much time and thus money. So, they keep all their CPU’s running while only using a fraction of it in when the amount of data processing is small. This is not sustainable. With our solution, we only need the resources that are needed. There is no need to waste energy anymore and that saves companies a lot of money.” The solution can also be used for quick system recovery after a crash.
While talking about this, their supervisors within the EIT Digital network, suggested applying for a conference. “We applied for Flink-Forward, the annual global conference for the Apache Flink community. All big companies gather here to learn from each other on different types of innovations. And we were invited”, says Sree Kumar. Flink Forward was held from 19 to 22 October. They were listed as speakers alongside people from Uber, Netflix, Amazon, Bol.com, Yelp, Spotify, Intel, Ververica, Intuit, Microsoft, and Alibaba. “It was an amazing experience”, says Asif. “People were twittering about our project; someone said that our presentation was the best talk of the day.”
Google approached the students itself to speak at the Beam Summit 2020, a conference for the worldwide community of Apache Beam users and contributors that was held from 24 to 28 August. Google spotted the students because they were active withing the open-source community about their work. They were using the Apache BEAM, open source from Google, to test their Apache Flink feature. The topic of the speech was NEXMark-Beam: Your Best Companion For Testing And Benchmarking New Core Stream Processing Libraries.
The students felt comfortable doing the presentations. After all, during the EIT Digital Master School, they are trained in so doing. “We have learned how to present, how to sell. All those things helped us a lot. All of this would not be possible without the education within EIT Digital Master School and the encouraging EIT Digital network.”
Their contributions to the conferences and the feedback they received, gives the students the motivation to move on. “When people say that we have developed something that they were looking for, gives an inspiration to deliver”, says Sree Kumar.
The project they started working on is open source. The students are considering now on making this a product on itself and make a business out of it. Before that, they first want to get the consult of the EIT Digital network how to get funding and finish their thesis that is planned to be finalised by the end of the year.
Interested in the student’s work? Read their blog on Medium.