Video Relation Big Data Technology Warsaw Summit 2018

Assisting millions of active users in real-time - Alexey Brodovshuk, Kcell; Krzysztof Zarzycki, GetInData

Big Data Journey at a Big Corp - Tomasz Burzyński, Maciej Czyżowicz, Orange Polska

Getting more out of your data in 2018 - panel discussion

Getting more out of your data in 2018 - panel discussion

Data: Big Potential in Business Practice


The Big Data hype is behind us. Business people has no doubts that the data has a great business value and great transformation potential. Now the time has come for hard work and true assessment of the real possibilities opened by large datasets, advanced analytical solutions, and related technologies. At the same time existing limitations must be better understood. From the business point of view, the greatest challenge is acquiring and activating data for key business projects and providing the skilled people to implement them. All of this – trends, challenges, technologies, solutions and practical applications – was discussed at the Big Data Tech Warsaw 2018 at the end of February by over 550 specialists from 20 countries. Without exaggeration one could say that for one day Warsaw has become the European capital of Big Data

One of the key trends that will intensify in the coming months and years is the convergence of technologies related to Big Data, machine learning and artificial intelligence. At the same time, the boundaries between structural and unstructured data sources are blurring. Thanks to this, business people can make decisions faster and easier. Today, however, it seems that for a long time the machines will not be able to replace human imagination and intuition.

“In our bank we use Big Data everywhere. We apply it in a lot of places to either improve customer services or to detect frauds and for risk modelling. It is part of our daily life. On the other side we can obviously serve our customers better” said Bolke de Bruin, Head of Advanced Analytics Technology at ING.

How does it look in practice? Specialized algorithms analyze large data sets with high efficiency. Thanks to this, they allow you to be warned before events occur – events, that are able to cause a single company bankruptcy or even shake up the entire industry. When a company gets in trouble, you can ask customers to settle your bills early. The bank also helps clients to find out what will be the sales volume in the coming months, what accessories for the products will best sell or how to optimize the supply chain. Another helps to enter the market faster or to build and test new products at an express speed.

“In the last few years, the most important change in Big Data and artificial intelligence is that we are better able to understand where we can create value with artificial intelligence, Big Data or advanced analytics. We know better where the upsides and downsides are. We are a bit past the hype and it is a good thing. We can actually start seeing the real benefits of that. We know that everything will not be perfect. Some of the things that we do need to be executed by people. That is the biggest change, but there is another thing: within companies it has grown beyond technical side. So people from business side are better able to understand what kind of questions to ask to the technical guys” said Bolke de Bruin.

There are still barriers that will be difficult to exceed in the near future. One of them seems to be the natural language processing. “We can do quite a lot in this area, but definitely not everything. In the foreseeable future, at least what I can foresee, human component is going to be a part of a lot of things that we do, especially in the banking area” argued Bolke de Bruin.

Ready for IoT scale

The largest telecommunications operator in Kazakhstan had to develop in cooperation with GetInData specialized tools for handling Big Data because the company was not able to operate using traditional solutions. “The previous traditional system did not work as we would like it. It did not offer the necessary scalability – it was able to handle at most 2000 events per second – and was not reliable. The new solution allows us to handle up to 160,000 events per second, 22Tb data per month and provide services to 10 million subscribers ” said Alexey Brodovshuk.


The problem is even more serious when we look at the amounts of data generated by IoT (Internet of Things) devices. There are more and more solutions to meet the challenges associated with it. Sending all data to centralized data centers becomes inefficient, which is why data processing and filtering must start at the edge.



Ernst Kratky, Solutions Sales Lead Big Data Analytics at Cisco Systems and Michał Kudelski, Senior Business Solutions Manager at SAS Institute talked about one of such solutions. The edge-to-edge analytical architecture developed jointly by both companies enables real-time data analysis at the edge and making decisions. Only filtered data is sent to the data center for historical analysis.

“A traditional model in which, after gaining access to data, ETL processes are carried out and then the data is analyzed, worked well but only so far. In the new reality, it is insufficient. Data storage costs are too high and response times are too slow. That is why there was a need to get closer to the sources of data, the need to act directly in real time on the data stream ” said Michał Kudelski.

The platform developed by Cisco and SAS Institute is currently being tested, among others by energy companies. With its help, they want to solve the main problem related to the development of a modern power grid – to guarantee its stability. “The goal is to identify events that can affect the stability of the power grid. The analytical solution allows detecting such events, categorizing them, taking direct action in relation to a specific event and then obtaining data for analysis after the event ” explained Ernst Kratky.

There is also a proof of concept project in which real-time data from trucks moving on the roads are collected and analyzed. The complete solution, which was created with standard Cisco components, collects information from 60 sensors placed in each vehicle. The effects are impressive. You can predict with 90% accuracy failures 30 days before their actual occurrence. This means, among others 30% increase in vehicle working time. For truck owners, this is a key parameter that directly translates into revenue. The system also allowed to limit the costs of the extended warranty by approx. 20 percent

Open world

Experts and attendees of the conference agreed that the Big Data solutions will continue to be mostly open source. Ecosystems are created around these types of projects. Commercial companies are part of it. It seems that as a whole the open source model should remain dominant. An example of this may be Yahoo, currently part of the Oath group, that opens its Vespa technology. This is a powerful platform for processing Big Data in the Apache ecosystem and for serving data to end users.


“Open sourcing this platform makes all the big data maturity level three, ie. highest level features available to everyone. It allows automated, data driven decision-making in real time. An example of such an application may be automatic blocking of fraudulent credit card transactions or personalized movie recommendations computed when needed by the user, “said Jon Bratseth, Distinguished Architect at Yahoo !. He argued that Vespa had hundreds of applications in Oath. Supports billions of users, over 200,000 queries per second and over one billion content products. You can look at it as a complementary technology to Hadoop.


However, it we cannot assumed that the Big Data tools will be limited to open source solutions only. Commercial solutions will appear in some specialist areas. “A clear trend is focusing on applications, high-level tools that allow you to handle programs without coding. Tool platforms for managing large data sets will appear. Many of these tools will be created in the open source model. However, a part will focus on the high-level tools themselves, “explained Joey Frazee, Solutions Engineer at Hortonworks.

The development of tools that do not require coding skills will clearly take off. It can be expected that soon applications that allow you to analyze data through friendly interfaces based on natural language become standard. They will provide visual results in a real-time. “High-level tools will solve some of the problems related to the lack of big data specialists in the market. Intuitive, easy to use tools will reduce the barrier to entry into the Big Data world, ” adds Joey Frazee.

Big Data as a Service

During Big Data Tech Warsaw 2018 many experts also were saying that tools themselves will be increasingly provided as-a-service. An example could be Relativity One platform. “We are a company operating in an area known as legal tech. Our technology helps lawyers to manage and analyze large datasets relevant to litigation and investigations, “said Elise Tropiano, Senior Technical Product Manager at Relativity.

When a company is prosecuted before a court, the process of finding documents needed to defend, such as e-mails that could affect the process, is extremely costly. It’s not just about the cost of searching and paying lawyers. The obligation to locate relevant documents is also covered by potential penalties.

The costs are usually so high that even if organizations that do not feel guilty, usually seek a settlement. The costs associated with the hearing are so severe that an out-of-court agreement is more cost-effective. “Relativity provides a solution in the Software as a Service model, which comprehensively allows you to handle the entire process of searching and analyzing relevant information. The largest case that our platform served was around 750 million documents, “said Elise Tropiano. The preparation costs for a traditional approach can reach up to USD 1.5 million. Using the platform allows you to limit them to tens of thousands of dollars and solve problems in a few days.

Experts and attendees of the conference agreed that the Big Data solutions will continue to be mostly open source. Ecosystems are created around these types of projects. Commercial companies are just part of it.

We invite you to the next edition of BigData Warstaw Technology Summit in February 2019!