Our own ELK, or how we “invented the wheel” for statistics gathering and project analysis

Hello! My name is Ivan Melnichuk, I’m the head of programming department at Nexteum. In my article, I’ll tell you about an interesting case. For business, it was a breath of fresh air and, at the same time, a quite ambitious goal and opportunity for scaling up. For the team of developers, it was a real challenge, because during the past few months we needed to develop 8 business directions on one online platform, each of which was supposed to sell, attract new customers and work smoothly.

What were the main challenges?

First of all – to please the customer. In 2017, the business has decided to implement active scaling of the online store. We had a goal to create 8 directions of trading on one online platform. This is a rapid expansion of assortment, an increase of content and a sharp rise of traffic.

Since our web platforms are online stores of a highload format, the main task was to provide a stable work of the website under high traffic and a quick response of a webpage.

In a short notice, we needed not only to increase the number of catalogues on one platform, but also to solve the analytics and statistics problem, because Google Analytics hides some data and sometimes gives only 5% of reliable information.

What were the problems and why?

With the active expansion of the online platform there came a moment when everything was failing in one minute: database was falling, a cache was cracking, the website worked unstable or slowly. The reasons were pretty obvious. Our website became a multifunctional catalogue with certain loads and its life changed.

The culprits of the problems were:

  • a huge amount of content;
  • growth of organic traffic;
  • omnipresent bots.

How did we solve the problem?

In two days, we designed and customized our own ELK, which provided us with access to reliable statistics and became an excellent tool for comprehensive tracking of the status of the project.

Visually, it looks like this:

You can implement everything you want in this graph. But the most important indicators for me were:

  • Diff from the past week;
  • percentiles of response time;
  • the average number of requests;
  • the number of cache connections;
  • cache timing.

This is how flowchart of our ELK looks:

When all the data is stacked in syslog-ng, they are quickly parsed into Log stash. As soon as the data appears in ElasticSearch, Kibana displays it on the graph.

Pros and additional features of this solution

This system is very easy in implementation and operational in terms of realization the decision (as I said we needed only two days for launching). The system is also flexible due to variables Apache and Nginx and has the ability to logging third-party services.

Our ELK gives access to many useful features. With its help, you can log timings of:

  • memcached;
  • response to requests;
  • connection to the database.

 

Instead of the epilogue:

Our ELK has made everyone happy: the customer who received reliable statistics; users who have uninterrupted service; our team leaders who can control the situation, and of course our SEO department who got answers to all the questions about traffic.