Monitoring complex stacks, detecting when things go haywire, is hard. Big Data Europe offers a component which monitors Docker’s network traffic in order to debug erroneous states.
The HTTP logger monitors network traffic, transforms its HTTP contents into a JSON format, stores it in Elastic Search and visualizes it in Kibana.
In this post, we sketch the context and indicate how to use the component. Further posts will explain how this is technically implemented on top of Docker networks.
A word about monitoring
Looking at the problem of monitoring from a high level, anything structured can be logged and introspected, as long as you can make sense of said structure.
Applications can be monitored at various levels. The most commonly used ways are checking the system load (containing CPU & memory usage) and watching the application’s logs. If the system load spikes, it should be executing a complex operation. Next up are the system logs containing custom output generated at various steps of the process. This logic for these logs is written during the development of the application. The former approach provides fewer insights but requires no manual input. A somewhat less used alternative exists in the form of custom user interfaces showing the current state of the system. These user interfaces are easy to interpret and can provide good insights, but they are even more expensive to develop.
Each of the systems mentioned above interprets the information it has on the system.
The approaches listed here all show ways of monitoring services at different costs. We were out to see if different forms of monitoring would be possible. The base format of communication needs to be known for each level of monitoring. When showing the LOAD of the system, we assume a system with a CPU and Memory. When showing the logs of a system, we assume a developer outputs text statements at various log levels throughout the application. When writing a custom app to introspect the status of an application, we assume that application is
What output to expect
The HTTP logger assumes HTTP traffic is sent over a Docker Network. Any stack which responds to HTTP traffic can be monitored.
Information on the HTTP calls is stored in Elastic Search, which can be￼ visualized with Kibana. What can be extracted from the interface largely depends on which statistics you retrieve with Kibana, and what information we currently store about the HTTP requests.
The HTTP logger runs separately from any specific stack, hence comparisons between stacks can be made. Only HTTP traffic is taken into account. From each HTTP call, extracted & enriched information is stored. Enrichment takes the form of the stack’s name, the container’s name, and the extraction of some headers. You can expect to retrieve information on the timings, the headers, and possibly the content of JSON calls. In fact, we use the HAR (https://dvcs.w3.org/hg/webperf/raw-file/tip/specs/HAR/Overview.html) format as our base for information. With the right headers set, you can track how calls propagate between services.
Kibana dashboards are interactive and show the state of a system. An example is shown below:
The example shows the queries over time for all monitored systems. The pie-chart on the right shows the stacks in the center, the containers of the stack in the ring around it, and the types of calls in the outermost ring. By selecting any of these, all other charts are updated to take that constraint into account. For instance, by selecting ‘GET’, all charts will be updated to show only GET requests. By selecting a different time-range, the time-range will be updated. By selecting a stack, only the requests for that stack are shown. Constraints can also be entered manually or be shared across dashboards.
The chart below shows the slow requests. For example, all requests which took between 50ms and 250ms to complete (including data transfer) are shown in green. The higher up the chart, the larger the total time these requests consumed. The blue dot at the top represents eight slow requests which took between two and ten seconds to complete. By selecting these requests, we can find the exact calls that caused this slow response. Great for debugging!
Even lower are the unique IPs per hour of the day, the requests per hour of the day, and the unique IPs over time. The bottom-left chart shows the evolution of the response time over time (shown in percentiles). Bottom-right shows how the response time evolves with the response size. This last one can come in handy to quickly detect slow users downloading large files.
Much more can be derived from this information. Most interesting sessions involve the search for slow requests in the system. A good approach is zooming in using the dashboard, and then going back to show individual requests. Which dashboards you need for that may depend on the specific problem you’re facing.
Each of these charts, as well as the dashboard itself, were created through the Kibana interface. You can create similar dashboards and charts yourself, depending on your monitoring needs.
How to run
The HTTP logger can run like most other stacks. Clone it, run it, and see the results. As this component gathers input from other stacks, we need to indicate what components to monitor too.
Starting the HTTP logger
The logger can be found at https://github.com/big-data-europe/app-http-logger. Clone this repository, and launch the stack.
git clone https://github.com/big-data-europe/app-http-logger.git cd app-http-logger docker-compose up
After downloading, the stack will be up and running. The interface is available through port 5601. Open your web browser on http://localhost:5601 to see it. It will request an index to build. We can do this after some data has been generated.
Generate data by adding the ‘logging’ label to components that have HTTP traffic. A common example of this is webCAT (http://github.com/big-data-europe/webcat).
In the case of webCAT, the frontend’s service description would become:
webcat: image: tenforce/webcat:0.2.3 links: - identifier:backend ports: - 80:80 labels: - "logging=true"
After making these changes, start (or restart) webCAT so the HTTP logger can pick up the changes. Then click around in the application in order to generate some data.
Play around with the data
Go back to Kibana and set the index to ‘har*’. This will make Kibana pick up the right content. You can inspect the content, generate some charts and create dashboards. The best way forward for us seemed to click around through Kibana to get a feel for the application and its possibilities.
The road ahead
There are some extensions possible on our current approach which will yield value. Furthermore, we will provide more information on the current implementation.
This approach of logging can be extended with more types of network traffic being understood. This allows more information than only HTTP traffic to be seen. Another valid extension is to extend the current HAR files into a semantic model. This allows traffic to be analyzed with the SANSA (http://sansa-stack.net) stack.
We will share more information on our implementation of this logging component. In the meantime, you can use it, or check the sources at https://github.com/big-data-europe/app-http-logger.