Menu
We’re excited to announce the release of Coroot v1.4!
Along with various UI improvements, this update brings a new feature: network traffic monitoring. Now, you can easily see how much data is being transferred between your applications and, more importantly, how much it costs.
Let’s dive into the details. In this post, we’ll explore the enhancements and new features included in this release.
Most of Coroot’s features start with gathering telemetry data, and this new feature is no exception. Before this release, Coroot already had extensive knowledge about application communications:
However, it lacked information on the amount of data transferred between applications. There were two main reasons for this. Firstly, bandwidth-related issues typically only arise in extremely high-load systems, so we initially focused on detecting more common network failure scenarios. Secondly, collecting such data at the eBPF level is quite challenging.
In the end, we discovered a reliable method to count both inbound and outbound traffic for each TCP connection without adding significant overhead using eBPF. But our approach has a number of limitations that will be useful to be aware of:
Now using the gathered metrics, Coroot can show traffic between any services on the Service Map:
The network metrics on the Service Map are aggregated by application pairs. However, when you look at a specific application, you can see detailed instance-to-instance metrics:
To track application-to-application traffic over time, we added a chart to the Network inspection report:
Anyone who runs applications in the cloud knows that computing (VM) costs are just the tip of the iceberg. For instance, if you operate a highly available application across multiple availability zones and replicate data between them, you’ll incur charges for data transfers even within the same region. In fact, data transfer costs between applications and to the internet can easily make up more than 30% of your cloud bill.
At Coroot, we believe that FinOps is an essential part of observability. To truly understand your cloud costs, you need to analyze application-level metrics. However, cloud providers typically only offer IP-to-IP breakdowns for data transfer costs. Coroot overcomes this by building a comprehensive model of your system, translating low-level statistics into application-to-application metrics, making them more understandable for engineers.
Let me explain, how Coroot uses models of distributed systems with an example:
Having all that data allows Coroot to show you the costs of communication between any specific services.
The most exciting thing about this approach is that it’s not just for Kubernetes! If your app on an EC2 instance communicates with an RDS cluster in a different AZ, you’ll see the associated costs as well.
As an engineer, I believe that it’s impossible to optimize anything without measuring it. Now, with Coroot, you can start optimizing your data transfer costs with all the necessary data.
Coroot groups individual containers into applications using the following approach:
This default approach works well in most cases. However, since no one knows your system better than you do, Coroot allows you to manually adjust application groupings to better fit your specific needs. You can regroup any non-Kubernetes applications:
A custom application in Coroot is a name and a set of patterns for application instances.
That’s it. Now, these two instances are part of our custom application, custom-ssh. It appears as a dedicated app on the Service Map, Application Health Summary, and other related views.
Coroot now uses the I/O load metric (total I/O latency) to identify storage performance issues. Previously, we relied on I/O utilization, which measures the time the disk performs at least one query. This method was accurate for spinning HDDs but is less effective for modern SSDs, which can handle multiple queries simultaneously.
To address this, Coroot sets a default threshold of 5 seconds/second for I/O load. If a disk performs more than 5 I/O requests in parallel, it is flagged as having high I/O load. This threshold is optimal for most average SSDs on the market. However, if you use higher-performance storage, you can easily adjust this threshold for specific applications, databases, or entire projects.
Before v1.4 Coroot didn’t show storage volumes attached to a node on its page. In Kubernetes, most applications use dedicated Persistent Volumes (PVs), which are usually network-attached rather than physical node disks. From the first release, Coroot has shown these PVs in the context of the application, not the node.
As we added support for VMs and bare-metal servers, more users began using Coroot in non-Kubernetes or hybrid environments. So, we decided to add disk performance statistics to the node page, whether the disk is local or network-attached.
Even before v1.4, Coroot’s network monitoring capabilities were significantly better than other observability tools on the market. However, there were some missing statistics that can be extremely useful in certain cases.
One key metric is TCP connection latency. Before v1.4, you could easily detect if connections to a particular service failed to establish or if the network round-trip time between a service and its database was higher than usual. However, these two metrics could miss scenarios where a service or database is slow to accept connections. To address this, we added the TCP connection latency chart.
Re-establishing a TCP connection for each HTTP request or Postgres query is usually inefficient, as it adds latency and consumes compute resources. That’s why most applications use connection pools of long-lived TCP connections. With Coroot, you can now easily see how many active connections your apps use to communicate with each service or database, and how often they open new connections.
We know that in every environment, some applications are more important than others. Engineers and teams often want to focus on specific applications while giving less attention to others. For example, the platform team needs to monitor the control plane, while DBAs focus on the databases.
Coroot has always allowed users to define ‘Categories’ for applications, enabling them to group and manage these applications separately on the Service Map or Application Health Summary. Our anonymous usage statistics show this feature is popular. In v1.4, we’ve made customizing categories even easier. Now, you can set up a new category or add an application to an existing one directly from the Service Map.
With Coroot v1.4, we’ve taken significant steps to enhance your observability and cost management capabilities. By introducing network traffic monitoring, you can now better understand and optimize your data transfer costs. Our improvements in TCP connection latency monitoring and I/O performance analysis ensure that you have precise and actionable insights into your system’s performance.
We’ve also made it easier to manage and focus on the applications that matter most to you, whether you’re using Kubernetes, VMs, or a hybrid environment. The enhanced customization of application categories ensures that you can tailor Coroot to your specific needs.
Start exploring Coroot v1.4 today and take full control of your application’s performance and costs. We look forward to continuing to support your observability needs.