Presentation: Top Node.js Metrics to Watch

Fresh from Germany’s largest Node.js Meetup, hosted by Wikimedia in Berlin, is the latest presentation from Sematext DevOps Evangelist Stefan Thies — “Top Node.js Metrics To Watch”. The event was shared with the Node.js Meetup in London via video-live stream, the full recording is available on YouTube.

Stefan’s talk goes through challenges of developing Node.js monitoring solutions, such as SPM for Node.js  and key metrics including examples from monitoring Kibana’s Node.js Server

Here is the video:

and the slides:

Please note, the article “ Top Node.js Metrics to Watch” was originally published by O’Reilly Radar.

Have a look at our other Node.js posts — there is a lot more interesting material to discover, like  MongoDB monitoring made with Node.js.

Questions or Feedback?
If you have any questions or feedback for us, please contact us by email or hit us on Twitter.  We love talking about performance monitoring — and log management!

 

Elasticsearch Training in London

3 Elasticsearch Classes in London

 

es-training-240x187

Elasticsearch for Developers ……. April 4-5

Elasticsearch for Logging ……… April 6

Elasticsearch Operations …….  April 6

All classes cover Elasticsearch 2.x

Hands-on — lab exercises follow each class section

Early bird pricing until February 29

Add a second seat for 50% off

Register_Now_2

Course overviews are on our Elasticsearch Training page.

Want a training in your city or on-site?  Let us know!

Attendees in all three workshops will go through several sequences of short lectures followed by interactive, group, hands-on exercises. There will be Q&A sessions in each workshop after each such lecture-practicum block.

Got any questions or suggestions for the course? Just drop us a line or hit us @sematext!

Lastly, if you can’t make it…watch this space or follow @sematext — we’ll be adding more Elasticsearch training workshops in the US, Europe and possibly other locations in the coming months.  We are also known worldwide for Elasticsearch Consulting Services, and Elasticsearch Production Support.
We hope to see you in London in April!

MongoDB Monitoring Support

For many of us in the DevOps field, MongoDB is a critical part of our IT stack.  With today’s acquisition of WiredTiger, MongoDB is further establishing itself as the NoSQL DB built to support massive data processing and storage.  It would be an understatement to say that MongoDB does a lot, with many organizations using it as their backend storage framework, analytics backend, and so on.

So your MongoDB cluster really, really needs to be in tip-top shape.  All the time.  And if it’s not then you need to know asap — or better yet — prevent problems before they kick in and make your life difficult.  That’s where SPM comes in — with MongoDB monitoring, alerting and anomaly detection.  MongoDB exposes a boatload of metrics, but instead of just throwing all of them on endless charts, we’ve taken the time to cherry pick what we think are the top 50 most valuable MongoDB metrics to monitor. We have furthermore made it possible to filter the MongoDB metrics by server, as well as a database and table where possible.

The key metric groups we track are:

  • Database Operations
  • Database Memory
  • Database Storage
  • Documents
  • Locks
  • Network
  • Database Journal
  • Background Flushes

The Overview chart below provides 9 charts with MongoDB key metrics:

  • Row 1 displays CPU, Memory and Disk Metrics
  • Row 2 displays Database Operations, Database Memory and Database Storage Metrics.
  • Row 3 adds Collection/Document Metrics, Locks, and wait times; followed by Network Metrics for MongoDB

MongoDB_Overview

SPM for MongoDB Overview

In case you monitor a MongoDB cluster, the Server Tab provides a quick overview for the Health of each node:

MongoDB_Server_view

SPM Server View

The Reports on the left side of the screen below provide detailed information for each group of metrics. Let’s have a quick look at them.

MongoDB_CPU_details

OS Metrics: CPU Metrics, Memory Usage, Disk Space and I/O

Below is an example of some of the key MongoDB Metrics found in SPM:

  • Database Operations: Counters for Queries, Insert, Update, Delete and other commands for the main database plus replica operations
  • Database Memory: Resident-, Virtual-, Mapped-, and Journal Memory
  • Database Storage: Size of Data Files, Namespace Files, DB Files etc., plus Size of Objects, Number of Collections and Objects

MongoDB_Storage

MongoDB Storage & Collections

The screenshot below shows:

  • Documents: Counters for Documents inserted, updated or returned by queries
  • Locks: Lock counters and lock acquisition wait times for Global, Database, Collection and Journal level. Since MongoDB 3.x Locks are not always global. SPM shows a breakdown for all lock types. These metrics are good candidates for alerting, when anomalies are detected.  Simply add an alert from the menu in the top-left corner in each chart.

MongoDB_Locks

Metrics for all MongoDB Locks

Other key MongoDB metrics that SPM displays are:

  • Network: Number of client connections, Received and transmitted data, Request rate
  • Database Journal: Commits, Early Commits,  Commit times and lock times

MongoDB_Journal_Commits

MongoDB Journal Metrics

In case you like to see MongoDB metrics together with the Top Node.js Metrics, you might like the idea of putting MongoDB and Node.js metrics from SPM for Node.js in a custom dashboard:

MongoDB_Locks-Node.js_Loop

SPM Custom Dashboard with MongoDB Locks and Node.js Event Loop Latency

We hope you like this new addition to SPM.  Got ideas how we could make it more useful for you?  Let us know via comments, email or @sematext.

Not using SPM yet? Check out the free 30-day trial by registering here.  There’s no commitment and no credit card required.  Even better — combine SPM with Logsene to make the integration of performance metrics, logs, events and anomalies more robust for those looking for a single pane of glass.

Introducing Top Database Operations

If you run Elasticsearch, Solr, or any backend you communicate with using SQL (via JDBC), like SparkSQL, Apache Cassandra (CQL), Apache Impala, Apache Drill, MySQL, PostgreSQL, etc., you’ll like what we’ve just added to SPM.  We call it Database Operations and in SPM you can find it in the new Database report:

If you didn’t watch the video, here’s what Database Operations gives you:

  • Top 5 operation types across all your data stores or filtered to a specific data store type
  • Top 5 operation types by speed, throughput, or simply their volume
  • Time-series reports for volume, throughput, and latency broken down by operation type
  • Ability to view all collected operations, not just the slowest ones, filter by database type or by operation type, sorted by average or total duration, or throughput
  • Sparklines that show last 5 minute values and trends
  • Top 10 slowest individual operations and drill-in details

Integration with Transaction Tracing, so you can correlate slow data store operations with the actual transaction/request that triggered slow operations

Important:

  • To get this information add SPM agent to the application that is talking to a data store (e.g. Solr or Elasticsearch or MySQL or …). This is because the SPM agent captures operations at that client layer, not in the server itself.
  • To start capturing this information enable Transaction Tracing in your SPM agents

This, including Distributed Transaction Tracing, works for all Java applications

Database_ops_1

——-

Database_ops_graphic

Don’t forget – when you enable Database Operations you will also automatically get Transaction Tracing, as well as the cool AppMaps – enjoy! 🙂

Got ideas how we could make Database Operations better and more useful to you?  Let us know via comments, email or @sematext.

Grab a free 30-day SPM trial by registering here (ping us if you’re a startup, a non-profit, or educational institution – we’ve got special pricing for you!).  There’s no commitment and no credit card required.

Kafka Real-time Stream Multi-topic Catch up Trick

Half of the world, Sematext included, seems to be using Kafka.

Kafka is the spinal cord that connects various components in SPM, Site Search Analytics, and Logsene.  If Kafka breaks, we’re in trouble (but we have anomaly detection all over the place to catch issues early).  In many Kafka deployments, ours included, the most recent data is the most valuable.  Consider the case of Kafka in SPM, which processes massive amounts of performance metrics for monitoring applications and servers.  Clearly, in a performance monitoring system you primarily care about current performance numbers.  Thus, if SPM’s Kafka pipeline were to break and we restore it, what we’d really like to avoid is processing all data sequentially, oldest to newest.  What we’d prefer is processing new metrics data first and then processing older data using any spare capacity we have in order to “fill the gap” caused by Kafka downtime.

Here’s a very quick “video” that show this in action:

Kafka Catch Up
Kafka Catch Up

 

How does this work?

We asked about it back in 2013, but didn’t really get good tips.  Shortly after that we implemented the following logic that’s been working well for us, as you can see in the animation above.

The catch up logic assumes having multiple topics to consume from and one of these topics being the “active” topic to which producer is publishing messages. Consumer sets which topic is active, although Producer can also set it if it has not already been set. The active topic is set in ZooKeeper.

Consumer looks at the lag by looking at the timestamp that Producer adds to each message published to Kafka. If the lag is over N minutes then Consumer starts paying attention to the offset.  If the offset starts getting smaller and keeps getting smaller M times in a row, then Consumer knows we are able to keep up (i.e. the offset is not getting bigger) and sets another topic as active. This signals to Producer to switch publishing to this new topic, while Consumer keeps consuming from all topics.

As the result, Consumer is able to consume both new data and the delayed/old data and avoid not having fresh data while we are in catch-up mode busy processing the backlog.  Consuming from one topic is what causes new data to be processed (this corresponds to the right-most part of the chart above “moving forward”), and consuming from the other topic is where we get data for filling in the gap.

If you run Kafka and want a good monitoring tool for Kafka, check out SPM for Kafka monitoring.

 

Docker + Elasticsearch: How to Monitor the Official Elasticsearch Image on Docker

Update: Elasticsearch 2.x requires a setup with standalone monitors. Why? A very restricitve Java security policy got implemented in Elasticsearch 2.x. This security policy  forbids loading of 3rd party libraries – including in-process monitoring libraries from SPM. Here is an example with Docker Compose and SPM-Client standalone monitor, working with Elasticsearch 2.x.

The official Elasticsearch Image on Docker Hub has already generated more than 1.6 million pulls. It is probably the easiest way to get a development setup — which includes Elasticsearch — to the application stack. The reason for this crazy number? A rapidly growing number of organizations are using Elasticsearch and Docker in production. Needless to say, monitoring Elasticsearch is essential in production, and you can find a detailed analysis of this topic (including the “top 10 Elasticsearch metrics to watch”) in the free eBook: ElasticsearchMonitoring Essentials. Docker is disruptive in many ways, and there are many things that are slightly different and worth mentioning.  These include:

  1. Changed deployment for Elasticsearch and its monitoring tools using Dockerfile, Docker Compose or various Orchestration Tools
  2. There is a new Layer to monitor: Container Metrics and Events, see: Docker Events and Metrics monitoring and SPM for Docker
  3. Logging has changed: containers log to the console and logs needs to be retrieved from Docker-Daemon instead getting them from the Elasticsearch log file.  Check out our post on the subject: Innovative Docker Log Management
  4. Official Images may not provide options for monitoring (such as JMX).  However, the official Image for Elasticsearch provides an option to pass parameters to the Java Runtime Environment.  We we will use this option for Elasticsearch monitoring in this post. You should also be aware that the official Elasticsearch Image does not include any plugins, and commercial monitoring from Elastic can’t be distributed in this Image for licensing reasons.  Our monitoring tool of choice is SPM.  If you are not familiar with SPM — but have heard of it — or if you use Marvel, have a look at Marvel vs. SPM.

Next, I’m going to demonstrate a setup to monitor multiple Elasticsearch nodes on a single Docker Host. The final setup will provide the full Monitoring and Logging package:

  • Detailed Application Metrics for Elasticsearch, deployed on Docker
  • Detailed Container Metrics and Docker Events  
  • Centralized Logs for all Containers by SPM for Docker

So let’s first decide on one of the following options to monitor Elasticsearch on Docker.  You can:

  1. Build your own Elasticsearch container with the included monitoring components. I’m not going to go into details about this option today; rather, I’m going to focus on the official / trusted build.
  2. Use a standalone agent, which queries metrics from the Elasticsearch container. This requires a setup for JMX and Docker networking configurations for the monitor and Elasticsearch. The metrics, gathered by remote agents, are limited and, in the Docker context, running an external monitoring process plus Elasticsearch processes consumes more resources.  And the next option …
  3. Inject an SPM in-process monitoring agent into Elasticsearch. This option has the lowest resource usage and has support for advanced monitoring functions like Transaction Tracing and AppMap.

I chose to implement Option #3 in this blog post because it provides the best insights into Elasticsearch. This means the Elasticsearch container needs file-system access to the SPM monitoring agent. Sematext provides the SPM Client (which includes the monitoring agent and metrics sender) pre-installed in a Docker Image, referred as “SPM Client Image/Container” in the following instructions and published on Docker Hub as “sematext/spm-client”.  The main trick here is to mount a volume from SPM-Client Container into Elasticsearch Containers in order to load the monitoring library.

Let’s have a look at the desired setup and how to get there:

Elasticsearch-Monitoring-On-Docker

Monitoring Setup for Elasticsearch on Docker

Continue reading “Docker + Elasticsearch: How to Monitor the Official Elasticsearch Image on Docker”

Docker Monitoring Webinar – Video and Slides

The recent Docker Monitoring webinar is ready for consuming!  Our DevOps Evangelist, Stefan Thies, took attendees on a tour of Docker monitoring basics, including a number of different Docker monitoring options and their pros and cons, solutions for Docker monitoring, and a brief Q&A session.

If you use Docker you know that these deployments can be very dynamic, not to mention all the ways there are to monitor Docker containers, collect logs from them, etc. etc.  And if you didn’t know these things, well, you’ve come to the right place!

Here is the video recording:

And here are the slides:

Docker Monitoring Tools and Resources

Once you’ve checked out some of the Docker content sign up for a free 30-day trial (no credit card required) of SPM for Docker or request a demo to see how easy it is to get up and running.

There’s a good chance you will also like Logsene, our centralized logging solution that, like SPM, offers alerting and anomaly detection on top of all the other benefits.  We’re even offering a 20% discount on SPM and Logsene to webinar viewers.  Just use these codes when creating new SPM or Logsene apps:

  • SPM: 201509WNR20S
  • Logsene: 201509WNR20L

Docker Logging Webinar & Slides

Speaking of logs…for those of you with an interest in Docker Logging, we held a webinar on that subject as well.  Click here to access the webinar video recording and slides.

Questions or Feedback?

If any questions have come up since the webinar, or if you have some feedback for us, please contact us by email or hit us on Twitter.

Docker Monitoring Webinar on October 6

[ Note: Click here for the Docker Monitoring webinar video recording and slides. And click here for the Docker Logging webinar video recording and slides. ]

——-

Good news for Docker fans: we’re running a third Docker Monitoring webinar on Tuesday, October 6 at 2:00 pm Eastern Time / 11:00 am Pacific Time.

If you use Docker you know that these deployments can be very dynamic, not to mention all the ways there are to monitor Docker containers, collect logs from them, etc. etc.  And if you didn’t know these things, well, you’ve come to the right place!

Sematext has been on the forefront of Docker monitoring, along with Docker event collection, charting, and correlation.  The same goes for CoreOS monitoring and CoreOS centralized log management.  So it’s only natural that we’d like to share our experiences and how-to knowledge with the growing Docker and container community.  During the webinar we’ll go through a number of different Docker monitoring options, point out their pros and cons, and offer solutions for Docker monitoring.

The webinar will be presented by Stefan Thies, our DevOps Evangelist, deeply involved in Sematext’s activities around monitoring and logging in Docker and CoreOS.  A post-webinar Q&A will take place — in addition to the encouraged attendee interaction during the webinar.

Date/Time

Tuesday, October 6 @ 2:00 pm to 3:00 pm Eastern Time / 11:00 am to 12:00 pm Pacific Time.

Register_Now_2

“Show, Don’t Tell”

The infographic below will give you a good idea of what Stefan will be showing and discussing in the webinar.

Docker_webinar_infographic

Got Questions, or topics you’d like Stefan to address?

Leave a comment, ping @sematext or send us an email — we’ll all ears.

Whether you’re using Docker or not, we hope you join us for the webinar.  Docker is hot — let us help you take advantage of it!

Introducing Akka Monitoring

Akka is a toolkit and runtime for building highly concurrent, distributed and resilient message-driven applications on the JVM. It’s a part of Scala’s standard distribution for the implementation of the “actor model”.

How Akka Works

Messages between Actors are exchanged in Mailbox queues and Dispatcher provides various concurrency models, while Routers manage the message flow between Actors. That’s quite a lot Akka is doing for developers!

But how does one find bottlenecks in distributed Akka applications? Well, many Akka users already use the great Kamon Open-Source Monitoring Tool, which makes it easy to collect Akka metrics.  However — and this is important! — predefined visualizations, dashboards, anomaly detection, alerts and role-based access controls for the DevOps team are out of scope for Kamon, which is focused on metrics collection only.  To overcome this challenge, Kamon’s design makes it possible to integrate Kamon with other monitoring tools.

Needless to say, Sematext has embraced this philosophy and contributed the Kamon backend to SPM.  This gives Akka users the option to use detailed Metrics from Kamon along with the visualization, alerting, anomaly detection, and team collaboration functionalities offered by SPM.

The latest release of Kamon 0.5.x includes kamon-spm module and was announced on August 17th, 2015 on the Kamon blog.  Here’s an excerpt:

Pavel Zalunin from Sematext contributed the new kamon-spm module, which as you might guess allows you to push metrics data to the Sematext Performance Monitor platform. This contribution is particularly special to us, given the fact that this is the first time that a commercial entity in the performance monitoring sector takes the first step to integrate with Kamon, and they did it so cleanly that we didn’t even have to ask any changes to the PR, it was just perfect. We sincerely hope that more companies follow the steps of Sematext in this matter.

Now let’s take a look at the result of this integration work:

  • Metrics pushed to SPM are displayed in predefined reports, including:
    • An overview of all key Akka metrics
    • Metrics for Actors, Dispatchers and Routers
    • Common metrics for CPU, Memory, Network, I/O,  JVM and Garbage Collection
  • Each chart has the “Action” menu to:
    • Define criteria for anomaly detection and alerts
    • Create scheduled email reports
    • Securely share charts with read-only links
    • Embed charts into custom dashboards
  • A single SPM App can take metrics from multiple hosts to monitor a whole cluster; filters by Host, Actor, Dispatcher, and Router make it easy to drill down to the relevant piece of information.
  • All other SPM features, are available for Akka users, too.  For example:

Akka_overview

Akka Metrics Overview

Actor Metrics

Actors send and receive messages, therefore the key metrics for Actors are:

  • Time in Mailbox
    Messages are waiting to be processed in the Mailbox – high Time in Mailbox values indicate potential delays in processing.
  • Processing Time
    This is the time Actors need to process the received messages – use this to discover slow Actors
  • Mailbox Size
    Large Mailbox Size could indicate pending operations, e.g. when it is constantly growing.

Each of the above metrics is presented in aggregate for all Actors, but one can also use SPM filtering feature to view all Actors’ metrics separately or select one or more specific Actors and visualize only their metrics.  Filtering by Host is also possible, as show below.

Akka_actors

Akka Actors

Dispatcher Metrics

In Akka a Dispatcher is what makes Actors ‘tick’. Each Actor is associated with a particular Dispatcher (default one is used if no explicit Dispatcher is set). Each Dispatcher is associated with a particular Executor – Thread Pool or Fork Join Pool. The SPM Dispatcher report shows information about Executors:

  • Fork Join Pool
  • Thread Pool Executor

All metrics can be filtered by Host and Dispatcher.

Akka_dispatchers

Akka Dispatchers

Router Metrics

Routers can be used to efficiently route messages to destination Actors, called Routees.

  • Routing Time – Time to route message to selected destination
  • Time In Mailbox – Time spent in routees mailbox.
  • Processing Time – Time spent by routee actor to process routed messages
  • Errors Count – Errors count during processing messages by routee

For all these metrics, lower values are better, of course.

Akka_routers

Akka Routers

You can set Alerts and enable Anomaly Detection for any Akka or OS metrics you see in SPM and you can create custom Dashboards with any combination of charts, whether from your Akka apps or other apps monitored by SPM.

We hope you like this new addition to SPM.  Got ideas how we could make it more useful for you?  Let us know via comments, email, or @sematext.

Not using SPM yet? Check out the free 30-day SPM trial by registering here (ping us if you’re a startup, a non-profit, or education institution – we’ve got special pricing for you!).  There’s no commitment and no credit card required.  SPM monitors a ton of applications, like Elasticsearch, Solr, Cassandra, Hadoop, Spark, Node.js (open-source), Docker (get open-source Docker image), CoreOS, RancherOS and more.

Tomcat Monitoring SPM Integration

This old cat, Apache Tomcat, has been around for ages, but it’s still very much alive!  It’s at version 8, with version 7.x still being maintained, while the new development is happening on version 9.0.  We’ve just added support for Tomcat monitoring to the growing list of SPM integrations the other day, so if you run Tomcat and want to take a peek at all its juicy metrics, give SPM for Tomcat a go!  Note that SPM supports both Tomcat 7.x and 8.x.

Before you jump to the screenshot below, read this: you may reeeeeally want to enable Transaction Tracing in the SPM agent running on your Tomcat boxes.  Why?  Because that will help you find bottlenecks in your web application by tracing transactions (think HTTP requests + method call + network calls + DB calls + ….).  It will also build a whole map of all your applications talking to each other with information about latency, request rate, error and exception rate between all component!  Check this out (and just click it to enlarge):

AppMap

Everyone loves maps.  It’s human. How about charts? Some of us have a thing for charts. Here are some charts with various Tomcat metrics, courtesy of SPM:

Overview  (click to enlarge)

Tomcat_overview_2

Session Counters  (click to enlarge)

Tomcat_Session_Counters

Cache Usage  (click to enlarge)

Tomcat_Sessions_4

Threads (Threads/Connections)  (click to enlarge)

Tomcat_threads_4

Requests  (click to enlarge)

Tomcat_Requests

Hope you like this new addition to SPM.  Got ideas how we could make it more useful for you?  Let us know via comments, email, or @sematext.

Not using SPM yet? Check out the free 30-day SPM trial by registering here (ping us if you’re a startup, a non-profit, or education institution – we’ve got special pricing for you!).  There’s no commitment and no credit card required.  SPM monitors a ton of applications, like Elasticsearch, Solr, Hadoop, Spark, Node.js & io.js (open-source), Docker (get open-source Docker image), CoreOS, and more.