Elasticsearch “Big Picture” – A Creative Flow Chart and Poster

There are many ways to look at Elasticsearch, but here at Sematext we’re pretty confident that you haven’t seen anything like this flowchart to demonstrate how it works:

ES_poster

Download a copy and print your own Elasticsearch poster!

If you’re looking for something unique to show off your Elasticsearch chops, then download a copy today and print your own.  We have files for US letter, A4, Ledger (11”x17”) and Poster (24”x36”) sizes.

poster_24poster_11poster_letterposter_A4

 

Sematext is your “one-stop shop” for all things Elasticsearch: Expert Consulting, Production Support, Elasticsearch Training, Elasticsearch Monitoring, even Hosted ELK!

Doing Centralized Logging with ELK?  We Can Help There, Too

If your log analysis and management leave something to be desired, then we’ve got you covered there as well.  There’s our centralized logging solution, Logsene, which you can think of as your “Managed ELK Stack in the Cloud.”   It’s is also available as an On Premises deployment.  Lastly, we offer Logging Consulting should you require more in-depth support.

Questions or Feedback?

If any questions or feedback for us, please contact us by email or hit us on Twitter.

Presentation: Large Scale Log Analytics with Solr

In this presentation from Lucene/Solr Revolution 2015, Sematext engineers — and Solr and centralized logging experts — Radu Gheorghe and Rafal Kuć talk about searching and analyzing time-based data at scale.

Documents ranging from blog posts and social media to application logs and metrics generated by smartwatches and other “smart” things share a similar pattern: timestamps among their fields, rarely changeable, and deletion when they become obsolete. Because this kind of data is so large it often causes scaling and performance challenges.

In this talk, Radu and Rafal focus on these challenges, including: properly designing collections architecture, indexing data fast and without documents waiting in queues for processing, being able to run queries that include time-based sorting and faceting on enormous amounts of indexed data (without killing Solr!), and many more.

Here is the video:

…and here are the slides:

 

Here’s a Taste of What You’ll See

How do Logstash, rsyslog, Redis, and fast-food-hating zombies (?!) relate? You’ll have to check out the presentation to find out…

LR_zombie_slide

Solr “One-stop Shop”

Sematext is your “one-stop shop” for all things Solr: Expert Consulting, Production Support, Solr Training, and Solr Monitoring with SPM.

Log Analytics – We Can Help

If your log analysis and management leave something to be desired, then we’ve got you covered there as well.  There’s our centralized logging solution, Logsene.  And we also offer Logging Consulting should you require more in-depth support.

Questions or Feedback?

If you have any questions or feedback for us, please contact us by email or hit us on Twitter.  We love talking Solr — and logs!

 

Presentation: Log Analysis with Elasticsearch

Fresh from the Velocity NYC conference is the latest presentation from Sematext engineers Rafal Kuć and Radu Gheorghe“From zero to production hero: Log Analysis with Elasticsearch.”

The talk goes through the basics of centralizing logs in Elasticsearch and all the strategies that make it scale with billions of documents in production. They cover:

  • Time-based indices and index templates to efficiently slice your data
  • Different node tiers to de-couple reading from writing, heavy traffic from low traffic
  • Tuning various Elasticsearch and OS settings to maximize throughput and search performance
  • Configuring tools such as logstash and rsyslog to maximize throughput and minimize overhead

Here is part 1 of the Video:

Here is part 2 of the Video:

Here are the slides:

 

And here are the Commands and Demo used in the presentation: https://github.com/sematext/velocity

Continue reading “Presentation: Log Analysis with Elasticsearch”

Docker Logging Webinar – Video and Slides

Docker Logging has been a very popular topic of late in our internal and external discussions.  So much so that we decided to hold webinars on the topic (and Docker Monitoring as well) and now we’re making them available to everyone.

The webinars were presented by Sematext’s DevOps Evangelist, Stefan Thies.  Stefan discussed Docker logging basics, including: the different log collection options Docker users have; the pros and cons of each option; specific and existing Docker logging solutions; log shipping to ELK Stack; and more.

Here is the video recording:

And here are the slides:

Docker Logging Resources:

Start Managing Docker Logs Now (Monitoring, too!)

Once you’ve checked out some of the Docker content sign up for a free 30-day trial (no credit card required) of Logsene or request a demo to see how easy it is to get up and running.

There’s a good chance you will also like SPM, our performance monitoring solution, that, like Logsene, offers alerting and anomaly detection on top of all the other benefits.  We’re even offering a 20% discount on SPM and Logsene to webinar viewers.  Just use these codes when creating new SPM and Logsene apps:

  • SPM: 201509WNR20S
  • Logsene: 201509WNR20L

Docker Monitoring Webinar & Slides

Speaking of metrics…for those of you with an interest in Docker Monitoring, we held a webinar on that subject as well. Click here to access the webinar video recording and slides.

Questions or Feedback?

If any questions have come up since the webinar, or if you have some feedback for us, please contact us by email or hit us on Twitter.

Docker Monitoring Webinar – Video and Slides

The recent Docker Monitoring webinar is ready for consuming!  Our DevOps Evangelist, Stefan Thies, took attendees on a tour of Docker monitoring basics, including a number of different Docker monitoring options and their pros and cons, solutions for Docker monitoring, and a brief Q&A session.

If you use Docker you know that these deployments can be very dynamic, not to mention all the ways there are to monitor Docker containers, collect logs from them, etc. etc.  And if you didn’t know these things, well, you’ve come to the right place!

Here is the video recording:

And here are the slides:

Docker Monitoring Tools and Resources

Once you’ve checked out some of the Docker content sign up for a free 30-day trial (no credit card required) of SPM for Docker or request a demo to see how easy it is to get up and running.

There’s a good chance you will also like Logsene, our centralized logging solution that, like SPM, offers alerting and anomaly detection on top of all the other benefits.  We’re even offering a 20% discount on SPM and Logsene to webinar viewers.  Just use these codes when creating new SPM or Logsene apps:

  • SPM: 201509WNR20S
  • Logsene: 201509WNR20L

Docker Logging Webinar & Slides

Speaking of logs…for those of you with an interest in Docker Logging, we held a webinar on that subject as well.  Click here to access the webinar video recording and slides.

Questions or Feedback?

If any questions have come up since the webinar, or if you have some feedback for us, please contact us by email or hit us on Twitter.

Poll Results: Log Shipping Formats

The results for the log shipping formats poll are in.  Thanks to everyone who took the time to vote!

The distribution pie chart is below, but we can summarize it for you here:

  • JSON won pretty handily with 31.7% of votes, which was not totally unexpected. If anything, we expected to see more people shipping logs in JSON.  One person pointed out GELF, but GELF is really just specific JSON structure over Syslog/HTTP, so GELF falls in this JSON bucket, too.
  • Plain-text / line-oriented log shipping is still popular, clocking in with 25.6% of votes.  It would be interesting to see how that will change in the next year or two.  Any guesses?  For those who are using Logstash for shipping line-oriented logs, but have to deal with occasional multi-line log events, such as exception stack traces, we’ve blogged about how to ship multi-line logs with Logstash.
  • Syslog RFC5424 (the newer one, with structured data in it) barely edged out its older brother, RFC3164 (unstructured data).  Did this surprise anyone?  Maybe people don’t care for structured logs as much as one might think?  Well, structure is important, as we’ll show later today in our Docker Logging webinar because without it you’re limited to mostly “supergrepping” your logs, not really getting insight based on more analytical type of queries on your logs.  That said, the two syslog formats together add up to 25%!  Talk about ancient specs holding their ground against newcomers!
  • There are still some of people out there who aren’t shipping logs! That’s a bit scary! 🙂 Fortunately, there are lot of options available today, from the expensive On Premises Splunk or DIY ELK Stack, to the awesome Logsene, which is sort of like ELK Stack on steroids.  Look at log shipping info to see just how easy it is to get your logs off of your local disks, so you can stop grepping them.  If you can’t live without the console, you can always use logsene-cli!

Log_shipper_poll_4

Similarly, if your organization falls in the “Don’t ship them” camp (and maybe even “None of the above” as well, depending on what you are or are not doing) then — if you haven’t done so already — you should give some thought to trying a centralized logging service, whether running within your organization or a logging SaaS like Logsene, or at least DIY ELK.

Docker Monitoring Webinar on October 6

[ Note: Click here for the Docker Monitoring webinar video recording and slides. And click here for the Docker Logging webinar video recording and slides. ]

——-

Good news for Docker fans: we’re running a third Docker Monitoring webinar on Tuesday, October 6 at 2:00 pm Eastern Time / 11:00 am Pacific Time.

If you use Docker you know that these deployments can be very dynamic, not to mention all the ways there are to monitor Docker containers, collect logs from them, etc. etc.  And if you didn’t know these things, well, you’ve come to the right place!

Sematext has been on the forefront of Docker monitoring, along with Docker event collection, charting, and correlation.  The same goes for CoreOS monitoring and CoreOS centralized log management.  So it’s only natural that we’d like to share our experiences and how-to knowledge with the growing Docker and container community.  During the webinar we’ll go through a number of different Docker monitoring options, point out their pros and cons, and offer solutions for Docker monitoring.

The webinar will be presented by Stefan Thies, our DevOps Evangelist, deeply involved in Sematext’s activities around monitoring and logging in Docker and CoreOS.  A post-webinar Q&A will take place — in addition to the encouraged attendee interaction during the webinar.

Date/Time

Tuesday, October 6 @ 2:00 pm to 3:00 pm Eastern Time / 11:00 am to 12:00 pm Pacific Time.

Register_Now_2

“Show, Don’t Tell”

The infographic below will give you a good idea of what Stefan will be showing and discussing in the webinar.

Docker_webinar_infographic

Got Questions, or topics you’d like Stefan to address?

Leave a comment, ping @sematext or send us an email — we’ll all ears.

Whether you’re using Docker or not, we hope you join us for the webinar.  Docker is hot — let us help you take advantage of it!

Top 10 Elasticsearch Mistakes

Top_10_ES_Mistakes_________________________________________________________________________

  1. Upgrading to the new major version right after its release without waiting for the inevitable .1 release
  2. Remembering that you said, “We don’t need backups, we have shard replicas” to your manager during an 8-hour cluster recovery session
  3. Not running dedicated masters and wondering why your whole cluster becomes unresponsive during high load
  4. In a room full of Elasticsearch fans suggest that Elasticsearch should use ZooKeeper like SolrCloud and avoid split-brain
  5. Running a single master and wondering why it takes the whole cluster down with it
  6. Running a significant terms aggregation on an analyzed field and wondering where all the memory/heap went
  7. Not using G1 GC with large heaps because Robert Muir claims G1 and Lucene/Elasticsearch don’t get along (just kidding, Robert!)
  8. Giving Elasticsearch JVM 32 GB heap and thinking you’re so clever ‘cause you’re still using CompressedOops. Tip: you ain’t
  9. Restarting multiple nodes too fast without waiting for the cluster to go green between node restarts
  10. …and last but not least: not taking Sematext Elasticsearch guru @radu0gheorghe’s upcoming Elasticsearch / ELK Stack Training course in October in NYC!  [Note: since this workshop has already taken place, stay up to date with future workshops at our Elasticsearch training page]

 

 

SolrCloud Rebalance API

This is a post of the work done at BloomReach on smarter index & data management in SolrCloud.  

Authors: Nitin Sharma – Search Platform Engineer & Suruchi Shah –  Engineering Intern

 

Nitin_intro

Introduction

In a multi-tenant search architecture, as the size of data grows, the manual management of collections, ranking/search configurations becomes non-trivial and cumbersome. This blog describes an innovative approach we implemented at BloomReach that helps with an effective index and a dynamic config management system for massive multi-tenant search infrastructure in SolrCloud.

Problem

The inability to have granular control over index and config management for Solr collections introduces complexities in geographically spanned, massive multi-tenant architectures. Some common scenarios, involving adding and removing nodes, growing collections and their configs, make cluster management a significant challenge. Currently, Solr doesn’t offer a scaling framework to enable any of these operations. Although there are some basic Solr APIs to do trivial core manipulation, they don’t satisfy the scaling requirements at BloomReach.

Innovative Data Management in SolrCloud

To address the scaling and index management issues, we have designed and implemented the Rebalance API, as shown in Figure 1. This API allows robust index and config manipulation in SolrCloud, while guaranteeing zero downtime using various scaling and allocation strategies. It has  two dimensions:

Nitin_strategy

The seven scaling strategies are as follows:

  1. Auto Shard allows re-sharding an entire collection to any number of destination shards. The process includes re-distributing the index and configs consistently across the new shards, while avoiding any heavy re-indexing processes.  It also offers the following flavors:
    • Flip Alias Flag controls whether or not the alias name of a collection (if it already had an alias) should automatically switch to the new collection.
    • Size-based sharding allows the user to specify the desired size of the destination shards for the collection. As a result, the system defines the final number of shards depending on the total index size.
  2. Redistribute enables distribution of cores/replicas across unused nodes. Oftentimes, the cores are concentrated within a few nodes. Redistribute allows load sharing by balancing the replicas across all nodes.
  3. Replace allows migrating all the cores from a source node to a destination node. It is useful in cases requiring replacement of an entire node.
  4. Scale Up adds new replicas for a shard. The default allocation strategy for scaling up is unused nodes. Scale up also has the ability to replicate additional custom per-merchant configs in addition to the index replication (as an extension to the existing replication handler, which only syncs the index files)
  5. Scale Down removes the given number of replicas from a shard.
  6. Remove Dead Nodes is an extension of Scale Down, which allows removal of the replicas/shards from dead nodes for a given collection. In the process, the logic unregisters the replicas from Zookeeper. This in-turn saves a lot of back-and-forth communication between Solr and Zookeeper in their constant attempt to find the replicas on dead nodes.
  7. Discovery-based Redistribution allows distribution of all collections as new nodes are introduced into a cluster. Currently, when a node is added to a cluster, no operations take place by default. With redistribution, we introduce the ability to rearrange the existing collections across all the nodes evenly.

Continue reading “SolrCloud Rebalance API”

Introducing Akka Monitoring

Akka is a toolkit and runtime for building highly concurrent, distributed and resilient message-driven applications on the JVM. It’s a part of Scala’s standard distribution for the implementation of the “actor model”.

How Akka Works

Messages between Actors are exchanged in Mailbox queues and Dispatcher provides various concurrency models, while Routers manage the message flow between Actors. That’s quite a lot Akka is doing for developers!

But how does one find bottlenecks in distributed Akka applications? Well, many Akka users already use the great Kamon Open-Source Monitoring Tool, which makes it easy to collect Akka metrics.  However — and this is important! — predefined visualizations, dashboards, anomaly detection, alerts and role-based access controls for the DevOps team are out of scope for Kamon, which is focused on metrics collection only.  To overcome this challenge, Kamon’s design makes it possible to integrate Kamon with other monitoring tools.

Needless to say, Sematext has embraced this philosophy and contributed the Kamon backend to SPM.  This gives Akka users the option to use detailed Metrics from Kamon along with the visualization, alerting, anomaly detection, and team collaboration functionalities offered by SPM.

The latest release of Kamon 0.5.x includes kamon-spm module and was announced on August 17th, 2015 on the Kamon blog.  Here’s an excerpt:

Pavel Zalunin from Sematext contributed the new kamon-spm module, which as you might guess allows you to push metrics data to the Sematext Performance Monitor platform. This contribution is particularly special to us, given the fact that this is the first time that a commercial entity in the performance monitoring sector takes the first step to integrate with Kamon, and they did it so cleanly that we didn’t even have to ask any changes to the PR, it was just perfect. We sincerely hope that more companies follow the steps of Sematext in this matter.

Now let’s take a look at the result of this integration work:

  • Metrics pushed to SPM are displayed in predefined reports, including:
    • An overview of all key Akka metrics
    • Metrics for Actors, Dispatchers and Routers
    • Common metrics for CPU, Memory, Network, I/O,  JVM and Garbage Collection
  • Each chart has the “Action” menu to:
    • Define criteria for anomaly detection and alerts
    • Create scheduled email reports
    • Securely share charts with read-only links
    • Embed charts into custom dashboards
  • A single SPM App can take metrics from multiple hosts to monitor a whole cluster; filters by Host, Actor, Dispatcher, and Router make it easy to drill down to the relevant piece of information.
  • All other SPM features, are available for Akka users, too.  For example:

Akka_overview

Akka Metrics Overview

Actor Metrics

Actors send and receive messages, therefore the key metrics for Actors are:

  • Time in Mailbox
    Messages are waiting to be processed in the Mailbox – high Time in Mailbox values indicate potential delays in processing.
  • Processing Time
    This is the time Actors need to process the received messages – use this to discover slow Actors
  • Mailbox Size
    Large Mailbox Size could indicate pending operations, e.g. when it is constantly growing.

Each of the above metrics is presented in aggregate for all Actors, but one can also use SPM filtering feature to view all Actors’ metrics separately or select one or more specific Actors and visualize only their metrics.  Filtering by Host is also possible, as show below.

Akka_actors

Akka Actors

Dispatcher Metrics

In Akka a Dispatcher is what makes Actors ‘tick’. Each Actor is associated with a particular Dispatcher (default one is used if no explicit Dispatcher is set). Each Dispatcher is associated with a particular Executor – Thread Pool or Fork Join Pool. The SPM Dispatcher report shows information about Executors:

  • Fork Join Pool
  • Thread Pool Executor

All metrics can be filtered by Host and Dispatcher.

Akka_dispatchers

Akka Dispatchers

Router Metrics

Routers can be used to efficiently route messages to destination Actors, called Routees.

  • Routing Time – Time to route message to selected destination
  • Time In Mailbox – Time spent in routees mailbox.
  • Processing Time – Time spent by routee actor to process routed messages
  • Errors Count – Errors count during processing messages by routee

For all these metrics, lower values are better, of course.

Akka_routers

Akka Routers

You can set Alerts and enable Anomaly Detection for any Akka or OS metrics you see in SPM and you can create custom Dashboards with any combination of charts, whether from your Akka apps or other apps monitored by SPM.

We hope you like this new addition to SPM.  Got ideas how we could make it more useful for you?  Let us know via comments, email, or @sematext.

Not using SPM yet? Check out the free 30-day SPM trial by registering here (ping us if you’re a startup, a non-profit, or education institution – we’ve got special pricing for you!).  There’s no commitment and no credit card required.  SPM monitors a ton of applications, like Elasticsearch, Solr, Cassandra, Hadoop, Spark, Node.js (open-source), Docker (get open-source Docker image), CoreOS, RancherOS and more.