Solr Training in New York City

[Note: since this workshop has already taken place, stay up to date with future workshops at our Solr Training page]

——-

For those of you interested in some comprehensive Solr training taught by an expert from Sematext who knows it inside and out, we’re running a super hands-on training workshop in New York City from October 19-20.

This two-day workshop will be taught by Sematext engineer — and author of Solr books — Rafal Kuc.

Target audience:

Developers and Devops who want to configure, tune and manage Solr at scale.

What you’ll get out of it:

In two days of training Rafal will help:

bring Solr novices to the level where he/she would be comfortable with taking Solr to production
give experienced Solr users proven and practical advice based on years of experience designing, tuning, and operating numerous Solr clusters to help with their most advanced and pressing issues

* See the Course Outline at the bottom of this post for details

When & Where:

Dates: October 19 & 20 (Monday & Tuesday)
Time: 9:00 a.m. — 5:00 p.m.
Location: New Horizons Computer Learning Center in Midtown Manhattan (map)
Cost: $1,200 “early bird rate” (valid through September 1) and $1,500 afterward. And…we’re also offering a 50% discount for the purchase of a 2nd seat!
Food/Drinks: Light breakfast and lunch will be provided

Attendees will go through several sequences of short lectures followed by interactive, group, hands-on exercises. There will be a Q&A session after each such lecture-practicum block.

Got any questions or suggestions for the course? Just drop us a line or hit us @sematext!

Lastly, if you can’t make it…watch this space or follow @sematext — we’ll be adding more Solr training workshops in the US, Europe and possibly other locations in the coming months. We are also known worldwide for our Solr Consulting Services and Solr Production Support.

Hope to see you in the Big Apple in October!

——-

Solr Training Workshop – Course Outline

Introduction to Solr

What is Solr and use – cases
Solr master – slave architecture
SolrCloud architecture
Why & When SolrCloud
Solr master – slave vs SolrCloud
Starting Solr with schema-less configuration
Indexing documents
Retrieving documents using URI request
Deleting documents

Indexing data

Index structure configuration
Defining custom field types
Tokenizers
Char filters
Filters
Dynamic fields
Copy fields
Running Solr with our own configuration
XML data format explained
JSON data format explained
CSV data format explained
Batch indexing
Doc values
Norms
Term vectors
Nested documents support

Searching

Simple URI search
Paging
Sorting
Filters
Choosing display fields
Pseudo fields
Debug query
Lucene query language
Standard query parser
Dismax query parser
Extended dismax query parser
Examples of other parsers
Timing out searches
Using cursor for deep paging
Nested documents support

Data analysis

Introduction to faceting
Basic use cases
Field faceting
Field prefix faceting
Sorting faceting results
Limiting faceting
Faceting execution control
Range faceting
Date faceting
Interval faceting
Hierarchical faceting
JSON facets
Facet functions
Nested JSON facets
Using stats component to generate statistics for field
Using stats component with function queries
Using stats component with faceting
Using stats component to calculate distinct field values

Beyond Search – highlighting and more like this

Introduction to highlighting
Highlighting query hits
Specifying fields to highlight
Choosing highlighting tags
Merging phrases
Using FastVectorHighlighter
Using PhraseHighlighter
Finding similar documents
Prerequisites for More Like This functionality
Configuring More Like This functionality

Beyond Search – Spellchecking

Spellchecker with its own index
File based spellchecker
Index based spellchecker
Building spellchecker
Including spell checking results with queries
Querying spellchecker independently
Maximum number of suggestions
Collation
Controlling collation
Accuracy
Extended results
Performance considerations

Beyond Search – Documents grouping

Grouping documents by field value
Grouping documents by function value
Grouping documents by query
Paging in grouped results
Controlling number of groups and documents count
Sorting inside groups
Documents grouping and faceting
Using collapse query parser
Using expand component

Function queries

Using function queries
Math function queries
Term function queries
Example use cases
Boosting by using functions
Sorting by function
External file field type
Using external file field type for boosting

Search under control

Routing
Index time routing
Query time routing
Basic syntax for local params
Parameter dereferencing
Using parameter dereferencing in handlers configuration
Using faceting tagging
Using faceting exclusions

Tuning Solr

General solrconfig.xml sections
Lucene directory configuration
Codec factory settings
Schema factory settings
Indexing threads
Indexing buffer size
Merge policy
Merge scheduler
Auto commit tuning
Transaction log configuration
Slow query threshold
Caches
Replication
Replication throttling

Scaling Solr

Proper Solr master configuration
Proper Solr slaves configuration
Replication for master – slave
Querying in master – slave deployment
Multiple masters architecture
Setting up Solr slaves for multiple masters
Indexing data in multi-master environment
Querying in multi-master environment

Scaling SolrCloud

ZooKeeper role explained
Uploading configuration to ZooKeeper
Sharding
Using collections API
Cluster state explained
Creating replicas
Removing replicas
Caches in SolrCloud
Shard splitting
Migrating data between collections
Aliases
Adding shards with implicit routing

Operations

Running Solr as a service on Linux systems
Running Solr as a service on MS Windows systems
Backing up Solr master – slave
Backing up SolrCloud
Current cluster state view
Monitoring using JMX
Monitoring using SPM
Configuration of Solr logging

Developer APIs

Connecting to Solr using Java
Connecting to SolrCloud using Java
Using SolrJ to index data
Using SolrJ to query Solr
Connecting to Solr using Python
Using pySolr to index data
Using pySolr to query Solr
Streaming aggregations explained

Ecosystem

Using Flume with Solr
Using Logstash with Solr
Solritas as the out of the box tool for data discovery
Visualizing data using Banana

Yeah — you’ll learn a TON in just two days!

2 thoughts on “Solr Training in New York City — October 19-20”

Anil says:

August 12, 2015 at 12:09 PM

Great content for training. Can this be offered online

1. sematext says:
  
  August 12, 2015 at 6:01 PM
  
  Anil – yes, we do offer on site trainings (we fly in) and remote trainings (we sit at our place, your people at your place, and we deliver the training remotely) as well. Please let us know if either is of interest.