
Index
SYMBOL
A
abandoned shopping carts
AbandonedCartEvent class, 2nd
AbandonedCartEvent.java file
AbandonedCartsStreamTask.java file
accounts, AWS, setting up
AFTER_SEQUENCE_NUMBER shard iterator type
alert schema, writing
alerts topic, 2nd
Amazon DynamoDB.
See DynamoDB.
Amazon Elastic Compute Cloud (EC2)
Amazon Elastic MapReduce (EMR), 2nd, 3rd
Amazon Kinesis, 2nd, 3rd
attaching function to
reading from
Kinesis frameworks and SDKs
monitoring stream with boto
reading events with AWS CLI
setting up
terminology differences from Apache Kafka
writing events to
modeling events
setting up stream
systems monitoring and unified log
Amazon Redshift, 2nd
creating fat events table
designing event warehouse
fat table
shredded entities
table per event
setting up
Amazon Simple Email Service (SES)
Amazon Simple Storage Service (S3), 2nd, 3rd, 4th, 5th
buckets, 2nd
uploading jar file to
Amazon Web Services.
See AWS.
AmazonKinesisFullAccess policy
AmazonS3FullAccess policy
AmountValidator object
analytics-on-read
analytics-on-write, 2nd
Lambda function
deploying
testing
Lambda function, building
analytics-on-write algorithm
AWS Lambda overview
conditional writes to DynamoDB
Lambda setup and event modeling
setting up DynamoDB
AowLambda function
Apache Avro, modeling events in
round-tripping event from JSON to Java and back
setting up development harness
testing
writing health check event schema
Apache Beam project
Apache Cassandra, 2nd
Apache Flink, 2nd, 3rd, 4th, 5th
Apache Flume
Apache Hadoop, 2nd, 3rd, 4th, 5th
Apache Hadoop YARN
Apache HBase
Apache Kafka, 2nd, 3rd, 4th, 5th, 6th
downloading and installing
reading from
stream-processing app
configuring application
locking down requirements
setting up development environment
single-event processor
stitching files together
testing
using Kafka as glue between systems
terminology differences from Amazon Kinesis
using as glue between systems
writing to
Apache Kafka Streams
Apache Mesos
Apache Samza, 2nd, 3rd
capabilities of
detecting abandoned shopping carts
configuring job
designing job
preparing project
writing Java task
running job
improving job
submitting job
testing job
Yet Another Resource Negotiator (YARN)
Apache Spark, 2nd, 3rd
Apache Spark Streaming, 2nd, 3rd
Apache Storm, 2nd, 3rd
Apache Thrift
Apache ZooKeeper, 2nd, 3rd, 4th
application-level logging
ApplicationMaster component, YARN
approximations
archiving events
archiving Apache Kafka with Secor
creating event archive
setting up Secor
warming up Apache Kafka
batch processing
designing job
overview
running job on Elastic MapReduce
writing job in Apache Spark
design for
how to archive
what to archive
where to archive
shortcomings of unified log
refinement
reprocessing
resilience
archivist’s manifesto
AT_SEQUENCE_NUMBER shard iterator type
Avro.
See Apache Avro.
AWS (Amazon Web Services), 2nd, 3rd, 4th
account set up
users, creating
AWS CLI
reading events with
setting up
tools, 2nd
AWS Free Tier, 2nd
AWS Python SDK
Azure Blob Storage
B
bad-events topic, 2nd, 3rd
Base64 encoder
batch processing archives
designing job
overview
running job on Elastic MapReduce
writing job in Apache Spark
batch processing framework
Bifrost tool
binary key-value pairs
body, of events
boto, 2nd, 3rd, 4th, 5th
bounded windows
browser-generated events
bun operator
C
Camus tool
Cart class
cascade failure
Cassandra, 2nd
CEP (complex event processing)
checked exceptions
CLI (command-line interface), 2nd
CloudFormation template
clusters, Redshift
collectd
command hierarchy
command priority
command-execution job
command-line interface (CLI), 2nd
commands
consuming
parsing commands
reading commands
testing
tool for
events and
executing
completing executor
final testing
signing up for Mailgun
execution failures
hierarchies
implicit vs. explicit
in Plum example
in unified log
modeling
one stream of vs. many
writing alert schema
complex event processing (CEP)
composable failure handling
compute nodes
conditional writes
Connect S3 tool
console-consumer, 2nd
consumer group, Kafka
continuous event streams, 2nd
control flow
COPY command
COPY from JSON statement
COUNT DISTINCT, SQL
count() function, 2nd
CREATE TABLE statement
create-cluster command
CREATING stream status
D
dashboards
data points
data processing framework
data serialization format
data warehouses, 2nd, 3rd
data-definition language (DDL)
Databricks Cloud
DataFrame type
DDL (data-definition language)
decision-making job
decisioning
delivery guarantees
derived events
describe-stream command
dimension widening
Disco framework
distributed frameworks
domain-specific language (DSL)
don’t repeat yourself (DRY)
DRIVER_MISSES_CUSTOMER tag, 2nd
DSL (domain-specific language)
durability
DynamoDB, 2nd, 3rd, 4th, 5th, 6th, 7th
E
EC2 (Elastic Compute Cloud)
Elastic Compute Cloud (EC2)
Elastic MapReduce (EMR), 2nd, 3rd
Elasticsearch
Elasticsearch plus Kibana
email_sent event, 2nd
embeddable stream processing frameworks
EMR (Elastic MapReduce), 2nd, 3rd
enriched-events topic, 2nd, 3rd
enriching events, 2nd, 3rd, 4th
error collection services
Esper tool
ETL (extract, transform, load), 2nd
data volatility
dimension widening
loading events
Event class
event IDs, 2nd, 3rd
event metadata
event property
event source mapping
event stream processing
multiple-event processing
reasons for
single-event processing
with Amazon Kinesis
reading from Kinesis
writing events to Kinesis
with Apache Kafka
app design
single-event processor
writing Kafka worker
event streams, 2nd
defined
delivery driver events and entities
delivery truck events and entities
event model
events archive
familiar types of
application-level logging
publish/subscribe messaging
web analytics
event warehouse, designing
fat table
shredded entities
table per event
events
associating with schemas
modest proposals
schema registry
self-describing event
commands and
defined
e-commerce
identifying key
modeling, 2nd
in Apache Avro
modeling failures as events
stateful stream processing
writing to Amazon Kinesis
reading from Amazon Kinesis
Kinesis frameworks and SDKs
monitoring stream with boto
reading events with AWS CLI
reading from Apache Kafka
writing to Amazon Kinesis
setting up stream
systems monitoring and unified log
terminology differences from Kafka
writing agent
writing to Apache Kafka
events stream, 2nd, 3rd
exactly-once processing
execute method, 2nd
executing commands
completing executor
final testing
signing up for Mailgun
exhaustive pattern match
exit values
extract, transform, load.
See ETL.
F
fact tables
failure
failure composition with Scalaz
better failure handling through Scalaz
composing failures
from Java to Scala
planning for failure
setting up our Scala project
Java and
logging and
unified log and
composing happy path across jobs
design for failure
modeling failures as events
Unix programs and
fallocate command
fat events table, 2nd
fat jar
fault tolerance
FIFO (first in, first out)
file variable
Filebeat, 2nd
filter operation
filtering events
first in, first out (FIFO)
first-class entity
Flafka tool
flatMap operation
Flink.
See Apache Flink.
Fluentd, 2nd
Flume, 2nd
for comprehensions
for loop
foreach operation
fragmented decisioning
framework limitations
fully baked commands, 2nd
G
H
Hadoop Distributed File System (HDFS), 2nd, 3rd
Hadoop.
See Apache Hadoop.
Hadoop SequenceFile
happy track
HBase
HDFS (Hadoop Distributed File System), 2nd, 3rd
Heron
heterogeneous streams
holistic systems monitoring
Home Alone operator
homogeneous streams
hosted unified log service
hot-swapping data application versions
hybrid era
I
IaaS (infrastructure-as-a-service)
IAM (identity and access management), 2nd
idealized happy path
implicit commands
in-process memory
indirect object
infrastructure-as-a-service (IaaS)
IP addresses
J
jackson-databind dependency
Java
failure and
failure composition with Scalaz
Java SE8 JDK
Java virtual machine (JVM)
JavaScript
JavaScript Object Notation (JSON), 2nd
JMX
JSON (JavaScript Object Notation), 2nd
JSON Paths file
JSON Schema, 2nd
JVM (Java virtual machine)
K
Kafka Connect S3, Confluent
Kafka.
See Apache Kafka.
kafka-clients dependency
Kappa Architecture
KCL (Kinesis Client Library), 2nd
key-value store, 2nd
Kinesis Client Library (KCL), 2nd
Kinesis.
See Amazon Kinesis.
Kinesis Storm Spout
kinesis-s3 tool
Kreps, Jay
ksh shell, Unix
L
Lambda function
building
analytics-on-write algorithm
AWS Lambda overview
conditional writes to DynamoDB
finalizing Lambda
Lambda setup and event modeling
setting up DynamoDB
deploying
attaching function to Kinesis
configuring permissions
uploading to S3
testing
late-arriving data
LATEST shard iterator type
LevelDB
librit
list command
Location timestamp column
log collection agents
log everything platform
log storage
log-file-analysis tool
log-industrial complex
Log4j framework
Logback
logging frameworks
Logstash, 2nd
low-latency analytics, 2nd
low-latency data pipelines
low-latency operational reporting
M
Mailgun, 2nd
make bucket (mb) command
managed policies
MapReduce, 2nd
Marz, Nathan
massively parallel processing (MPP)
MaxMind, 2nd, 3rd
maxmindFile argument
mb (make bucket) command
MD5 hash
MECE (mutually exclusive, collectively exhaustive)
Mesos
metrics
microbatch of events, 2nd
microbatch processing framework
minimum function
modeling
commands
events
Monitoring tab, Stream Details view
MPP (massively parallel processing)
multi-node cluster
multiple-event processing, 2nd, 3rd
MUST SET section
mutually exclusive, collectively exhaustive (MECE)
N
NEL (NonEmptyList), 2nd, 3rd
newline-delimited JSON
NextShardIterator
nile codebase
nile-carts job
Node.js cluster
NodeManager component, YARN
nodes
NonEmptyList (NEL), 2nd, 3rd
Notification child record
novel errors, 2nd
NSQ brokering events
nsqd daemon
nsqlookupd daemon
nswq_tail app
O
offset
ongoing states
OOPS
analytics-on-write and
analytics-on-write algorithm
Kinesis setup
requirements gathering
event stream
delivery driver events and entities
delivery truck events and entities
event model
events archive
Open Exchange Rates service
OpenStack Swift
out-of-band failure path
P
parsing commands
partitions, 2nd
pass-through producer
pattern matching
payload of events
pipe character (|)
pipefail option
Plain Old Java Object (POJO)
point-to-point connections, 2nd
POJO (Plain Old Java Object)
Postfix
PostgreSQL GUI
pragmatic happy path
pre-aggregate microbatch
prepositional objects, 2nd
processing window
producing events
product property
program termination
Prometheus
protocol buffers
psql connection
psql tool
publish-subscribe message queue
publish/subscribe messaging
pull-based monitoring
push-based monitoring
Q
R
railway-oriented processing
building railway
failure and Java
failure and logging
failure and unified log
composing happy path across jobs
design for failure
modeling failures as events
failure and Unix programs
failure composition with Scalaz
better failure handling through Scalaz
composing failures
from Java to Scala
planning for failure
setting up Scala project
overview
raw-events stream, 2nd, 3rd
raw-events topic, 2nd, 3rd, 4th
RDD (resilient distributed dataset)
read-eval-print loop (REPL)
Recipient child record
Redis
Redshift.
See Amazon Redshift.
refinement processes
REGION parameter
remote data store
reordering events
REPL (read-eval-print loop)
reprocessing, 2nd
resilience
resilient distributed dataset (RDD)
ResourceManager component, YARN
retention period
Riak Cloud Storage (CS)
RocksDB, 2nd
Rollbar service
S
s3 command
S3.
See Amazon Simple Storage Service (S3).
SaaS (software-as-a-service), 2nd, 3rd
Samza.
See Apache Samza.
Sawmill
SBT (Scala Build Tool)
Scala Build Tool (SBT), 2nd
scala-forex project, Snowplow
scalability
Scalaz, failure composition with
better failure handling
composing failures
from Java to Scala
planning for failure
setting up Scala project
Schema Guru, 2nd
schemas
as contracts
associating events with
modest proposals
schema registry
self-describing event
modeling events in Apache Avro
round-tripping event from JSON to Java and back
setting up development harness
testing
writing health check event schema
Plum example company
technologies
Apache Avro
Apache Thrift
capabilities of
choosing
JSON Schema
protocol buffers
Scream operator, 2nd
Scribe project
SDKs, Kinesis frameworks and
second-order events
Secor SequenceFile
Secor, archiving Apache Kafka with
creating event archive
setting up Secor
warming up Apache Kafka
self-describing commands
self-describing events
send email command
SendGrid
Sentry service
separation of concerns
serdes (serializers-deserializers)
Serializers section
server-side events
SES (Amazon Simple Email Service)
shard iterator, 2nd
shards, 2nd
shopping carts, detecting abandoned
defining algorithm
derived events stream
goals regarding
modeling events
Shopper abandons cart
Shopper adds item to cart
Shopper places order
with Apache Samza
configuring job
designing job
preparing project
writing Java task
shrink-wrapped commands, 2nd
single version of truth, 2nd
single-event processing, 2nd, 3rd
single-event processors, 2nd, 3rd
testing
updating main function
writing
single-node clusters
slaves, 2nd
SLF4J
software-as-a-service (SaaS), 2nd, 3rd
Spark SQL
Spark Streaming.
See Apache Spark Streaming.
spark-shell command
SparkConf, 2nd
speed layer
Splunk
SQLite
stable data points
state management, 2nd
stateful event processing
stateful stream processing
detecting abandoned shopping carts
defining algorithm
derived events stream
goals regarding
frameworks for
Apache Flink
Apache Kafka Streams
Apache Samza
Apache Spark Streaming
Apache Storm
choosing
modeling events
Shopper abandons cart
Shopper adds item to cart
Shopper places order
state management
stream windowing
with Apache Samza
configuring job
designing job
improving job
preparing project
submitting job
testing job
writing Java task
Yet Another Resource Negotiator (YARN)
StatsD
stderr stream
stdin stream
stdout stream, 2nd
storage target
Storm.
See Apache Storm.
Storm Trident library
Stream Details view
stream processing frameworks
stream windowing
stream-processing app
configuring application
locking down requirements
reading from Kafka
setting up development environment
single-event processor
testing
updating main function
writing event processor
stitching files together
testing
using Kafka as glue between systems
writing to Kafka
stream-processing topology
streams
String type
subject-verb-object events, 2nd, 3rd, 4th, 5th, 6th
subscribing apps
Success output
systems monitoring, 2nd
Systems section
systems-monitoring tool
T
Tachyon filesystem
tag manager, JavaScript
task distribution
Task section
terminated set of records
testing commands
Thrift framework
time frames
tinylog
topics
transactional systems
transformation warnings
trial-and-error deserialization
trim horizon, Kinesis
TRIM_HORIZON type, 2nd
trimmed events, 2nd
U
ulp (unified log processing)
ulp database
ulp-assets bucket
unified era
unified log, 2nd, 3rd
evolution of business data processing
classic era
hybrid era
unified era
example implementation
creating stream
downloading and installing Apache Kafka
identifying key events
sending and receiving events
setting up
failure and
composing our happy path across jobs
design for failure
modeling failures as events
properties of
append-only
distributed
ordered
unified
shortcomings of
systems monitoring and
use cases for
customer feedback loops
holistic systems monitoring
hot-swapping data application versions
unified log analytics
unified log processing (ulp)
UNION command
Universal Analytics
Unix pipeline
Unix programs, failure and
unrecoverable failure
unterminated event stream, 2nd
user security credentials
users, AWS, setting up
V
Vagrant
validating events, 2nd, 3rd, 4th
Validation inputs
Validation outputs
Validation type
Vertica
vertical bar character (|)
VirtualBox
volatile data points
W
web analytics
wide data coverage, analytics
window() function, 2nd, 3rd
Wlaschin, Scott
WMI
workers
Writer module
writing events, 2nd