Skip to product information
From practical Kafka development to operation
From practical Kafka development to operation
Description
Book Introduction
A book recommended by Jun Rao, co-founder of Apache Kafka!

Seung-Beom Ko, the author of "Kafka, the Strongest Data Platform," holds the title of Korea's first and only Confluent Certified Trainer for Apache Kafka and Confluent Certified Administrator for Apache Kafka. He has compiled all of his practical experience and know-how accumulated while operating Kafka on Korea's largest data platforms, including SKT and Kakao, into this book.
  • You can preview some of the book's contents.
    Preview

index
Chapter 1: Kafka Overview

1.1 Kafka adoption cases at Zalando and Twitter
__1.1.1 The Case of Zalando, Europe's Largest Online Fashion Mall
__1.1.2 A Case Study of Twitter, the SNS Powerhouse, Using Kafka
1.2 Current Status of Kafka Usage at Home and Abroad
1.3 Key Features of Kafka
1.4 Kafka's Growth
1.5 Various Kafka Use Cases
1.6 Summary

Chapter 2 Configuring the Kafka Environment

2.1 Setting up the practice environment for this book
__2.1.1 Configuring a Practice Environment in an AWS Environment
__2.1.2 Configuring a Practice Environment in an On-Premises Environment
2.2 Configuring the Kafka Cluster
2.3 A taste of Kafka in 5 minutes
__2.3.1 Kafka's basic configuration
__2.3.2 Sending and receiving messages
2.4 Summary

Chapter 3: Kafka Basic Concepts and Structure

3.1 Building the Basics of Kafka
__3.1.1 Replication
__3.1.2 Partition
__3.1.3 Segment
3.2 Kafka's Core Concepts
__3.2.1 Distributed Systems
__3.2.2 Page Cache
__3.2.3 Batch transfer processing
__3.2.4 Compressed transmission
__3.2.5 Topics, Partitions, and Offsets
__3.2.6 High Availability Guarantee
__3.2.7 ZooKeeper Dependencies
3.3 A taste of the producer's basic operations and examples
__3.3.1 Producer Design
__3.3.2 Producer's Main Options
__3.3.3 Producer Example
3.4 A taste of consumer basic operations and examples
__3.4.1 Basic operation of the consumer
__3.4.2 Consumer's Main Options
__3.4.3 Consumer Example
__3.4.4 Understanding Consumer Groups
3.5 Summary

Chapter 4: Kafka's Internal Operations and Implementation

4.1 Kafka Replication
__4.1.1 Replication Operation Overview
__4.1.2 Leaders and Followers
__4.1.3 Maintaining replication and committing
__4.1.4 Step-by-step replication behavior of leaders and followers
__4.1.5 Leader Epoch and Recovery
4.2 Controller
4.3 Log (Log Segment)
__4.3.1 Deleting log segments
__4.3.2 Log Segment Compaction
4.4 Summary

Chapter 5: Producer's Internal Operations and Implementation

5.1 Partitioner
__5.1.1 Round Robin Strategy
__5.1.2 Sticky Partitioning Strategy
5.2 Producer placement
5.3 Non-duplicate transmission
5.4 Exactly once transmission
__5.4.1 Design
__5.4.2 Producer Example Code
__5.4.3 Step-by-step operation
__5.4.4 Example Practice
5.5 Summary

Chapter 6: Internal Operations and Implementation of Consumers

6.1 Consumer Offset Management
6.2 Group Coordinator
6.3 Static Membership
6.4 Consumer Partition Allocation Strategy
__6.4.1 Range Partition Allocation Strategy
__6.4.2 Round Robin Partition Allocation Strategy
__6.4.3 Sticky Partition Allocation Strategy
__6.4.4 Cooperative Sticky Partition Allocation Strategy
6.5 Exactly Once Consumer Operation
6.6 Summary

Chapter 7: Kafka Operation and Monitoring

7.1 Configuring ZooKeeper and Kafka for Stable Operation
__7.1.1 Zookeeper Configuration
__7.1.2 Kafka Configuration
7.2 Configuring the Monitoring System
__7.2.1 Log Management and Analysis with Kafka as an Application
__7.2.2 Monitoring Kafka Metrics Using JMX
__7.2.3 Kafka Exporter
7.3 Summary

Chapter 8 Kafka Version Upgrade and Extension

8.1 Preparing for a Kafka Version Upgrade
8.2 Kafka Rolling Upgrade with ZooKeeper Dependencies
__8.2.1 Download and configure the latest version of Kafka
__8.2.2 Broker Version Upgrade
__8.2.3 Changing broker settings
__8.2.4 Precautions when upgrading
8.3 Kafka Extensions
__8.3.1 Broker Load Balancing
__8.3.2 Precautions when working with distributed batches
8.4 Summary

Chapter 9 Kafka Security

9.1 Three Elements of Kafka Security
__9.1.1 Encryption (SSL)
__9.1.2 Authentication (SASL)
__9.1.3 Permissions (ACL)
9.2 Kafka Encryption Using SSL
__9.2.1 Creating a Broker Keystore
__9.2.2 Creating a CA Certificate
__9.2.3 Creating a Trust Store
__9.2.4 Certificate signing
__9.2.5 Configuring SSL for the remaining brokers
__9.2.6 Adding SSL to Broker Settings
__9.2.7 SSL-based message transmission
9.3 Kafka Authentication Using Kerberos (SASL)
__9.3.1 Kerberos Configuration
__9.3.2 Authentication using keytab
__9.3.3 Broker Kerberos Settings
__9.3.4 Client Kerberos Settings
9.4 Setting Kafka Permissions Using ACLs
__9.4.1 Setting Broker Permissions
__9.4.2 Setting user-specific permissions
9.5 Summary

Chapter 10 Schema Registry

10.1 Concept and Usefulness of Schema
10.2 Kafka and Schema Registry
__10.2.1 Schema Registry Overview
__10.2.2 Avro Support in Schema Registry
__10.2.3 Installing the Schema Registry
10.3 Schema Registry Practice
__10.3.1 Schema Registry and Client Behavior
__10.3.2 Using the Schema Registry with Python
10.4 Schema Registry Compatibility
__10.4.1 BACKWARD Compatibility
__10.4.2 FORWARD Compatibility
__10.4.3 FULL Compatibility
__10.4.4 Schema Registry Compatibility Practice
10.5 Summary

Chapter 11 Kafka Connect

11.1 Core Concepts of Kafka Connect
11.2 The Inner Workings of Kafka Connect
11.3 Standalone Kafka Connect
__11.3.1 Running the File Source Connector
__11.3.2 Running the File Sync Connector
11.4 Distributed Mode Kafka Connect
11.5 Connector-based Mirror Maker 2.0
11.6 Summary

Chapter 12: Enterprise Kafka Architecture Configuration Case Studies

12.1 Overview of Kafka Architecture for Enterprises
12.2 Configuring Kafka for Enterprise
12.3 Operational Practice of Kafka for Enterprises
__12.3.1 Creating a Topic Using CMAK
__12.3.2 Kafka Connect Configuration
__12.3.3 Configuring the monitoring environment
__12.3.4 Sending and Confirming Messages
12.4 Summary

Chapter 13: Kafka's Development and Future

13.1 The Future of Kafka Without Zookeeper
__13.1.1 Restrictions on using ZooKeeper
__13.1.2 Kafka upgrade without ZooKeeper dependency
13.2 New Consensus Protocol
13.3 Optimized Controller Node Configuration
13.4 KIP Containing Kafka's Future
13.5 Summary

Appendix A MSK and Confluent Cloud
__A.1 MSK
__A.2 Confluent Cloud
__A.3 Comparison of MSK and Confluent Cloud

Appendix B: An Ensemble Tasting
__B.1 Characteristics of the ensemble
__B.2 Configuring the Practice Environment
__B.3 Ansible Features Overview

Appendix C: Installing Kafka with Docker
__C.1 Docker-based Kafka configuration
__C.2 Sending and receiving messages

Appendix D Q&A at a Glance
__D.1 Zookeeper Related
__D.2 Kafka related
__D.3 Producer Related
__D.4 Consumer Related

Detailed image
Detailed Image 1

Publisher's Review
Want to process massive amounts of data quickly and accurately, without loss? Kafka is the answer!

This is the most complete and detailed guidebook that covers everything about Kafka, from the internal structure and operation of Kafka that can be easily and quickly understood with rich illustrations, to the basic example code of the Kafka client and the core know-how required for actual operation, as well as the security and monitoring techniques that can operate Kafka safely 365 days a year, and the schema registry and Kafka Connect that can maximize operational convenience and efficiency.

What this book covers

- Kafka's internal structure and operating principles explained in an easy-to-understand manner with rich illustrations.
- Kafka client example code using Java and Python
- Deploying and operating Kafka in AWS and on-premises environments
- Kafka upgrade and maintenance strategies that minimize pain
- How to build security based on Apache Kafka
- Various uses of schema registry and Kafka Connect
- The internal workings of producers/consumers and rebalancing methods for properly using Kafka
- Kafka architecture configuration example in an enterprise environment
- Q&A summarizing the experiences and tips of industry experts

Target audience for this book

- Beginners who want to learn Kafka
- Operators who want to apply Kafka to their work
- Developers who want to know how Kafka works internally to get the most out of it.
- Developers who want to understand and utilize Kafka and the Kafka ecosystem.
- Architects concerned with data standardization and real-time processing
- Architects who want to collect, process, and analyze data efficiently.

Structure of this book

In Chapter 1, "Kafka Overview," we will learn about the challenges faced by leading companies and the unique strengths of Kafka through the Kafka adoption cases of Zalando and Twitter, as well as the current status of Kafka in Korea and abroad.
We'll explore the ever-evolving growth of Kafka and explore its use cases.

Chapter 2, "Configuring the Kafka Environment," covers the overall setup of this book's practical environment. It provides detailed instructions for configuring both AWS and on-premises environments, making it easy for even beginners to follow. Once Kafka is installed in your AWS environment, you'll learn about it through a simple introductory example.

In Chapter 3, 'Kafka Basic Concepts and Structure,' we will look at the basic structure of Kafka, the operation of producers and consumers, and example code to build basic knowledge of Kafka.

Chapter 4, 'Kafka's Internal Operations and Implementation', examines the main core functions of Kafka and focuses on the internal detailed operations related to replication, the role of the controller, logs, etc.

Chapter 5, "Producer Internal Operations and Implementation," examines the core functions of the producer, covering essential information for reliable message transmission in the producer, including partitioners, non-duplicate message transmission, and exactly-once transmission.

Chapter 6, "Consumer's Internal Operations and Implementation," examines the core functions of the consumer.
We'll examine key elements for stable consumer development and operation, including consumer offset management, group coordinator behavior, and partition strategies.

Chapter 7, 'Kafka Operation and Monitoring', examines methods and know-how for stable operation of Kafka.
Learn how to monitor reliably through hands-on practice of building your own monitoring system.

In Chapter 8, 'Kafka Version Upgrade and Scaling', you will learn and practice the essential version upgrade and scale-out methods for operating Kafka.

Chapter 9, 'Kafka Security', covers security-related topics provided by Kafka.
We also conduct a hands-on course on applying security.

Chapter 10, "Schema Registry," provides an overview of the schema registry and how to use it.
For those new to schema registries, we'll cover Avro-based hands-on exercises and example code.

Chapter 11, 'Kafka Connect', examines the basic concepts and operations of Kafka Connect.
Learn how to mirror with a brief introduction to the recently released Mirror Maker 2.0.

In Chapter 12, "Enterprise Kafka Architecture Configuration Case," we will practice integrating and configuring mirror makers, Elasticsearch, and monitoring in a similar way to an enterprise environment, based on what we have covered so far in the book.

Chapter 13, "Kafka's Evolution and Future," examines the background, current progress, and future direction of efforts to eliminate dependence on ZooKeeper, a recent hot topic.

In the four appendices, we compare the pros and cons of configuring Kafka on MSK and Confluent Cloud, respectively. We also added separate basic courses, such as the features of Ansible, setting up a practice environment, and installing Kafka using Docker, to make it easy for even beginners to understand.
We've also compiled a summary of frequently asked questions so you can easily review them.

Development environment for using this book

- Apache Kafka 2.6
- Apache Zookeeper 3.5.9
- Confluent Kafka 6.1
- Java 1.8
- Python 3.7.9
Ansible 2.9.12
- Operating system: On-premises (CentOS7), AWS (Amazon Linux 2)

As of late September 2021, when this book was being finalized, Apache Kafka 3.0 has been released.
However, the KRAFT algorithm, which was newly introduced in Apache Kafka 3.0 and has no ZooKeeper dependency, is not yet recommended for use in production environments.
Therefore, all practical examples and explanations in this book are based on Apache Kafka 2.6, which can be used stably in enterprise environments.

[A word from the beta reader]

Kafka has become a core system in MSA and EDA-based designs.
Thanks to the efforts of numerous contributors, its maturity has increased significantly, and it is being used by various companies, proving its stability even in processing large volumes of messages.
The author has contributed greatly to the domestic activation of Kafka and its various use cases.
He runs the Kafka Korea User Group and wrote "Kafka, the Strongest Data Platform," informing many Kafka beginners about basic theories and operational know-how.


The recently released 『Practical Kafka Development to Operations』 is a new version of a Kafka specialized book that contains the latest Kafka information and is filled with practical examples applicable to every chapter.
It was also written in an easy-to-understand way so that readers could fully understand the content related to Kafka.
In line with the increasingly high expectations of users, the book covers monitoring, security, schema registry, and how to utilize components such as Connect. In particular, the chapters on version upgrades, enterprise configuration, and Q&A contain the author's operational know-how, ensuring that readers' questions are completely resolved.
- Seungha Lee / Musinsa Billing Development Team

This book will appeal to a wide range of audiences, including students who want to learn about Kafka, developers who want to build applications using Kafka, and engineers who operate Kafka clusters.
This book will be beneficial to anyone listed here.
Students will be able to learn the essence of the problems Kafka is trying to solve, developers will be able to find clues to answer questions they have been curious about while developing producers/consumers, and administrators will gain the knowledge necessary to operate Kafka stably.

What makes this book even more special is that it contains a lot of information based on the author's extensive experience in running Kafka.
The author develops the content by assuming situations likely to be encountered in practice and providing appropriate information without being excessive or lacking.
For those who have been thinking about similar issues, this will be like a ray of hope.
I hope you, my readers, will find many of these tips as helpful as I did.
Nam Ji-yeol / DPG Media Machine Learning Engineer

In fact, constructing Kafka is not - in theory - that difficult.
This is because you can complete your own message pipeline by configuring ZooKeeper and Broker according to the official documentation and launching consumer/producer applications by referring to examples floating around on the Internet.
However, developers who use our Kafka have many requirements.
For example, "This message must never be lost!", "We want to put messages in S3, but do we have to develop and operate our own client?", or "Strange messages keep coming in and my client is crashing!"! And even a Kafka cluster that was working fine sometimes starts to misbehave.
In this way, Kafka says that the difference between theory and practice can be as great as the difference between heaven and earth, or perhaps even more.

This book clearly reflects the author's practical operational experience, interwoven with solid theory.
Therefore, this book will provide those who are just starting out with Kafka with a basic understanding of the Kafka ecosystem and its operation, while providing a foundation for more stable and advanced Kafka operation for those who are already operating.
- Kim Dae-ho / Woowa Brothers Cloud Platform Development Team

It covers the entire process from Kafka installation to operation, so even those who are new to Kafka can easily build and operate it by following the exercises one by one. In particular, it was good that it provided practice in a cloud environment, so that it could be quickly applied to practical work in line with the paradigm shift.
The author's goal is to share with readers his extensive operational experience and core technical know-how, accumulated through extensive hands-on experience, from the basic concepts required to handle Kafka to the construction of various operational environments used in practice.
This will be very helpful for those who are just starting out with Kafka or want to become an expert.
- Lee Sang-jin / Kakao Hadoop Engineering Part

[Author's Note]

As of the fall of 2021, three years have passed since the publication of my first book, "Kafka, the Powerhouse of Data Platforms," ​​in 2018. Compared to then, Kafka has significantly increased in recognition and status as a core infrastructure that enhances corporate added value. Furthermore, we are hearing news of numerous companies adopting Kafka.
And many companies that provide various services not only on-premises but also through the cloud are adopting Kafka and showing visible results.
However, the reality is that there is still a lack of reference materials and books for companies wishing to participate in the Kafka ecosystem.
Accordingly, after the publication of my first book, I wrote another Kafka book that would incorporate the diverse practical experiences I've accumulated in the field and reflect a more diverse range of technical aspects. By doing so, I hope to fulfill my mission of shining a light, however small, on the many operators, developers, and architects who are putting in a lot of effort and effort in the process of introducing and optimizing Kafka.

This book expands on the previous work, covering almost everything about Kafka, from its internal structure and operating principles to the issues encountered in actual work, such as installation, operation, maintenance, expansion, stability enhancement, and upgrades.
As an operator who has been introducing and using Kafka at the forefront, this is a comprehensive gift set for our readers.
If beginners learn the basic concepts in the book 『Kafka, the Strongest Data Platform』 and then learn more in-depth operations and new core elements in this book 『Practical Kafka Development to Operation』, I believe they will be able to fully acquire the theory and practical know-how necessary to operate Kafka in the field.
Of course, we also tried to include a lot of tips and knowledge that intermediate users should know.
I hope that many people will be able to understand Kafka more easily and utilize it more effectively through this book.
GOODS SPECIFICS
- Publication date: October 29, 2021
- Page count, weight, size: 512 pages | 180*235*25mm
- ISBN13: 9791189909345
- ISBN10: 1189909340

You may also like

카테고리