Kafka - Topic

Kafka Commit Log Messaging Process

About

The Kafka cluster stores streams of records in categories called topics.

A topic is also known as:

  • a category
  • or feed name.

A topic can have zero, one, or many consumers that subscribe to the data written to it.

Structure

For each topic, the Kafka cluster maintains a partitioned log that looks like this:

Log Anatomy

Topic 
   * -> partition 1
       * -> segment 11
       * -> segment 12
   * -> partition 2
       * -> segment 21
       * -> segment 22
.......

where:

Management

Creation

  • create a topic named “test” with a single partition and only one replica:
bin/kafka-topics.sh --create --zookeeper localhost:2181 --replication-factor 1 --partitions 1 --topic test
  • Other example
$ /usr/bin/kafka-topics --create --zookeeper hostname:2181/kafka --replication-factor 2  --partitions 4 --topic topicname 

Docker example where kafka is the service

docker-compose exec kafka  kafka-topics --create --topic foo --partitions 1 --replication-factor 1 --if-not-exists --zookeeper localhost:32181

This is one partition and one replica. For a production environment you would have many more broker nodes, partitions, and replicas for scalability and resiliency.

It is possible to create Kafka topics dynamically; however, this relies on the Kafka brokers being configured to allow dynamic topics.

Info

List

bin/kafka-topics.sh --list --zookeeper localhost:2181

Describe

kafka-topics --describe --topic foo --zookeeper localhost:32181
# With docker-compose and the kafka service
docker-compose exec kafka  kafka-topics --describe --topic foo --zookeeper localhost:32181
Topic:foo       PartitionCount:1        ReplicationFactor:1     Configs:
        Topic: foo      Partition: 0    Leader: 1       Replicas: 1     Isr: 1

There is one partition and one replica. For a production environment you would have many more broker nodes, partitions, and replicas for scalability and resiliency.

Show Structure

see Kafka - Consumer

Sync

To keep the two topics in sync you can either dual write to them from your client (using a transaction to keep them atomic) or, more cleanly, use Kafka Streams to copy one into the other.

Retention period

For example, if the retention policy is set to two days, then for the two days after a record is published, it is available for consumption, after which it will be discarded to free up space. Kafka's performance is effectively constant with respect to data size so storing data for a long time is not a problem.

Documentation / Reference





Discover More
Kafka Commit Log Messaging Process
Kafka (Event Hub)

Apache Kafka is a broker application that stores the message as a distributed commit log. The entire data storage system is just a transaction log. |data feeds Data Systems are exposing data, ...
Log Consumer
Kafka - (Consumer) Offset

The offset is the position of a consumer in a topic keyrecord Zookeeperconsumer groupStream Processing ...
Kafka Commit Log Messaging Process
Kafka - (Record|Message)

in Kafka. The Kafka cluster stores streams of records in categories called topics. Each record consists of: a key, a value, and a timestamp. See built-in timestamp org/apache/kafka/connect/data/Structorg.apache.kafka.connect.data.Struct...
Kafka Commit Log Messaging Process
Kafka - Client

Client in the sense of reading data from topics. Presto connector
Kafka Commit Log Messaging Process
Kafka - Internal Topic

Kafka internal topic are used by Kafka to run.
Kafka Commit Log Messaging Process
Kafka - Ksql

Without the LIMIT keyword, the SELECT query would run indefinitely until you stop it by pressing ==== Persistent Query ==== Unlike the non-persistent query above, Results from this query are written...
Kafka Commit Log Messaging Process
Kafka - Oracle

The connector for oracle There is actually two given by Oracle: the OGG (Oracle GoldenGate ) Kafka Handler (Supported), or an open source Kafka Connect handler And others: Xstream...
Kafka Commit Log Messaging Process
Kafka - Partition

in Kafka Each partition is an: ordered, immutable sequence of records that is continually appended to—a structured commit log. The records in the partitions are each assigned a sequential...
Kafka Commit Log Messaging Process
Kafka - Segment

where: partition topic
Kafka Commit Log Messaging Process
Kafka - Stream Application

in Kafka. The stream API The Kafka cluster stores streams of records in categories called topics. configuration-parameters|Doc...



Share this page:
Follow us:
Task Runner