通过Docker在自己电脑上启动Kafka

18-11-19 banq
         

想Apache Kafka在你自己的电脑上运行,需要Docker,docker-compose,一些磁盘空间和互联网连接。我们将使用来自https://www.confluent.io/的 Kafka Docker镜像。

我们之所以选择docker-compose(而不是Confluent CLI工具),是因为大多数现代开发人员已经熟悉Docker,而且很多人每天都在使用docker-compose; 这将使您能够轻松地在现有项目中包含Kafka群集配置:

我们的群集将包括:

  • 一位卡夫卡经纪人,
  • 一个Zookeeper实例,
  • 一个模式schema注册表(因为我们稍后会使用Avro)

在真正的生产系统中,一个Kafka经纪人和一个Zookeeper是不够的,但它对开发来说很好。

使用Confluent Kafka docker镜像,我们无需手动编写配置文件。相反,一切都可以通过环境变量进行配置,我们将Kafka的环境与容器配置分开存储。

1. Zookeeper的环境 

让我们从Zookeeper开始吧。向Zookeeper提供的最重要的选项是此实例ID,客户端端口以及群集中所有服务器的列表。

原始[url=https://gist.github.com/saabeilin/4708a2b1a16dbf3c52d53a721744a779#file-zookeeper-env]zookeeper.env[/url]

ZOOKEEPER_SERVERS=zookeeper-1:4182:5181
ZOOKEEPER_SERVER_ID=1
ZOOKEEPER_CLIENT_PORT=2181

KAFKA_HEAP_OPTS=-Xms32M -Xmx32M -verbose:gc

我们也配置了一些堆内存 - 这对于开发环境来说已经足够了。

Confluent的Zookeeper映像导出两个卷,data并log分别导出; 我们将不得不装载它们以保持持久性。因此,最小的Zookeeper服务描述将如下所示:

docker-compose.zookeeper.yaml

  zookeeper-1:
    image: confluentinc/cp-zookeeper:5.0.0
    hostname: zookeeper-1
    container_name: zookeeper-1
    ports:
      - "2181:2181"
    env_file:
      - zookeeper.env
    healthcheck:
      test: /bin/sh -c '[ \"imok\" = \"$$(echo ruok | nc -w 1 127.0.0.1 2181)\" ]' || exit 1
      interval: 1m
    volumes:
      - zookeeper-1-data:/var/lib/zookeeper/data/
      - zookeeper-1-log:/var/lib/zookeeper/log/

请注意,我们对主机、服务非常明确规定名称都是zookeeper-1,它们应与Zookeeper配置环境中的内容zookeeper-1相匹配。

2. Kafka

首先,我们需要指定kafka代理ID,将其指向先前配置的Zookeeper,并配置侦听器和发布商(代理在哪里监听以及在客户端连接时它通告的内容).

请注意,我们配置的代理应用,我可以通过kafka-1:9092从其他docker容器访问,也可以从你的主机通过localhost:29092访问。

Kafka可以在您第一次制作主题时自动创建主题; 这通常不是生产的最佳选择,但在开发中非常方便。在许多情况下,不希望删除主题,而在开发中它也没关系。因此,我们将启用主题自动创建和删除。

kafka.env

# This will be our first and only broker
KAFKA_BROKER_ID=1
# Define listeneres for accessing broker both inside Docker and from host machine
KAFKA_LISTENER_SECURITY_PROTOCOL_MAP=PLAINTEXT:PLAINTEXT,PLAINTEXT_HOST:PLAINTEXT
KAFKA_ADVERTISED_LISTENERS=PLAINTEXT://kafka-1:9092,PLAINTEXT_HOST://localhost:29092
# Zookeeper connection
KAFKA_ZOOKEEPER_CONNECT=zookeeper-1:2181

# In dev, we do not need Confluent metrics
CONFLUENT_SUPPORT_METRICS_ENABLE=false

# As well, we can add some heap pressure
KAFKA_HEAP_OPTS=-Xms256M -Xmx256M -verbose:gc

# In development enviroment, auto-creating topics (and deleting them) could be convenient
KAFKA_AUTO_CREATE_TOPICS_ENABLE=true
KAFKA_DELETE_TOPIC_ENABLE=true

# Eight partitions is more than enough for development
KAFKA_NUM_PARTITIONS=8

# Retain offsets for 31 days - in case you work on your project not that often
KAFKA_OFFSETS_RETENTION_MINUTES=44640

# Since we have just one broker, set replication factors to just one
KAFKA_DEFAULT_REPLICATION_FACTOR=1
KAFKA_OFFSETS_TOPIC_REPLICATION_FACTOR=1
KAFKA_TRANSACTION_STATE_LOG_REPLICATION_FACTOR=1
KAFKA_TRANSACTION_STATE_LOG_MIN_ISR=1
KAFKA_MIN_INSYNC_REPLICAS=1

# In development environ, we do not need to many threads
KAFKA_NUM_RECOVERY_THREADS_PER_DATA_DIR=1
KAFKA_NUM_NETWORK_THREADS=3
KAFKA_NUM_IO_THREADS=3

# Configure default log cleanup. You can override these on per-topic basis
KAFKA_LOG_CLEANUP_POLICY=compact

KAFKA_LOG_RETENTION_BYTES=-1
KAFKA_LOG_RETENTION_CHECK_INTERVAL_MS=300000
KAFKA_LOG_RETENTION_HOURS=-1

KAFKA_LOG_ROLL_HOURS=24
KAFKA_LOG_SEGMENT_BYTES=1048576
KAFKA_LOG_SEGMENT_DELETE_DELAY_MS=60000

服务描述配置是为zookeeper-1Kafka日志添加依赖关系和卷(数据日志!)

kafka-1:
    image: confluentinc/cp-kafka:5.0.0
    hostname: kafka-1
    container_name: kafka-1
    stop_grace_period: 3m
    depends_on:
      - zookeeper-1
    ports:
      - "29092:29092"
    env_file:
      - kafka.env
    volumes:
      - kafka-1-data:/var/lib/kafka/data/

3.schema注册表

Schema注册表是最容易配置的。这是环境:

schema-registry.env

SCHEMA_REGISTRY_HOST_NAME=schema-registry SCHEMA_REGISTRY_KAFKASTORE_CONNECTION_URL=zookeeper-1:2181
KAFKA_HEAP_OPTS=-Xms32M -Xmx32M -verbose:gc

服务:

docker-compose.schema-registry.yaml

  schema-registry:
    image: confluentinc/cp-schema-registry:5.0.0
    hostname: schema-registry
    container_name: schema-registry
    depends_on:
      - zookeeper-1
      - kafka-1
    ports:
      - "8081:8081"
    env_file:
      - schema-registry.env

4.一起运行

最后,拥有所有三个  .env文件和以下总的文件放在一个目录:

docker-compose.yaml

启动整个堆栈:

docker-compose up

启动堆栈可能需要一些时间,你会看到很多输出。当它终于结束时,我们可以尝试我们的新集群!