The default DC/OS Apache Kafka installation provides reasonable defaults for trying out the service, but may not be sufficient for production use. You may require a different configuration depending on the context of the deployment.
Installing with Custom Configuration
The following are some examples of how to customize the installation of your Apache Kafka instance.
In each case, you would create a new Apache Kafka instance using the custom configuration as follows:
dcos package install kafka --options=sample-kafka.json
We recommend that you store your custom configuration in source control.
Installing multiple instances
By default, the Apache Kafka service is installed with a service name of kafka
. You may specify a different name using a custom service configuration as follows:
{
"service": {
"name": "kafka-other"
}
}
When the above JSON configuration is passed to the package install kafka
command via the --options
argument, the new service will use the name specified in that JSON configuration:
dcos package install kafka --options=kafka-other.json
Multiple instances of Apache Kafka may be installed into your DC/OS cluster by customizing the name of each instance. For example, you might have one instance of Apache Kafka named kafka-staging
and another named kafka-prod
, each with its own custom configuration.
After specifying a custom name for your instance, it can be reached using dcos kafka
CLI commands or directly over HTTP as described below.
Installing into folders
In DC/OS 1.10 and later, services may be installed into folders by specifying a slash-delimited service name. For example:
{
"service": {
"name": "/foldered/path/to/kafka"
}
}
The above example will install the service under a path of foldered
=> path
=> to
=> kafka
. It can then be reached using dcos kafka
CLI commands or directly over HTTP as described below.
Addressing named instances
After you’ve installed the service under a custom name or under a folder, it may be accessed from all dcos kafka
CLI commands using the --name
argument. By default, the --name
value defaults to the name of the package, or kafka
.
For example, if you had an instance named kafka-dev
, the following command would invoke a pod list
command against it:
dcos kafka --name=kafka-dev pod list
The same query would be over HTTP as follows:
curl -H "Authorization:token=$auth_token" <dcos_url>/service/kafka-dev/v1/pod
Likewise, if you had an instance in a folder like /foldered/path/to/kafka
, the following command would invoke a pod list
command against it:
dcos kafka --name=/foldered/path/to/kafka pod list
Similarly, it could be queried directly over HTTP as follows:
curl -H "Authorization:token=$auth_token" <dcos_url>/service/foldered/path/to/kafka-dev/v1/pod
You may add a -v
(verbose) argument to any dcos kafka
command to see the underlying HTTP queries that are being made. This can be a useful tool to see where the CLI is getting its information. In practice, dcos kafka
commands are a thin wrapper around an HTTP interface provided by the DC/OS Apache Kafka Service itself.
Integration with DC/OS access controls
In Enterprise DC/OS, DC/OS access controls can be used to restrict access to your service. To give a non-superuser complete access to a service, grant them the following list of permissions:
dcos:adminrouter:service:marathon full
dcos:service:marathon:marathon:<service-name> full
dcos:service:adminrouter:<service-name> full
dcos:adminrouter:ops:mesos full
dcos:adminrouter:ops:slave full
Where <service-name>
is your full service name, including the folder if it is installed in one.
Service Settings
Placement Constraints
Placement constraints allow you to customize where a service is deployed in the DC/OS cluster. Placement constraints use the Marathon operators syntax. For example, [["hostname", "UNIQUE"]]
ensures that at most one pod instance is deployed per agent.
A common task is to specify a list of whitelisted systems to deploy to. To achieve this, use the following syntax for the placement constraint:
[["hostname", "LIKE", "10.0.0.159|10.0.1.202|10.0.3.3"]]
Updating Placement Constraints
Clusters change, and as such so will your placement constraints. However, already running service pods will not be affected by changes in placement constraints. This is because altering a placement constraint might invalidate the current placement of a running pod, and the pod will not be relocated automatically as doing so is a destructive action. We recommend using the following procedure to update the placement constraints of a pod:
- Update the placement constraint definition in the service.
- For each affected pod, one at a time, perform a
pod replace
. This will (destructively) move the pod to be in accordance with the new placement constraints.
Enterprise
ZonesRequires: DC/OS 1.11 Enterprise or later.
Placement constraints can be applied to DC/OS zones by referring to the @zone
key. For example, one could spread pods across a minimum of three different zones by including this constraint:
[["@zone", "GROUP_BY", "3"]]
For the @zone constraint to be applied correctly, DC/OS must have Fault Domain Awareness enabled and configured.
Virtual networks
DC/OS Apache Kafka supports deployment on virtual networks on DC/OS (including the dcos
overlay network), allowing each container (task) to have its own IP address and not use port resources on the agent machines. This can be specified by passing the following configuration during installation:
{
"service": {
"virtual_network_enabled": true
}
}
User
By default, all pods’ containers will be started as system user “nobody”. If your system configured for using over system user (for instance, you may have externally mounted persistent volumes with root’s permissions), you can define the user by defining a custom value for the service’s property “user”, for example:
{
"service": {
"properties": {
"user": "root"
}
}
}
Regions
The service parameter region
can be used to deploy the service in an alternate region. By default the service is deployed in the “local” region, which is the region the DC/OS masters are running in. To install a service in a specific reason, include in its options:
{
"service": {
"region": "<region>"
}
}
Configuring the ZooKeeper Connection
Apache Kafka requires a running ZooKeeper ensemble to perform its own internal accounting. By default, the DC/OS Apache Kafka Service uses the ZooKeeper ensemble made available on the Mesos masters of a DC/OS cluster at master.mesos:2181/dcos-service-<servicename>
. At install time, you can configure an alternate ZooKeeper for Kafka to use. This enables you to increase Kafka’s capacity and removes the DC/OS System ZooKeeper ensemble’s involvement in running it.
To configure an alternate Zookeeper instance:
-
Create a file named
options.json
with the following contents. If you are using the DC/OS Apache ZooKeeper service, use the DNS addresses provided by thedcos kafka-zookeeper endpoints clientport
command as the value ofkafka_zookeeper_uri
.Here is an example
options.json
which points to akafka-zookeeper
instance namedkafka-zookeeper
:{ "kafka": { "kafka_zookeeper_uri": "zookeeper-0-server.kafka-zookeeper.autoip.dcos.thisdcos.directory:1140,zookeeper-1-server.kafka-zookeeper.autoip.dcos.thisdcos.directory:1140,zookeeper-2-server.kafka-zookeeper.autoip.dcos.thisdcos.directory:1140" } }
-
Install Kafka with the options file you created.
dcos package install kafka --options="options.json"
You can also update an already-running Kafka instance from the DC/OS CLI, in case you need to migrate your ZooKeeper data elsewhere.
dcos kafka --name=kafka update start --options=options.json
Zone/Rack-Aware Placement and Replication
Kafka’s “rack”-based fault domain support is automatically enabled when specifying a placement constraint that uses the @zone
key. For example, you could spread Kafka nodes across a minimum of three different zones/racks by specifying the constraint [["@zone", "GROUP_BY", "3"]]
. When a placement constraint specifying @zone
is used, Kafka nodes will be automatically configured with rack
s that match the names of the zones. If no placement constraint referencing @ zone
is configured, all nodes will be configured with a default rack of rack1
.
In addition to placing the tasks on different zones/racks, the zone/rack information will be added to each Kafka broker’s broker.rack setting. This enables Kafka to ensure data is replicated between zones/racks and not to two nodes in the same zone/rack.
Extend the Kill Grace Period
When performing a requested restart or replace of a running broker, the Kafka service will wait a default of 30
seconds for a broker to exit, before killing the process. This grace period may be customized via the brokers.kill_grace_period
setting. In this example, the DC/OS CLI is used to increase the grace period delay to 60 seconds. This example assumes that the Kafka service instance is named kafka
.
During the configuration update, each of the Kafka broker tasks are restarted. During the shutdown portion of the task restart, the previous configuration value for brokers.kill_grace_period
is in effect. Following the shutdown, each broker task is launched with the new effective configuration value. Make sure to monitor the amount of time Kafka brokers take to cleanly shut down by observing their logs.
Replacing a Broker with Grace
The grace period must also be respected when a broker is shut down before replacement. While it is not ideal that a broker must respect the grace period even if it is going to lose persistent state, this behavior will be improved in future versions of the SDK. Broker replacement generally requires complex and time-consuming reconciliation activities at startup if there was not a graceful shutdown, so the respect of the grace kill period still provides value in most situations. It is recommended to set the kill grace period only sufficiently long enough to allow graceful shutdown. Monitor the Kafka broker clean shutdown times in the broker logs to keep this value tuned to the scale of data flowing through the Kafka service.
Enterprise
Configuring Secure JMXApache Kafka supports Secure JMX allowing you to remotely manage and monitor the Kafka JRE. This can be specified by passing the following configuration during installation:
{
"service": {
"jmx": {
"enabled": true,
"port": 31299,
"rmi_port": 31298,
"access_file": "<path_to_secret>",
"password_file": "<path_to_secret>",
"key_store": "<path_to_secret>",
"key_store_password_file": "<path_to_secret>",
"add_trust_store": true,
"trust_store": "<path_to_secret>",
"trust_store_password_file": "<path_to_secret>"
}
}
}
Refer to Secure JMX for a more detailed configuration process.
Configuring Volume Profiles
DC/OS Storage Service (DSS) is a service that manages volumes, volume profiles, volume providers, and storage devices in a DC/OS cluster.
Volume profiles are used to classify volumes. For example, users can group SSDs into a “fast” profile and group HDDs into a “slow” profile.
Once the DC/OS cluster is running and volume profiles are created, you can deploy Kafka with the following configs:
cat > kafka-options.json <<EOF
{
"brokers": {
"volume_profile": "kafka",
"disk_type": "MOUNT"
}
}
EOF
dcos package install kafka --options=kafka-options.json
Once the Kafka service finishes deploying, its tasks will be running with the specified volume profiles.
dcos kafka update status
deploy (serial strategy) (COMPLETE)
└─ broker (serial strategy) (COMPLETE)
├─ kafka-0:[broker] (COMPLETE)
├─ kafka-1:[broker] (COMPLETE)
└─ kafka-2:[broker] (COMPLETE)
Configuring Service Health Checks
DC/OS Apache Kafka supports service oriented health checks, allowing you to monitor your service health in detail. This can be specified by passing the following configuration during installation:
{
"service”: {
"name": "kafka",
"health_check": {
"enabled": true,
"method": "PORT" <OR> "FUNCTIONAL",
"interval": 60,
"delay": 20,
"timeout": 60,
"grace-period": 30,
"max-consecutive-failures": 3,
"health-check-topic-prefix": "KafkaHealthCheckTopic"
}
}
}
Refer to Service Health Checks for a more detailed configuration process.