The default DC/OS DataStax Enterprise installation provides reasonable defaults for trying out the service, but may not be sufficient for production use. You may require a different configuration depending on the context of the deployment.
Installing with Custom Configuration
The following are some examples of how to customize the installation of your DataStax Enterprise instance.
In each case, you would create a new DataStax Enterprise instance using the custom configuration as follows:
dcos package install datastax-dse --options=sample-datastax-dse.json
We recommend that you store your custom configuration in source control.
Installing multiple instances
By default, the DataStax Enterprise service is installed with a service name of datastax-dse
. You may specify a different name using a custom service configuration as follows:
{
"service": {
"name": "datastax-dse-other"
}
}
When the above JSON configuration is passed to the package install datastax-dse
command via the --options
argument, the new service will use the name specified in that JSON configuration:
dcos package install datastax-dse --options=datastax-dse-other.json
Multiple instances of DataStax Enterprise may be installed into your DC/OS cluster by customizing the name of each instance. For example, you might have one instance of DataStax Enterprise named datastax-dse-staging
and another named datastax-dse-prod
, each with its own custom configuration.
After specifying a custom name for your instance, it can be reached using dcos datastax-dse
CLI commands or directly over HTTP as described below.
Installing into folders
In DC/OS 1.10 and later, services may be installed into folders by specifying a slash-delimited service name. For example:
{
"service": {
"name": "/foldered/path/to/datastax-dse"
}
}
The above example will install the service under a path of foldered
=> path
=> to
=> datastax-dse
. It can then be reached using dcos datastax-dse
CLI commands or directly over HTTP as described below.
Addressing named instances
After you’ve installed the service under a custom name or under a folder, it may be accessed from all dcos datastax-dse
CLI commands using the --name
argument. By default, the --name
value defaults to the name of the package, or datastax-dse
.
For example, if you had an instance named datastax-dse-dev
, the following command would invoke a pod list
command against it:
dcos datastax-dse --name=datastax-dse-dev pod list
The same query would be over HTTP as follows:
curl -H "Authorization:token=$auth_token" <dcos_url>/service/datastax-dse-dev/v1/pod
Likewise, if you had an instance in a folder like /foldered/path/to/datastax-dse
, the following command would invoke a pod list
command against it:
dcos datastax-dse --name=/foldered/path/to/datastax-dse pod list
Similarly, it could be queried directly over HTTP as follows:
curl -H "Authorization:token=$auth_token" <dcos_url>/service/foldered/path/to/datastax-dse-dev/v1/pod
You may add a -v
(verbose) argument to any dcos datastax-dse
command to see the underlying HTTP queries that are being made. This can be a useful tool to see where the CLI is getting its information. In practice, dcos datastax-dse
commands are a thin wrapper around an HTTP interface provided by the DC/OS DataStax Enterprise Service itself.
Integration with DC/OS access controls
In Enterprise DC/OS, DC/OS access controls can be used to restrict access to your service. To give a non-superuser complete access to a service, grant them the following list of permissions:
dcos:adminrouter:service:marathon full
dcos:service:marathon:marathon:<service-name> full
dcos:service:adminrouter:<service-name> full
dcos:adminrouter:ops:mesos full
dcos:adminrouter:ops:slave full
Where <service-name>
is your full service name, including the folder if it is installed in one.
DataStax OpsCenter
The DC/OS DataStax OpsCenter can be installed from the datastax-ops
package. It is managed identically to datastax-dse
. This guide primarily covers datastax-dse
for conciseness. See the later sections of the guide for any configuration specifics of DC/OS DataStax OpsCenter.
Service Settings
Placement Constraints
Placement constraints allow you to customize where a service is deployed in the DC/OS cluster. Placement constraints use the Marathon operators syntax. For example, [["hostname", "UNIQUE"]]
ensures that at most one pod instance is deployed per agent.
A common task is to specify a list of whitelisted systems to deploy to. To achieve this, use the following syntax for the placement constraint:
[["hostname", "LIKE", "10.0.0.159|10.0.1.202|10.0.3.3"]]
Updating Placement Constraints
Clusters change, and as such so will your placement constraints. However, already running service pods will not be affected by changes in placement constraints. This is because altering a placement constraint might invalidate the current placement of a running pod, and the pod will not be relocated automatically as doing so is a destructive action. We recommend using the following procedure to update the placement constraints of a pod:
- Update the placement constraint definition in the service.
- For each affected pod, one at a time, perform a
pod replace
. This will (destructively) move the pod to be in accordance with the new placement constraints.
Enterprise
ZonesRequires: DC/OS 1.11 Enterprise or later.
Placement constraints can be applied to DC/OS zones by referring to the @zone
key. For example, one could spread pods across a minimum of three different zones by including this constraint:
[["@zone", "GROUP_BY", "3"]]
For the @zone constraint to be applied correctly, DC/OS must have Fault Domain Awareness enabled and configured.
Virtual networks
DC/OS DataStax Enterprise supports deployment on virtual networks on DC/OS (including the dcos
overlay network), allowing each container (task) to have its own IP address and not use port resources on the agent machines. This can be specified by passing the following configuration during installation:
{
"service": {
"virtual_network_enabled": true
}
}
User
By default, all pods’ containers will be started as system user “nobody”. If your system configured for using over system user (for instance, you may have externally mounted persistent volumes with root’s permissions), you can define the user by defining a custom value for the service’s property “user”, for example:
{
"service": {
"properties": {
"user": "root"
}
}
}
Best Practices
- Use Mesosphere Enterprise DC/OS’s placement rules to map your DSE cluster nodes or DC to different availability zones to achieve high resiliency.
- Set up a routine backup service using OpsCenter to back up your business critical data on a regular basis. The data can be stored on the DSE nodes themselves, or on AWS S3 buckets, depending on your IT policy or business needs.
- Set up a routine repair service using OpsCenter to ensure that all data on a replica is consistent within your DSE clusters.
Node Settings
The following settings may be adjusted to customize the amount of resources allocated to each DSE Node. DataStax’s minimum system requirements must be taken into consideration when adjusting these values. Reducing these values may result in adverse performance and possibly even task failures.
Each of the following settings may be customized under the dsenode configuration section.
Node Count
Customize the Node Count
setting (default 3). Consult the DSE documentation for minimum node requirements.
CPU
The amount of CPU allocated to each DSE Node may be customized. A value of 1.0
equates to one full CPU core on a machine. This value may be customized by editing the cpus
value under the dsenode configuration section.
Memory
The amount of RAM allocated to each DSE Node may be customized. This value may be customized by editing the mem
value (in MB) under the dsenode configuration section.
If the allocated memory is customized, you must also update the heap
value under that section as well. As a rule of thumb we recommend that heap
be set to half of mem
. For example, for a mem
value of 32000
, heap
should be 16000
. If you do not do this, you may see restarted dse-#-node
tasks due to memory errors.
Ports
Each port exposed by DSE components may be customized via the service configuration. If you wish to install multiple instances of DSE and have them colocate on the same machines, you must ensure that no ports are common between those instances. You only need to customize ports if you require multiple instances sharing a single machine. This customization is optional otherwise.
Each component’s ports may be customized in the following configuration sections:
- DSE Nodes (as a group):
Node placement constraint
under dsenode. - OpsCenter (if built-in instance is enabled):
OpsCenter placement constraint
under OpsCenter.
Storage Volumes
The DSE DC/OS service supports two volume types:
ROOT
volumes are effectively an isolated directory on the root volume, sharing IO/spindles with the rest of the host system.MOUNT
volumes are a dedicated device or partition on a separate volume, with dedicated IO/spindles.
MOUNT
volumes require additional configuration on each DC/OS agent system, so the service currently uses ROOT
volumes by default. To ensure reliable and consistent performance in a production environment, you must configure two MOUNT volumes on the machines which will run DSE in your cluster, and then configure the following as MOUNT
volumes under dsenode:
- Persistent data volume type =
MOUNT
- Persistent Solr volume type (if
DSE Search
is enabled) =MOUNT
UsingROOT
volumes for these is not supported in production.
Separate volume for commit log data
If you are using non-magnetic disks, then a good approach is to keep your commit log data files on the same volume as your DSE data. This is the default configuration. If you need to keep commit log data on a separate volume, you can do so. The service install provides options for enabling that feature and provisioning a separate mount point for commit log data. Be aware that if you choose to use a separate volume, you will not be able to change it back later.
Placement Constraints
Placement constraints allow you to customize where a DSE instance is deployed in the DC/OS cluster. Placement constraints may be configured separately for each of the node types in the following locations:
- DSE Nodes (as a group):
Node placement constraint
under dsenode. - OpsCenter (if built-in instance is enabled):
OpsCenter placement constraint
under OpsCenter.
Placement constraints support all Marathon operators (reference) with this syntax: field:OPERATOR[:parameter]
. For example, if the reference lists [["hostname", "UNIQUE"]]
, you should use hostname:UNIQUE
.
A common task is to specify a list of whitelisted systems to deploy to. To achieve this, use the following syntax for the placement constraint:
hostname:LIKE:10.0.0.159|10.0.1.202|10.0.3.3
You must include spare capacity in this list so that if one of the whitelisted systems goes down, there is still enough room to repair your service without that system.
Rack-Aware Placement
DSE’s “rack”-based fault domain support may be enabled by specifying a placement constraint that uses the @zone
key. For example, you could spread DSE nodes across a minimum of three different zones/racks by specifying the constraint @zone:GROUP_BY:3
. When a placement constraint specifying @zone
is used, DSE nodes will be automatically configured with rack
s that match the names of the zones. If no placement constraint referencing @zone
is configured, all nodes will be configured with a default rack of rack1
.
dse.yaml and cassandra.yaml settings
Nearly all settings for dse.yaml
and cassandra.yaml
are exposed as configuration options, allowing them to be deployed and updated automatically by the service.
dse.yaml
options are listed under the DSE sectioncassandra.yaml
options are listed under the Cassandra section
For more information on each setting, view DataStax’s documentation for dse.yaml and cassandra.yaml.
Use Built-In or External OpsCenter
DSE DC/OS provides the datastax-ops package, which you can install to get a default OpsCenter dashboard. If you prefer to use an external OpsCenter instance, you can configure the DSE service to point to an externally managed OpsCenter.
Follow these steps to configure DSE to use an external OpsCenter (in the OpsCenter section of the DSE installation screen).
- Check the ENABLE DATASTAX OPSCENTER checkbox.
- Set the OPSCENTER HOST NAME field to the hostname of your external OpsCenter instance.
If you choose to run an instance of the datastax-ops package, this field can be populated as opscenter-0-node.<service-name>.autoip.dcos.thisdcos.directory
. For example opscenter-0-node.datastax-ops-1.autoip.dcos.thisdcos.directory
if the datastax-ops service is named datastax-ops-1
Installation with DSE Multi-Datacenter
Each DSE Datacenter must be configured with the seed nodes of the other DSE Datacenters. For example, let’s deploy three Datacenters (as separate DC/OS services in DC/OS terms), named datastax-dse-1
, datastax-dse-2
, and datastax-dse-3
, and then link them all together. Here is an example timeline from start to finish:
DC/OS 1.10 and later
Follow these instructions for DC/OS 1.10 and later. If you are using DC/OS 1.9 or earlier, follow these instructions.
-
Add a DSE service from the DC/OS Catalog. Deploy
datastax-dse-1
with the following customizations:- In service, set
Service Name
=datastax-dse-1
- In cluster, set
DSE Datacenter
=dc_datastax_1
- In service, set
-
Wait for
datastax-dse-1
to finish deploying before continuing with the other DCs. -
Add a second DSE service from the DC/OS Catalog. Deploy
datastax-dse-2
with the following customizations:- In service, set:
Service Name
=datastax-dse-2
- In cluster, set:
DSE Datacenter
=dc_datastax_2
External Seed Nodes
=dse-0-node.datastax-dse-1.autoip.dcos.thisdcos.directory,dse-1-node.datastax-dse-1.autoip.dcos.thisdcos.directory
(pointdatastax-dse-2
todatastax-dse-1
's seeds)
- In OpsCenter, set:
OPSCENTER HOSTNAME
=opscenter-0-node.datastax-dse-1.autoip.dcos.thisdcos.directory
(pointdatastax-dse-2
todatastax-dse-1
's OpsCenter)
- In service, set:
-
Add a third DSE service from the DC/OS Catalog. Deploy
datastax-dse-3
with the following customizations:- In service, set:
Service Name
=datastax-dse-3
- In cluster, set:
DSE Datacenter
=dc_datastax_3
External Seed Nodes
=dse-0-node.datastax-dse-1.autoip.dcos.thisdcos.directory,dse-1-node.datastax-dse-1.autoip.dcos.thisdcos.directory
(pointdatastax-dse-3
todatastax-dse-1
's seeds)
- In OpsCenter, set:
OPSCENTER HOSTNAME
=opscenter-0-node.datastax-dse-1.autoip.dcos.thisdcos.directory
(pointdatastax-dse-3
todatastax-dse-1
's OpsCenter)
- In service, set:
-
Wait for
datastax-dse-2
anddatastax-dse-3
to finish deploying. Then, update the seed nodes across all the instances:-
Create a local file called
dse-1-options.json
. Paste the following into the file.{ "cluster": { "external_seeds": "dse-0-node.datastax-dse-2.autoip.dcos.thisdcos.directory,dse-1-node.datastax-dse-2.autoip.dcos.thisdcos.directory,dse-0-node.datastax-dse-3.autoip.dcos.thisdcos.directory,dse-1-node.datastax-dse-3.autoip.dcos.thisdcos.directory" }, }
This points
datastax-dse-1
todatastax-dse-2
anddatastax-dse-3
. -
From the DC/OS CLI, update the service to the new configuration.
dcos datastax-dse update start --options=dse-1-options.json
-
Wait for the seed update to roll out across
datastax-dse-1
nodes. -
Perform the same operation for
datastax-dse-2
anddatastax-dse-3
.- For
datastax-dse-2
, setcluster.external_seeds
todse-0-node.datastax-dse-1.autoip.dcos.thisdcos.directory,dse-1-node.datastax-dse-1.autoip.dcos.thisdcos.directory,dse-0-node.datastax-dse-3.autoip.dcos.thisdcos.directory,dse-1-node.datastax-dse-3.autoip.dcos.thisdcos.directory
. - For
datastax-dse-3
, setcluster.external_seeds
todse-0-node.datastax-dse-2.autoip.dcos.thisdcos.directory,dse-2-node.datastax-dse-1.autoip.dcos.thisdcos.directory,dse-0-node.datastax-dse-3.autoip.dcos.thisdcos.directory,dse-1-node.datastax-dse-3.autoip.dcos.thisdcos.directory
.
- For
-
-
Now, each of the three DCs has seed nodes configured for the other DCs. Because we used
.autoip.dcos.thisdcos.directory
hostnames, which automatically update to follow the tasks, we won’t need to reconfigure seeds if they’re moved between systems in the DC/OS cluster.
DC/OS 1.9 and earlier
- Add a DSE service from the DC/OS Catalog. Deploy
datastax-dse-1
with the following customizations:- In service, set
Service Name
=datastax-dse-1
- In cluster, set
DSE Datacenter
=dc_datastax_1
- In service, set
- Wait for
datastax-dse-1
to finish deploying before continuing with the other DCs. - Add a second DSE service from the DC/OS Catalog. Deploy
datastax-dse-2
with the following customizations:- In service, set:
Service Name
=datastax-dse-2
- In cluster, set:
DSE Datacenter
=dc_datastax_2
External Seed Nodes
=dse-0-node.datastax-dse-1.autoip.dcos.thisdcos.directory,dse-1-node.datastax-dse-1.autoip.dcos.thisdcos.directory
(pointdatastax-dse-2
todatastax-dse-1
's seeds)
- In OpsCenter, set:
OPSCENTER HOSTNAME
=opscenter-0-node.datastax-dse-1.autoip.dcos.thisdcos.directory
(pointdatastax-dse-2
todatastax-dse-1
's OpsCenter)
- In service, set:
- Add a third DSE service from the DC/OS Catalog. Deploy
datastax-dse-3
with the following customizations:- In service, set:
Service Name
=datastax-dse-3
- In cluster, set:
DSE Datacenter
=dc_datastax_3
External Seed Nodes
=dse-0-node.datastax-dse-1.autoip.dcos.thisdcos.directory,dse-1-node.datastax-dse-1.autoip.dcos.thisdcos.directory
(pointdatastax-dse-3
todatastax-dse-1
's seeds)
- In OpsCenter, set:
OPSCENTER HOSTNAME
=opscenter-0-node.datastax-dse-1.autoip.dcos.thisdcos.directory
(pointdatastax-dse-3
todatastax-dse-1
's OpsCenter)
- In service, set:
- Wait for
datastax-dse-2
anddatastax-dse-3
to finish deploying. Then, update the seed nodes across all the instances:- Go to the service view of
datastax-dse-1
in the DC/OS UI. Click the menu in the upper right and then choose Edit. Go to the Environment tab and setDSE_EXTERNAL_SEEDS
=dse-0-node.datastax-dse-2.autoip.dcos.thisdcos.directory,dse-1-node.datastax-dse-2.autoip.dcos.thisdcos.directory,dse-0-node.datastax-dse-3.autoip.dcos.thisdcos.directory,dse-1-node.datastax-dse-3.autoip.dcos.thisdcos.directory
(pointdatastax-dse-1
todatastax-dse-2
anddatastax-dse-3
). - Wait for the seed update to roll out across
datastax-dse-1
nodes. - Go to the service view of
datastax-dse-2
in the DC/OS UI. Click the menu in the upper right and then choose Edit. Go to the Environment tab and updateDSE_EXTERNAL_SEEDS
=dse-0-node.datastax-dse-1.autoip.dcos.thisdcos.directory,dse-1-node.datastax-dse-1.autoip.dcos.thisdcos.directory,dse-0-node.datastax-dse-3.autoip.dcos.thisdcos.directory,dse-1-node.datastax-dse-3.autoip.dcos.thisdcos.directory
(pointdatastax-dse-2
todatastax-dse-1
anddatastax-dse-3
). - Wait for the seed update to roll out across
datastax-dse-2
nodes. - Go to the service view of
datastax-dse-3
in the DC/OS UI. Click the menu in the upper right and then choose Edit. Go to the Environment tab and updateDSE_EXTERNAL_SEEDS
=dse-0-node.datastax-dse-2.autoip.dcos.thisdcos.directory,dse-2-node.datastax-dse-1.autoip.dcos.thisdcos.directory,dse-0-node.datastax-dse-3.autoip.dcos.thisdcos.directory,dse-1-node.datastax-dse-3.autoip.dcos.thisdcos.directory
(pointdatastax-dse-3
todatastax-dse-1
anddatastax-dse-2
). - Wait for the seed update to roll out across
datastax-dse-3
nodes.
- Go to the service view of
- Now, each of the three DCs has seed nodes configured for the other DCs. Because we used
.autoip.dcos.thisdcos.directory
hostnames, which automatically update to follow the tasks, we won’t need to reconfigure seeds if they’re moved between systems in the DC/OS cluster.
Using Volume Profiles
Volume profiles are used to classify volumes. For example, users can group SSDs into a “fast” profile and group HDDs into a “slow” profile.
DC/OS Storage Service (DSS) is a service that manages volumes, volume profiles, volume providers, and storage devices in a DC/OS cluster.
If you want to deploy DSE with DSS, please follow our tutorial for Cassandra and use the same procedure to deploy DSE.
After the DC/OS cluster is running and volume profiles are created, you can deploy DSE with the volume profile.
After the DSE service finishes deploying, its tasks will be running with the specified volume profiles.