This section describes how to configure secure DC/OS service accounts for Spark.
When running in DC/OS strict security mode, both the dispatcher and jobs must authenticate to Mesos using a DC/OS service account.
Provisioning a service account
This section describes how to configure DC/OS access for Apache Spark. Depending on your security mode, Apache Spark may require service authentication for access to DC/OS.
A service like Apache Spark typically performs certain privileged actions on the cluster, which might require authenticating with the cluster. A service account associated with the service is used to authenticate with the DC/OS cluster. It is recommended to provisioning a separate service account for each service that would perform privileged operations. Service accounts authenticate using public-private keypair. The public key is used to create the service account in the cluster, while the corresponding private key is stored in the secret store. The service account and the service account secret are passed to the service as install time options.
Security mode | Service Account |
---|---|
Disabled | Not available |
Permissive | Optional |
Strict | Required |
If you install a service in permissive mode and do not specify a service account, Metronome and Marathon will act as if requests made by this service are made by an account with the superuser permission.
Prerequisites:
- DC/OS CLI installed and be logged in as a superuser.
- Enterprise DC/OS CLI 0.4.14 or later installed.
Create a Key Pair
In this step, a 2048-bit RSA public-private key pair is created using the Enterprise DC/OS CLI.
Create a public-private key pair and save each value into a separate file within the current directory.
dcos security org service-accounts keypair <private-key>.pem <public-key>.pem
Create a Service Account
From a terminal prompt, create a new service account (for example, spark
) containing the public key (<your-public-key>.pem
).
dcos security org service-accounts create -p <your-public-key>.pem -d <description> spark
You can verify your new service account using the following command.
dcos security org service-accounts show spark
Create a Secret
Create a secret (spark/<secret-name>
) with your service account and private key specified (<private-key>.pem
).
dcos security secrets create-sa-secret <private-key>.pem <service-account-id> spark/<secret-name>
You can list the secrets with this command:
dcos security secrets list /
Create and assign permissions
Use the following curl
commands to rapidly provision the Spark service account with the required permissions. You can also provision the service account through the UI.
When running in DC/OS strict security mode, both the dispatcher and jobs must authenticate to Mesos using a DC/OS service account.
Follow these instructions to authenticate in strict mode.
Using the secret store
DC/OS Enterprise allows users to add privileged information in the form of a file to the DC/OS secret store. These files can be referenced in Spark jobs and used for authentication and authorization with various external services (for example, HDFS). For example, you can use this functionality to pass Kerberos keytab
files. For details about how to use secrets, see understanding secrets.
Where to place secrets
For a secret to be available to Spark, it must be placed in a path
that can be accessed by the Spark service. If only Spark requires access to a secret, you can store the secret in a path that matches the name of the Spark service (for example, spark/secret
). See the Secrets Documentation about Spaces for details about how secret paths restrict service access to secrets.
Limitations
Anyone who has access to the Spark (Dispatcher) service instance has access to all secrets available to it. Do not grant users access to the Spark Dispatchers instance unless they are also permitted to access all secrets available to the Spark Dispatcher instance.
Binary secrets
You can store binary files, like a Kerberos keytab, in the DC/OS secrets store. In DC/OS 1.11 and later, you can create secrets from binary files directly. In DC/OS 1.10 or lower, files must be base64-encoded–as specified in RFC 4648–before being stored as secrets.
DC/OS 1.11 and later
To create a secret called mysecret
with the binary contents of kerb5.keytab
, run the following command:
dcos security secrets create --file kerb5.keytab mysecret
DC/OS 1.10 or earlier
To create a secret called mysecret
with the binary contents of kerb5.keytab
, first encode it using the base64
command line utility. The following example uses BSD base64
(default on macOS).
base64 -i krb5.keytab -o kerb5.keytab.base64-encoded
Alternatively, GNU base64
(the default on Linux) inserts line-feeds in the encoded data by default.
Disable line-wrapping with the -w 0
argument.
base64 -w 0 -i krb5.keytab > kerb5.keytab.base64-encoded
Now that the file is encoded, it can be stored as a secret.
dcos security secrets create -f kerb5.keytab.base64-encoded some/path/__dcos_base64__mysecret
When the some/path/__dcos_base64__mysecret
secret is referenced in your dcos spark run
command, its base64-decoded contents are made available as a temporary file in your Spark application.
Using Mesos secrets
Once a secret has been added in the secret store, you can pass it to Spark with the spark.mesos.<task-name>.secret.names
and spark.mesos.<task-name>.secret.<filenames|envkeys>
configuration parameters, where <task-name>
is either driver
or executor
. Specifying filenames
or envkeys
identifies the secret as either a file-based secret or an environment variable. These configuration parameters take comma-separated lists that are “zipped” together to make the final secret file or environment variable. In most cases, you should use file-based secrets whenever possible because they are more secure than environment variable secrets.
To use the Mesos containerizer, add this configuration:
--conf spark.mesos.containerizer=mesos
For example, to use a secret named spark/my-secret-file
as a file in the driver and the executors, add these configuration parameters:
--conf spark.mesos.containerizer=mesos
--conf spark.mesos.driver.secret.names=spark/my-secret-file
--conf spark.mesos.driver.secret.filenames=target-secret-file
--conf spark.mesos.executor.secret.names=spark/my-secret-file
--conf spark.mesos.executor.secret.filenames=target-secret-file
These settings put the contents of the secret spark/my-secret-file
in a secure RAM-FS mounted secret file named target-secret-file
in the drivers’ and executors’ sandboxes. If you want to use a secret as an environment variable (for example, AWS credentials), you can change the configurations to be similar to the following:
--conf spark.mesos.containerizer=mesos
--conf spark.mesos.driver.secret.names=/spark/my-aws-secret,/spark/my-aws-key
--conf spark.mesos.driver.secret.envkeys=AWS_SECRET_ACCESS_KEY,AWS_ACCESS_KEY_ID
These example settings illustrate a secret access key stored in a secret named spark/my-aws-secret
and a secret key ID in spark/my-aws-key
.
Limitations
When using a combination of environment and file-based secrets, there must be an equal number of sinks and secret sources (files and environment variables). For example:
--conf spark.mesos.containerizer=mesos
--conf spark.mesos.driver.secret.names=/spark/my-secret-file,/spark/my-secret-envvar
--conf spark.mesos.driver.secret.filenames=target-secret-file,placeholder-file
--conf spark.mesos.driver.secret.envkeys=PLACEHOLDER,SECRET_ENVVAR
This code places the content of spark/my-secret-file
into the PLACEHOLDER
environment variable and the target-secret-file
file as well as the content of spark/my-secret-envvar
into the SECRET_ENVVAR
and placeholder-file
. In the case of binary secrets, the environment variable is empty because environment variables cannot be assigned binary values.
Spark SSL
SSL support in DC/OS Apache Spark encrypts the following channels:
- From the DC/OS admin router to the dispatcher.
- Files served from the drivers to their executors.
To enable SSL, a Java keystore (and, optionally, truststore) must be provided, along with their passwords. The first three settings below are required during job submission. If using a truststore, the last two are also required:
Variable | Description |
---|---|
--keystore-secret-path |
Path to keystore in secret store |
--keystore-password |
The password used to access the keystore |
--private-key-password |
The password for the private key |
--truststore-secret-path |
Path to truststore in secret store |
--truststore-password |
The password used to access the truststore |
In addition, there are a number of Spark configuration variables relevant to SSL setup. These configuration settings are optional:
Variable | Description | Default value |
---|---|---|
spark.ssl.enabledAlgorithms |
Allowed cyphers | JVM defaults |
spark.ssl.protocol |
Protocol | TLS |
The keystore and truststore are created using the Java keytool. The keystore must contain one private key and its signed public key. The truststore is optional and might contain a self-signed root-CA certificate that is explicitly trusted by Java.
Add the stores to your secrets in the DC/OS secret store. For example, if your keystores and truststores are server.jks
and trust.jks
, respectively, then use the following commands to add them to the secret store:
dcos security secrets create /spark/keystore --text-file server.jks
dcos security secrets create /spark/truststore --text-file trust.jks
You must add the following configurations to your dcos spark run
command.
The ones in parentheses are optional:
dcos spark run --verbose --submit-args="\
--keystore-secret-path=<path/to/keystore, e.g. spark/keystore> \
--keystore-password=<password to keystore> \
--private-key-password=<password to private key in keystore> \
(—-truststore-secret-path=<path/to/truststore, for example, spark/truststore> \)
(--truststore-password=<password to truststore> \)
(—-conf spark.ssl.enabledAlgorithms=<cipher, for example, TLS_RSA_WITH_AES_128_CBC_SHA256> \)
--class <Spark Main class> <Spark Application JAR> [application args]"
DC/OS 1.10 or earlier: Since both stores are binary files, they must be base64 encoded before being placed in the DC/OS secret store. Follow the instructions above on encoding binary secrets to encode the keystore and truststore.
Spark SASL
This section discusses executor authentication and BlockTransferService encryption.
Spark uses Simple Authentication Security Layer (SASL) to authenticate executors with the driver and for encrypting messages sent between components. This functionality relies on a shared secret between all components you expect to communicate with each other. A secret can be generated with the DC/OS Spark CLI:
dcos spark secret <secret_path>
For example:
dcos spark secret /spark/sparkAuthSecret
This example generates a random secret and uploads it to the DC/OS secrets store at the designated path. To use this secret for RPC authentication, add the following configutations to your CLI command:
dcos spark run --submit-args="\
...
--executor-auth-secret-path=/spark/sparkAuthSecret
...
"