To Install DC/OS Data Science Engine with TensorFlow 2.1.0. DC/OS Data Science Engine comes with TensorFlow 2.1.0 support by default. Run the following command:
To Install DC/OS Data Science Engine with TensorFlow 1.15. Run the following command:
With options.json
having the following content:
TensorFlow local machine learning
Open a Python 3
Notebook and put the following sections in different code cells.
-
Prepare the test data:
-
Define a model:
-
Training and Evaluating the model
-
Use the model to predict a hand-written number:
TensorFlow 2.1.0 Distributed Learning with Horovod on Spark
DC/OS Data Science Engine includes Horovod on Spark
integration, which allows you to run TensorFlow in a distributed mode, using Apache Spark as an engine.
Open a Python 3
Notebook and put the following sections in different code cells.
-
Define Utility functions to prepare dataset and model
-
Implement distributed training function using
Horovod
-
Create Spark Session
-
Run distributed training
-
Evaluate model
-
Shutdown Spark workers
TensorFlow 1.15 Distributed Learning with Horovod on Spark
Open a Python 3
Notebook and put the following sections in different code cells.
-
Describe layers of the model
-
Implement train input generator
-
Implement distributed training function using
Horovod
-
Create Spark Session
-
Run distributed training
-
Shutdown Spark workers
TensorBoard
DC/OS Data Science Engine comes with TensorBoard
installed. It can be found at
http://<dcos-url>/service/data-science-engine/tensorboard/
.
Log directory
TensorBoard reads log data from specific directory, with the default being /mnt/mesos/sandbox
. It can be changed with advanced.tensorboard_logdir
option. HDFS paths are supported as well.
Here is an example:
-
Install HDFS:
-
Install
data-science-engine
with overridden log directory option:With
options.json
having the following content: -
Open TensorBoard at
https://<dcos-url>/service/data-science-engine/tensorboard/
and confirm the change.
Disabling TensorBoard
DC/OS Data Science Engine can be installed with TensorBoard
disabled by using the following configuration: