This page explains how to get started with metrics in Mesosphere® DC/OS™. A metrics pipeline is natively integrated with DC/OS and no additional setup is required.
Prerequisites:
- You must have the DC/OS CLI installed and be logged in as a superuser via the
dcos auth login
command.
-
Optional: Deploy a sample Marathon™ app for use in this quick start guide. If you already have tasks running on DC/OS, you can skip this setup step.
-
Create the following Marathon app definition and save as
test-metrics.json
.{ "id": "/test-metrics", "cmd": "./statsd-emitter", "fetch": [{"uri": "https://downloads.mesosphere.com/dcos-metrics/1.11.0/statsd-emitter", "executable": true}], "cpus": 0.01, "instances": 1, "mem": 128 }
-
Deploy the app with the following CLI command:
dcos marathon app add test-metrics.json
-
-
To get the Apache® Mesos® ID of the node that is running your app, run
dcos task
followed bydcos node
. For example:-
Running
dcos task
shows that host10.0.0.193
is running the Marathon tasktest-metrics.93fffc0c-fddf-11e6-9080-f60c51db292b
.dcos task NAME HOST USER STATE ID test-metrics 10.0.0.193 root R test-metrics.93fffc0c-fddf-11e6-9080-f60c51db292b
-
Running
dcos node
shows that host10.0.0.193
has the Mesos ID7749eada-4974-44f3-aad9-42e2fc6aedaf-S1
.dcos node HOSTNAME IP ID 10.0.0.193 10.0.0.193 7749eada-4974-44f3-aad9-42e2fc6aedaf-S1
-
-
View metrics.
-
Container metrics for a specific task
For an overview of the resource consumption for a specific container, execute the following command:
dcos task metrics summary <task-id>
The output should resemble:
CPU MEM DISK 0.17 (1.35%) 0.01GiB (6.46%) 0.00GiB (0.00%)
The metrics summary command displays a summary of raw and percentage utilization of CPU, Memory and Disk resources using the metrics documented in the metrics reference summary.
In particular, the following metrics and formula are used to compute the displayed values:
- CPU usage:
cpus.system_time_secs + cpus.user_time_secs (raw) (cpus.system_time_secs + cpus.user_time_secs) / cpus.throttled_time_secs (percentage)
- Memory usage:
mem.total_bytes (raw) mem.total_byes/mem.limit_bytes (percentage)
- Disk usage:
disk.used_bytes (raw) disk.used_bytes/disk.total_bytes (percentage)
-
All metrics for a specific task
To get a detailed list of all metrics related to a task, execute the following command:
dcos task metrics details <task-id>
The output is a combination of container resource utilization and metrics transmitted by the workload. For example:
NAME VALUE cpus.limit 0.20 cpus.nr_periods 1272 cpus.nr_throttled 8 cpus.system_time_secs 0.23 cpus.throttled_time_secs 0.45 cpus.user_time_secs 0.15 mem.anon_bytes 9359360 mem.cache_bytes 106496 mem.file_bytes 106496 mem.limit_bytes 44040192 mem.rss_bytes 9359360 mem.total_bytes 9465856 perf.timestamp 1556720487.68
The CPU, disk, and memory statistics come from container data supplied by Mesos. The
statsd_tester.time.uptime
statistic comes from the application itself. -
For task data, host-level metrics are available in the form of a summary or a detailed table. To view host-level metrics, execute the following command:
dcos node metrics details <mesos-id>
The output displays the statistics about available resources on the node and their utilization. For example:
NAME VALUE TAGS cpu.idle 99.56% cpu.system 0.09% cpu.total 0.34% cpu.user 0.25% cpu.wait 0.01% filesystem.capacity.free 134.75GiB path: / filesystem.capacity.total 143.02GiB path: / filesystem.capacity.used 2.33GiB path: / filesystem.inode.free 38425263 path: / filesystem.inode.total 38504832 path: / filesystem.inode.used 79569 path: / load.15min 0 load.1min 0 load.5min 0 memory.buffers 0.08GiB memory.cached 2.41GiB memory.free 12.63GiB memory.total 15.67GiB process.count 175 swap.free 0.00GiB swap.total 0.00GiB swap.used 0.00GiB system.uptime 28627
-
All dcos-cli metrics commands can be executed with the
--json
for use in scripts. For example:dcos node metrics summary <mesos-id> --json
The output displays the same data, but in JSON format, for convenient parsing:
[ { "name": "cpu.total", "timestamp": "2018-04-09T23:46:16.834008315Z", "value": 0.32, "unit": "percent" }, { "name": "memory.total", "timestamp": "2018-04-09T23:46:16.834650407Z", "value": 16830304256, "unit": "bytes" }, { "name": "memory.free", "timestamp": "2018-04-09T23:46:16.834650407Z", "value": 13553008640, "unit": "bytes" }, { "name": "filesystem.capacity.total", "timestamp": "2018-04-09T23:46:16.834373702Z", "value": 153567944704, "tags": { "path": "/" }, "unit": "bytes" }, { "name": "filesystem.capacity.used", "timestamp": "2018-04-09T23:46:16.834373702Z", "value": 2498990080, "tags": { "path": "/" }, "unit": "bytes" } ]
-