DC/OS provides a distributed and fault tolerant DNS-based service discovery mechanism.
DNS is provided by two different components within DC/OS, mesos-dns
and dcos-dns
. These components support two top-level domain (TLD) names, .mesos
and .directory
. Please read the Recommendation section and Mesos-DNS to better understand the usage of these two TLDs. dcos-net
and mesos-dns
do not perform any caching.
Each of the TLDs consists of multiple zones. Every DC/OS service gets multiple fully qualified domain name (FQDN) entries from these different zones. Each service that is launched on DC/OS through Marathon gets an FQDN in the form of <service-name>.mesos
. Moreover, all running services launched on DC/OS get an FQDN based upon the service that launched it, in the form <service-name>.<group-name>.<framework-name>.mesos
.
Assume that the following UCR container is launched on DC/OS:
{
"id": "/mygroup/myapp",
"instances": 1,
"container": {
"type": "MESOS",
"docker": {
"image": "nginx"
},
"portMappings": [
{
"containerPort": 80,
"labels": {
"VIP_0": "/mygroup/myapp:80"
},
"name": "http"
}
]
},
"cpus": 0.1,
"mem": 32,
"networks": [
{
"name": "dcos",
"mode": "container"
}
],
}
Example 1. UCR container launched on DC/OS
The different zones and the FQDN associated with those zones are described below, along with the type of network connectivity that a service receives when it is accessed using an FQDN from a particular zone.
myapp.mygroup.marathon.mesos
This FQDN is exposed by mesos-dns
as an A record. Please read the Recommendation section to understand the difference between using this FQDN and using an FQDN from the *.directory
TLD. The actual IP address to which this FQDN maps will depend on the networking mode that a container is using. For host
and bridge
mode networking, this FQDN will resolve to the agent IP address on which the container is launched, whereas for container
mode networking it will resolve to the container’s IP address. In Example 1, the FQDN will resolve to the container’s IP address, as it is using container
networking mode.
myapp-mygroup.marathon.containerip.dcos.thisdcos.directory
This FQDN is exposed by dcos-dns
as an A record. For container
and bridge
mode networking, this FQDN will resolve to the container’s IP address, whereas for host
mode networking it will resolve to the agent IP on which container is launched. In the current example, the framework-name
is marathon
. The FQDN will resolve to the container’s IP address, as it is using container
networking mode.
myapp-mygroup.marathon.agentip.dcos.thisdcos.directory
This FQDN is exposed by dcos-dns
as an A record. This FQDN will always resolve to the agent IP on which the container is running, irrespective of networking mode.
myapp-mygroup.marathon.autoip.dcos.thisdcos.directory
This FQDN is exposed by dcos-dns
as an A record. As the name suggests, autoip
FQDN can resolve to the IP address that is most likely to reach the container. For instance, if a container is reachable via agent IP, as in host and bridge mode networking, then autoip
FQDN will resolve to the agent IP. Similarly, in container mode networking it will resolve to the container’s IP address.
mygroupmyapp.marathon.l4lb.thisdcos.directory
If a service explicitly defines a VIP
label as part of its app definition, then it gets an FQDN <group-name><service-name>.<framework-name>.l4lb.thisdcos.directory
In Example 1 above, the label looks like:
"labels": {
"VIP_0": "/mygroup/myapp:80"
}
This FQDN is exposed by dcos-dns
as an A record. It is primarily used for layer-4 load balancing. It resolves to a virtual IP address allocated by dcos-l4lb
in the 11.x.x.x
range, which then maps to all the instances that correspond to this service.
SRV records
See SRV Records for a full description of Mesos DNS SRV records.
-
For a task named
mytask
launched by a service namedmyservice
, Mesos-DNS generates an SRV record_mytask._protocol.myservice.mesos
, whereprotocol
is eitherudp
ortcp
. -
For more information on naming tasks and services in Mesos-DNS, see Task and Service Naming Conventions.
myapp.mygroup./_tcp.marathon.mesos:
If a service explicitly assigns a name to its port in the app definition, then it gets an FQDN _<service-name>.<group-name>._<protocol>.<framework-name>.mesos
.
This FQDN is exposed by mesos-dns
as an SRV record.
FQDNs for frameworks other than Marathon
Example 1 uses Marathon, which is the default framework in DC/OS. However, there are other frameworks that also run on top of DC/OS, such as Kafka, Cassandra, Spark, and so on. The DNS infrastructure in DC/OS generates all the FQDNs mentioned above for services launched by these frameworks as well. The only difference is that the name marathon
is replaced by that framework’s name to build out the FQDNs. For instance, a service launched by framework kafka
would have FQDNs such as:
<taskname>.kafka.l4lb.thisdcos.directory
<taskname>.kafka.containerip.dcos.thisdcos.directory
<taskname>.kafka.agentip.dcos.thisdcos.directory
<taskname>.kafka.autoip.dcos.thisdcos.directory
<taskname>.kafka.mesos
Recommendation
The .mesos
TLD pre-dates the .directory
TLD, and has been maintained primarily for the sake of backwards compatibility. While any service launched on DC/OS will get an FQDN both in the .mesos
TLD and the .directory
TLD, it is recommended to use the .directory
TLD to access services, since by design dcos-dns
is more reactive and fault-taulerant than mesos-dns
. That said, mesos-dns
does provide a RESTful interface for accessing its records, which allows the .mesos
TLD to be available over an HTTP interface and not just over DNS.