DC/OS Version 1.12.3 was released on March 14, 2019.
DC/OS 1.12.3 includes the following components:
DC/OS is a distributed operating system that enables you to manage resources, application deployment, data services, networking, and security in an on-premise, cloud, or hybrid cluster environment.
Issues Fixed in DC/OS 1.12.3
The issues that have been fixed in DC/OS 1.12.3 are grouped by feature, functional area, or component. Most change descriptions include one or more issue tracking identifiers for reference.
Command-Line Interface (CLI)
- DCOS-42928 - This release includes the
docker psoutput in the bundle generated by the
dcos node diagnosticscommand.
- DCOS-45863 - Currently, the
Settings/LDAP Directory/Add Directory/Authenticationdialog has a mandatory
lookup-dnand an optional
lookup-password. This release provides a possibility to select between
LDAP Credentialsthat translates to the following:
Anonymous Bindis selected, the
JSONsent to Bouncer
/ldap/configAPI has no
LDAP Credentialsis selected, use the current behavior. The
JSONsent to Bouncer
lookup-passwordfield may be an empty string.
COPS-4282, DCOS_OSS-4613 - If you run the
dcos_generate_configcommand with the
--validateoption, the command validates the configuration settings in your
config.yamlfile. In some cases, this option issued warning messages that validation failed for parameters that are no longer used. For example, some SSH parameters, such as
ssh_user, have been deprecated. Previously, if you ran
--validateoption to check your configuration settings and these parameters were not specified, the command reported that the validation of configuration parameters had failed. With this release, the
--validateoption does not return validation failure messages for parameters that are no longer required for installation.
DCOS-15890 - The Pre-flight check on advanced installer shows misleading information. This release improves an error message in case Docker is not running at the start of installation.
COPS-3554 - Fixed a rare problem where a follower Marathon might try to proxy to a non-leader instance. The fix adds a watcher loop process that monitors and re-registers (if necessary) the Marathon leader after re-election.
COPS-3593, DCOS_OSS-4193 - In previous releases, you might have services that are managed by Marathon unable to restart if the container crashes or under certain DNS failure conditions. For example, restarting services might fail if the first ZooKeeper node or first DC/OS master is unreachable. Because this problem affects high availability for Marathon, a workaround (ping zk-1) was introduced for DC/OS 1.11.5 and 1.11.6 to address the issue. In this release, the underlying issue is resolved and you can safely remove the workaround if you have it deployed. For background information about the issue and the steps to remove the workaround, see the product advisory documentation.
ASF-2719 - Agent could not recover due to empty docker volume checkpointed files.
COPS-4104 - This release fixes an issue that caused container and agent recovery to fail under the following circumstances:
- The checkpointed Docker volumes file for a container does not exist.
- The checkpointed Docker volumes file for a container exists but is empty.
Prior to this fix, the missing or empty file could prevent the agent from restarting and returning to normal operation. With this release, recovery from an empty or missing docker/volume file is handled by the containerizer or by the
ASF-2731, COPS-4504 - Nvidia changed the container runtime settings in CUDA 10 images causing the GPU isolator in UCR to disable CUDA 10 images. Specifically, the new CUDA images relies on the
libnvidia-containerlibrary to set up the container runtime. This release updates the GPU isolator in UCR to workaround the changes required to support the image.
DCOS-46554 - This change forces Mesos master to have port resources in every offer unless the offer contains disks. This helps to reduce the number of offers with no ports, which are not useful for most frameworks.
DCOS-47991, DCOS_OSS-4760 - An issue with the way the Mesos input plugin was mapping fields to tags for Telegraf, which led to gaps in Mesos Grafana dashboards, was corrected.
DCOS_OSS-4624 - Currently, there are no container metrics that provide the disk usage of Mesos persistent volumes, only the disk usage of the Mesos sandbox is provided. This release adds the missing container metrics such as
NetSNMPStatistics, and also adds all available
- COPS-3279, COPS-3576, DCOS-37703, DCOS-37703, DCOS-39703 - Service endpoint values and service address-based statistics return the correct number of successful and failure connections when you enable the
statsdmetrics input plugin and view backend activity.
DCOS-46381, DCOS-47348 - The Marathon app definition format was changed from 1.4 to 1.5. Previously, the Admin Router code only supported the v1.4 app definitions, and thus Admin Router was not able to expose apps using the
ip-per-containerfeature at a
/service/endpoint. This release adds the necessary routing logic for Marathon v1.5 app definitions.
DCOS-47687 - The ZooKeeper snapshot and log files contain sensitive data and are readable by any user on a master node, so it is important to control permissions for ZooKeeper data directories. This fix ensures that
/var/lib/dcos/exhibitor/zookeeperis owned by
dcos_exhibitorand only has owner permissions.
Known Issues and Limitations
This section covers any known issues or limitations that don’t necessarily affect all customers, but might require changes to your environment to address specific scenarios. The issues are grouped by feature, functional area, or component. Where applicable, issue descriptions include one or more issue tracking identifiers.
- COPS-3585 - In previous releases, a deadlock or race condition might prevent one or more nodes in a cluster from generating a routing table that forwards network traffic through Marathon load balancing properly. Problems with routing tables and network connectivity can lead to the following issues:
- Incomplete network overlay configuration on certain nodes.
- Incomplete VIP/IPVS/L4LB configuration on certain nodes.
- DNS records that are missing on certain nodes.
You can restart the
systemdprocess on the nodes affected to restore proper network connectivity. This fix is related to the mitigation of a networking issue caused by a secure socket layer (SSL) deadlock in the Erlang library (DC/OS 1.12).
About DC/OS 1.12
DC/OS 1.12 includes many new features and capabilities. The key features and enhancements focus on:
- Mesosphere Kubernetes engine
- Mesosphere Jupyter service
- Observability and metrics
- Private package registry
- Installation and upgrade improvements
- LDAP and networking enhancements
Mesosphere Kubernetes Engine
- Introduced High Density Multi-Kubernetes (HDMK) that allows operators to take advantage of intelligent resource pooling when running multiple Kubernetes clusters on DC/OS. Compared with other Kubernetes distributions that run a single Kubernetes node per virtual machine, Mesosphere HDMK uses its intelligent resource pooling to pack multiple Kubernetes nodes onto the same server for bare metal, virtual machine, and public cloud instances, driving significant cost savings and resource efficiencies. Learn more about Kubernetes on DC/OS.
Mesosphere Jupyter Service (MJS)
- Delivered secure, cloud-native Jupyter Notebooks-as-a-Service to empower data scientists to perform analytics and distributed machine learning on elastic GPU-pools with access to big and fast data services.
- Secured connectivity to data lakes and data sets on S3 and (Kerberized) HDFS.
- Included GPU-enabled Spark and distributed TensorFlow.
- Provided OpenID connect authentication and authorization with support for Windows Integrated Authentication (WIA) and Active Directory Federation Services (ADFS).
Observability and Metrics
- Introduced a flexible and configurable metrics pipeline with multiple output formats.
- Enhanced support for application metric types including histograms, counters, timers, and gauges.
- Provided support for sample rates and multi-metrics packets.
- Introduced Mesos framework metrics.
- No longer require modifications when collecting metrics via Prometheus endpoint in 1.11.
EnterprisePrivate Package Registry
- Enabled on-premise package distribution and management.
- Enabled air-gapped Virtual Private Cloud package management.
- Simplified package artifact management.
- Introduced package-specific controls for adding/removing/updating packages within a cluster.
- Introduced package management CLI.
Installation and Upgrade
- Provided full support for installing and operating a cluster on SELinux hardened OS with SE Linux in targeted-enforcing mode for all hardened non-DC/OS components.
- Introduced a unified Terraform-based open source tool for provisioning, deploying, installing, upgrading, and decommissioning DC/OS on AWS, GCP, and Azure.
- Introduced an intuitive, streamlined installation with a quick start process - Spin up a DC/OS cluster with a few easy steps in 10-15 minutes.
- Officially recommended as a Mesosphere supported installation method with best practices built-in (i.e sequential masters & parallel agents in upgrade).
- Restructured Mesosphere installation documentation to organize Mesosphere supported installation methods and Community supported installation methods.
- Expanded DC/OS upgrade paths enable Mesosphere to skip specific upgrade paths within a supported patch version of DC/OS (i.e upgrade from 1.11.1 => 1.11.5 in one move) and to skip upgrade paths between supported major to major versions of DC/OS (for example, enabling you to upgrade from 1.11.7 to 1.12.1 in one move).
- If you have installed the optional DC/OS Storage Service package, then upgrading from 1.12.0 to 1.12.1 requires you to first follow the storage upgrade instructions provided in Manually upgrade the DSS package to 0.5.x from 0.4.x.
EnterpriseLDAP and Networking Enhancements
- Introduced anonymous LDAP bind complies with standardized Enterprise LDAP integration pattern without a dedicated DC/OS integration LDAP user.
- Provided dynamic LDAP synchronization to synchronize LDAP user account groups automatically without manual synchronization of LDAP directory with accounts imported into DC/OS.
- Enhanced networking component with 150+ bug fixes with limited logging for visibility.
- Improved DNS convergence time (sub-sec) performance.
- Configured MTU for Overlay networks.
- Provided reusable IP addresses for new agents in the cluster.
- Mitigation of networking stuck-state due to SSL deadlock in Erlang library.
- Provided TLS 1.2 support.
- Provided support for per container network Metrics.
- Leveraged persistent connections in Edge-LB for L7 load-balancing. Enterprise
- Improved logging in Edge-LB. Enterprise