A security issue assigned CVE-2020-8558 was recently discovered in the kube-proxy, a networking component
running on Kubernetes nodes. The issue exposed internal services of
Kubernetes nodes, often run without authentication. On certain
Kubernetes deployments, this could have exposed the api-server, allowing
an unauthenticated attacker to gain complete control over the cluster.
An attacker with this sort of access could steal information, deploy
crypto miners or remove existing services altogether.
The vulnerability exposed nodes’ localhost services – services meant
to be accessible only from the node itself – to hosts on the local
network and to pods running on the node. Localhost bound services expect
that only trusted, local processes can interact with them, and thus
often serve requests without authentication. If your nodes run localhost
services without enforcing authentication, you are affected.
The issue details were made public on April 18, 2020, and a patch released on June 1, 2020. We worked to assess additional impact to
Kubernetes clusters and found that some Kubernetes installations don’t
disable the api-server insecure-port, which is normally only accessible from within the master node. Exploiting CVE-2020-8558, attackers can gain access to the insecure-port and gain full control over the cluster.
We alerted the Kubernetes security team of the potential impact of
this vulnerability. In turn, the team rated the vulnerability’s impact
as High in clusters where the api-server insecure-port is enabled, and otherwise Medium. Luckily, CVE-2020-8558’s impact is
somewhat reduced on most hosted Kubernetes services like Azure
Kubernetes Service (AKS), Amazon’s Elastic Kubernetes Service (EKS) and
Google Kubernetes Engine (GKE). CVE-2020-8558 was patched in Kubernetes
versions v1.18.4, v1.17.7, and v1.16.11 (released June 17, 2020). All
users are encouraged to update.
Prisma Cloud customers are protected from this vulnerability through the capabilities described in the Conclusion section.
kube-proxy is a network proxy running on each node in a Kubernetes
cluster. Its job is to manage connectivity among pods and services. Kubernetes services expose a single clusterIP, but may consist of multiple backing pods to
enable load balancing. A service may consist of three pods – each with
its own IP address – but will expose only one clusterIP, for example,
10.0.0.1. Pods accessing that service will send packets to its
clusterIP, 10.0.0.1, but must somehow be redirected to one of the pods
behind the service abstraction.
That’s where the kube-proxy comes in. It sets up routing tables on
each node, so that requests targeting a service will be correctly routed
to one of the pods backing that service. It’s commonly deployed as a
static pod or as part of a DaemonSet.
GKE’s documentation has further details, if you’re interested.
There are networking solutions, such as Cilium, that could be configured to fully replace the kube-proxy.
As part of its job, the kube-proxy configures several network parameters through sysctl files. One of those is net.ipv4.conf.all.route_localnet – the culprit behind this vulnerability. Sysctl documentation states, “route_localnet: Do not consider loopback addresses as martian
source or destination while routing. This enables the use of 127/8 for
local routing purposes. default FALSE.”
Let’s unpack that explanation. For IPv4, the loopback addresses
consist of the 127.0.0.0/8 address block (127.0.0.1-127.255.255.255),
while commonly only 127.0.0.1 is used and has the hostname “localhost”
mapped to it. Those are addresses used by your machine to refer to
itself. Packets targeting a local service will be sent to IP 127.0.0.1
through the loopback network interface, with their source IP set to
127.0.0.1 as well.
Setting route_localnet instructs the kernel to not define 127.0.0.1/8 IP addresses as martian.
What does “martian” mean in this context? Well, some packets arrive at a
network interface and make claims about their source or destination IP
that just don’t make sense. For example, a packet could arrive with a
source IP of 255.255.255.255. That packet shouldn’t exist:
255.255.255.255 can’t identify a host, it’s a reserved address used to
indicate broadcast. So what’s going on? Your kernel can’t know for sure
and has no choice but to conclude the packet came from Mars and should
be dropped.
Martian packets often hint that someone malicious on the network is
trying to attack you. In the example above, the attacker may want your
service to respond to IP 255.255.255.255, causing routers to broadcast
the response. A fishy destination IP can also cause a packet to be
deemed martian, such as a packet arriving at an external network
interface with a destination IP of 127.0.0.1. Again, that packet doesn’t
make sense – 127.0.0.1 is used for internal communication through the
loopback interface and shouldn’t arrive from a network-facing interface.
For more details on martian packets, refer to RFC 1812.
In some complicated routing scenarios, you might want the kernel to let certain martian packets pass through. That’s what route_localnet is used for. It instructs the kernel not to consider 127.0.0.0/8 as
martian addresses (as it normally would, like in the case discussed in
the previous paragraph). The kube-proxy enables route_localnet to support a bunch of routing magic that I won’t get into, but route_localnet is disabled by default for a reason. Unless proper mitigation is set up
alongside it, attackers on the local network could exploit route_localnet to perform several attacks. The most impactful is reaching localhost bound services.
Linux allows processes to listen only on a specific IP address so
that they can bind themselves to the address of a network interface.
Internal services often use that feature to listen only on 127.0.0.1.
Normally, this ensures that only local processes can access the service,
as only they can reach 127.0.0.1. This assumption is broken with route_localnet,
since it allows external packets destined for 127.0.0.0/8. That’s
highly concerning given internal services tend not to enforce
authentication, expecting external packets will not reach them.
An attacker attempting to reach the victim’s internal services would
need to construct a malicious packet where the destination IP address is
set to 127.0.0.1 and the destination MAC address is set to the victim’s
MAC address. Without a meaningful destination IP, the attacker’s packet
only relies on layer 2 (MAC-based) routing to reach the victim and is
thus limited to the local network. Therefore, even if a victim enabled route_localnet, only attackers on the local network could access the victim’s localhost services.
When the victim machine receives the malicious packet, it will let it pass because of route_localnet.
Since the packet has a destination IP of 127.0.0.1, it would be
eligible to access localhost services. Table 1 shows what a malicious
packet may look like. The attacker’s IP is 10.0.0.1 with MAC address
XXX, and the target IP is 10.0.0.0.2 with MAC address YYY. The target is
running a localhost-only service on port 1234.
src mac | XXX | dst mac | YYY |
src ip | 10.0.0.1 | dst ip | 127.0.0.1 |
src port | random | dst port | 1234 |
Figure 1. A packet exploiting route_localnet
The attacker sends the packet with his IP address to ensure he
receives the target’s responses. To summarize, route_localnet allows
attackers on the local network to access a host’s internal services with
packets like the one shown above.
Because of kube-proxy, every node in the cluster has route_localnet enabled. As a result, every host on a node’s local network could gain
access to the node’s internal services. If your nodes run internal
services without authentication, you are affected.
Aside from neighboring hosts on a node’s local network, pods running
on the node could also access its internal services. To be clear, a pod
can only reach the internal services of the node hosting it. To carry
out the attack, the pod must possess the CAP_NET_RAW capability. Unfortunately, Kubernetes grants this capability by default.
When we examined this issue, we tried to identify localhost services
that are natively deployed by Kubernetes. We found that by default, the
Kubenetes API server serves unauthenticated requests on localhost
through a port dubbed the insecure-port. The insecure-port exists to allow other control-plane components running on the master
(such as etcd) to easily talk with the api-server. Role-based access
control (RBAC) or any other authorization mechanisms are not enforced on
that port. Kubernetes installations frequently run the api-server as a
pod on the master node, meaning it is running alongside a kube-proxy
that has enabled route_localnet.
Alarmingly, this means that if your Kubernetes deployment didn’t disable the insecure-port,
hosts on the master node’s local network could exploit CVE-2020-8558 to
command the api-server and gain complete control over the cluster.
Managed Kubernetes platforms such as GKE, EKS and AKS are better protected against CVE-2020-8558.
To begin with, the virtual networks of some cloud service providers (CSPs), such as Microsoft Azure,
don’t support layer-2 semantics and MAC-based routing. You can easily
see how this manifests – every AKS machine has the same MAC address:
12:34:56:78:9A:BC. This mitigates exploitation of CVE-2020-8558 from
other hosts on a node’s local network, but malicious pods possessing CAP_NET_RAW should still be able to carry out the attack.
Cloud-hosted Kubernetes offerings also tend to manage the Kubernetes
control plane and api-server for you and run them on a separate network
from the rest of the cluster. This protects the api-server, as it isn’t
exposed to the rest of the cluster. Even if a CSP would run the
api-server without disabling the insecure-port, attackers in the cluster wouldn’t be able to access it as they don’t run in the same local network.
Still, if your CSP virtual network does support layer-2 routing,
malicious hosts in your cluster’s network could access localhost
services on the worker nodes.
The initial fix actually resides in the kubelet and adds mitigations around route_localnet: routing rules that cause nodes to drop external packets destined for 127.0.0.1. At the time of writing this post, route_localnet is still enabled by the kube-proxy. There are ongoing discussions about disabling it.
Even with those mitigations applied, there’s still a special case
where a local network attacker could send packets to nodes’ internal UDP
services. However, the attack only works if the victim node disables reverse path filtering (which normally isn’t case), and in this circumstance the attacker won’t be able to get the response. A patch is being developed, but it may be dropped as the attack depends on an insecure setting (rp_filter=0).
If all of the below are true, your cluster is vulnerable to CVE-2020-8558:
Additionally, if the following is also true, your cluster may be vulnerable to a complete takeover through the api-server insecure-port:
A node could be attacked either by a malicious host on the local network or by a malicious pod with CAP_NET_RAW running on the node.
CVE-2020-8558 can have some serious consequences. You should patch
your clusters as soon as possible. CVE-2020-8558 also serves as a
reminder that security best practices do work and significantly reduce attack surface. You should disable the api-server insecure-port, and if your pods don’t require CAP_NET_RAW,
there’s no reason they should have that capability. While not related
to this specific issue, we recommend implementing other security
recommendations such as RBAC or running containers as a non-root user.
Palo Alto Networks Prisma Cloud customers are protected from this vulnerability. Compliance rules
ensure your clusters are configured securely, blocking or alerting on: