2018 has started with a major announcement about two vulnerabilities that exist in modern hardware CPUs from different vendors. This post means to summarize the situation as of Jan 5, 2018.
Mysterious kernel updates
The first rumors emerged on January 1st 2018 in a blog post by python sweetness.
This blog post explains that the memory management subsystem of the Linux kernel was being rewritten “in a rush” and that the reasons had to be critical. This subsystem is amongst the most stable within the kernel and since any modification usually takes years before being fully accepted by the Linux maintainers, having major parts of it rewritten in such a short timeframe was very suspicious. Moreover, this work was done in the open (because of the Open Source nature of the Linux kernel) and comments were redacted in order to give as few information as possible.
The changes were to implement a patch named KAISER, which was meant to correct a weakness discovered in July 2017. The full paper is available here.
In parallel, late in 2017, members of the “Windows Insider” early-availability program also received some patches that were making equivalent changes in the Windows memory subsystem.
Research leads to urgent action
It turns out that research into such low-level/CPU vulnerabilities has continued and reached a point where we need to urgently take action. A coordinated disclosure was initially planned for Friday, Jan 5, 2018 but the amount of leaked information triggered an early disclosure by Google Project Zero in the evening of January 3 2018.
What the problem is
Without going into details, the problem lies in the combination of two optimization techniques:
Every Operating System (Windows, Linux, MacOS, ...) must handle both a Kernel context (low level) and several User/Application contexts (higher levels). Each context needs memory to store data. As an optimization technique, the Kernel memory normally remains reachable from each User context. It is the task of the Operating System to make sure that a User process is not allowed to read inside the Kernel memory and this has worked well for decades.
- As for the CPUs, they also have optimization techniques and one of them, called “predictive execution”, allows a CPU to execute instructions “in advance” and drop the results if it later turns out that those instructions were not required. A side effect of this “predictive execution” is to fetch data from the standard memory into the CPU caches (high speed memory located inside the CPU).
The result of combining those two optimization techniques is that a standard program running in a User context can trigger the CPU “predictive execution” to fetch data into the caches and later use this cache as an oracle to leak kernel memory, bypassing the standard controls implemented by the Operating System. The content of the kernel memory can later be used to simply steal data (like private/encryption keys) or mount attacks that are more sophisticated.
The hardware “speculative execution” cannot be de-activated, which is why these bugs will never get 100% patched. However, the immediate response is to modify the Operating Systems to render kernel memory invisible from the User contexts. This implies more work for the kernel and will obviously have a performance impact. It also appears that BIOS modifications are being implemented by the vendors in order to mitigate the risks.
Issue and concrete consequences
The short-term plan is therefore to patch both the firmwares and Operating Systems on affected systems (servers, workstations, phones…)
However, the problem becomes much bigger once we add cloud computing to the mix. In this case “local execution” from one virtual machine to the hypervisor kernel potentially means “remote” execution from one customer to another. Cloud providers are the ones having to react quickly and both Amazon AWS and Microsoft Azure are in the process of “rebooting” their clouds. Smaller providers have to be challenged on their patching strategy.
As a final word, we can add that even people with master’s degrees in computer hardware engineering have issues fully understanding the extent of those vulnerabilities, so the future will definitely bring us more clarifications, details and attack vectors.
Vendors’ responses published to date (as of Jan 10, 2018 - will be updated regularly)