x86 CPU Rings: Why Only Two Are Used?

Understanding System Behavior: Resource Utilization and Operating Systems
As you delve deeper into the workings of operating systems and their interaction with underlying hardware, certain behaviors might seem unusual. You may observe what appears to be inefficient resource allocation or seemingly unnecessary processes.
These observations often lead to questions about why systems operate in this manner. Today’s discussion, drawn from the SuperUser Q&A forum, addresses a reader’s inquiry regarding these curious system characteristics.
The Source of the Answers: SuperUser
This particular question and its insightful answer originate from SuperUser, a valuable resource within the Stack Exchange network. Stack Exchange is a collection of community-driven question and answer websites.
SuperUser provides a platform for users to seek and share knowledge related to advanced computing topics. It’s a great place to find explanations for complex system behaviors.
Image Attribution
The accompanying image used in the original post is credited to Lemsipmatt, and was sourced from Flickr.
This visual element helps illustrate the concepts discussed within the Q&A session.
Understanding x86 CPU Rings
A SuperUser user, AdHominem, recently inquired about the utilization of CPU rings in x86 processors. Specifically, they questioned why systems running Linux or Windows only employ Ring 0 for kernel mode and Ring 3 for user mode.
The core of the question revolves around the purpose of having four distinct rings when practical implementation consistently limits usage to just two.
The History of CPU Rings
The concept of CPU rings originated with the IBM System/360 architecture in the 1960s. This design introduced multiple privilege levels to enhance system security and stability.
These rings, numbered 0 to 3, represent decreasing levels of privilege. Ring 0 possesses the highest level of access, while Ring 3 has the most restricted access.
Original Intent and Design
Initially, the four rings were envisioned to support a more granular security model. The idea was to allow for intermediate layers of software, such as device drivers or virtual machines, to operate at different privilege levels.
This hierarchical structure would theoretically limit the damage caused by software errors or malicious code. Each ring would act as a protective barrier.
Why Only Two Rings Are Commonly Used
Despite the initial design, practical considerations led to the widespread adoption of only two rings – Ring 0 and Ring 3. The complexity of managing and securing software across four distinct privilege levels proved substantial.
Furthermore, the benefits of intermediate rings were often outweighed by the performance overhead they introduced. Maintaining strict isolation between rings requires significant processing resources.
Ring Usage in Modern Operating Systems
In modern x86 systems, Ring 0 is exclusively reserved for the operating system kernel. This ensures the kernel has unrestricted access to system resources.
User applications, conversely, operate in Ring 3, with limited access to hardware and system functions. This separation prevents user-level programs from directly interfering with the kernel or other applications.
AMD64 and Ring Support
The transition to the AMD64 architecture did not alter the fundamental ring structure of x86 processors. It continues to support four rings, but the practical usage remains largely confined to Ring 0 and Ring 3.
While AMD64 introduced other significant architectural improvements, it did not necessitate or implement a change in how CPU rings are utilized by operating systems.
The Persistence of Four Rings
Even though only two rings are commonly used, the presence of all four rings remains a part of the x86 instruction set. Removing them would introduce backward compatibility issues.
Maintaining the four-ring structure ensures that older software and operating systems can continue to function without modification. It represents a legacy consideration.
Understanding Memory Protection Rings
A SuperUser community member, Jamie Hanrahan, provides insight into this topic.
There are fundamentally two key reasons explaining this phenomenon.
Initially, while x86 processors offer four levels of memory protection – often referred to as rings – the protection granularity is limited to the segment level. Each segment can be assigned a specific ring, defining its privilege level, alongside other protections like write restrictions. However, the number of available segment descriptors is constrained. Modern operating systems require more precise memory protection, ideally at the individual page level.
Consequently, page table-based protection emerged. The vast majority of contemporary x86 operating systems largely bypass the segmenting mechanism, relying instead on protection features within page table entries. A crucial element is the "privileged" bit, which dictates whether processor access to a page requires a privileged level – PL 0, 1, or 2. Due to its binary nature, page-level protection is effectively limited to two modes: accessible from non-privileged code, or not. Therefore, only two rings are utilized. Achieving four rings per page would necessitate two protection bits within each page table entry, mirroring the segment descriptors, but this is not the case.
Furthermore, a focus on operating system portability plays a significant role. The Unix operating system demonstrated the feasibility of creating relatively portable systems across diverse processor architectures, a desirable trait. Some processors inherently support only two rings. By avoiding reliance on multiple rings, operating system developers enhanced system portability.
A third factor is specific to the development of Windows NT. The NT designers, led by David Cutler and his team recruited from DEC Western Region Labs, possessed extensive experience with VMS. In fact, Cutler and several others were original VMS architects. The VAX processor, for which VMS was created, does feature four rings, and VMS leverages them.
However, components present in VMS Rings 1 and 2 – Record Management Services and the Command Line Interface, respectively – were omitted from the NT design. Ring 2 in VMS primarily served to preserve the user's CLI environment between programs, a concept absent in Windows. The CLI operates as a standard process. Regarding VMS's Ring 1, frequent calls from RMS code in Ring 1 to Ring 0 proved inefficient due to the overhead of ring transitions. It was more effective to directly access Ring 0, especially considering NT lacks an equivalent to RMS.
The reason x86 implemented four rings while operating systems didn't fully utilize them lies in the timing of their development. Many x86 system programming features were designed before the advent of NT or modern Unix-like kernels. The ultimate use cases for these features were not yet defined. The introduction of paging on x86 enabled the implementation of kernels akin to Unix or VMS.
Modern x86 operating systems largely disregard segmenting, typically configuring the C, D, and S segments with a base address of 0 and a size of 4 GB. Segments F and G are occasionally used for critical operating system data structures. Similarly, features like Task State Segments are often ignored. The TSS mechanism was intended for thread context switching, but its side effects proved problematic. Modern systems manage context switching manually. Hardware task switching in x86 NT is reserved for exceptional circumstances, such as a double fault exception.
With the advent of the x64 architecture, many of these underutilized features were removed. AMD proactively consulted with operating system kernel teams to determine necessary features, unwanted elements, and potential additions. Segments in x64 exist in a vestigial form, task state switching is absent, and operating systems continue to employ just two rings.
Do you have additional insights to contribute? Share your thoughts in the comments section. For a more comprehensive discussion and further perspectives from other technical experts, visit the original Stack Exchange thread here.