Want to build your own kernel in Rust? See the Bare Metal Rust page for more resources and more posts in this series.

Rust is a really fun language: It allows me to work on low-level kernel code, but it also allows me to wrap my code up in clean, high-level APIs. If you this sounds interesting, you should really check out Philipp Oppermann’s blog posts about writing a basic x86_64 operating system kernel in Rust. He walks you through booting the kernel, entering long mode, getting Rust running, and printing text to the screen.

Once you get a basic kernel running, you’ll probably want to start working on basic I/O, which requires interrupts. And this point, you’ll find that pretty much every tutorial dives right into the in and out instructions. For example, if you look at the OSDev.org introduction to interrupts, the very first code you’ll see is (comments added):

mov al,20h  ; Move interrupt acknowledgment code into al.
out 20h,al  ; Write al to PIC on port 0x20.

Here, we’re talking to the PIC (“Programmable Interrupt Controller”), and we’re telling it that we’ve finished handling a processor interrupt. To do this, we need to write an 8-bit status code to the I/O port at address 0x20.

Traditionally, we would wrap this up in an outb (“out byte”) function, which might look something like this in Rust:

// The asm! macro requires a nightly build of Rust, and
// we need to opt-in explicitly.
#![feature(asm)]

unsafe fn outb(value: u8, port: u16) {
    asm!("outb %al, %dx" ::
         "{dx}"(port), "{al}"(value) ::
         "volatile");
}

This writes an 8-byte value to the specified port. It uses the unstable Rust extension asm!, which allows us to use GCC/LLVM-style inline assembly. We’d invoke it like this:

outb(0x20, 0x20);

But let’s see if we can wrap a higher-level API around an I/O port.

What we want to wrap

First, let’s look at the complete family of functions we want to wrap:

unsafe fn inb(port: u16) -> u8 { ... }
unsafe fn outb(value: u8, port: u16) { ... }
unsafe fn inw(port: u16) -> u16 { ... }
unsafe fn outw(value: u16, port: u16) { ... }
unsafe fn inl(port: u16) -> u32 { ... }
unsafe fn outl(value: u32, port: u16) { ... }

The implementations of these functions are all similar to outb above. The functions whose names begin with in all read from port, and the functions beginning with out write to it. We use the b functions for 8-bit-wide ports, the w functions for 16-bit-wide ports, and the l functions for 32-bit ones.

API design & safety

We could imagine an API which looks something like:

// Create a port at location 0x3F8, the data line
// associated with the serial port COM1.
let mut port = Port::new(0x3F8);

// Write `out_val` to our port.
port.write(out_val);

// Receive `in_val` from our port.
let in_val = port.read();

But out course, this is Rust, so we need to ask some questions about safety. We call a Rust API “unsafe” if we can use it to cause undefined behavior. Could we do that with this API? Well, it depends on the port address: Port::new(0x3F8) is used to read and write data on COM1, so it’s harmless, but Port::new(0x20) allows us to reprogram the interrupt controller, which can corrupt the stack. So we should mark Port::new as an unsafe API, and call it as:

let mut port = unsafe { Port::new(0x3F8) };

But we probably don’t need to make read or write unsafe, because we can leave it up to the unsafe code which calls Port::new to decide who can safely have access to the port. Once some unsafe code somewhere has taken responsibility the port, it’s their problem.

So here’s the first draft of our API:

pub struct Port {
    port: u16,
}

impl Port {
    pub unsafe fn new(port: u16) -> Port {
        Port { port: port }
    }

    pub fn read(&mut self) -> u8 {
        unsafe { inb(self.port) }
    }

    pub fn write(&mut self, value: u8) {
        unsafe { outb(value, self.port) }
    }
}

This works, but only for u8. We can do better!

Trying to add a type parameter

We could try to add a type parameter:

pub struct Port<T> {
    port: u16,
}

But when we compile this, we get very helpful error:

src/lib.rs:1:17: 1:18 error: parameter `T` is never used [E0392]
src/lib.rs:1 pub struct Port<T> {
                             ^
src/lib.rs:1:17: 1:18 help: run `rustc --explain E0392` to see
  a detailed explanation
src/lib.rs:1:17: 1:18 help: consider removing `T` or using a
  marker such as `core::marker::PhantomData`

The Rust documentation explains PhantomData in detail. It’s basically a zero-sized placeholder type which we can use as follows:

pub struct Port<T> {
    port: u16,
    phantom: PhantomData<T>,
}

We can now update new, adding the type parameter <T> everywhere we need it:

impl<T> Port<T> {
    pub unsafe fn new(port: u16) -> Port<T> {
        Port { port: port, phantom: PhantomData }
    }

    // ...
}

We initialize the field phantom to the value PhantomData, which is the only possible value of type PhantomData<T>. This is probably seems really weird to everybody except Haskell and ML programmers, who’ve seen types with only one value before.

Problem 1: We don’t want anyone creating Port<f64>

As written above, Port<T> could be used with any type T, including something weird like f64. But this won’t work, because our CPU’s in and out instructions only work with the types u8, u16 and u32. So we need to restrict the possible values of T, which we can do by defining a new trait:

pub trait InOut {}

impl InOut for u8 {}
impl InOut for u16 {}
impl InOut for u32 {}

We can then add a type constraint to T, giving us T: InOut in several places:

pub struct Port<T: InOut> {
    port: u16,
    phantom: PhantomData<T>,
}

impl<T: InOut> Port<T> {
    // ...
}

Problem 2: What do we put in read and write?

But now, if we try to define the read function, we have no way to decide whether to call inb, inw or inl:

impl<T: InOut> Port<T> {
    // ...
    
    pub fn read(&mut self) -> T {
        unsafe { /* inb, inw or inl??? */ }
    }

    // ...
}

If T were u8, we could use inb like before. But u16 would require inw, and u32 would require inl. We can sort this out by adding some functions to our InOut trait:

pub trait InOut {
    unsafe fn port_in(port: u16) -> Self;
    unsafe fn port_out(port: u16, value: Self);
}

We then need to define these methods for each instance of the trait:

impl InOut for u8 {
    unsafe fn port_in(port: u16) -> u8 { inb(port) }
    unsafe fn port_out(port: u16, value: u8) { outb(value, port); }
}

impl InOut for u16 {
    unsafe fn port_in(port: u16) -> u16 { inw(port) }
    unsafe fn port_out(port: u16, value: u16) { outw(value, port); }
}

impl InOut for u32 {
    unsafe fn port_in(port: u16) -> u32 { inl(port) }
    unsafe fn port_out(port: u16, value: u32) { outl(value, port); }
}

This finally gives us:

impl<T: InOut> Port<T> {
    pub unsafe fn new(port: u16) -> Port<T> {
        Port { port: port, phantom: PhantomData }
    }

    pub fn read(&mut self) -> T {
        unsafe { T::port_in(self.port) }
    }

    pub fn write(&mut self, value: T) {
        unsafe { T::port_out(self.port, value); }
    }
}

An experimental enhancement: Creating Ports at compile time

Using a Mutex from the Rust spin crate, we might try to create a global port object protected by a lock:

extern crate spin;
use spin::Mutex;

/// Port used to access a PS/2 keyboard.
static KEYBOARD: Mutex<Port<u8>> = Mutex::new(unsafe {
    Port::new(0x60)
});

If we do this, Rust tells us:

src/lib.rs:24:5: 24:20 error: function calls in statics are limited to
  constant functions, struct and enum constructors [E0015]
src/lib.rs:24     Port::new(0x60)
                  ^~~~~~~~~~~~~~~

It would be nice if we could run Port::new at compile time. Happily, if we use the nightly build of Rust, we can try out the new const_fn feature:

#![feature(const_fn)]

impl<T: InOut> Port<T> {
    pub const unsafe fn new(port: u16) -> Port<T> {
        /// ...

See that const in the function declaration? That means we can now safely write:

static KEYBOARD: Mutex<Port<u8>> = Mutex::new(unsafe {
    Port::new(0x60)
});

…and get our Mutex-protected port! We can access this as:

let scancode = KEYBOARD.lock().read();

Available as a crate! (and where to next?)

This library is available as a crate, which you can also download from crates.io. If you follow Philipp Oppermann’s instructions for using cargo to build a kernel, you can include this by adding cpuio to your Cargo.toml file:

[dependencies]
cpuio = "*"

And using this library as a building block, we can tackle:

  • Setting up interrupts (needed for I/O).
  • Reading keyboard input.
  • Writing to the serial port (useful for debugging kernels with QEMU).

If people are interested, I’m happy to dive into these subjects. And if you have any questions, I’ll be following the discussion at Reddit.

Want to build your own kernel in Rust? See the Bare Metal Rust page for more resources and more posts in this series.