Writing a Unix-like OS in Rust

octox is a Unix-like Operating System completely implemented in Rust from scratch. I began implementing it as my learning OS inspired by xv6-riscv. In this posts, I'll discuss how the features of Rust are utilized in the OS implementation.

Overview

Features of octox include:

  • The kernel, userland and build system are all implemented in Rust.
  • Wherever possible, using the standard features of the Rust language.
  • The kernel can also be used as lib crate.
  • Userland includes a library with usability similar to Rust's std, with Unix commands, including the shell, implemented using this library.
  • Filesystem with logging capabilities.
  • Supports multicore, featuring a preemptive kernel that allows multiple processes to run on multiple cores simultaneously.

However, I think it's quicker to check out what kind of OS it is for yourself, so please try it from https://github.com/o8vm/octox. All you need is Rust and qemu. In fact, this is one of the great benefits of having the entire build system written in Rust.
As far as I've checked, you can build and test the OS on Windows (mingw64 or WSL), Linux, and macOS using the exact same steps. No features beyond Rust's standard capabilities are used, and there's no need for environment-specific adjustments.

demo

I hope you’ve experienced the excellence of the build system, but there are many other advantages to using the Rust for OS development beyond just the build process.

Why Rust?

The merits of the Rust language have already been discussed in various places, so it may not be necessary to reiterate it here. Primarily, Rust is optimal for OS implementation for the following reasons:

  • The type-safe type system ensures that undefined behavior does not occur.
  • Memory safety prevents memory operations such as accessing memory areas after release or before initialization.
  • Due to the constraints imposed by type safety and memory safety, writing concurrent programs is also straightforward.
  • However, for implementing low-level features, the use of the unsafe keyword allows for programming that is not type-safe.
  • Modern language features
  • Easy cross-compilation and custom builds

Furthermore, to fully enjoy the benefits of the Rust language with minimal effort, octox is implemented with attention to the following points:

  • Maximizing the utilization of cargo, Rust's standard build tool.
  • Maximizing the use of Rust's standard types. For types not available in std due to using no_std, implement alternatives with equivalent functionality.
  • Minimizing the use of unsafe blocks.

Implementation.

Here are some specific examples.

Configuration and Build

The kernel of octox is located in src/kernel, while userland resides in src/user. The program for creating the file system, mkfs, can be found in src/mkfs. Among these, the kernel can be directly utilized as a library, and it is used for both building userland and building mkfs. The building process is conducted through the build.rs build script of cargo. build.rs is a regular Rust program that allows customization of the build, enabling changes to the target system during the build and even preparation for OS boot by creating the file system during the build process.

build.rs:


let out_dir = PathBuf::from(std::env::var("OUT_DIR").unwrap());
// build user programs
let (uprogs_src_path, uprogs) = build_uprogs(&out_dir);  // target = riscv64gc-unknown-none-elf
// build mkfs
let mkfs_path = build_mkfs(&out_dir); // target = host system
// build fs.img
let fs_img = PathBuf::from(std::env::var("CARGO_MANIFEST_DIR").unwrap()).join("target").join("fs.img"); let mut mkfs_cmd = Command::new(&mkfs_path);
mkfs_cmd.status().expect("mkfs fs.img failed");

There is also a src/user/build.rs in the userland crate, which is executed within the above build_uprogs() function. Using the libkernel crate, it automatically generates the system call wrapper library for the userland side, located at src/user/usys.rs. Each userland program utilizes the automatically generated src/user/usys.rs to issue system calls.

src/user/build.rs:


fn main() {
    let out_dir = PathBuf::from(std::env::var("OUT_DIR").unwrap());
    let mut usys_rs = File::create(out_dir.join("usys.rs")).expect("cloudn't create src/user/usys.rs");
    usys_rs.write_all("// Created by build.rs\n\n".as_bytes()).expect("src/user/usys.rs: write error");
    for syscall_id in SysCalls::into_enum_iter().skip(1) {
        usys_rs.write_all(syscall_id.gen_usys().as_bytes()).expect("usys write error");
}

The core of the kernel library is the implementation of system calls. enum SysCalls has a system call table TABLE and methods as actual system calls with cfg attributes to enable changing the compiled code depending on the target and automatic code generation.

src/kernel/syscall.rs:


#[derive(Copy, Clone, Debug)]
#[repr(usize)]
pub enum SysCalls {
    Fork = 1,
    Exit = 2,
    Wait = 3,
    ...
}
impl SysCalls {
    pub const TABLE: [(Fn, &'static str); variant_count::<Self>()] = [
        (Fn::N(Self::invalid), ""),
        (Fn::I(Self::fork), "()"), // Create a process, return child's PID.
        (Fn::N(Self::exit), "(xstatus: i32)"), // Terminate the current process; status reported to wait(). No Return.
        (Fn::I(Self::wait), "(xstatus: &mut i32)"), // Wait for a child to exit; exit status in &status; retunrs child PID.
        ...
    ];
}
// Process related system calls
impl SysCalls {
    pub fn exit() -> ! {
        #[cfg(not(all(target_os = "none", feature = "kernel")))]
        unimplemented!();
        #[cfg(all(target_os = "none", feature = "kernel"))]
        {
            exit(argraw(0) as i32)
            // not reached
        }
    }
    ....
}
// Generate system call interface for userland in build process.
#[cfg(not(target_os = "none"))]
impl SysCalls {
    pub fn into_enum_iter() -> std::vec::IntoIter<SysCalls> {
        (0..core::mem::variant_count::<SysCalls>())
            .map(SysCalls::from_usize)
            .collect::<Vec<SysCalls>>()
            .into_iter()
    }
    pub fn signature(self) -> String {
        let syscall = Self::TABLE[self as usize];
        format!(
            "fn {}{} -> {}",
            self.fn_name(),
            syscall.1,
            self.return_type()
        )
    }

By adopting this structure, transforming octox into a library OS or a unikernel becomes remarkably straightforward.

Execution

If you configure qemu as a runner in .cargo/config.toml, you can boot the OS on qemu when cargo run is executed.

.cargo/config.toml:


[target.riscv64gc-unknown-none-elf]
runner = """ qemu-system-riscv64 -machine virt -bios none -m 524M -smp 4 -nographic -serial mon:stdio -global virtio-mmio.force-legacy=false -drive file=target/fs.img,if=none,format=raw,id=x0 -device virtio-blk-device,drive=x0,bus=virtio-mmio-bus.0 -kernel """

Inline Assembly

Rust has inline assembly, and functions can have link_section and repr(align) attributes, so there is no need to prepare files for assembly, complicate the build process, or switch heads. Functions can also be written within functions, so trampoline code can easily be written in a \*.rs file as follows:

src/kernel/trampoline.rs


#[link_section = "trampsec"]
#[no_mangle]
#[repr(align(16))]
pub unsafe extern "C" fn trampoline() -> ! {
    unreachable!();
    #[link_section = "trampsec"]
    #[naked]
    #[no_mangle]
    #[repr(align(16))]
    pub unsafe extern "C" fn uservec() -> ! {
        // trap.rs sets stvec to point here, so
        // traps from user space start here,
        // in supervisor mode, but with a
        // user page table.

        asm!(
            // save user a0 in sscratch so
            // a0 can be used to get at TRAPFRAME
            "csrw sscratch, a0",
            // each process has a separate p.trapframe memory area,
            // but it's mapped to the same virtual address
            // (TRAPFRAME) in every process's user page table.
            "li a0, {tf}",
            // save the user registers in TRAPFRAME
            "sd ra, 40(a0)",
            ...
            // install the kernel page table
            "csrw satp, t1",
            // flush now-stable user entries from the TLB
            "sfence.vma zero, zero",
            // jump to usertrap(), which does not return
            "jr t0",
            tf = const TRAPFRAME,
            options(noreturn)
        );
    }

    #[link_section = "trampsec"]
    #[naked]
    #[no_mangle]
    #[repr(align(16))]
    pub unsafe extern "C" fn userret(pagetable: usize) -> ! {
        ...
        asm!(
            ....
            "sret",
            tf = const TRAPFRAME,
            options(noreturn),
        );
    }
}

Separate Address Space by Type and Trait

In oxtox, physical addresses, kernel address space, and user address space are clearly separated by type. Therefore, you will not confuse them and operate incorrectly.


#[derive(Debug, Clone, Copy, PartialEq, Eq, PartialOrd, Ord)]
#[repr(transparent)]
pub struct PAddr(usize);

#[derive(Debug, Clone, Copy, PartialEq, Eq, PartialOrd, Ord)]
#[repr(transparent)]
pub struct KVAddr(pub usize);

#[derive(Debug, Clone, Copy, PartialEq, Eq, PartialOrd, Ord, Default)]
#[repr(transparent)]
pub struct UVAddr(usize);

On the other hand, common functionality is implemented by a feature in Rust called traits. For example, if the VAddr trait is implemented, then the walk() method for finding page table entries can be used in common, regardless of kernel/user address space. The outlook is improved by the use of modern language features.


pub trait VAddr: Addr + Debug {
    // extract the three 9-bit page table indices from a virtual address.
    const PXMASK: usize = 0x1FF; // 9 bits
    fn px(&self, level: usize) -> usize;
    // one beyond the highest possible virtual address.
    // MAXVA is actually one bit less than the max allowed by
    // Sv39, to avoid having to sign-extend virtual addresses
    // that have the high bit set.
    const MAXVA: usize = 1 << (9 + 9 + 9 + 12 - 1);
}
impl_vaddr!(KVAddr);
impl_vaddr!(UVAddr);
#[derive(Debug, Clone)]
pub struct PageTable<V: VAddr> {
    ptr: *mut RawPageTable,
    _marker: PhantomData<V>,
}

impl<V: VAddr> PageTable<V> {
    pub fn walk(&mut self, va: V, alloc: bool) -> Option<&mut PageTableEntry> {
        let mut pagetable = self.ptr;
        if va.into_usize() >= V::MAXVA {
            panic!("walk");
        }
        ...
    }
}

Atomic Types

Rust has atomic versions of some data types, such as AtomicUsize for usize and AtomicBool for bool. This means that it is easy to make the OS multicore-compatible. In addition, by using atomic types, it is easy to construct concurrent types that do not exist in no_std. For example, the lock type is one of them.

Locks

When implementing an OS with multicore supports, one of the initial crucial components is spin locks. However, in no_std environments, there is no built-in type for spin locks. In octox, a custom spin lock that disables interrupts upon lock acquisition is introduced as the Mutex type. It is easy to implement using atomic types. The usage is the same as the Mutex type in std. Notably, disabling interrupts upon spin lock acquisition in the kernel aims to avoid performance degradation and prevent deadlock occurrences.

Below are usage examples. In Rust's locking mechanism, since the lock is automatically released when its scope ends, forgetting to unlock the lock does not occur.


let m: Mutex<usize> = Mutex::new(5);
{
    let mut val = m.lock(); // acquire lock
    // critical section
    *val = 6;
}  // release lock

The interrupt disabling feature is implemented safely by introducing a IntrLock type that manages interrupt enable/disable and delegating it to Rust's resource management. Specifically, a spinlock holds the IntrLock value when acquiring a lock, and disables interrupts on that processor for the duration of the value of this type exists. Conversely, the implementation is such that when all IntrLock types on a processor drop, interrupts are enabled for that processor; in Rust, when the lifetime of a value expires, a drop() (Drop trait) is called to destroy the value. Just as there is no danger of forgetting to release a lock, Rust's functionality eliminates the danger of leaving interrupts disabled.

src/kernel/spinlock.rs:


pub struct Mutex<T> {
    locked: AtomicBool,
    data: UnsafeCell<T>,
}
pub struct MutexGuard<'a, T> {
    mutex: &'a Mutex<T>,
    _intr_lock: IntrLock,
}
impl<T> Mutex<T> {
    pub fn lock(&self) -> MutexGuard<'_, T> {
        let _intr_lock = Cpus::lock_mycpu();
        while self.lcoked.swap(true, Ordering::Acquire) {
            core::hint::spin_loop();
        }
        MutexGuard {
            lock: self,
            intr_lock,
        }
    }
}
impl Cpus {
    // disable interrupts on mycpu().
    // if all IntrLock are dropped, interrupts may recover
    // to previous state.
    pub fn lock_mycpu(name: &str) -> IntrLock {
        ...
    }
}
impl Drop for IntrLock {
    fn drop(&mut self) {
        // enable interrupt if all locks are dropped
    }
}

octox also implements the OnceLock type, which allows a variable to be written only once. There is also a LazyLock type that performs initialisation the first time a variable is accessed. Usability is the same as https://doc.rust-lang.org/std/sync/struct.OnceLock.html and https://doc.rust-lang.org/std/sync/struct.LazyLock.html in the Rust standard library respectively. This makes static variable initialisation safer and more straightforward.

src/kernel/vm.rs:


pub static mut KVM: OnceLock<Kvm> = OnceLock::new();
// Initialize the one kernel_pagetable
// Safety
// Runs on a first single CPU
pub unsafe fn kinit() {
    // safe because this function body is only executed on a first single CPU.
    unsafe {
        KVM.set(Kvm::new().unwrap()).unwrap();
        KVM.get_mut().unwrap().make();
    }
}

src/kernel/proc.rs:


pub static PROCS: LazyLock<Procs> = LazyLock::new(|| Procs::new());
pub fn init() {
    for (i, proc) in PROCS.pool.iter().enumerate() {
        proc.data_mut().kstack = kstack(i);
    }
}

Other more advanced lock types are implemented using Arc and Mutex and process features. An example is the SleepLock type. See src/kernel/sleeplock.rs for more details.

Using Rust's Reference Counting

Reference counting counts the number of owners who share ownership of a value, and discards the value when the count reaches zero; there are several resources in OS implementations that require reference counting, but doing the reference counting yourself is a source of error; Rust has Rc and Arc, types that use reference counting internally to achieve multiple owners, can be used instead. We will never make a counting mistake.

For example, consider a buffer cache.

A buffer cache is a copy of a disk block that is sandwiched in front of a disk access to speed up IO and synchronise accesses. Unused buffer caches are reused, so we have to keep track of how many places a given buffer is used. This is where reference counting comes in. Note that several concurrently executed codes may access the buffer cache at the same time, so we can use Arc, which supports use by concurrently executed codes.

In octox, the buffer cache is implemented as follows. Reference counting by wrapping the actual state of the data in Arc:

src/kernel/bio.rs:


pub struct Buf {
    data: Arc<&'static SleepLock<Data>>,
}

There are other resources where reference counting is required. An example is the Inode in memory. Inodes can be either on disk (DInode) or in memory (MInode), where the contents of the DInode are copied and information added to it for use in the kernel. However, an Inode in memory must be allocated in memory as long as the code that refers to it exists. In the opposite case, it must be destroyed from memory. This is exactly where Arc comes in. The octox Inode definition is as follows.

src/kernel/fs.rs:


#[derive(Default, Clone, Debug)]
pub struct Inode {
    ip: Option<Arc<MInode>>,
}

The reference count in Arc counts up if any file opens or otherwise refers to that Inode. The reference count will also decrease if the file is no longer used. If the reference count decreases and it is known that the Inode is no longer needed, it can be discarded from memory.

We can also use Arc to manage open files. Files can be opened multiple times by multiple or single processes, and you need to remember how many of them there are. This means that Arc can be used here too, to deter reference count errors and to properly discard resources that are no longer needed.


#[derive(Default, Clone, Debug)]
pub struct File {
    f: Option<Arc<VFile>>,
    readable: bool,
    writable: bool,
    cloexec: bool,
}

Incidentally, File is an abstraction of FNod for file data with an offset added to Inode, DNod for device files and Pipe for pipes as enum VFile, to which attributes such as readable, writable, cloexec. octox does not use a file state management table, but instead manages each open file. This was also easy to implement due to Rust's rich type system.


#[derive(Debug)]
pub enum VFile {
    Device(DNod),
    Inode(FNod),
    Pipe(Pipe),
    None,
}
// File & directory Node
#[derive(Debug)]
pub struct FNod {
    off: UnsafeCell<u32>,
    ip: Inode,
}
// Device Node
#[derive(Debug)]
pub struct DNod {
    driver: &'static dyn Device,
    ip: Inode,
}
#[derive(Debug)]
pub struct Pipe {
    rx: Option<Receiver<u8>>,
    tx: Option<SyncSender<u8>>,
}

MultiProducer-MultiConsumer (MPMC) channel

octox has a Condvar in its kernel, implemented using process sleep and wakeup functions. This Condvar allows the use of methods like notify_all() and wait(), similar to those found in the Rust standard library https://doc.rust-lang.org/std/sync/struct.Condvar.html. With this mechanism, channels can be easily implemented, and consequently, pipes can also be straightforwardly developed.

octox has a MultiProducer-MultiConsumer (MPMC) channel capable of supporting multiple senders and receivers. This channel is designed with the mpsc module from Rust's standard library in mind, offering comparable usability.


pub fn sync_channel<T: Debug>(max: isize, name: &'static str) -> (SyncSender<T>, Receiver<T>) { ... }
pub struct SyncSender<T: Debug> {
    sem: Arc<Semaphore>, // count receiver
    buf: Arc<Mutex<LinkedList<T>>>,
    cond: Arc<Condvar>, // cont sender
    scnt: Arc<AtomicUsize>,
    rcnt: Arc<AtomicUsize>,
}
pub struct Receiver<T: Debug> {
    sem: Arc<Semaphore>,
    buf: Arc<Mutex<LinkedList<T>>>,
    cond: Arc<Condvar>,
    scnt: Arc<AtomicUsize>,
    rcnt: Arc<AtomicUsize>,
}

The following is an implementation of a pipe using mpmc: src/kernel/pipe.rs:


struct Pipe {
    rx: Option<Receiver<u8>>,
    tx: Option<SyncSender<u8>>,
}

impl Pipe {
    const PIPESIZE: isize = 512;
    pub fn new(rx: Option<Receiver<u8>>, tx: Option<SyncSender<u8>>) -> Self {
        Self { rx, tx }
    }
    pub fn get_mode(&self) -> OMode { ... }
    pub fn alloc() -> Result<(File, File)> {
        let (tx, rx) = sync_channel::<u8>(Self::PIPESIZE, "pipe");
        let p0 = Self::new(Some(rx), None);
        let p1 = Self::new(None, Some(tx));
        let f0 = FTABLE.alloc(p0.get_mode(), FType::Pipe(p0))?;
        let f1 = FTABLE.alloc(p1.get_mode(), FType::Pipe(p1))?;
        Ok((f0, f1))
    }
    pub fn write(&self, mut src: VirtAddr, n: usize) -> Result<usize> {
        let tx = self.tx.as_ref().ok_or(BrokenPipe)?;
        let mut i = 0;
        while i < n {
            let mut ch: u8 = 0;
            either_copyin(&mut ch, src)?;
            let Ok(_) = tx.send(ch) else {
                break;
            };
            src += 1;
            i += 1;
        }
        Ok(i)
    }
    pub fn read(&self, mut dst: VirtAddr, n: usize) -> Result<usize> {
        let rx = self.rx.as_ref().ok_or(BrokenPipe)?;
        let mut i = 0;
        while i < n {
            let Ok(ch) = rx.recv() else {
                break;
            };
            either_copyout(dst, &ch)?;
            dst += 1;
            i += 1;
        }
        Ok(i)
    }
}

As demonstrated with the implementation of the pipe, leveraging Rust's language features alongside OS capabilities can significantly streamline the development of other OS functionalities

No libc, Rust-like userland library available

In octox, libc does not exist. The user library is also entirely written in Rust, akin to Rust's std, acts entirely as a wrapper over syscalls. In simple terms, userland just issues syscalls to the kernel. For a command, this could be as straightforward as mkdir dir in the shell or mkdir("dir") in Rust code. While this seems simple, it's not particularly convenient. To enhance usability, we're modeling user library after Rust's standard library. For example, having access to handy features like DirEntry makes developing in userland more efficient (see https://doc.rust-lang.org/std/fs/struct.DirEntry.html. Aligning with Rust's standard library has other benefits, too. One is the absorption of compatibility issues. Rust is compatible with various platforms, and handling compatibility at the level of the Rust standard library, rather than through syscalls or libc, can be more efficient. The goal is that Rust code used on one OS can be recompiled to work on my OS. From a unikernel perspective, recompilation is natural, and if we're only using Rust for development, it enhances safety.

octox ls example: src/user/ls.rs


fn ls(path: &str) -> sys::Result<()> {

let path = Path::new(path);

match fs::read_dir(path) {

Err(_) => {

let attr = File::open(path)?.metadata()?;

...

}

Ok(entries) => {

for entry in entries {

let entry = entry.unwrap();

let attr = entry.metadata()?;

println!(

"{:14} {:6} {:3} {}",

entry.file_name(),

format!("{:?}", attr.file_type()),

attr.inum(),

attr.len()

);

}

}

}

Ok(())

}

Use Rust types instead of C types as arguments to system calls.

octox expects crabi ABI https://github.com/rust-lang/rust/pull/105586 to be introduced in Rust in the future and uses Rust types directly as systemcall arguments. Traditionally, Rust's ABI has not been stable, so FFIs, systemcalls, etc. have had to go through the C ABI once (e.g. extern "C" ). However, going through the C ABI can cause problems, for example, if you use the safe String type, you cannot communicate between the kernel and userland without converting it to an unsafe C string. It's also a hassle because of the extra copying that can occur. Since everything is implemented in Rust, it is tempting to use Rust types for system call arguments.

For example, the following is a definition for automatically generating exec system calls. If you use a slice of a string slice as the argument, you don't need to specify the number of arguments, and you don't need to do any string conversion between userland and the kernel.


impl SysCalls {
    pub const TABLE: [(Fn, &'static str); variant_count::<Self>()] = [
        (Fn::I(Self::exec), "(filename: &str, argv: &[&str])"), // Load a file and execute it with arguments; only returns if error.
        ...,
    ];
}

Arguments passed from exec() on the kernel side to userland can also be of the same type as in Rust. The arguments passed to the user program can be thought of as a slice of a Rust string slice on the stack. assuming that the correct data is always passed from the kernel, argument handling on the user program side (like https://doc.rust-lang.org/std/env/fn.args.html can be easily implemented. There is no need for string conversion between userland and the kernel.

You can define lang_start(), which calls main(), also known as Rust's runtime startup, as follows:


// wrapper so that it's ok if main() does not call exit().
#[lang = "start"]
fn lang_start<T: Termination + 'static>(
    main: fn() -> T,
    _: isize,
    args: *const *const u8,
    _: u8,
) -> isize {
    unsafe {
        ARGS = (args as *const &[&str]).as_ref().copied();
    }
    let xstatus = main().report() as i32;
    sys::exit(xstatus)
}

The env module is defined as:


pub static mut ARGS: Option<&[&str]> = None;

pub fn args() -> Args {
    Args {
        iter: if let Some(args) = unsafe { ARGS } {
            args.iter()
        } else {
            [].iter()
        },
    }
}

The lang_start arguments are defined by Rust so some of them are unnecessary for my OS at this time, but if more flexibility is given here, the runtime startup signature could be simplified. For example, lang_start(main: fn() -> T, args: Option<&[&str]>) would be simpler.

Difficulties.

I was going to mention the implementation difficulties and bugs, but to my surprise, there were no Rust-specific difficulties at all. I found it much easier to write an OS in Rust than to write one in C. In fact, the some issues I encountered was a miscalculation of an array index, but this could happen in any language. On the contrary, Rust is much less prone to bugs caused by miscalculations than other languages.

I dare say that the primary challenge was mishandling unsafe and Arc. In my opinion, it was a fundamental mistake, yet it took some time to debug. Below, I've outlined the structures and arrays that manage the CPU. The issue arose in the code responsible for extracting process information from these structures.

src/kernel/proc.rs:


pub static CPUS: Cpus = Cpus::new(); // Cpus([UnsafeCell::new(Cpu::new()), ...])
pub struct Cpus([UnsafeCell<Cpu>; NCPU]);
pub struct Cpu {
    pub proc: Option<Arc<Proc>>,  // The process running on this cpu, or None.
    pub context: Context,
    ...
}
impl Cpus {
    pub unsafe fn cpu_id() -> usize {
        let id;
        asm!("mv {0}, tp", out(reg) id);
        id
    }
    pub unsafe fn mycpu(&self) -> *mut Cpu {
        let id = Self::cpu_id();
        self.0[id].get()
    }
}

When the OS needs to obtain process information, it retrieves the corresponding Cpu struct from the CPUS array using the ID of the currently running CPU core and then extracts the Proc from it. The issue was in this step. To better understand, let's compare the problematic code with the corrected version.

Problematic code:


pub fn myproc(&self) -> Option<&Arc<Proc>> {
    let _intr_lock = CPUS.lock_mycpu("withoutspin");
    unsafe {
        let c = self.mycpu();
        (*c).proc.as_ref()
    }
}

Corrected code:


pub fn myproc() -> Option<Arc<Proc>> {
    let _intr_lock = Self::lock_mycpu("withoutspin");
    let c;
    unsafe {
        c = &*CPUS.mycpu();
    }
    c.proc.clone()
}

The problematic code returns a reference to an Arc via a raw pointer to Cpu within an unsafe block. Specifically, it's returning the address where Arc<Proc> is stored. In other words, the function returns the address where the target process information (Arc<Proc>) is located in the CPUS array. This approach presents a significant issue because processes frequently move between different cores. As a result, by the time we use that address, the specific process we intend to manipulate may no longer be present at that location. This volatility is the reason why mycpu() operations are in unsafe blocks. Unless interrupts are disabled, the raw pointer of Cpu might not point to the CPU core that's currently in execution.

Given that Arc<Proc> itself is a pointer to the process information, the function should return Arc<Proc> directly, as demonstrated in the corrected code. This method ensures that the correct process information is accessible, even if the process has migrated from the original core.

It was a relatively simple bug, but its identification and resolution were not immediate. This delay was because the bug's manifestations could vary with each execution in a multiprocess environment where different activities are intertwined across multiple cores. For instance, one execution might trigger an exception on core 1, while another could do so on core 2. However, the issue could be pinpointed more rapidly than it might have been in C. Since we can associate these errors with the unsafe block, our initial scrutiny can be directed there, helping us narrow down the potential sources of the error.

In Conclusion

From what has been said so far, it's apparent that Rust could be an excellent choice for OS development. The language's robust type system not only accelerates development but also minimizes the occurrence of critical issues and facilitates easier problem identification when they do arise. Given that octox is an educational OS, it presents a valuable resource for those keen on learning about OS development. If you're interested, feel free to dive in! If the kernel seems daunting, you might want to start with userland programming or by adding a new system call to the kernel. Check out a straightforward example in octox's README on GitHub at https://github.com/o8vm/octox - it's a great starting point.

Looking ahead, my next objective is to finalize the OnceLock, which is presently incomplete. I plan to introduce a robust abstraction that effectively supports multiple architectures. Once that is achieved, the subsequent step will be to running octox on actual hardware.

octox can also function as a library OS. I plan to make octox compatible with Rust's standard library (std) in the future. By achieving compatibility with Rust's std, I believe it will be possible to create a unikernel that resolves application compatibility issues.