Introduction
Welcome to libr0 - a hands-on journey through Rust's standard library types.
What is libr0?
libr0 is an educational project that reimplements Rust's core standard library types from scratch. By building types like Option, Box, Vec, and Rc yourself, you'll develop a deep understanding of:
- Memory management and allocation
- Smart pointers and ownership patterns
- Interior mutability with
CellandRefCell - Reference counting and weak references
Who is this for?
This book is designed for developers who:
- Understand basic Rust syntax and ownership rules
- Want to understand how Rust's types work under the hood
- Learn best by building and experimenting with code
- Are curious about memory layout and low-level implementation details
How to use this book
Each chapter covers one type in depth:
- What problem does it solve? - Understanding the motivation
- How does it work? - Implementation details with memory diagrams
- Build it yourself - Runnable examples and exercises
- Common patterns - Real-world usage scenarios
All implementations use the suffix 0 (e.g., Option0<T>, Vec0<T>) to distinguish them from stdlib types, allowing side-by-side comparison.
Running the code
The complete source code is available at github.com/arinal/libr0. You can:
# Run all tests
cargo test
# Run a specific chapter's examples
cargo run --example option
cargo run --example vec
cargo run --example rc
Ready to dive in?
Start with Chapter 1: Option to begin your journey.
Chapter 1: Option - The Simplest Enum
The Problem: Null References
In many languages, any reference can be null:
String name = null;
int length = name.length(); // NullPointerException!
Tony Hoare, who invented null references, called it his "billion-dollar mistake." Rust solves this with Option.
Null Across Languages
Languages with null:
// Java
String name = null; // Can assign null to any reference
// JavaScript
let name = null; // null is a primitive value
// Even Scala, which is a functional language, still has null lurking around
var name: String = null // ✅ Compiles - null is allowed
name.length // Runtime error if null!
// But idiomatic Scala uses Option
val name: Option[String] = None
Languages without null:
-- Haskell - no null!
name :: Maybe String
name = Nothing -- Uses Maybe instead
-- You CANNOT do this in Haskell:
name = null -- ERROR: null doesn't exist!
#![allow(unused)] fn main() { // Rust - no null! let name: Option<String> = None; // You CANNOT do this in Rust: let name = null; // ERROR: null doesn't exist! }
Key insight: Rust and Haskell don't have null at all. Instead, they use type-safe alternatives (Option in Rust, Maybe in Haskell) that force you to handle the absence of a value explicitly.
In Rust, to represent "no value," we use an enum called Option, which we'll implement ourselves as Option0.
Our Option Type
#![allow(unused)] fn main() { enum Option0<T> { Some(T), None, } }
That's it. Two variants:
Some(T)- contains a value of typeTNone- represents absence of a value
The compiler forces you to handle both cases. You can't accidentally use a None as if it were Some.
Basic Usage
use Option0::{Some, None}; fn find_user(id: u32) -> Option0<String> { if id == 1 { Some(String::from("Alice")) } else { None } } fn main() { let user = find_user(1); // Must handle both cases match user { Some(name) => println!("Found: {}", name), None => println!("User not found"), } }
Why is this better than null?
Notice that find_user returns Option0<String>, not String. This is the key difference:
| With null (Java, etc.) | With Option (Rust) |
|---|---|
String find_user(...) | Option0<String> find_user(...) |
| Return type lies - might be null | Return type is honest - might be None |
| Compiler lets you ignore null | Compiler forces you to handle None |
Crash at runtime: NullPointerException | Error at compile time |
// Java: Compiler is happy, but this crashes at runtime
String user = findUser(99);
int len = user.length(); // NullPointerException!
#![allow(unused)] fn main() { // Rust: Compiler is not happy, `user` is Option0<String>, not String let user = find_user(99); let len = user.len(); // Error: Option0<String> has no method `len` }
#![allow(unused)] fn main() { // You MUST unwrap it first, which forces you to think about None let len = match user { Some(s) => s.len(), None => 0, // You're forced to decide what happens here }; }
The compiler is your safety net. It won't let you forget.
Implementing Option Methods
Let's build the most useful methods step by step.
is_some and is_none
The simplest methods - just check which variant we have:
#![allow(unused)] fn main() { impl<T> Option0<T> { fn is_some(&self) -> bool { match self { Some(_) => true, None => false, } } fn is_none(&self) -> bool { !self.is_some() } } }
Examples:
#![allow(unused)] fn main() { let x: Option0<u32> = Some(42); x.is_some() // true x.is_none() // false let y: Option0<u32> = None; y.is_none() // true y.is_some() // false // Useful for conditional checks if x.is_some() { println!("x has a value"); } // Or for early returns fn process(opt: Option0<i32>) -> Result<(), String> { if opt.is_none() { return Err("No value provided".to_string()); } // Continue processing... Ok(()) } }
unwrap - The Dangerous One
Extract the value, panic if None:
#![allow(unused)] fn main() { impl<T> Option0<T> { fn unwrap(self) -> T { match self { Some(val) => val, None => panic!("called unwrap on a None value"), } } } }
Warning: Only use
unwrap()when you're 100% sure it'sSome, or in examples/tests.
Examples:
#![allow(unused)] fn main() { let x = Some("value"); x.unwrap() // "value" // This will panic! Avoid in production code let y: Option0<&str> = None; // y.unwrap(); // ❌ Panics: "called unwrap on a None value" // Safe uses of unwrap: // 1. In tests #[test] fn test_parse() { let result = parse_config("valid_config.json"); assert_eq!(result.unwrap().port, 8080); // OK in tests } // 2. When you've already checked let opt = Some(42); if opt.is_some() { let value = opt.unwrap(); // Safe, but pattern matching is cleaner } // 3. When failure is a programming error let config = load_config().unwrap(); // OK if missing config means broken setup }
unwrap_or - Safe Default
Provide a fallback value:
#![allow(unused)] fn main() { impl<T> Option0<T> { fn unwrap_or(self, default: T) -> T { match self { Some(val) => val, None => default, } } } }
Examples:
#![allow(unused)] fn main() { // Basic usage let x = Some(42); x.unwrap_or(0) // 42 let y: Option0<i32> = None; y.unwrap_or(0) // 0 // User input with fallback fn get_count(user_input: Option0<i32>) -> i32 { user_input.unwrap_or(10) // Default to 10 if no input } get_count(Some(5)) // 5 get_count(None) // 10 }
unwrap_or_else - Lazy Default
Sometimes computing the default is expensive. Only compute it if needed:
#![allow(unused)] fn main() { impl<T> Option0<T> { fn unwrap_or_else<F: FnOnce() -> T>(self, f: F) -> T { match self { Some(val) => val, None => f(), } } } }
Examples:
#![allow(unused)] fn main() { // Basic usage let x = Some(42); x.unwrap_or_else(|| 0) // 42 let y: Option0<i32> = None; y.unwrap_or_else(|| 0) // 0 // Avoid expensive computation when Some fn expensive_computation() -> String { println!("Computing..."); // This won't print if Some String::from("default") } let some_value = Some(String::from("existing")); let result = some_value.unwrap_or_else(|| expensive_computation()); // "Computing..." is NOT printed because we have Some result // "existing" let none_value: Option0<String> = None; let result = none_value.unwrap_or_else(|| expensive_computation()); // "Computing..." IS printed because we have None result // "default" // Database lookup as fallback fn find_in_cache(key: &str) -> Option0<String> { None } fn fetch_from_db(key: &str) -> String { String::from("db_value") } let value = find_in_cache("user:123") .unwrap_or_else(|| fetch_from_db("user:123")); // DB query only if cache miss }
map - Transform the Inner Value
This is where it gets interesting. Transform Some(x) to Some(f(x)), leave None alone:
#![allow(unused)] fn main() { impl<T> Option0<T> { fn map<U, F: FnOnce(T) -> U>(self, f: F) -> Option0<U> { match self { Some(x) => Some(f(x)), None => None, } } } }
Examples:
#![allow(unused)] fn main() { // Basic transformation let maybe_name: Option0<String> = Some(String::from("alice")); let maybe_len: Option0<usize> = maybe_name.map(|s| s.len()); maybe_len // Some(5) let nothing: Option0<String> = None; let still_nothing: Option0<usize> = nothing.map(|s| s.len()); still_nothing // None // Convert between types let age: Option0<u32> = Some(25); let age_str = age.map(|n| n.to_string()); age_str // Option0<String>: Some("25") // Chain transformations let number = Some(5); let result = number .map(|n| n * 2) // Some(10) .map(|n| n + 3) // Some(13) .map(|n| n.to_string()); // Some("13") result // Some("13") // None propagates through let number: Option0<i32> = None; let result = number .map(|n| n * 2) .map(|n| n + 3); result // None // Working with structs struct User { name: String, age: u32, } let user = Some(User { name: String::from("Alice"), age: 30, }); let user_name = user.map(|u| u.name); user_name // Some("Alice") // Real-world: parsing configuration fn get_port_config() -> Option0<String> { Some(String::from("8080")) } let port: Option0<u16> = get_port_config() .map(|s| s.parse::<u16>().unwrap_or(3000)); port // Some(8080) }
and_then - Chainable Operations (flatMap)
What if your transformation also returns an Option? map would give you Option<Option<T>>. Use and_then instead:
#![allow(unused)] fn main() { impl<T> Option0<T> { fn and_then<U, F: FnOnce(T) -> Option0<U>>(self, f: F) -> Option0<U> { match self { Some(x) => f(x), None => None, } } } }
How it works conceptually:
The key insight: unwrap self first, then apply f
- If
selfisSome(x), unwrap it to getx, then applyf(x)which returnsOption0<U> - If
selfisNone, just returnNone(no unwrapping needed)
This is different from map:
map(f): unwrap → apply f → wrap result in Someand_then(f): unwrap → apply f → return result as-is (f already returns Option)
#![allow(unused)] fn main() { // Example: Why and_then avoids nesting let x: Option0<i32> = Some(5); // With map: f returns Option0, so we get nested Option let nested = x.map(|n| Some(n * 2)); // Option0<Option0<i32>> // With and_then: f returns Option0, result stays flat let flat = x.and_then(|n| Some(n * 2)); // Option0<i32> }
Examples:
#![allow(unused)] fn main() { // Why we need and_then: Compare map vs and_then fn safe_divide(a: i32, b: i32) -> Option0<i32> { if b == 0 { None } else { Some(a / b) } } // Using map: ❌ Gives nested Option let x = Some(10); let result = x.map(|n| safe_divide(n, 2)); // result = Some(Some(5)) - Wrong! We have nested Options // Using and_then: ✅ Flattens automatically let x = Some(10); let result = x.and_then(|n| safe_divide(n, 2)); // result = Some(5) - Correct! result // Some(5) // None propagates let x = Some(10); let result = x.and_then(|n| safe_divide(n, 0)); result // None // Processing multiple Options together let a = Some(3); let b = Some(2); // Combine two Options: a + b let sum = a.and_then(|x| b.map(|y| x + y)); sum // Some(5) // If either is None, result is None let a = Some(3); let b: Option0<i32> = None; let sum = a.and_then(|x| b.map(|y| x + y)); sum // None // Three Options: a + b + c let a = Some(3); let b = Some(2); let c = Some(1); let sum = a.and_then(|x| b.and_then(|y| c.map(|z| x + y + z) ) ); sum // Some(6) // Alternative: using match for multiple Options (often cleaner) let a = Some(3); let b = Some(2); let sum = match (a, b) { (Some(x), Some(y)) => Some(x + y), _ => None, }; sum // Some(5) }
filter - Conditional Keep
Keep Some only if a predicate is true:
#![allow(unused)] fn main() { impl<T> Option0<T> { fn filter<P: FnOnce(&T) -> bool>(self, predicate: P) -> Option0<T> { match self { Some(x) if predicate(&x) => Some(x), _ => None, } } } }
Examples:
#![allow(unused)] fn main() { // Basic filtering let even_number = Some(4).filter(|n| n % 2 == 0); even_number // Some(4) let odd_number = Some(3).filter(|n| n % 2 == 0); odd_number // None // None stays None let nothing: Option0<i32> = None; let still_nothing = nothing.filter(|n| n % 2 == 0); still_nothing // None }
as_ref - Borrow the Inner Value
Convert &Option0<T> to Option0<&T>:
#![allow(unused)] fn main() { impl<T> Option0<T> { fn as_ref(&self) -> Option0<&T> { match self { Some(x) => Some(x), None => None, } } } }
Why do we need this? Because map takes self by value - it consumes the Option.
Examples:
#![allow(unused)] fn main() { // Problem: map consumes the Option let maybe_name: Option0<String> = Some(String::from("Alice")); let len = maybe_name.map(|s| s.len()); // println!("{:?}", maybe_name); // ERROR: maybe_name was moved! // Solution: Use as_ref() to borrow let maybe_name: Option0<String> = Some(String::from("Alice")); let len = maybe_name.as_ref().map(|s| s.len()); // s is &String len // Some(5) println!("{:?}", maybe_name); // Works! maybe_name still valid // Multiple operations on the same Option let data = Some(String::from("hello world")); let len = data.as_ref().map(|s| s.len()); let uppercase = data.as_ref().map(|s| s.to_uppercase()); let contains = data.as_ref().map(|s| s.contains("world")); len // Some(11) uppercase // Some("HELLO WORLD") contains // Some(true) // data is still usable! data // Some("hello world") // as_ref with None let nothing: Option0<String> = None; let result = nothing.as_ref().map(|s| s.len()); result // None // Real-world: Validating without consuming struct Config { api_key: Option0<String>, } impl Config { fn validate(&self) -> bool { // Use as_ref to check without consuming api_key self.api_key .as_ref() .map(|key| key.len() > 10) .unwrap_or(false) } fn get_key(&self) -> Option0<&str> { // Convert Option0<String> to Option0<&str> self.api_key.as_ref().map(|s| s.as_str()) } } let config = Config { api_key: Some(String::from("secret_key_12345")), }; config.validate() // true (borrows api_key) config.validate() // true (can validate again!) config.get_key() // Some("secret_key_12345") // Chaining with as_ref let text = Some(String::from(" hello ")); let trimmed_len = text .as_ref() .map(|s| s.trim()) .map(|s| s.len()); trimmed_len // Some(5) text // Some(" hello ") - original unchanged }
The key insight: as_ref() converts &Option0<T> to Option0<&T>. Now when map consumes the Option, it's consuming an Option of references, not the original data.
take - Extract and Replace with None
Useful for moving values out of mutable references:
#![allow(unused)] fn main() { impl<T> Option0<T> { fn take(&mut self) -> Option0<T> { std::mem::replace(self, None) } } }
Examples:
#![allow(unused)] fn main() { // Basic usage: Move value out, leave None let mut slot: Option0<String> = Some(String::from("hello")); let taken = slot.take(); taken // Some("hello") slot // None (slot is now None) // Taking from None returns None let mut empty: Option0<i32> = None; let result = empty.take(); result // None empty // None // Use case: Moving from struct fields struct Cache { data: Option0<String>, } impl Cache { fn flush(&mut self) -> Option0<String> { // Take the data, leaving cache empty self.data.take() } fn get(&self) -> Option0<&str> { // Use as_ref for non-destructive access self.data.as_ref().map(|s| s.as_str()) } } let mut cache = Cache { data: Some(String::from("cached_value")), }; cache.get() // Some("cached_value") let flushed = cache.flush(); flushed // Some("cached_value") cache.get() // None (cache is now empty) // Taking in a loop let mut items = vec![ Some(1), Some(2), Some(3), ]; let extracted: Vec<i32> = items .iter_mut() .filter_map(|opt| opt.take()) .collect(); extracted // [1, 2, 3] // All items are now None items.iter().all(|opt| opt.is_none()) // true // Conditional take struct Player { weapon: Option0<String>, } impl Player { fn drop_weapon_if(&mut self, condition: bool) -> Option0<String> { if condition { self.weapon.take() } else { None } } } let mut player = Player { weapon: Some(String::from("sword")), }; // Don't drop let result = player.drop_weapon_if(false); result // None player.weapon // Some("sword") // Do drop let result = player.drop_weapon_if(true); result // Some("sword") player.weapon // None }
The Complete Implementation
See the full code in src/option.rs for the complete implementation of Option0 with all methods.
Also, see the exercises in 01_option.rs
Key Takeaways
- Option is just an enum - No magic, just two variants
- The compiler enforces handling - Can't ignore the
Nonecase - map transforms, and_then chains - Functional programming patterns
- unwrap is a code smell - Prefer
unwrap_or,unwrap_or_else, or pattern matching
Next Chapter
Result - Like Option, but with error information.
Chapter 2: Result - Error Handling Done Right
The Problem: Exceptions Are Invisible
In many languages, any function can throw an exception:
String content = readFile("non-existent-file.txt"); // throws exception
println("File content: " + content);
In Java, the above code compiles fine, even though the programmer "forgot" to handle exception.
Rust's approach: readFile returns a wrapper to indicate it can fail:
#![allow(unused)] fn main() { let result = readFile("non-existent-file.txt"); // returns Result<String, Error> // result is not the content, but a wrapper that can be Ok(content) or Err(error) // to extract the content, you're forced to handle both cases: match result { Ok(content) => println!("File content: {}", content), Err(e) => println!("Failed to read file: {:?}", e), } // this way, the programmer can't "forget" to handle errors, as the case with the java example. }
Our Result Type
#![allow(unused)] fn main() { enum Result0<T, E> { Ok(T), Err(E), } }
Two variants:
Ok(T)- operation succeeded with valueTErr(E)- operation failed with errorE
The caller must handle both cases. The compiler won't let you ignore errors.
What Can Be an Error?
The E in Result<T, E> can be any type. It doesn't need to implement std::error::Error or any special trait, as long as you wrap it in Err().
#![allow(unused)] fn main() { // String as error let error: Result0<i32, String> = Err(String::from("something broke")); // &str as error let error: Result0<i32, &str> = Err("file not found"); // Number as error code let error: Result0<i32, i32> = Err(404); // Custom enum - most common in real code #[derive(Debug)] enum ParseError { Empty, TooLong, InvalidFormat, } let error: Result0<i32, ParseError> = Err(ParseError::Empty); }
Key rule: Always wrap your error in Err(). Don't return the error type directly:
#![allow(unused)] fn main() { // ❌ Wrong fn parse(s: &str) -> Result0<i32, &str> { if s.is_empty() { "empty string" // ERROR: expected Result0, found &str } else { Ok(42) } } // ✅ Correct fn parse(s: &str) -> Result0<i32, &str> { if s.is_empty() { Err("empty string") // Wrapped in Err! } else { Ok(42) } } }
Basic Usage
Let's validate a person with a custom error type:
use Result0::{Ok, Err}; #[derive(Debug)] struct Person { name: String, age: i32, } #[derive(Debug)] enum InvalidPersonError { EmptyName, InvalidAge(i32), } fn validate_person(person: Person) -> Result0<Person, InvalidPersonError> { if person.name.is_empty() { Err(InvalidPersonError::EmptyName) // Wrap in Err! } else if person.age < 0 { Err(InvalidPersonError::InvalidAge(person.age)) // Capture the bad value } else { Ok(person) // Wrap valid person in Ok! } } fn main() { let person = Person { name: String::from("Alice"), age: 30 }; match validate_person(person) { Ok(valid_person) => println!("Valid person: {:?}", valid_person), Err(e) => println!("Invalid person: {:?}", e), } // Output: Valid person: Person { name: "Alice", age: 30 } let bad_person = Person { name: String::from(""), age: -5 }; match validate_person(bad_person) { Ok(valid_person) => println!("Valid person: {:?}", valid_person), Err(e) => println!("Invalid person: {:?}", e), } // Output: Invalid person: EmptyName }
Implementing Result Methods
is_ok and is_err
#![allow(unused)] fn main() { impl<T, E> Result0<T, E> { fn is_ok(&self) -> bool { matches!(self, Ok(_)) } fn is_err(&self) -> bool { !self.is_ok() } } }
Examples:
#![allow(unused)] fn main() { let success: Result0<i32, &str> = Ok(42); success.is_ok() // true success.is_err() // false let failure: Result0<i32, &str> = Err("bad input"); failure.is_ok() // false failure.is_err() // true // ❌ Common mistake: verbose pattern if result.is_ok() { let value = result.unwrap(); // Don't do this! // use value... } // ✅ Better: use match or if let match result { Ok(value) => { /* use value */ }, Err(e) => { /* handle error */ } } }
unwrap and expect
Extract value, panic on error:
#![allow(unused)] fn main() { impl<T, E: std::fmt::Debug> Result0<T, E> { fn unwrap(self) -> T { match self { Ok(val) => val, Err(e) => panic!("called unwrap on Err: {:?}", e), } } fn expect(self, msg: &str) -> T { match self { Ok(val) => val, Err(e) => panic!("{}: {:?}", msg, e), } } } }
expect is slightly better than unwrap - at least you leave a message explaining what went wrong.
Examples:
#![allow(unused)] fn main() { let success: Result0<i32, &str> = Ok(42); success.unwrap() // 42 let failure: Result0<i32, &str> = Err("oops"); failure.unwrap() // ❌ Panics: "called unwrap on Err: \"oops\"" // expect provides context let result: Result0<Config, &str> = Err("missing file"); result.expect("Config must be loaded"); // ❌ Panics: "Config must be loaded: \"missing file\"" // Anti-pattern: checking then unwrapping let result: Result0<i32, &str> = Ok(42); if result.is_ok() { let val = result.unwrap(); // Won't panic, but verbose and clunky // use val... } // What about the Err case? You still need another if/else! // Pattern matching is cleaner - extracts value and handles both cases let result: Result0<i32, &str> = Ok(42); match result { Ok(val) => { /* use val */ }, Err(e) => { /* handle error */ } } // Or use if let for the Ok case only if let Ok(val) = result { // use val... } }
unwrap_or and unwrap_or_else
#![allow(unused)] fn main() { impl<T, E> Result0<T, E> { fn unwrap_or(self, default: T) -> T { match self { Ok(val) => val, Err(_) => default, } } fn unwrap_or_else<F: FnOnce(E) -> T>(self, f: F) -> T { match self { Ok(val) => val, Err(e) => f(e), } } } }
Examples:
#![allow(unused)] fn main() { let success: Result0<i32, &str> = Ok(10); success.unwrap_or(0) // 10 let failure: Result0<i32, &str> = Err("bad"); failure.unwrap_or(0) // 0 let result: Result0<i32, &str> = Err("parse error"); let val = result.unwrap_or_else(|e| { eprintln!("Error: {}", e); // ✅ Has access to error! 0 }); // Key difference: unwrap_or vs unwrap_or_else fn expensive_default() -> i32 { println!("Computing default..."); 42 } let result = Ok(10); // expensive_default() is being called // even though the result is not used! let out = result.unwrap_or(expensive_default()) // expensive_default() is only called if result is Err // which in this case it is not, so we avoid the unnecessary computation! let out = result.unwrap_or_else(|_| expensive_default()) }
map - Transform Success
Transform the Ok value, leave Err unchanged:
#![allow(unused)] fn main() { impl<T, E> Result0<T, E> { fn map<U, F: FnOnce(T) -> U>(self, f: F) -> Result0<U, E> { match self { Ok(x) => Ok(f(x)), Err(e) => Err(e), } } } }
Examples:
#![allow(unused)] fn main() { let success: Result0<i32, &str> = Ok(5); success.map(|x| x * 2) // Ok(10) let failure: Result0<i32, &str> = Err("bad"); failure.map(|x| x * 2) // Err("bad") - unchanged! // Misconception: map transforms both Ok and Err // ❌ Wrong! map ONLY transforms Ok values let result: Result0<i32, &str> = Err("error"); result.map(|x| x.to_string()) // Still Err("error"), not transformed // Chain transformations Ok(5).map(|x| x * 2).map(|x| x + 1) // Ok(11) }
map_err - Transform Error
Transform the Err value, leave Ok unchanged:
#![allow(unused)] fn main() { impl<T, E> Result0<T, E> { fn map_err<F2, O: FnOnce(E) -> F2>(self, op: O) -> Result0<T, F2> { match self { Ok(x) => Ok(x), Err(e) => Err(op(e)), } } } }
Examples:
#![allow(unused)] fn main() { let success: Result0<i32, &str> = Ok(5); success.map_err(|e| e.to_uppercase()) // Ok(5) - unchanged! let failure: Result0<i32, &str> = Err("bad"); failure.map_err(|e| e.to_uppercase()) // Err("BAD") // map_err ONLY transforms Err values Ok(42).map_err(String::from) // Still Ok(42), not transformed // Convert error types #[derive(Debug)] enum AppError { IoError(String), ParseError(String) } let result: Result0<i32, &str> = Err("file not found"); result.map_err(|e| AppError::IoError(e.to_string())) // Err(AppError::IoError(...)) }
and_then - Chain Fallible Operations
The most important combinator. Chain operations that might fail:
#![allow(unused)] fn main() { impl<T, E> Result0<T, E> { fn and_then<U, F: FnOnce(T) -> Result0<U, E>>(self, f: F) -> Result0<U, E> { match self { Ok(x) => f(x), Err(e) => Err(e), } } } }
Examples:
#![allow(unused)] fn main() { fn safe_divide(a: i32, b: i32) -> Result0<i32, &'static str> { if b == 0 { Err("division by zero") } else { Ok(a / b) } } // Misconception: use map for Result-returning functions let x: Result0<i32, &str> = Ok(10); // x.map(|n| safe_divide(n, 2)) // ❌ Result0<Result0<i32, &str>, &str> - nested! // ✅ Use and_then to avoid nesting x.and_then(|n| safe_divide(n, 2)) // Result0<i32, &str> - flattened // Chain multiple fallible operations Ok(20) .and_then(|n| safe_divide(n, 2)) // Ok(10) .and_then(|n| safe_divide(n, 5)) // Ok(2) // Errors propagate Ok(10) .and_then(|n| safe_divide(n, 0)) // Err("division by zero") .and_then(|n| safe_divide(n, 2)) // Still Err, second operation skipped }
ok - Convert to Option
Discard the error, convert to Option:
#![allow(unused)] fn main() { impl<T, E> Result0<T, E> { fn ok(self) -> Option0<T> { match self { Ok(x) => Option0::Some(x), Err(_) => Option0::None, } } fn err(self) -> Option0<E> { match self { Ok(_) => Option0::None, Err(e) => Option0::Some(e), } } } }
Examples:
#![allow(unused)] fn main() { // ok() - Extract success value, discard error type let success: Result0<i32, &str> = Ok(42); success.ok() // Some(42) let failure: Result0<i32, &str> = Err("something went wrong"); failure.ok() // None - error information lost! // ✅ Use ok() when you don't care about the error let port = parse_port("8080") .ok() .unwrap_or(3000); // Use default if parse fails, don't care why // err() - Extract error value, discard success value let success: Result0<i32, &str> = Ok(42); success.err() // None let failure: Result0<i32, &str> = Err("bad input"); failure.err() // Some("bad input") // Use case: Collecting errors let results = vec![Ok(1), Err("error1"), Ok(2), Err("error2")]; let errors: Vec<&str> = results .into_iter() .filter_map(|r| r.err()) .collect(); errors // ["error1", "error2"] }
as_ref - Borrow the Inner Values
Convert &Result0<T, E> to Result0<&T, &E>:
#![allow(unused)] fn main() { impl<T, E> Result0<T, E> { fn as_ref(&self) -> Result0<&T, &E> { match self { Ok(x) => Result0::Ok(x), Err(e) => Result0::Err(e), } } } }
Examples:
#![allow(unused)] fn main() { // Problem: map consumes the Result let result: Result0<String, String> = Ok(String::from("hello")); let len = result.map(|s| s.len()); // println!("{:?}", result); // ❌ result was moved! // ✅ Solution: Use as_ref() to borrow let result: Result0<String, String> = Ok(String::from("hello")); let len = result.as_ref().map(|s| s.len()); // s is &String len // Ok(5) println!("{:?}", result); // ✅ Works! result still valid // Multiple operations on the same Result let data: Result0<String, &str> = Ok(String::from("test")); let len = data.as_ref().map(|s| s.len()); let uppercase = data.as_ref().map(|s| s.to_uppercase()); let is_empty = data.as_ref().map(|s| s.is_empty()); len // Ok(4) uppercase // Ok("TEST") is_empty // Ok(false) // data is still usable! // Works with errors too let failure: Result0<i32, String> = Err(String::from("error")); let borrowed = failure.as_ref(); // Result0<&i32, &String> borrowed // Err(&String::from("error")) }
The ? Operator
Chaining with and_then works, but gets verbose:
#![allow(unused)] fn main() { fn process_config() -> Result0<Config, Error> { read_file("config.txt") .and_then(|content| parse_config(&content)) .and_then(|raw| validate_config(raw)) .and_then(|valid| apply_defaults(valid)) } }
Rust's ? operator makes this cleaner:
#![allow(unused)] fn main() { fn process_config() -> Result<Config, Error> { let content = read_file("config.txt")?; let raw = parse_config(&content)?; let valid = validate_config(raw)?; apply_defaults(valid) } }
The ? operator is syntax sugar. This:
#![allow(unused)] fn main() { let content = read_file("config.txt")?; }
...expands to roughly this:
#![allow(unused)] fn main() { let content = match read_file("config.txt") { Ok(val) => val, // Unwrap and continue Err(e) => return Err(e), // Early return with error }; }
So the whole function:
#![allow(unused)] fn main() { fn process_config() -> Result<Config, Error> { let content = read_file("config.txt")?; let raw = parse_config(&content)?; apply_defaults(raw) } }
...is equivalent to:
#![allow(unused)] fn main() { fn process_config() -> Result<Config, Error> { let content = match read_file("config.txt") { Ok(val) => val, Err(e) => return Err(e), }; let raw = match parse_config(&content) { Ok(val) => val, Err(e) => return Err(e), }; apply_defaults(raw) } }
We can't implement ? for our custom type (it requires the Try trait which is unstable), but understanding what it does is essential.
The ? Operator is Also Monadic
Both and_then and ? are monadic operations - they both short-circuit on errors, just in different styles.
and_then - Functional style (expression-based):
#![allow(unused)] fn main() { // Linear chain: fn calculate(input: &str) -> Result0<i32, &str> { parse_int(input) .and_then(|n| safe_divide(n, 2)) .and_then(|n| check_positive(n)) .map(|n| n * 10) } // If any step returns Err, the chain stops and returns that Err // Nested pattern - same calculation, nested style (like Scala's for-comprehension): fn calculate_nested(input: &str) -> Result0<i32, &str> { parse_int(input).and_then(|n| safe_divide(n, 2).and_then(|n2| check_positive(n2).map(|n3| n3 * 10) ) ) } // Same calculation as linear chain, but nested. Demonstrates short-circuiting beautifully: // If parse_int returns Err, the nested closures are NEVER invoked at all! }
? - Imperative style (statement-based):
#![allow(unused)] fn main() { fn calculate(input: &str) -> Result0<i32, &str> { let n = parse_int(input)?; // Returns Err if parse fails let n = safe_divide(n, 2)?; // Returns Err if divide fails let n = check_positive(n)?; // Returns Err if check fails Ok(n * 10) } // If any step returns Err, the function returns early with that Err }
Both do the same thing: stop on first error and propagate it up.
Visualizing ? short-circuit:
#![allow(unused)] fn main() { fn multi_step() -> Result0<i32, &str> { let a = step1()?; // Ok(5) - continues let b = step2(a)?; // Err("failed") - returns immediately let c = step3(b)?; // Never runs let d = step4(c)?; // Never runs Ok(d) // Never runs } // Returns: Err("failed") // Expanded to show what happens: fn multi_step_expanded() -> Result0<i32, &str> { let a = match step1() { Ok(val) => val, Err(e) => return Err(e), // Early return }; let b = match step2(a) { Ok(val) => val, Err(e) => return Err(e), // Early return - stops here! }; // Everything below never executes let c = match step3(b) { Ok(val) => val, Err(e) => return Err(e), }; let d = match step4(c) { Ok(val) => val, Err(e) => return Err(e), }; Ok(d) } }
Key insight: Both and_then and ? implement the same monadic pattern:
- Execute an operation that might fail
- If it succeeds, continue with the result
- If it fails, stop immediately and propagate the error
This is why Result-based error handling in Rust is so ergonomic - errors automatically bubble up without explicit checking at every step.
Result vs Option
| Situation | Use |
|---|---|
| Value might not exist | Option<T> |
| Operation might fail | Result<T, E> |
| Need to know why it failed | Result<T, E> |
| Don't care about error details | Option<T> |
Converting between them:
#![allow(unused)] fn main() { // Option -> Result impl<T> Option0<T> { fn ok_or<E>(self, err: E) -> Result0<T, E> { match self { Option0::Some(x) => Result0::Ok(x), Option0::None => Result0::Err(err), } } fn ok_or_else<E, F: FnOnce() -> E>(self, f: F) -> Result0<T, E> { match self { Option0::Some(x) => Result0::Ok(x), Option0::None => Result0::Err(f()), } } } // Result -> Option (already shown above with .ok()) }
Implementation
See the full code in src/result.rs for the complete implementation of Result0 with all methods.
Also, see the exercises in 02_result.rs
Key Takeaways
- Errors are values - Not hidden control flow like exceptions. The compiler forces you to handle them.
- The type signature tells the truth -
Result<T, E>means "this can fail". No surprises, no invisible exceptions. - E can be any type - String, &str, enums, integers, custom types. No special traits required. Just wrap it in
Err(). - map for success, map_err for errors - Transform either side independently. Only one variant changes at a time.
- and_then chains fallible operations - The workhorse of error handling. Flattens nested Results and short-circuits on first error.
- Two styles, same pattern - Linear chains (
and_then) and nested closures both demonstrate monadic short-circuiting. If any step fails, everything stops. - ? is syntax sugar for and_then + early return - Imperative style that does the same thing. Use it in real code.
- Short-circuit behavior is free - Errors automatically propagate up without explicit checking at every step. That's why Result-based error handling is so ergonomic.
Next Chapter
Box - Heap allocation and the Deref trait.
Chapter 3: Box - Heap Allocation
Common Misconceptions
If you come from Java or C#, you might think this allocates on the heap:
#![allow(unused)] fn main() { struct Point { x: i32, y: i32 } impl Point { fn new(x: i32, y: i32) -> Point { Point { x, y } } } let p = Point::new(3, 2); // Is this on the heap? }
No. In Rust, new is just a method name - there's nothing special about it. The p: Point lives on the stack.
Another surprise: arrays are also on the stack:
#![allow(unused)] fn main() { let arr = [0u8; 1000]; // 1000 bytes on the STACK, not heap! }
Heap Allocation Across Languages
In many languages, heap allocation happens through keywords:
- Java/C#:
newis a keyword that allocates on the heap - JavaScript: Creating objects/arrays automatically uses the heap
- Python: All objects live on the heap
In C, heap allocation is a function call:
malloc(),calloc(),free()- explicit function calls
In Rust, there's no keyword for heap allocation. Instead, it's wrapped in types:
Box::new()- allocate a single valueVec::new()- allocate a growable arrayString::new()- allocate a growable string
The raw allocation functions (alloc, dealloc) are unsafe and require manual memory management. You're not supposed to call them directly.
Key insight: All methods that internally call alloc are doing heap allocation. Box::new(), Vec::push(), String::from() - they all ultimately call alloc underneath, but wrap it in safe APIs that handle deallocation automatically.
Box is the simplest and most direct safe wrapper around heap allocation.
Stack vs Heap
When you create a variable in Rust, it lives on the stack by default:
#![allow(unused)] fn main() { let x = 42; // 4 bytes on stack let y = [0u8; 1000]; // 1000 bytes on stack (!) let p = Point::new(3, 2); // 8 bytes on stack }
Stack allocation is fast but limited:
- Size must be known at compile time
- Data is dropped when the function returns
- Stack space is limited (typically 1-8 MB)
The heap is for dynamic allocation:
- Size can be determined at runtime
- Data lives until explicitly freed
- Much larger (limited by RAM)
- Requires explicit action in Rust (via
Box,Vec,String, etc.)
What is Box?
Box<T> is the simplest smart pointer. It:
- Allocates memory on the heap
- Stores a value there
- Keeps a pointer to that memory on the stack
- Automatically frees the memory when dropped
Why Use Box?
1. Recursive Types
This won't compile:
#![allow(unused)] fn main() { enum List { Cons(i32, List), // Error: couldn't figure out the layout of recursive Nil, } }
Memory layout without Box (infinite, this confuses the compiler!):
The compiler tries to calculate: size(List) = 4 + size(List) = 4 + 4 + size(List) = ... - it never ends.
Fixed with Box:
#![allow(unused)] fn main() { enum List { Cons(i32, Box<List>), // Box has known size (pointer) Nil, } }
Why doesn't Box<List> have the same problem?
Look at what Box actually is:
#![allow(unused)] fn main() { struct Box<T> { ptr: *mut T, // Just a pointer! T is not stored here. } }
The T in Box<T> is only a generic parameter - it tells the compiler what type the pointer points to, but T is never a field inside Box. The struct is always just a pointer (8 bytes).
So Box<List> doesn't contain a List. It contains a pointer to a List somewhere. The size of Box<List> is always 8 bytes, regardless of what List is.
Memory layout with Box, fixed!:
The arrows show where each pointer points to in memory (addresses like 0x1000, 0x2000). The Box itself is just 8 bytes storing an address.
Now the compiler knows: size(List) = max(size(Cons), size(Nil)) = max(4 + 8, 0) = 12 bytes. Done!
2. Large Data
In Rust, move = memcpy. When you pass a value to a function or assign it to another variable, Rust copies the bytes:
#![allow(unused)] fn main() { let huge = [0u8; 1_000_000]; // 1MB array on stack fn process(data: [u8; 1_000_000]) { /* ... */ } process(huge); // Copies 1MB of bytes to the function's stack frame! }
With Box, the large data lives on the heap. Only the pointer (8 bytes) is on the stack:
#![allow(unused)] fn main() { let boxed = Box::new([0u8; 1_000_000]); // 1MB on heap, 8-byte ptr on stack fn process(data: Box<[u8; 1_000_000]>) { /* ... */ } process(boxed); // Copies only 8 bytes (the pointer), not 1MB! }
The heap data stays in place. Only the pointer moves.
3. Trait Objects (Dynamic Dispatch)
Sometimes you want to return different types that implement the same trait. Without Box, this is impossible:
#![allow(unused)] fn main() { trait Animal { fn sound(&self) -> &str; } struct Dog; struct Cat; impl Animal for Dog { fn sound(&self) -> &str { "Woof!" } } impl Animal for Cat { fn sound(&self) -> &str { "Meow!" } } // ❌ This doesn't work - different return types! fn make_animal(dog: bool) -> ??? { if dog { Dog // Type: Dog (size: 0 bytes) } else { Cat // Type: Cat (size: 0 bytes) } } }
The problem: Functions must have a single, known return type. Dog and Cat are different types, even if they both implement Animal.
Solution 1: Generic (Static Dispatch) - Doesn't work here:
#![allow(unused)] fn main() { // ❌ Won't compile - can't return different types fn make_animal<T: Animal>(dog: bool) -> T { if dog { Dog // T must be Dog } else { Cat // T must be Cat - conflict! } } }
The problem: T is a single concrete type chosen by the caller, but we're trying to return two different types based on runtime logic.
Solution 2: Trait Objects (Dynamic Dispatch) - Use Box<dyn Animal>:
#![allow(unused)] fn main() { // ✅ Works! Returns a trait object fn make_animal(dog: bool) -> Box<dyn Animal> { if dog { Box::new(Dog) // Box<Dog> → Box<dyn Animal> } else { Box::new(Cat) // Box<Cat> → Box<dyn Animal> } } // Usage let animal = make_animal(true); animal.sound() // "Woof!" - decided at runtime }
Key takeaway: Use Box<dyn Trait> when you need to:
- Return different types from the same function
- Store different types in the same collection
- Decide which type to use at runtime (plugins, configuration, user input)
Use generics (<T: Trait>) when you:
- Know the type at compile time
- Want maximum performance
- Don't need to mix different types
Building Our Own Box
We can't truly replicate Box without compiler magic, but we can understand its core:
#![allow(unused)] fn main() { use std::alloc::{alloc, dealloc, Layout}; use std::ptr; struct Box0<T> { ptr: *mut T, } }
new - Allocate and Store
#![allow(unused)] fn main() { impl<T> Box0<T> { fn new(value: T) -> Box0<T> { unsafe { // 1. Calculate memory layout for T let layout = Layout::new::<T>(); // 2. Allocate memory let ptr = alloc(layout) as *mut T; if ptr.is_null() { std::alloc::handle_alloc_error(layout); } // 3. Write value to allocated memory ptr::write(ptr, value); Box0 { ptr } } } } }
The Deref Trait
Deref lets us use *box_value to get the inner value:
#![allow(unused)] fn main() { use std::ops::Deref; impl<T> Deref for Box0<T> { type Target = T; fn deref(&self) -> &Self::Target { unsafe { &*self.ptr } } } }
With Deref, this works:
#![allow(unused)] fn main() { let b = Box0::new(5); *b // 5 // Can we move out with *b? let c = *b; // ✅ Works! i32 is Copy, so this copies the value c // 5 // What about non-Copy types? let s = Box0::new(String::from("hello")); // This works fine - gets a reference: let borrowed = *s; // borrowed: &String // But if you explicitly ask for ownership, it fails: // let owned: String = *s; // ❌ ERROR: cannot move out of `*s` }
What's happening?
*s calls deref() which returns &String. What happens next depends on what you're trying to bind:
#![allow(unused)] fn main() { let borrowed = *s; // ✅ Infers &String, works let borrowed: &String = *s; // ✅ Explicitly &String, works let owned: String = *s; // ❌ Asks for String (ownership), fails! }
When you write let owned: String = *s;:
*sgives&String(a reference)- You're asking for
String(ownership, not a reference) - Compiler tries: "Can I move
Stringout of&String?" - Answer: No! Moving from a reference is not allowed → error!
The key insight:
- Copy types (
i32):let c = *b;copies the value from&i32→ works! - Non-Copy types, asking for reference (
&String):let x = *s;orlet x: &String = *s;→ works! - Non-Copy types, asking for ownership (
String):let x: String = *s;→ fails!
To get ownership of non-Copy types, use into_inner() to consume the box.
DerefMut for Mutable Access
#![allow(unused)] fn main() { use std::ops::DerefMut; impl<T> DerefMut for Box0<T> { fn deref_mut(&mut self) -> &mut Self::Target { unsafe { &mut *self.ptr } } } }
Now we can mutate:
#![allow(unused)] fn main() { let mut b = Box0::new(5); *b = 10; }
Drop - Clean Up Memory
The whole point of smart pointers: automatic cleanup.
#![allow(unused)] fn main() { impl<T> Drop for Box0<T> { fn drop(&mut self) { unsafe { // 1. Call destructor on the value ptr::drop_in_place(self.ptr); // 2. Deallocate the memory let layout = Layout::new::<T>(); dealloc(self.ptr as *mut u8, layout); } } } }
When is Drop called?
#![allow(unused)] fn main() { // 1. End of scope { let b = Box0::new(String::from("hello")); println!("Using box: {}", *b); } // Drop called here - memory freed automatically // 2. Explicit drop let b = Box0::new(42); drop(b); // Drop called immediately // b is no longer valid // 3. Reassignment let mut b = Box0::new(10); b = Box0::new(20); // Drop called on old Box0(10), then new one assigned // 4. NOT called when using into_inner, leak, or into_raw let b = Box0::new(5); let value = b.into_inner(); // Drop NOT called - we handled cleanup manually }
into_inner - Extract Value Without Drop
Sometimes you want to move the value out of the box and take ownership, deallocating the box itself but keeping the value:
#![allow(unused)] fn main() { impl<T> Box0<T> { fn into_inner(self) -> T { unsafe { // 1. Read the value from heap into stack variable // ptr::read copies/moves T from *self.ptr (heap) to local variable let value = ptr::read(self.ptr); // 2. Deallocate the heap memory (but value is now on stack!) let layout = Layout::new::<T>(); dealloc(self.ptr as *mut u8, layout); // 3. Prevent Drop from running std::mem::forget(self); // 4. Return the extracted value (moves to caller) value } } } }
Why mem::forget(self)?
Without it, Drop::drop would run when self goes out of scope at the end of this function. But we already:
- Read the value with
ptr::read - Deallocated the memory with
dealloc
If Drop ran, it would:
- Try to drop the value that's no longer there (use-after-free!)
- Try to deallocate already-freed memory (double-free!)
Both are undefined behavior. mem::forget tells the compiler "don't run Drop on this value."
Example:
#![allow(unused)] fn main() { let boxed = Box0::new(String::from("hello")); let s = boxed.into_inner(); // Extract String, deallocate Box s // "hello" - we now own the String directly // boxed is consumed, but Drop wasn't called }
Wait, is the String on the stack now?
Not quite! Remember that String itself is just a struct:
#![allow(unused)] fn main() { struct String { ptr: *mut u8, // Pointer to heap data len: usize, cap: usize, } }
After into_inner():
- The
Stringstruct (24 bytes: ptr + len + cap) is now on the stack - The actual string data
"hello"is still on the heap - We freed the memory where
Box0stored the String struct, but not the string's data
Before into_inner():
After into_inner():
The String still owns its heap-allocated data - we just moved the String struct itself from heap to stack!
leak - Intentionally Leak Memory
Sometimes you want to keep data alive forever without deallocation:
#![allow(unused)] fn main() { impl<T> Box0<T> { fn leak(self) -> &'static mut T { let ptr = self.ptr; std::mem::forget(self); // Don't run Drop unsafe { &mut *ptr } } } }
Example:
#![allow(unused)] fn main() { let boxed = Box0::new(42); let leaked: &'static mut i32 = boxed.leak(); *leaked = 100; // Can mutate forever // Memory is never freed! }
Is this safe?
Yes! leak() is safe (not marked unsafe) because:
- It doesn't cause undefined behavior
- The returned reference has
'staticlifetime, valid for the entire program - The heap memory stays allocated and accessible through the reference
What if we don't save the returned reference?
#![allow(unused)] fn main() { fn leak_and_lose() { let boxed = Box0::new(42); let leaked: &'static mut i32 = boxed.leak(); // Get the reference // leaked is a local variable that will be destroyed when function ends // But the heap memory it points to? Still there! } leak_and_lose(); // Function ends, local variable 'leaked' is gone // The i32 is still on the heap at some address, taking up 4 bytes // But we have no way to access it anymore - the reference is lost! }
This is still safe (no UB), but it's a useless memory leak:
- The heap memory is leaked (never freed)
- The local variable
leaked(just a pointer on the stack) is destroyed - But we can't access the heap data because we lost the reference
- The data sits in memory for the rest of the program, wasting space
When leaked goes out of scope, only the reference (a stack pointer) is removed. The heap data remains forever - that's the whole point of leak().
Use cases: global state, thread-local storage, or when interfacing with C code that expects static lifetime.
into_raw and from_raw - Raw Pointer Conversion
Convert to/from raw pointers for FFI or manual memory management:
#![allow(unused)] fn main() { impl<T> Box0<T> { fn into_raw(self) -> *mut T { let ptr = self.ptr; std::mem::forget(self); // Don't run Drop ptr } unsafe fn from_raw(ptr: *mut T) -> Box0<T> { Box0 { ptr } } } }
Example:
#![allow(unused)] fn main() { extern "C" { fn c_process_data(ptr: *mut String); } let boxed = Box0::new(String::from("hello")); let ptr = Box0::into_raw(boxed); unsafe { c_process_data(ptr); } // Pass to C // from_raw is UNSAFE - you must guarantee the pointer came from into_raw let restored = unsafe { Box0::from_raw(ptr) }; // Get it back }
Warning: from_raw is unsafe because:
- The pointer must have come from
into_raw - You must not use it after calling
from_raw(double-free!) - The pointer must not be null
Compare to dereferencing and moving:
#![allow(unused)] fn main() { let boxed = Box0::new(String::from("hello")); let s = *boxed; // Move out of box // ERROR: Can't move out of `*boxed` because Box implements Deref but not DerefMove }
This doesn't work with the real Box either - you need Box::into_inner() (or just let the box drop if you want both gone).
Deref Coercion
One of Rust's nicest features. When you have &Box0<T>, it can automatically become &T:
#![allow(unused)] fn main() { fn print_len(s: &str) { println!("Length: {}", s.len()); } let boxed = Box0::new(String::from("hello")); print_len(&boxed); // &Box0<String> -> &String -> &str }
How does this work?
Deref coercion is a special compiler feature that only works with the Deref trait. The compiler automatically inserts deref calls to make types match:
- You pass
&boxed, which is&Box0<String> - Function expects
&str - Compiler tries: "Can I turn
&Box0<String>into&str?" - First deref:
&Box0<String>→ callsderef()→&String - Second deref:
&String→ callsderef()→&str✅ Match!
The compiler chains Deref implementations automatically. This only works with:
- The
Dereftrait (for immutable references) - The
DerefMuttrait (for mutable references)
You can't create your own trait with this behavior - it's built into the compiler specifically for Deref/DerefMut.
Vec and String: Box with Extra Metadata
Box isn't the only type that uses heap allocation. Vec and String do too - they're essentially "fat" pointers with extra fields:
#![allow(unused)] fn main() { // Simplified Vec definition struct Vec<T> { ptr: *mut T, // Pointer to heap data (like Box) len: usize, // Number of elements currently stored cap: usize, // Total allocated capacity } // Simplified String definition struct String { vec: Vec<u8>, // String is just a Vec<u8> with UTF-8 guarantee } // Which means String is really: struct String { ptr: *mut u8, // Pointer to heap-allocated bytes len: usize, // Length in bytes cap: usize, // Capacity in bytes } }
Compare to Box:
#![allow(unused)] fn main() { struct Box<T> { ptr: *mut T, // Just the pointer, nothing else } }
| Type | Stack size | Heap data |
|---|---|---|
Box<T> | 8 bytes (ptr) | T |
Vec<T> | 24 bytes (ptr + len + cap) | [T; cap] |
String | 24 bytes (ptr + len + cap) | [u8; cap] |
All three:
- Allocate on the heap
- Implement
Dereffor ergonomic access - Implement
Dropto free memory automatically
Exercises
See the full code in src/box.rs for the complete implementation of Option0 with all methods.
Also, see the exercises in 01_box.rs
Complete solutions: Switch to the answers branch with git checkout answers to see completed exercises
Key Takeaways
- Box is just a pointer - Single pointer to heap-allocated data
- Deref enables ergonomics - Use
*bto access inner value - Drop ensures cleanup - Memory freed when Box goes out of scope
- Deref coercion is magic -
&Box<T>automatically becomes&T - Use for recursive types - Break infinite size with indirection
Next Chapter
Cell - Interior mutability without runtime cost.
Chapter 4: Vec - Growable Arrays
The Problem: Fixed-Size Arrays
Arrays in Rust have a fixed size known at compile time:
#![allow(unused)] fn main() { let arr: [i32; 3] = [1, 2, 3]; // Can't grow or shrink }
What if we need a collection that can grow dynamically at runtime?
Vec: A Growable Array
Vec<T> is Rust's dynamically-sized array type. Unlike Box<T> which allocates a single value on the heap, Vec<T> allocates a contiguous block of memory that can grow or shrink.
The Three Fields of Vec
#![allow(unused)] fn main() { pub struct Vec<T> { ptr: *mut T, // Pointer to heap-allocated array len: usize, // Number of elements currently in use capacity: usize, // Total allocated space (in elements) } }
Key insight: len <= capacity always.
Heap memory:
[1, 2, 3, ?, ?, ?]
^ ^
| |
len = 3 capacity = 6
Why Not Use Box?
Box<T> allocates space for exactly one T. To grow, we'd need to:
- Allocate a new
Box - Copy all elements
- Deallocate the old
Box
Instead, Vec uses the allocator APIs directly (alloc, realloc, dealloc) to:
- Allocate more space than immediately needed (capacity > len)
- Grow in-place when possible
- Only reallocate when we run out of capacity
Implementing Vec
Basic Structure
#![allow(unused)] fn main() { use std::alloc::{alloc, dealloc, realloc, Layout}; use std::ptr; pub struct Vec0<T> { ptr: *mut T, len: usize, capacity: usize, } impl<T> Vec0<T> { pub fn new() -> Vec0<T> { Vec0 { ptr: std::ptr::NonNull::dangling().as_ptr(), len: 0, capacity: 0, } } pub fn len(&self) -> usize { self.len } pub fn capacity(&self) -> usize { self.capacity } pub fn is_empty(&self) -> bool { self.len == 0 } } }
Push - Adding Elements
When len == capacity, we need to grow:
#![allow(unused)] fn main() { impl<T> Vec0<T> { pub fn push(&mut self, value: T) { if self.len == self.capacity { self.grow(); } unsafe { // Write to the next available slot ptr::write(self.ptr.add(self.len), value); } self.len += 1; } fn grow(&mut self) { let new_capacity = if self.capacity == 0 { 1 } else { self.capacity * 2 // Double the capacity }; let new_layout = Layout::array::<T>(new_capacity).unwrap(); let new_ptr = if self.capacity == 0 { // First allocation unsafe { alloc(new_layout) as *mut T } } else { // Reallocate let old_layout = Layout::array::<T>(self.capacity).unwrap(); unsafe { realloc( self.ptr as *mut u8, old_layout, new_layout.size(), ) as *mut T } }; if new_ptr.is_null() { std::alloc::handle_alloc_error(new_layout); } self.ptr = new_ptr; self.capacity = new_capacity; } } }
Growth strategy: Start at 1, then double each time.
Capacity progression: 0 → 1 → 2 → 4 → 8 → 16 → 32 → ...
Why double? Amortized O(1) push operations.
Note: The vec! macro is syntactic sugar for repeatedly calling push:
#![allow(unused)] fn main() { let v = vec![1, 2, 3]; // Expands to roughly: // let mut v = Vec::new(); // v.push(1); // v.push(2); // v.push(3); }
Here's a simplified implementation of the macro:
#![allow(unused)] fn main() { #[macro_export] macro_rules! vec { () => { Vec::new() }; ($elem:expr; $n:expr) => { // vec![0; 5] creates [0, 0, 0, 0, 0] { let mut v = Vec::with_capacity($n); v.resize($n, $elem); v } }; ($($x:expr),+ $(,)?) => { // vec![1, 2, 3] { let mut v = Vec::new(); $(v.push($x);)* v } }; } }
The macro has three patterns:
vec![]- creates an empty vectorvec![elem; n]- creates a vector withncopies ofelemvec![x, y, z]- creates a vector with the given elements
Pop - Removing Elements
#![allow(unused)] fn main() { impl<T> Vec0<T> { pub fn pop(&mut self) -> Option<T> { if self.len == 0 { return None; } self.len -= 1; unsafe { Some(ptr::read(self.ptr.add(self.len))) } } } }
Note: We don't shrink capacity on pop. The memory stays allocated.
Index Access
#![allow(unused)] fn main() { use std::ops::{Index, IndexMut}; impl<T> Index<usize> for Vec0<T> { type Output = T; fn index(&self, index: usize) -> &T { if index >= self.len { panic!("index out of bounds: {} >= {}", index, self.len); } unsafe { &*self.ptr.add(index) } } } impl<T> IndexMut<usize> for Vec0<T> { fn index_mut(&mut self, index: usize) -> &mut T { if index >= self.len { panic!("index out of bounds: {} >= {}", index, self.len); } unsafe { &mut *self.ptr.add(index) } } } }
Now we can use vec[i]:
#![allow(unused)] fn main() { let mut vec = Vec0::new(); vec.push(10); vec.push(20); vec[0] // 10 vec[1] = 99; vec[1] // 99 }
Drop Implementation
Critical! We must:
- Drop all elements (call their destructors)
- Deallocate the memory
#![allow(unused)] fn main() { impl<T> Drop for Vec0<T> { fn drop(&mut self) { if self.capacity > 0 { // Drop all elements unsafe { ptr::drop_in_place( std::slice::from_raw_parts_mut(self.ptr, self.len) ); } // Deallocate memory let layout = Layout::array::<T>(self.capacity).unwrap(); unsafe { dealloc(self.ptr as *mut u8, layout); } } } } }
Slices: Views into Vec
Important: Unlike Vec, Option, Result, or Box, slices ([T] and &[T]) are a language primitive built into the Rust compiler. You cannot implement your own slice type with identical behavior.
Why slices are special:
[T]is a dynamically sized type (DST) - no known size at compile time- The compiler has special knowledge of slices for:
- Array to slice coercion:
&[1, 2, 3]automatically becomes&[i32] - Slice syntax:
&vec[1..3]uses built-in range operators - Pattern matching:
match slice { [first, rest @ ..] => ... } - Indexing bounds checks are optimized by the compiler
- Array to slice coercion:
Can we implement something slice-like? Yes! We can create a struct with (ptr, len) that behaves like a slice, but it won't have the same compiler integration. We'll show this in the exercises.
A slice &[T] is a view into contiguous memory. It's a fat pointer:
Convert Vec<T> to &[T]:
#![allow(unused)] fn main() { impl<T> Vec0<T> { pub fn as_slice(&self) -> &[T] { unsafe { std::slice::from_raw_parts(self.ptr, self.len) } } pub fn as_mut_slice(&mut self) -> &mut [T] { unsafe { std::slice::from_raw_parts_mut(self.ptr, self.len) } } } }
Now we can use slice methods:
#![allow(unused)] fn main() { let mut vec = Vec0::new(); vec.push(1); vec.push(2); vec.push(3); let slice = vec.as_slice(); slice.len() // 3 slice[0] // 1 slice.iter() // Iterator over &T }
Deref Coercion
Make Vec0<T> deref to [T]:
#![allow(unused)] fn main() { use std::ops::{Deref, DerefMut}; impl<T> Deref for Vec0<T> { type Target = [T]; fn deref(&self) -> &[T] { self.as_slice() } } impl<T> DerefMut for Vec0<T> { fn deref_mut(&mut self) -> &mut [T] { self.as_mut_slice() } } }
Now we can call slice methods directly:
#![allow(unused)] fn main() { let mut vec = Vec0::new(); vec.push(3); vec.push(1); vec.push(2); vec.sort(); // Calls [T]::sort() vec.len() // Works! (both Vec and slice have len()) vec.iter() // Calls [T]::iter() }
String is Just Vec
String is literally:
#![allow(unused)] fn main() { pub struct String { vec: Vec<u8>, } }
All String methods delegate to Vec:
#![allow(unused)] fn main() { impl String { pub fn new() -> String { String { vec: Vec::new() } } pub fn push_str(&mut self, s: &str) { self.vec.extend_from_slice(s.as_bytes()); } pub fn as_str(&self) -> &str { unsafe { std::str::from_utf8_unchecked(&self.vec) } } } }
str is a Slice
&str is to String what &[T] is to Vec<T>:
String &str
Vec<u8> &[u8] (but guaranteed valid UTF-8)
Both are fat pointers:
#![allow(unused)] fn main() { &str = (ptr: *const u8, len: usize) }
#![allow(unused)] fn main() { let s = String::from("hello"); let slice: &str = &s[0..3]; // "hel" }
Memory Layout Comparison
Array: Stack
[1, 2, 3]
Box: Heap (single value)
Box::new([1, 2, 3])
Vec: Heap (growable)
After vec.push(1); vec.push(2); vec.push(3); ... ; vec.push(7)
Vec on stack: 24 bytes (on 64-bit: 8 + 8 + 8)
Actual data: on heap
Slice: View (no ownership)
let slice = &vec[1..5]; // [2, 3, 4, 5, 6]
Common Operations
Creating a Vec
#![allow(unused)] fn main() { let vec = Vec0::new(); // ptr = dangling, len = 0, capacity = 0 let mut vec = Vec0::new(); vec.push(1); // First allocation: capacity = 1 vec.push(2); // Grows: capacity = 2 vec.push(3); // Grows: capacity = 4 }
Preallocating Capacity
#![allow(unused)] fn main() { impl<T> Vec0<T> { pub fn with_capacity(capacity: usize) -> Vec0<T> { if capacity == 0 { return Vec0::new(); } let layout = Layout::array::<T>(capacity).unwrap(); let ptr = unsafe { alloc(layout) as *mut T }; if ptr.is_null() { std::alloc::handle_alloc_error(layout); } Vec0 { ptr, len: 0, capacity, } } } }
Use when you know the size upfront:
#![allow(unused)] fn main() { let mut vec = Vec0::with_capacity(100); // 100 pushes without reallocation for i in 0..100 { vec.push(i); } }
Clear
#![allow(unused)] fn main() { impl<T> Vec0<T> { pub fn clear(&mut self) { // Drop all elements unsafe { ptr::drop_in_place( std::slice::from_raw_parts_mut(self.ptr, self.len) ); } self.len = 0; // Capacity unchanged } } }
Key Differences: Box vs Vec
| Feature | Box<T> | Vec<T> |
|---|---|---|
| Size | Fixed (one T) | Dynamic |
| Capacity | Always equals size | Can exceed size |
| Growth | N/A | Doubles when full |
| Use case | Single heap value | Collection of values |
| Deref | To T | To [T] |
Performance Characteristics
| Operation | Time Complexity | Notes |
|---|---|---|
push | O(1) amortized | O(n) on reallocation |
pop | O(1) | No reallocation |
index | O(1) | Direct memory access |
insert(0, x) | O(n) | Must shift all elements |
remove(i) | O(n) | Must shift elements |
The Complete Implementation
See examples/04_vec.rs for the full implementation with:
push,pop,insert,removeIndexandIndexMutDerefto[T]IntoIteratorimplementationCloneforT: CloneDebugforT: Debug
Key Takeaways
- Vec uses raw allocator APIs - Not implemented with
Box - Three fields -
ptr,len,capacity - Growth strategy - Double capacity when full
- String = Vec<u8> - Same structure, UTF-8 constraint
- Slices are views -
&[T]and&strdon't own data - Fat pointers - Slices contain
(ptr, len)
Exercises
See ./examples/04_vec.rs for exercises.
Complete solutions: Switch to the answers branch with git checkout answers to see completed exercises
Implement a slice-like type
Here's a starting point for a slice-like type:
#![allow(unused)] fn main() { use std::marker::PhantomData; pub struct MySlice<'a, T> { ptr: *const T, len: usize, _marker: PhantomData<&'a T>, // Zero-sized, but tells compiler about 'a and T } impl<'a, T> MySlice<'a, T> { pub fn from_vec(vec: &'a Vec0<T>) -> MySlice<'a, T> { MySlice { ptr: vec.ptr, len: vec.len, _marker: PhantomData, } } pub fn len(&self) -> usize { self.len } pub fn get(&self, index: usize) -> Option<&'a T> { if index < self.len { unsafe { Some(&*self.ptr.add(index)) } } else { None } } } }
Why PhantomData?
Raw pointers (*const T and *mut T) don't carry lifetime information. Without PhantomData, the compiler wouldn't know that MySlice<'a, T> should:
- Not outlive the data it points to - The
'alifetime connects the slice to the vec - Act like it owns a
&'a T- For variance and drop check purposes
Example of what could go wrong without it:
#![allow(unused)] fn main() { // WITHOUT PhantomData, this dangerous code might compile: let slice = { let vec = Vec0::new(); vec.push(42); MySlice::from_vec(&vec) // vec dies here! }; // slice now points to freed memory! ❌ Use-after-free! // WITH PhantomData, the compiler catches this: // error: `vec` does not live long enough }
PhantomData<&'a T> is zero-sized (no runtime cost) but tells the compiler: "pretend I own a &'a T reference" so it enforces proper lifetimes. With it, the above code becomes a compile-time error instead of undefined behavior.
Alternative without PhantomData:
You could use real references instead of raw pointers:
#![allow(unused)] fn main() { pub struct MySlice<'a, T> { data: &'a [T], // Real reference, carries lifetime automatically } }
But this defeats the purpose of the exercise - we want to see what we can build with just (ptr, len)!
What you can implement:
- Index access (
impl Index<usize>) len(),is_empty()first(),last()iter()returning an iterator
What you CANNOT implement:
- Slice syntax:
&my_slice[1..3](requires compiler support) - Pattern matching:
match my_slice { [first, rest @ ..] => ... }(DST feature) - Automatic coercion from arrays:
&[1, 2, 3]→MySlice(compiler magic)
This demonstrates why slices are special - they need compiler integration for the syntax we take for granted!
Chapter 5: Cell - Interior Mutability
You want to track how many times a value gets accessed. Simple, right? Add a counter, increment it on every read. But Rust says no:
#![allow(unused)] fn main() { struct Stats { value: i32, access_counter: usize, } impl Stats { fn get_value(&self) -> i32 { self.access_counter += 1; // ❌ can't mutate through &self self.value } } }
The problem? get_value takes &self (shared reference), but you need &mut self to change access_counter.
"Just use &mut self then!"
Can't. That would be terrible:
#![allow(unused)] fn main() { fn get_value(&mut self) -> i32 { // Now requires &mut self self.access_counter += 1; self.value } // This breaks: let stats = Stats { value: 42, access_counter: 0 }; let r1 = &stats; let r2 = &stats; let v1 = r1.get_value(); // ❌ can't call get_value on &Stats let v2 = r2.get_value(); // Need &mut, can't have multiple readers! }
You just killed multiple readers. The whole point of &self is that many readers can coexist. But now you can't read without exclusive access, all because of a silly counter.
The counter is just bookkeeping. It's not the meaningful data. The meaningful data (value) never changes. You want the struct to be logically immutable (value doesn't change) but physically mutable (counter does change).
The Unsafe Escape Hatch
Fine. Let's use raw pointers:
#![allow(unused)] fn main() { fn get_value(&self) -> i32 { unsafe { let counter_ptr = &self.access_counter as *const usize as *mut usize; *counter_ptr += 1; } self.value } }
This works. You mutate through a shared reference by casting away the constness. But this is a common pattern. You'll need it for counters, caches, lazy initialization—anywhere you want interior mutability. Every time you write this unsafe code again, you're making a promise to Rust: "Trust me, this is safe." You verify it once in Stats. Then again in Cache. Then in LazyCell. Then in RefCount. Exhausting.
Enter Cell: The Safe Wrapper
What if someone already wrote that unsafe code, verified it's sound, and wrapped it in a safe API? Then you'd never think about it again:
#![allow(unused)] fn main() { use std::cell::Cell; struct Stats { value: i32, access_counter: Cell<usize>, // ✅ Wrapped in Cell! } impl Stats { fn get_value(&self) -> i32 { // Mutate through &self - totally safe! self.access_counter.set(self.access_counter.get() + 1); self.value } } }
Done. No unsafe. No mental overhead. Just works.
That's Cell: a safe wrapper for interior mutability. It lets you mutate data through a shared reference without violating Rust's safety guarantees.
There are also other types for interior mutability for different use cases: RefCell, Mutex, etc that we'll cover later.
How Cell Stays Safe
Let's try building a naive ClumsyCell that gives you references:
#![allow(unused)] fn main() { use std::cell::UnsafeCell; struct ClumsyCell<T> { value: UnsafeCell<T>, } impl<T> ClumsyCell<T> { fn new(value: T) -> Self { ClumsyCell { value: UnsafeCell::new(value) } } fn get_ref(&self) -> &T { unsafe { &*self.value.get() } } fn set(&self, value: T) { unsafe { *self.value.get() = value; } } } // This compiles! But it's unsound. let cell = ClumsyCell::new(5); let r1: &i32 = cell.get_ref(); // Get reference to inner value println!("{}", r1); // Reads 5 cell.set(10); // Mutate through &cell println!("{}", r1); // DANGER: r1 still exists but value changed! // r1 is supposed to be immutable, but the data it points to changed }
The problem: r1 is an immutable reference, but the value it points to changed. That's exactly what Rust's borrow checker prevents. ClumsyCell bypassed it with unsafe, breaking Rust's safety guarantees.
Real Cell's solution: never give you a reference to the inner value.
#![allow(unused)] fn main() { let cell = Cell::new(5); cell.set(10); // ✅ Replace the value let value = cell.get(); // ✅ Get a COPY of the value (requires T: Copy) }
No references = no aliasing violations. You can't have a reference to something that might change, because you never get a reference at all. Only copies.
This is why Cell only works with Copy types (for get()). Can't copy out a String or Vec. For those, you need RefCell (next chapter).
What Cell gives you:
- get(): Copy the value out (requires
T: Copy) - set(): Replace the value entirely
- replace(): Swap values, return the old one
- take(): Take the value, leave
Default::default()behind
All through &Cell<T>, not &mut Cell<T>. That's the magic.
Motivation: When You Need to Mutate Through Shared References
When the struct is mostly read-only (value never changes), but a minor field (access_count) needs to change. Making the entire struct mutable just for this tracking would be too awkward and restrictive.
This is logical vs physical constness:
- Logically const: The meaningful data (
value) doesn't change - Physically mutable: Internal bookkeeping (
access_count) does change
Solution: Use interior mutability for the counter while keeping &self. This lets you mutate the minor parts (bookkeeping, caches, counters) without requiring &mut self for the whole struct.
All interior mutability types are wrappers around UnsafeCell (which we'll explore next). Rust provides safe wrappers that handle the unsafe operations for you. We'll implement these wrappers (Cell, RefCell) later in this chapter to understand how they work.
| Type | Check | Best for |
|---|---|---|
Cell<T> | None (Copy semantics) | Simple Copy types |
RefCell<T> | Runtime | Any type, single-threaded |
Mutex<T> | Runtime + blocking | Any type, multi-threaded |
RwLock<T> | Runtime + blocking | Read-heavy, multi-threaded |
Atomic* | Hardware | Primitive integers, multi-threaded |
This chapter covers Cell. We'll cover RefCell in the next chapter.
UnsafeCell: The Foundation
Every interior mutability type is built on UnsafeCell<T>. It's the only way to get mutability through a shared reference in Rust.
#![allow(unused)] fn main() { use std::cell::UnsafeCell; let cell = UnsafeCell::new(5); let ptr: *mut i32 = cell.get(); // Returns raw mutable pointer // UNSAFE: We must ensure no aliasing unsafe { *ptr = 10; } }
UnsafeCell is unsafe because:
- It gives you a
*mut Tfrom an&UnsafeCell<T> - You are responsible for ensuring no data races or aliasing violations
- The compiler can't help you
That's why we have safe wrappers like Cell and RefCell.
Can We Implement Our Own UnsafeCell?
No. UnsafeCell is a compiler intrinsic - it's deeply integrated into Rust's type system.
Here's what it conceptually looks like:
#![allow(unused)] fn main() { #[repr(transparent)] pub struct UnsafeCell<T> { value: T, } impl<T> UnsafeCell<T> { pub fn new(value: T) -> UnsafeCell<T> { UnsafeCell { value } } pub fn get(&self) -> *mut T { &self.value as *const T as *mut T } pub fn into_inner(self) -> T { self.value } } }
But this naive implementation is wrong! The compiler treats UnsafeCell specially:
- Disables certain optimizations - The compiler won't assume immutability through
&UnsafeCell<T> - Allows interior mutability - Without
UnsafeCell, getting*mut Tfrom&Tis undefined behavior - Memory model implications - Tells LLVM that mutations can happen through shared references
If you tried this with a regular struct, the compiler might:
- Optimize away your writes (assumes
&Tmeans immutable) - Reorder operations incorrectly
- Generate incorrect code
Example of why it matters:
#![allow(unused)] fn main() { // Regular struct - WRONG! struct NotUnsafeCell<T> { value: T, } impl<T> NotUnsafeCell<T> { fn get(&self) -> *mut T { &self.value as *const T as *mut T } } let x = NotUnsafeCell { value: 5 }; let ptr = x.get(); // Compiler sees &x (shared ref) and might assume x.value never changes let a = x.value; // Reads 5 unsafe { *ptr = 10; } // Mutate through pointer - UB! let b = x.value; // Compiler might optimize this to still be 5! }
With the real UnsafeCell, the compiler knows mutation can happen and won't make those assumptions.
Does Rust have a volatile keyword?
No. In C/C++, volatile tells the compiler "don't optimize reads/writes to this variable." Rust doesn't have a volatile keyword, but instead provides:
UnsafeCell: For interior mutability (single-threaded mutation through shared references)std::ptr::read_volatile/write_volatile: Unsafe functions for cases where you need volatile semantics (e.g., memory-mapped I/O, hardware registers)- Atomics: For thread-safe mutation with proper synchronization
UnsafeCell is not the same as volatile - it just tells the compiler "this can be mutated through shared references," while volatile prevents all optimizations. Most Rust code uses UnsafeCell (via Cell, RefCell, etc.), not volatile operations.
Bottom line: You must use std::cell::UnsafeCell. It's the only sound way to implement interior mutability.
What is Cell?
Cell<T> is a safe wrapper around UnsafeCell<T> for Copy types:
#![allow(unused)] fn main() { use std::cell::Cell; let cell = Cell::new(5); cell.set(10); // Mutate through shared reference! let value = cell.get(); // Get a COPY of the value }
Key insight: Cell never gives you a reference to the inner value. It only lets you:
- get: Copy the value out
- set: Replace the value entirely
This is safe because you can't have a reference to something that might change - you only ever have copies.
What happens when references escape?
If Cell gave you a reference, you'd have:
- Multiple
&Cell(shared references to the Cell itself) ✓ Allowed - A
&T(reference to the inner value) ✓ Should be valid - But Cell can mutate through
&self! ✗ Breaks Rust's aliasing rules!
#![allow(unused)] fn main() { // Hypothetical broken Cell with get_ref: let cell = Cell::new(5); let r1: &i32 = cell.get_ref(); // Get reference to inner value println!("{}", r1); // Reads 5 cell.set(10); // Mutate through &self println!("{}", r1); // DANGER: r1 still pointing to the value inside the cell! // In Rust with UB, optimizer might assume still 5! // While it's actually 10 }
Visualizing the problem:
Step 1: cell.get_ref() returns a reference
Step 2: cell.set(10) changes the value
BROKEN: r1 points into Cell's memory, and that memory can be mutated. This violates Rust's aliasing rules: you have an immutable reference (&i32) to data that's being mutated.
The problem: r1 is supposed to be immutable, but the value it points to changed!
How Cell avoids this:
#![allow(unused)] fn main() { // The real Cell let cell = Cell::new(5); let n: i32 = cell.get(); // n is a COPY of the value, not a reference println!("{}", n); // Reads 5 cell.set(10); // Mutate through &self println!("{}", n); // Still reads 5, because n is a copy, not a reference! }
Initial state:
cell.get() - Returns a COPY:
cell.set(10) - REPLACES the value:
Key insight: You never get &i32 or &mut i32 pointing to the value inside Cell's box. Only copies come out. The inner value never escapes as a reference.
Building Our Own Cell
#![allow(unused)] fn main() { use std::cell::UnsafeCell; pub struct Cell0<T> { value: UnsafeCell<T>, } }
new - Create a Cell
#![allow(unused)] fn main() { impl<T> Cell0<T> { pub fn new(value: T) -> Cell0<T> { Cell0 { value: UnsafeCell::new(value), } } } }
get - Copy the Value Out
#![allow(unused)] fn main() { impl<T: Copy> Cell0<T> { pub fn get(&self) -> T { // SAFETY: We only copy the value out, never give a reference unsafe { *self.value.get() } } } }
Note the T: Copy bound. This is crucial - we can only implement get for Copy types because we need to return a copy, not a reference.
set - Replace the Value
#![allow(unused)] fn main() { impl<T> Cell0<T> { pub fn set(&self, value: T) { // SAFETY: We replace the entire value atomically (single-threaded) unsafe { *self.value.get() = value; } } } }
Notice set doesn't require Copy - we're replacing the value, not reading it.
replace - Set and Return Old Value
#![allow(unused)] fn main() { impl<T> Cell0<T> { pub fn replace(&self, value: T) -> T { // SAFETY: We swap values without creating references unsafe { std::mem::replace(&mut *self.value.get(), value) } } } }
take - Take the Value, Leave Default
#![allow(unused)] fn main() { impl<T: Default> Cell0<T> { pub fn take(&self) -> T { self.replace(T::default()) } } }
into_inner - Consume Cell, Get Value
#![allow(unused)] fn main() { impl<T> Cell0<T> { pub fn into_inner(self) -> T { self.value.into_inner() } } }
get_mut - Common Confusion About Getting References from Cell
Common confusion: "Can Cell give me a reference to its inner value?"
Short answer: No, not through &Cell. That's the whole point of Cell - it can't give out references.
But Cell does have a get_mut method that returns &mut T. Here's the catch:
#![allow(unused)] fn main() { // Note: Separate impl block with ?Sized impl<T: ?Sized> Cell0<T> { pub fn get_mut(&mut self) -> &mut T { // Takes &mut self! self.value.get_mut() } } }
The key insight: get_mut requires &mut self, not &self.
This is just regular Rust borrowing - nothing special. The compiler catches it at compile time because get_mut takes &mut self.
This means you need exclusive mutable access to the Cell itself. At that point, you don't need interior mutability at all - Rust already knows at compile-time that you have exclusive access!
#![allow(unused)] fn main() { let mut cell = Cell0::new(5); // Note: mut cell *cell.get_mut() += 1; // Direct mutable access assert_eq!(cell.get(), 6); }
The compiler enforces normal borrow rules:
#![allow(unused)] fn main() { let mut cell = Cell0::new(5); let r1 = cell.get_mut(); // First mutable borrow let r2 = cell.get_mut(); // ❌ Error: cannot borrow `cell` as mutable more than once // Error message: // cannot borrow `cell` as mutable more than once at a time // first mutable borrow occurs here }
Why this defeats Cell's purpose:
If you have &mut Cell, you could've just used T directly:
#![allow(unused)] fn main() { // Why use Cell if you have &mut anyway? struct Config { value: Cell<i32>, // Uses Cell... } impl Config { fn update(&mut self) { // ...but needs &mut self? *self.value.get_mut() += 1; // Could've just used i32! } } // More natural - no Cell needed: struct Config { value: i32, // Just use i32 directly } impl Config { fn update(&mut self) { self.value += 1; // Simpler! } } }
When would you actually use get_mut?
Rarely. The only real use case is when you have a Cell<T> where T is not Copy, and you happen to have exclusive access to the Cell. At that point, get_mut lets you modify T in place without moving it out.
But this is uncommon - Cell exists precisely so you DON'T need &mut.
Cell in Practice: Simple Examples
Example 1: Counter
#![allow(unused)] fn main() { use std::cell::Cell; struct HitCounter { count: Cell<usize>, } impl HitCounter { fn new() -> Self { HitCounter { count: Cell::new(0) } } fn record_hit(&self) { self.count.set(self.count.get() + 1); } fn get_count(&self) -> usize { self.count.get() } } // Usage let counter = HitCounter::new(); counter.record_hit(); // count: 0 -> 1 counter.record_hit(); // count: 1 -> 2 counter.record_hit(); // count: 2 -> 3 counter.get_count() // 3 }
Example 2: Toggle Flag
#![allow(unused)] fn main() { struct Toggle { state: Cell<bool>, } impl Toggle { fn new() -> Self { Toggle { state: Cell::new(false) } } fn toggle(&self) { self.state.set(!self.state.get()); } fn is_on(&self) -> bool { self.state.get() } } // Usage let toggle = Toggle::new(); toggle.is_on() // false toggle.toggle(); toggle.is_on() // true toggle.toggle(); toggle.is_on() // false }
Example 3: Lazy Initialization
#![allow(unused)] fn main() { struct LazyValue { initialized: Cell<bool>, value: Cell<i32>, } impl LazyValue { fn new() -> Self { LazyValue { initialized: Cell::new(false), value: Cell::new(0), } } fn get_or_init(&self, compute: impl FnOnce() -> i32) -> i32 { if !self.initialized.get() { let val = compute(); self.value.set(val); self.initialized.set(true); } self.value.get() } } // Usage let lazy = LazyValue::new(); let result1 = lazy.get_or_init(|| 42); // Computes: 42 let result2 = lazy.get_or_init(|| 99); // Returns cached: 42 }
All these examples mutate state through &self (shared reference) - impossible without Cell!
Cell and Thread Safety: Send and Sync
Cell<T> is not thread-safe. It can be used in a single thread, but cannot be safely shared between threads.
Quick overview:
Rust has two special marker traits for thread safety:
Send: A type can be transferred between threads (moved to another thread)Sync: A type can be shared between threads (multiple threads can have&T)
#![allow(unused)] fn main() { // Cell<T> is Send (can be moved between threads) let cell = Cell::new(42); std::thread::spawn(move || { cell.set(100); // ✅ OK - moved to this thread }); // Cell<T> is NOT Sync (cannot be shared between threads) let cell = Cell::new(42); std::thread::spawn(|| { cell.set(100); // ❌ Cell is not Sync }); }
Why is Cell not Sync?
If two threads could share &Cell<T>, they could both call set() simultaneously:
- Thread 1:
cell.set(10) - Thread 2:
cell.set(20) - Data race! Both write to the same memory without synchronization
Hypothetical example if Cell was Sync (this won't compile!):
#![allow(unused)] fn main() { use std::cell::Cell; use std::thread; // Imagine Cell<T> was Sync (it's not!) let counter = Cell::new(0); // Try to share it between threads (won't compile) let handle1 = thread::spawn(|| { for _ in 0..1000 { counter.set(counter.get() + 1); // Thread 1 increments } }); let handle2 = thread::spawn(|| { for _ in 0..1000 { counter.set(counter.get() + 1); // Thread 2 increments } }); handle1.join().unwrap(); handle2.join().unwrap(); // Expected: 2000 // Actual: Could be anything! (if data races were allowed) // Both threads read, modify, write with no synchronization println!("{}", counter.get()); // Undefined behavior! }
Cell provides no internal synchronization, so it's unsafe for concurrent access. For thread-safe interior mutability, use:
Mutex<T>orRwLock<T>- Provides locking- Atomics (
AtomicUsize,AtomicBool, etc.) - Hardware-level synchronization
Note: Send and Sync are covered in depth in a later chapter on concurrency. For now, just remember: Cell = single-threaded only.
The Complete Implementation
See the full implementation in cell.rs.
Cell vs Other Interior Mutability Types
| Cell | RefCell | |
|---|---|---|
| Works with | Copy types (for get) | Any type |
| Returns | Copy of value | Reference (Ref<T>) |
| Overhead | None | Runtime borrow tracking |
| Panic? | Never | Yes, on borrow violation |
| Thread-safe? | No | No |
Use Cell when:
- Your type is
Copy(integers, bools, small structs) - You just need to get/set the value
- You want zero runtime overhead
Use RefCell when:
- Your type isn't
Copy - You need references to the inner value
- You're willing to pay for runtime borrow checking
Key Takeaways
- Interior mutability allows mutation through shared references
- UnsafeCell is the primitive - unsafe but flexible
- Cell is safe by never exposing references - only copies
- Use Cell for counters, flags, and simple state - like
Rc's reference count - Cell is not thread-safe - use atomics or mutexes for that
Exercises
See exercises.
Complete solutions: Switch to the answers branch with git checkout answers to see completed exercises
Next Chapter
RefCell - Runtime borrow checking for non-Copy types.
Chapter 6: RefCell - Runtime Borrow Checking
The Real Problem RefCell Solves
Remember Cell<T>? It works perfectly... for Copy types:
#![allow(unused)] fn main() { use std::cell::Cell; let count = Cell::new(0); count.set(count.get() + 1); // ✅ Works! i32 is Copy }
But try this:
#![allow(unused)] fn main() { let name = Cell::new(String::from("hello")); // name.get() // ❌ ERROR: String doesn't implement Copy! }
The issue: Cell::get() returns a copy of the value. This only works if T: Copy.
For non-Copy types like String, Vec, or your custom structs, we need a different approach. We need references to the data, not copies.
What RefCell Actually Does
RefCell<T> gives you interior mutability for any type by doing one clever thing:
It enforces Rust's borrowing rules at runtime instead of compile time.
Instead of compilation errors, you get panics!
#![allow(unused)] fn main() { use std::cell::RefCell; let cell = RefCell::new(String::from("hello")); // Get a reference (not a copy!) let borrowed: Ref<String> = cell.borrow(); println!("{}", *borrowed); // ✅ Works! // Mutate through a mutable reference let mut borrowed_mut: RefMut<String> = cell.borrow_mut(); borrowed_mut.push_str(" world"); // ✅ Works! }
The borrowing rules are exactly the same:
- Many
borrow()calls can coexist (multiple readers) - Only one
borrow_mut()at a time (single writer) - Can't have
borrow()andborrow_mut()simultaneously (no reading while writing)
The only difference: violations cause a runtime panic instead of a compile error:
#![allow(unused)] fn main() { let cell = RefCell::new(5); let r1 = cell.borrow(); // ✅ OK let r2 = cell.borrow(); // ✅ OK - multiple immutable borrows let r3 = cell.borrow_mut(); // 💥 PANIC! Can't borrow mutably while borrowed }
Don't Confuse borrow/borrow_mut with get_mut
Important distinction: RefCell has a get_mut() method, but it's not the main point of RefCell and is rarely used.
#![allow(unused)] fn main() { let mut cell = RefCell::new(5); // Note: mut cell *cell.get_mut() += 1; // Takes &mut self - compile-time check! }
Key difference:
get_mut(): Requires&mut self→ compile-time checked → Just regular Rust borrowingborrow()/borrow_mut(): Only need&self→ runtime checked → This is interior mutability!
If you have &mut RefCell<T>, you don't need interior mutability at all - you already have exclusive access! You could've just used T directly. The whole point of RefCell is borrow() and borrow_mut() - they let you mutate through &self.
See the Cell chapter's get_mut section for more details on why get_mut defeats the purpose of interior mutability.
Why Cell is Copy-only and RefCell is Not
This is crucial to understand:
Cell works by copying values in and out (set, get, replace):
set(value)- Replaces the current value with a new copyget()- Returns a copy of the current value- Requires
Copy- since we're always duplicating values, not borrowing them
RefCell gives you references (borrow, borrow_mut):
borrow()returns&Twrapped in a guardborrow_mut()returns&mut Twrapped in a guard- Works with any type because we're just borrowing, not copying
#![allow(unused)] fn main() { // Cell: requires Copy let cell = Cell::new(42); let value = cell.get(); // Copies the value out // RefCell: works with any type let cell = RefCell::new(String::from("hello")); let borrowed = cell.borrow(); // Returns &String (no copy!) }
TL;DR: Use Cell<T> for small Copy types. Use RefCell<T> for everything else.
How RefCell Tracks Borrows
RefCell uses a simple counter (borrow_count) to track the borrow state:
How borrow_count changes:
borrow()checks ifborrow_count >= 0, then increments itborrow_mut()checks ifborrow_count == 0, then sets it to -1- Dropping the guard restores
borrow_count
Building Our Own RefCell
The Structure
#![allow(unused)] fn main() { use std::cell::{Cell, UnsafeCell}; pub struct RefCell0<T> { borrow_count: Cell<isize>, // 0, positive, or -1 value: UnsafeCell<T>, } // Borrow states: // 0 = not borrowed // > 0 = immutably borrowed N times // -1 = mutably borrowed }
The Guard Types
RefCell uses two types of guards to track borrows:
Ref<T> - Returned by borrow() for immutable access:
- Dereferences to
&T - Decrements the count when dropped
RefMut<T> - Returned by borrow_mut() for mutable access:
- Dereferences to
&mut T - Resets the count to 0 when dropped
#![allow(unused)] fn main() { pub struct Ref<'a, T> { refcell: &'a RefCell0<T>, } pub struct RefMut<'a, T> { refcell: &'a RefCell0<T>, } }
new - Create RefCell
#![allow(unused)] fn main() { impl<T> RefCell0<T> { pub fn new(value: T) -> RefCell0<T> { RefCell0 { borrow_count: Cell::new(0), value: UnsafeCell::new(value), } } } }
borrow - Immutable Access
#![allow(unused)] fn main() { impl<T> RefCell0<T> { pub fn borrow(&self) -> Ref<'_, T> { let count = self.borrow_count.get(); if count < 0 { panic!("Already mutably borrowed!"); } self.borrow_count.set(count + 1); Ref { refcell: self } } } }
borrow_mut - Mutable Access
#![allow(unused)] fn main() { impl<T> RefCell0<T> { pub fn borrow_mut(&self) -> RefMut<'_, T> { let count = self.borrow_count.get(); if count != 0 { panic!("Already borrowed!"); } self.borrow_count.set(-1); RefMut { refcell: self } } } }
Implementing the Guards
The guards need to do two things:
- Deref - Allow transparent access to the inner value
- Drop - Clean up the borrow count when done
How deref is used in practice:
#![allow(unused)] fn main() { let cell = RefCell0::new(String::from("hello")); // borrow() returns Ref<String>, not &String let borrowed: Ref<String> = cell.borrow(); // But we can use it like a &String thanks to Deref! println!("{}", borrowed.len()); // Calls String::len println!("{}", borrowed.to_uppercase()); // Calls String::to_uppercase println!("{}", *borrowed); // Explicit deref to &String }
The Ref<String> guard derefs to &String automatically, so you can call all String methods on it!
How Drop ensures cleanup:
#![allow(unused)] fn main() { let cell = RefCell0::new(5); { let borrowed = cell.borrow(); // borrow_count: 0 -> 1 println!("{}", *borrowed); } // borrowed dropped here - borrow_count: 1 -> 0 (Drop called!) // Now we can borrow_mut because count is back to 0 let mut borrowed_mut = cell.borrow_mut(); // ✅ Works! }
Without Drop, the count would stay at 1 forever, and you'd never be able to borrow mutably!
Implementation:
#![allow(unused)] fn main() { use std::ops::{Deref, DerefMut}; // Ref<T> derefs to &T impl<T> Deref for Ref<'_, T> { type Target = T; fn deref(&self) -> &T { // SAFETY: borrow() already checked borrowing rules unsafe { &*self.refcell.value.get() } } } // When Ref<T> is dropped, decrement the count impl<T> Drop for Ref<'_, T> { fn drop(&mut self) { let count = self.refcell.borrow_count.get(); self.refcell.borrow_count.set(count - 1); } } // RefMut<T> derefs to &T impl<T> Deref for RefMut<'_, T> { type Target = T; fn deref(&self) -> &T { unsafe { &*self.refcell.value.get() } } } // RefMut<T> also derefs to &mut T impl<T> DerefMut for RefMut<'_, T> { fn deref_mut(&mut self) -> &mut T { // SAFETY: borrow_mut() already checked borrowing rules unsafe { &mut *self.refcell.value.get() } } } // When RefMut<T> is dropped, reset count to 0 impl<T> Drop for RefMut<'_, T> { fn drop(&mut self) { self.refcell.borrow_count.set(0); } } }
Why the guards store &RefCell0<T>:
The guards need a reference back to the RefCell0 so they can update borrow_count when dropped. This is how RAII (Resource Acquisition Is Initialization) works - the cleanup happens automatically!
try_borrow - Non-Panicking Version
Sometimes you want to try borrowing without panicking. borrow() and borrow_mut() panic on violations, but what if you want to handle the error gracefully?
The problem with panicking:
#![allow(unused)] fn main() { let cell = RefCell0::new(vec![1, 2, 3]); let borrowed = cell.borrow(); // This panics! Program crashes. let mut borrowed_mut = cell.borrow_mut(); // 💥 PANIC! }
Using try_borrow instead:
#![allow(unused)] fn main() { let cell = RefCell0::new(vec![1, 2, 3]); let borrowed = cell.borrow(); // Try to borrow mutably - returns Result instead of panicking match cell.try_borrow_mut() { Ok(mut b) => { b.push(4); println!("Successfully modified!"); } Err(_) => { println!("Can't borrow mutably right now, already borrowed!"); // Handle the error gracefully - maybe retry later } } }
Practical use case - Checking before borrowing:
#![allow(unused)] fn main() { fn safely_update(cell: &RefCell0<Vec<i32>>, value: i32) { // Check if we can borrow mutably first if let Ok(mut data) = cell.try_borrow_mut() { data.push(value); } else { // Already borrowed, skip or queue the update println!("Skipping update, cell is busy"); } } }
Implementation:
#![allow(unused)] fn main() { pub enum BorrowError { AlreadyMutablyBorrowed, } pub enum BorrowMutError { AlreadyBorrowed, } impl<T> RefCell0<T> { pub fn try_borrow(&self) -> Result<Ref<'_, T>, BorrowError> { let count = self.borrow_count.get(); if count < 0 { Err(BorrowError::AlreadyMutablyBorrowed) } else { self.borrow_count.set(count + 1); Ok(Ref { refcell: self }) } } pub fn try_borrow_mut(&self) -> Result<RefMut<'_, T>, BorrowMutError> { let count = self.borrow_count.get(); if count != 0 { Err(BorrowMutError::AlreadyBorrowed) } else { self.borrow_count.set(-1); Ok(RefMut { refcell: self }) } } } }
The logic is identical to borrow() and borrow_mut(), but returns Result instead of panicking!
The Complete Implementation
See the full implementation in refcell.rs.
Common Patterns
Pattern 1: Deferred Mutation
#![allow(unused)] fn main() { fn process_data(cell: &RefCell<Vec<i32>>) { // Read first let sum: i32 = cell.borrow().iter().sum(); // Then mutate (borrow dropped, so this is safe) cell.borrow_mut().push(sum); } }
Important: Drop borrows as soon as possible to avoid panics!
Pattern 2: Scoped Borrows
#![allow(unused)] fn main() { let cell = RefCell::new(vec![1, 2, 3]); { let borrowed = cell.borrow(); println!("Length: {}", borrowed.len()); } // borrowed dropped here cell.borrow_mut().push(4); // Now safe to mutate }
Pattern 3: Early Return to Drop Borrow
#![allow(unused)] fn main() { fn check_and_modify(cell: &RefCell<Option<String>>) { // Check if we need to modify if cell.borrow().is_none() { return; // Borrow dropped on return } // Safe to mutate now *cell.borrow_mut() = Some(String::from("modified")); } }
RefCell vs Cell vs Mutex
| Cell | RefCell | Mutex | |
|---|---|---|---|
| Type restriction | Copy for get | Any | Any |
| Returns | Copy | Reference | Guard |
| Runtime cost | None | Counter check | Lock |
| Panics? | Never | On violation | On poison |
| Thread-safe? | No | No | Yes |
Note: Mutex will be covered in detail in a later chapter on thread-safe interior mutability.
Pitfalls
Pitfall 1: Holding Borrows Too Long
#![allow(unused)] fn main() { let cell = RefCell::new(vec![1, 2, 3]); let borrowed = cell.borrow(); // Held across... let len = borrowed.len(); cell.borrow_mut().push(4); // PANIC! Still borrowed! }
Fix 1: Explicitly drop the borrow
#![allow(unused)] fn main() { let cell = RefCell::new(vec![1, 2, 3]); let borrowed = cell.borrow(); let len = borrowed.len(); drop(borrowed); // ✅ Explicitly drop the borrow cell.borrow_mut().push(4); // Now safe! }
Fix 2: Use a shorter scope
#![allow(unused)] fn main() { let cell = RefCell::new(vec![1, 2, 3]); let len = { let borrowed = cell.borrow(); borrowed.len() }; // ✅ borrowed dropped here cell.borrow_mut().push(4); // Now safe! }
Fix 3: Don't store the borrow (best approach)
#![allow(unused)] fn main() { let cell = RefCell::new(vec![1, 2, 3]); let len = cell.borrow().len(); // ✅ Borrow dropped immediately cell.borrow_mut().push(4); // Now safe! }
Pitfall 2: Borrowing in a Loop
The bad pattern:
#![allow(unused)] fn main() { let cell = RefCell::new(0); let mut borrows = Vec::new(); // BAD: Accumulating borrows without dropping them for _ in 0..10 { let r = cell.borrow(); borrows.push(r); // Moves r into Vec - now the Vec owns it! // r won't drop at the end of iteration // It only drops when the Vec drops } // Later, trying to mutate: cell.borrow_mut(); // 💥 PANIC! Still have 10 active borrows! // All guards are still alive inside the Vec }
Even this seemingly innocent code has a subtle issue:
#![allow(unused)] fn main() { let cell = RefCell::new(vec![1, 2, 3]); let first_borrow = cell.borrow(); // This borrow lives outside the loop! for i in 0..5 { let current = cell.borrow(); println!("{:?}", *current); // current drops here - this is fine! } // But first_borrow is STILL active! // If you try to borrow_mut here, it will panic: cell.borrow_mut().push(4); // 💥 PANIC! first_borrow is still active }
The fix:
#![allow(unused)] fn main() { let cell = RefCell::new(vec![1, 2, 3]); // ✅ Each borrow drops at the end of the iteration for i in 0..5 { let borrowed = cell.borrow(); println!("{:?}", *borrowed); // borrowed drops here automatically } // Now safe to borrow mutably cell.borrow_mut().push(4); }
Or even better - don't store the guard at all:
#![allow(unused)] fn main() { let cell = RefCell::new(vec![1, 2, 3]); // ✅ Best: Borrow dropped immediately after use for i in 0..5 { println!("{:?}", *cell.borrow()); } }
Pitfall 3: Recursive Calls with Borrows
#![allow(unused)] fn main() { fn recursive(cell: &RefCell<i32>, n: i32) { if n == 0 { return; } let _borrowed = cell.borrow(); // Borrow held... recursive(cell, n - 1); // ...across recursive call // If recursive() tries to borrow_mut, PANIC! } }
Key Takeaways
- RefCell doesn't bypass the borrow checker - It moves it from compile time to runtime. Same rules, same restrictions.
- Cell = Copy types, RefCell = Any type - Cell copies values, RefCell borrows references.
- Guards implement RAII - The borrow is released when the guard drops. This is crucial!
- Keep borrows short-lived - Drop guards as soon as possible to avoid panics.
- Use try_borrow for fallible access - Better than panicking in complex scenarios.
- RefCell is NOT thread-safe - Single-threaded only. Use
MutexorRwLockfor concurrency.
Exercises
Practice using RefCell with hands-on exercises in 06_refcell.rs.
Complete solutions: Switch to the answers branch with git checkout answers to see completed exercises
Next Chapter
Rc - Reference counting for shared ownership.
Chapter 7: Rc - Reference Counting
The Real Problem: Multiple Owners Need the Same Data
Rust's ownership model says every value has exactly one owner. This prevents many bugs, but sometimes multiple parts of your code legitimately need to own the same data:
#![allow(unused)] fn main() { struct Config { db_url: String, api_key: String, } // This won't work - who owns the config? let config = Config::load(); let server = Server::new(config); // server takes ownership let logger = Logger::new(config); // ❌ ERROR: config already moved! }
You can't:
- Clone
Configeach time (expensive, inconsistent if mutated) - Use references (lifetime gets complicated with multiple owners)
- Use
Box(still single ownership)
You need: Multiple owners to share the same data.
What Rc Actually Does
Rc<T> (Reference Counted) enables shared ownership by tracking how many owners exist:
#![allow(unused)] fn main() { use std::rc::Rc; let config = Rc::new(Config::load()); let server = Server::new(Rc::clone(&config)); // ✅ +1 owner let logger = Logger::new(Rc::clone(&config)); // ✅ +1 owner // config, server, and logger all share ownership of the same Config }
How it works:
- Allocates the value on the heap (like
Box) - Keeps a reference count alongside the value
- Each
Rc::clone()increments the count (and creates a new pointer) - Each
dropdecrements the count - When count reaches 0, the value is freed
Why Rc? Comparing the Three Approaches
The Key Insight: Rc = Multiple Pointers to THE SAME Heap Data
With Rc<T> (What we want):
✅ Multiple owners, one allocation, shared data
- All three
Rcpoint to the exact sameConfigin memory - Memory is freed only when the last owner drops
- For mutation, see Chapter 13 (Rc + RefCell)
Why Not Box<T>?
With Box<T> (Doesn't work for sharing):
❌ Problem: Each Box owns a DIFFERENT copy
- You'd have to clone the
Configfor each component (expensive!) - Each copy is independent - changes to one don't affect others
- Three separate heap allocations wasting memory
Box::new(config)movesconfig, making it unavailable for the next component
#![allow(unused)] fn main() { let config = Config::load(); let server = Box::new(config); // config moved here let logger = Box::new(config); // ❌ ERROR: config already moved! }
Why Not References &T?
With references (Lifetime hell):
❌ Problem: References don't own - lifetimes get complex
First, let's define structs that use references:
#![allow(unused)] fn main() { struct Server<'a> { config: &'a Config } struct Logger<'a> { config: &'a Config } }
Problem 1: Can't return from functions
When you try to return a struct containing a reference to local data, it fails:
#![allow(unused)] fn main() { fn create_server() -> Server { let config = Config::load(); let server = Server { config: &config }; server // ❌ ERROR: cannot return value referencing local variable `config` } }
Why it fails: config is dropped at the end of the function, but server.config still references it!
Problem 2: Lifetimes infect everything like a virus
You can store references locally - that works fine:
#![allow(unused)] fn main() { let config = Config::load(); let server = Server { config: &config }; let logger = Logger { config: &config }; // Storing in a Vec works locally let mut components: Vec<Server<'_>> = Vec::new(); components.push(server); // ✅ This works here! }
But now the lifetime constraint INFECTS everything that touches it:
Want to return the Vec? Now the function needs lifetimes:
#![allow(unused)] fn main() { fn make_components<'a>(cfg: &'a Config) -> Vec<Server<'a>> { vec![Server { config: cfg }] } }
Want to store in a struct? Now the struct needs lifetimes:
#![allow(unused)] fn main() { struct App<'a> { components: Vec<Server<'a>>, } }
Want to store App in another struct? That needs lifetimes too:
#![allow(unused)] fn main() { struct System<'a> { app: App<'a>, } }
Every function that uses System needs lifetimes:
#![allow(unused)] fn main() { fn process_system<'a>(sys: &System<'a>) { // Process the system } }
This cascades through your ENTIRE codebase - all because Server contains a reference!
How Rc Solves These Problems
Now let's see how Rc fixes both issues:
Solution to Problem 1: Returning from functions works
With Rc, you can return structs containing shared data - no lifetime issues:
#![allow(unused)] fn main() { use std::rc::Rc; struct Server { config: Rc<Config> } // No lifetime parameter! fn create_server() -> Server { let config = Rc::new(Config::load()); Server { config } // ✅ Works! Rc owns the data } }
Why it works: Rc owns the data (keeps it alive), unlike references which just borrow. When you return Server, you're moving ownership of the Rc out of the function.
Solution to Problem 2: No lifetime infection
With Rc, no lifetime parameters needed anywhere:
#![allow(unused)] fn main() { use std::rc::Rc; // No lifetime parameters! struct Server { config: Rc<Config> } struct Logger { config: Rc<Config> } // Storing in a Vec - no lifetime needed let config = Rc::new(Config::load()); let server = Server { config: Rc::clone(&config) }; let logger = Logger { config: Rc::clone(&config) }; let mut components = Vec::new(); components.push(server); // ✅ Works! // Functions don't need lifetime parameters fn make_components(cfg: Rc<Config>) -> Vec<Server> { vec![Server { config: cfg }] } // Structs don't need lifetime parameters struct App { components: Vec<Server>, } struct System { app: App, } // Functions using System don't need lifetime parameters fn process_system(sys: &System) { // Process the system } }
Why it works: Rc provides ownership, not just borrowing. Lifetime tracking is moved from compile-time to runtime - instead of the compiler tracking lifetimes with 'a annotations, Rc tracks them at runtime with reference counts.
The key trade-off:
- References (
&T): Compile-time lifetime tracking → zero runtime overhead, but inflexible - Rc: Runtime reference counting → flexible ownership, but pays cost of counter increments/decrements
The lifetime problem:
- If
ServerandLoggerstore&Config, they must live shorter thanconfig - You can't return them from functions (references stack-local data)
- You can't store them in collections easily (lifetime annotations everywhere)
- Complex ownership patterns become impossible to express
When references work well:
- Temporary borrows (function parameters)
- Short-lived access patterns
- When there's a clear owner and the borrow is brief
When you need Rc instead:
- No clear single owner
- Multiple components need to outlive each other independently
- Dynamic lifetimes (can't determine at compile time)
- Building complex data structures (graphs, trees with shared nodes)
Summary: Why Rc Wins
| Approach | Shares Data? | Multiple Owners? | Lifetime Issues? |
|---|---|---|---|
Rc<T> | ✅ Yes | ✅ Yes | ✅ No - runtime counted |
Box<T> | ❌ No (copies) | ❌ No (single owner) | ✅ No |
&T | ✅ Yes | ❌ No (borrowed) | ❌ Yes - compile-time tracked |
Rc<T> gives you the best of both worlds:
- Shared data like references (all point to same allocation)
- Multiple ownership like separate boxes (but without the copies)
- Simple lifetimes (no
'aannotations needed) - Runtime reference counting handles cleanup automatically
Using Rc: Basic Examples
Example 1: Shared Configuration
#![allow(unused)] fn main() { use std::rc::Rc; let config = Rc::new(Config { db_url: String::from("localhost:5432"), api_key: String::from("secret"), }); println!("Strong count: {}", Rc::strong_count(&config)); // 1 let server_config = Rc::clone(&config); // Now 2 owners let logger_config = Rc::clone(&config); // Now 3 owners println!("Strong count: {}", Rc::strong_count(&config)); // 3 // All three can access the same data println!("Server sees: {}", server_config.db_url); println!("Logger sees: {}", logger_config.db_url); }
Example 2: Deref Coercion Works
Because Rc implements Deref, you can use it like a reference:
#![allow(unused)] fn main() { let name = Rc::new(String::from("Alice")); // Deref coercion: Rc<String> -> &String -> &str fn print_name(s: &str) { println!("Name: {}", s); } print_name(&name); // ✅ Works! &Rc<String> coerces to &str println!("Length: {}", name.len()); // ✅ Call String methods directly }
Example 3: Automatic Cleanup via RAII
RAII (Resource Acquisition Is Initialization): When a variable goes out of scope, its Drop is called automatically.
#![allow(unused)] fn main() { { let data = Rc::new(vec![1, 2, 3]); println!("Count: {}", Rc::strong_count(&data)); // 1 { let shared = Rc::clone(&data); println!("Count: {}", Rc::strong_count(&data)); // 2 } // `shared` dropped here, count decrements to 1 println!("Count: {}", Rc::strong_count(&data)); // 1 } // `data` dropped here, count goes to 0, memory freed }
Why this matters: You don't need to manually manage the reference count. Rust's ownership system handles it automatically through Drop.
Why Clone is Cheap
The Confusion: Two Different "Clones"
#![allow(unused)] fn main() { let data = vec![1, 2, 3]; let clone1 = data.clone(); // ❌ EXPENSIVE: Copies all elements let rc_data = Rc::new(vec![1, 2, 3]); let clone2 = rc_data.clone(); // ✅ CHEAP: Just increments a counter }
Visual comparison:
Cloning a Vec directly - EXPENSIVE (O(n)):
Cloning an Rc<Vec> - CHEAP (O(1)):
Rc::clone() only clones the pointer, not the data!
- Cloning the
Rc= increment counter + copy pointer (O(1)) - Cloning the inner data = depends on the data (could be O(n))
Convention: Use Rc::clone(&rc) for Clarity
#![allow(unused)] fn main() { let rc = Rc::new(String::from("hello")); // Both work, but one is clearer: let clone1 = rc.clone(); // Looks like it might be expensive let clone2 = Rc::clone(&rc); // ✅ Clearly cloning the Rc, not the String }
The explicit Rc::clone(&rc) syntax makes it obvious you're doing a cheap pointer clone, not an expensive data clone.
Rc Only Gives You Shared References
Important limitation: Rc<T> only provides &T, never &mut T.
#![allow(unused)] fn main() { let data = Rc::new(vec![1, 2, 3]); let borrowed: &Vec<i32> = &*data; // ✅ Can get &T // let mut_borrowed: &mut Vec<i32> = &mut *data; // ❌ ERROR: cannot borrow as mutable }
Why? If multiple owners could get &mut T, you'd have multiple mutable references to the same data - a data race!
Solution for mutation: Use Rc<RefCell<T>> (covered in Chapter 13) for interior mutability.
Don't Confuse get_mut with Rc's Purpose
Like Cell and RefCell (see Chapters 5-6), Rc has a get_mut() method - but it's not the main point:
#![allow(unused)] fn main() { let mut rc = Rc::new(5); if let Some(data) = Rc::get_mut(&mut rc) { *data += 1; // Requires &mut Rc<T> AND strong_count == 1 } }
The key distinction (same as Cell/RefCell):
get_mut(): Requires&mut self→ compile-time checked → defeats the purposeRc::clone(): Only needs&self→ shared ownership → this is the point!
If you're the sole owner (strong_count == 1), you don't need Rc at all! The whole point of Rc is enabling multiple owners.
Common Patterns
Pattern 1: Shared Data Across Components
The most common use case - multiple components need read access to the same data:
#![allow(unused)] fn main() { struct Server { config: Rc<Config>, } struct Logger { config: Rc<Config>, } struct Database { config: Rc<Config>, } let config = Rc::new(Config::load()); let server = Server { config: Rc::clone(&config) }; let logger = Logger { config: Rc::clone(&config) }; let db = Database { config: Rc::clone(&config) }; // All components can read the same config println!("Server using: {}", server.config.db_url); println!("Logger using: {}", logger.config.db_url); }
Pattern 2: Tree Structures (Parent → Children)
Rc works well for tree structures where parents own children:
#![allow(unused)] fn main() { use std::rc::Rc; struct Node { value: i32, children: Vec<Rc<Node>>, } let child1 = Rc::new(Node { value: 1, children: vec![] }); let child2 = Rc::new(Node { value: 2, children: vec![] }); let parent = Rc::new(Node { value: 0, children: vec![Rc::clone(&child1), Rc::clone(&child2)], }); // child1 and child2 can be shared elsewhere too let another_parent = Rc::new(Node { value: 10, children: vec![Rc::clone(&child1)], // Shared child! }); }
Pattern 3: Caching/Flyweight Pattern
Share immutable data to save memory:
#![allow(unused)] fn main() { use std::collections::HashMap; use std::rc::Rc; struct FontCache { fonts: HashMap<String, Rc<FontData>>, } impl FontCache { fn get(&mut self, name: &str) -> Rc<FontData> { self.fonts.entry(name.to_string()) .or_insert_with(|| Rc::new(FontData::load(name))) .clone() // Cheap Rc clone, not data clone } } }
Pattern 4: Functional Data Structures
Share structure between versions:
#![allow(unused)] fn main() { use std::rc::Rc; #[derive(Debug)] enum List<T> { Cons(T, Rc<List<T>>), Nil, } let tail = Rc::new(List::Cons(2, Rc::new(List::Cons(3, Rc::new(List::Nil))))); // Two lists sharing the same tail let list1 = List::Cons(1, Rc::clone(&tail)); let list2 = List::Cons(0, Rc::clone(&tail)); // Both lists share the [2, 3] portion in memory }
The Problem with Cycles: When Rc Leaks Memory
Rc can create memory leaks if you create reference cycles. Here's the conceptual problem:
Imagine two nodes referencing each other:
When the stack variables drop:
- Each node's count decrements by 1 (from stack variables)
- But each node still has count = 1 (from the other node)
- Neither reaches 0, so neither is freed
- Memory leak! 💀
Why this happens: Each Rc in the cycle keeps the others alive. There's no "starting point" to begin deallocation.
The solution: Use Weak<T> references to break cycles. Weak is a non-owning reference that doesn't keep the value alive. This pattern is covered in Chapter 13 (Rc + RefCell), where you'll learn how to combine Rc, Weak, and interior mutability for practical data structures like graphs and trees with bidirectional references.
Building Our Own Rc
The Inner Structure
Before diving into implementation details, we need to answer a fundamental question: Why do we need RcInner? Why not just put the count in Rc itself?
Why We Need RcInner: The Shared Count Problem
Consider what happens if we try to store the count directly in each Rc:
#![allow(unused)] fn main() { // ❌ WRONG: Each Rc has its own count struct Rc0<T> { ptr: *mut T, count: usize, // Each Rc instance has its own count! } }
The problem: When you clone an Rc, you create a new struct with a separate count:
#![allow(unused)] fn main() { let rc1 = Rc::new(String::from("data")); // rc1.count = 1 let rc2 = rc1.clone(); // rc2.count = 1 (copied!) // rc1 and rc2 have DIFFERENT counts! // When rc1 drops, it sees count = 1, so it frees the memory // When rc2 drops, it also sees count = 1, so it tries to free again // 💀 Double free! Undefined behavior! }
What we actually need: All Rc instances pointing to the same data must share the SAME count:
#![allow(unused)] fn main() { let rc1 = Rc::new(String::from("data")); let rc2 = rc1.clone(); let rc3 = rc1.clone(); // All three need to see: count = 3 // When rc1 drops: count becomes 2 // When rc2 drops: count becomes 1 // When rc3 drops: count becomes 0 → NOW free the memory }
The solution: Store the count with the data, not with the pointer
#![allow(unused)] fn main() { // ✅ CORRECT: Separate inner struct struct RcInner<T> { strong_count: usize, // Shared count, stored with the value value: T, } struct Rc0<T> { ptr: *mut RcInner<T>, // Just a pointer to the shared data + count } }
Visual comparison:
Wrong approach (count in each Rc):
Correct approach (count with data):
In code:
#![allow(unused)] fn main() { // Each Rc is just a pointer let rc1 = Rc::new(String::from("data")); // Heap: RcInner { count: 1, value: "data" } let rc2 = rc1.clone(); // rc2 gets a COPY of the pointer (not the count!) // Both rc1.ptr and rc2.ptr point to the SAME RcInner // RcInner.count is incremented: count = 2 drop(rc1); // Follows rc1.ptr to the shared RcInner // Decrements the shared count: count = 1 // Doesn't free (count != 0) drop(rc2); // Follows rc2.ptr to the SAME RcInner // Decrements the shared count: count = 0 // NOW frees the memory ✅ }
Summary: RcInner exists to ensure the count lives with the data on the heap, not with the pointer on the stack. This way, all Rc instances pointing to the same data share the exact same count.
Now let's tackle the main challenge in implementing RcInner.
The Challenge: Mutating Counts Through Shared References
The problem: When you clone an Rc, you only have &self (shared reference), but you need to increment the count.
If we tried using a plain usize:
#![allow(unused)] fn main() { struct RcInner<T> { strong_count: usize, // ❌ Can't mutate this from &self value: T, } impl<T> Clone for Rc0<T> { fn clone(&self) -> Rc0<T> { let inner = unsafe { &*self.ptr }; inner.strong_count += 1; // ❌ ERROR: cannot mutate through shared reference! Rc0 { ptr: self.ptr } } } }
This fails because:
clone()receives&self(theClonetrait requires this)- From
&self, we get&RcInner<T>(shared reference to inner data) - Rust forbids mutation through shared references (prevents data races)
- But we MUST increment the count!
The solution: Cell<usize>
Cell provides interior mutability for Copy types (covered in Chapter 5):
#![allow(unused)] fn main() { struct RcInner<T> { strong_count: Cell<usize>, // ✅ Can mutate through &self! value: T, } impl<T> Clone for Rc0<T> { fn clone(&self) -> Rc0<T> { let inner = unsafe { &*self.ptr }; // ✅ Works! Cell allows mutation through shared reference inner.strong_count.set(inner.strong_count.get() + 1); Rc0 { ptr: self.ptr } } } }
The Simplified Structure
For now, we'll keep our implementation simple:
#![allow(unused)] fn main() { use std::cell::Cell; struct RcInner<T> { strong_count: Cell<usize>, // Cell: mutate through &self value: T, // The actual data } struct Rc0<T> { ptr: *mut RcInner<T>, } }
Note: The actual implementation in src/rc.rs is more complete - it includes weak_count and uses ManuallyDrop<T> to support Weak references (non-owning pointers that don't keep the value alive). We'll cover Weak in Chapter 13 (Rc + RefCell).
new - Create with Count 1
#![allow(unused)] fn main() { impl<T> Rc0<T> { fn new(value: T) -> Rc0<T> { let inner = Box::new(RcInner { strong_count: Cell::new(1), value, }); Rc0 { ptr: Box::into_raw(inner), } } } }
clone - Increment Count
#![allow(unused)] fn main() { impl<T> Clone for Rc0<T> { fn clone(&self) -> Rc0<T> { let inner = unsafe { &*self.ptr }; inner.strong_count.set(inner.strong_count.get() + 1); Rc0 { ptr: self.ptr } } } }
Important: Rc::clone() is cheap! It just increments a counter. This is different from .clone() on the inner type which might be expensive.
Convention: Use Rc::clone(&rc) instead of rc.clone() to make it clear you're cloning the pointer, not the data.
Deref - Access the Value
#![allow(unused)] fn main() { impl<T> Deref for Rc0<T> { type Target = T; fn deref(&self) -> &T { unsafe { &(*self.ptr).value } } } }
Note: Rc only gives you &T (shared reference), never &mut T. This is intentional - if multiple owners could mutate, we'd have data races.
drop - Decrement and Maybe Free
#![allow(unused)] fn main() { impl<T> Drop for Rc0<T> { fn drop(&mut self) { let inner = unsafe { &*self.ptr }; let count = inner.strong_count.get(); inner.strong_count.set(count - 1); if count == 1 { // Last reference - deallocate everything unsafe { drop(Box::from_raw(self.ptr)); } } } } }
When the last Rc drops (count becomes 0), we deallocate the entire RcInner, which automatically drops the value.
strong_count - Check Reference Count
#![allow(unused)] fn main() { impl<T> Rc0<T> { fn strong_count(this: &Rc0<T>) -> usize { unsafe { (*this.ptr).strong_count.get() } } } }
get_mut - Unique Access for Sole Owner
If you're the only owner (strong_count == 1), you can get &mut T:
#![allow(unused)] fn main() { impl<T> Rc0<T> { fn get_mut(this: &mut Rc0<T>) -> Option<&mut T> { if Rc0::strong_count(this) == 1 { // SAFETY: We're the sole owner, so no aliases exist unsafe { Some(&mut (*this.ptr).value) } } else { None } } } }
The Complete Implementation
Note: The actual implementation in src/rc.rs is more complete - it includes full Weak0<T> support and uses ManuallyDrop<T> to properly separate dropping the value from deallocating the memory. We'll explore Weak references in depth in Chapter 13 (Rc + RefCell). For this chapter, here's the complete Rc implementation:
#![allow(unused)] fn main() { use std::cell::Cell; use std::ops::Deref; struct RcInner<T> { strong_count: Cell<usize>, value: T, } pub struct Rc0<T> { ptr: *mut RcInner<T>, } impl<T> Rc0<T> { pub fn new(value: T) -> Rc0<T> { let inner = Box::new(RcInner { strong_count: Cell::new(1), value, }); Rc0 { ptr: Box::into_raw(inner) } } pub fn strong_count(this: &Rc0<T>) -> usize { unsafe { (*this.ptr).strong_count.get() } } pub fn get_mut(this: &mut Rc0<T>) -> Option<&mut T> { if Rc0::strong_count(this) == 1 { unsafe { Some(&mut (*this.ptr).value) } } else { None } } } impl<T> Clone for Rc0<T> { fn clone(&self) -> Rc0<T> { let inner = unsafe { &*self.ptr }; inner.strong_count.set(inner.strong_count.get() + 1); Rc0 { ptr: self.ptr } } } impl<T> Deref for Rc0<T> { type Target = T; fn deref(&self) -> &T { unsafe { &(*self.ptr).value } } } impl<T> Drop for Rc0<T> { fn drop(&mut self) { let inner = unsafe { &*self.ptr }; let count = inner.strong_count.get(); inner.strong_count.set(count - 1); if count == 1 { // Last reference - deallocate everything unsafe { drop(Box::from_raw(self.ptr)); } } } } }
Rc vs Box vs References
Box<T> | Rc<T> | &T | |
|---|---|---|---|
| Ownership | Single owner | Multiple owners | Borrowed |
| Heap allocation | Yes | Yes | No |
| Clone behavior | Deep copy | Increment counter | Just copy pointer |
| Runtime overhead | None | Counter checks | None |
| Mutability | &mut via Box | Need RefCell | Follow borrows |
| Thread-safe? | No | No | Depends on T |
| Use when | Single owner | Shared immutable | Short-lived |
Key Takeaways
- Rc enables shared ownership - Multiple owners can access the same data
- Clone is cheap - Only increments a counter, doesn't copy data (O(1))
- Only shared references -
Rcgives&T, never&mut T(prevents data races) - Beware of cycles -
Rccycles cause memory leaks (see Chapter 13 for solutions) - Not thread-safe - Use
Arc(Chapter 8) for multi-threaded code - Automatic cleanup - RAII ensures memory is freed when last
Rcdrops - get_mut confusion -
get_mut()requires sole ownership, defeating Rc's purpose
Pitfalls
Pitfall 1: Cloning Inner Data Instead of Rc
BAD: Dereferencing and cloning the inner data:
#![allow(unused)] fn main() { let rc1 = Rc::new(vec![1, 2, 3, 4, 5]); // ❌ This clones the Vec, not the Rc! let vec_clone = (*rc1).clone(); // Expensive! Copies all elements }
FIX: Clone the Rc, not the inner data:
#![allow(unused)] fn main() { let rc1 = Rc::new(vec![1, 2, 3, 4, 5]); // ✅ This clones the Rc - cheap! let rc2 = Rc::clone(&rc1); // Just increments counter // Both point to the same Vec assert_eq!(Rc::strong_count(&rc1), 2); }
Pitfall 2: Trying to Mutate Through Rc
BAD: Attempting to get &mut from Rc:
#![allow(unused)] fn main() { let rc = Rc::new(vec![1, 2, 3]); // rc.push(4); // ❌ ERROR: cannot borrow as mutable }
Why this fails: Rc only provides shared references (&T), never mutable references (&mut T). This prevents data races when multiple owners exist.
FIX 1: If you're the sole owner, use get_mut():
#![allow(unused)] fn main() { let mut rc = Rc::new(vec![1, 2, 3]); if let Some(vec) = Rc::get_mut(&mut rc) { vec.push(4); // ✅ Works if strong_count == 1 } }
FIX 2: Use Box or plain values if you don't need sharing:
#![allow(unused)] fn main() { // If you don't need shared ownership, don't use Rc! let mut vec = vec![1, 2, 3]; vec.push(4); // ✅ Simplest solution }
Exercises
See examples/07_rc.rs for hands-on exercises demonstrating:
Complete solutions: Switch to the answers branch with git checkout answers to see completed exercises
For cycle prevention with Weak and mutable shared data, see Chapter 13 (Rc + RefCell).
Next Chapter
Chapter 8: Rc + RefCell - Shared mutable state with reference counting and interior mutability.
Chapter 8: Rc + RefCell - Shared Mutable State (Single-Threaded)
The Problem: Rc Can't Mutate
You learned about Rc<T> in Chapter 7 - it gives you shared ownership. Multiple parts of your code can own the same data:
#![allow(unused)] fn main() { use std::rc::Rc; let config = Rc::new(Config { port: 8080 }); let server = Rc::clone(&config); // ✅ Multiple owners! let logger = Rc::clone(&config); // ✅ Everyone can read! }
But what if you need to mutate the shared data?
#![allow(unused)] fn main() { let counter = Rc::new(0); *counter += 1; // ❌ ERROR: cannot borrow as mutable }
Why it fails: Rc<T> only gives you &T (shared reference), never &mut T. If multiple owners could get &mut T, you'd have multiple mutable references to the same data - a data race!
Real-world scenarios where you need shared mutable data:
- Graph structures - Nodes need to add/remove edges to other nodes
- Trees with parent pointers - Children need to update their parent references
- Observer pattern - Observable needs to notify and update observers
- Shared cache - Multiple readers with occasional updates
You need a way to have multiple owners AND mutation. Rc<T> alone isn't enough.
The Solution: Rc<RefCell>
Remember RefCell<T> from Chapter 6? It gives you interior mutability - the ability to mutate through a shared reference:
#![allow(unused)] fn main() { use std::cell::RefCell; let cell = RefCell::new(5); *cell.borrow_mut() += 1; // ✅ Mutate through &RefCell! }
Combining them gives you both:
#![allow(unused)] fn main() { use std::rc::Rc; use std::cell::RefCell; // Shared ownership + Interior mutability let counter = Rc::new(RefCell::new(0)); // Clone the Rc to create multiple owners let counter1 = Rc::clone(&counter); let counter2 = Rc::clone(&counter); // All owners can mutate the same data! *counter1.borrow_mut() += 1; // ✅ Works! *counter2.borrow_mut() += 1; // ✅ Works! println!("Counter: {}", counter.borrow()); // 2 }
Memory Layout: Two Levels of Tracking
Rc<RefCell<T>> has two levels of tracking in a single allocation:
Two kinds of tracking:
- Rc level: Reference counting (how many owners exist)
- RefCell level: Borrow checking (is anyone currently borrowing?)
Two ways things can go wrong:
- Memory leak - If you create a reference cycle at the Rc level (counts never reach 0)
- Runtime panic - If you violate borrow rules at the RefCell level (multiple mutable borrows)
Two Patterns: Which Order?
You'll see two different patterns in this chapter - the order matters!
Pattern 1: Rc<RefCell<T>> - Multiple owners of mutable data
Use when: You need multiple owners who all mutate the same value.
#![allow(unused)] fn main() { let counter = Rc::new(RefCell::new(0)); let c1 = Rc::clone(&counter); let c2 = Rc::clone(&counter); *c1.borrow_mut() += 1; // Both owners mutate the same counter *c2.borrow_mut() += 1; // Counter is now 2 }
Pattern 2: RefCell<Rc<T>> - Changing which shared value you point to
Use when: You have one owner who needs to change which Rc it points to.
#![allow(unused)] fn main() { struct Node { next: RefCell<Option<Rc<Node>>>, // Can change which node this points to } let node_a = Rc::new(Node { next: RefCell::new(None) }); let node_b = Rc::new(Node { next: RefCell::new(None) }); // node_a changes which node it points to *node_a.next.borrow_mut() = Some(Rc::clone(&node_b)); }
Key difference:
Rc<RefCell<T>>: Multiple owners → mutate the value insideRefCell<Rc<T>>: One owner → change which Rc you're pointing to
You'll see both patterns in real code!
Note: When you add
Optioninto the mix, the order becomes even more important:RefCell<Option<Rc<T>>>vsOption<RefCell<Rc<T>>>vsRc<RefCell<Option<T>>>all have different capabilities. See Appendix: Nested Types for a detailed exploration of these patterns.
Creating Cycles: The Memory Leak Problem
Now that we can mutate through Rc, we can create actual reference cycles that leak memory. Chapter 7 showed this conceptually with std::mem::forget - now let's see it for real.
Step-by-Step Cycle Creation
#![allow(unused)] fn main() { use std::rc::Rc; use std::cell::RefCell; // Node that can point to another node struct Node { value: i32, next: RefCell<Option<Rc<Node>>>, } // Create two nodes let node_a = Rc::new(Node { value: 1, next: RefCell::new(None), }); let node_b = Rc::new(Node { value: 2, next: RefCell::new(None), }); // a count: 1, b count: 1 // Point a -> b *node_a.next.borrow_mut() = Some(Rc::clone(&node_b)); // a count: 1, b count: 2 (a's next points to b) // Point b -> a (creating a cycle!) *node_b.next.borrow_mut() = Some(Rc::clone(&node_a)); // a count: 2, b count: 2 (cycle created!) }
Visual representation of the cycle:
When the stack variables drop:
node_adrops: Decrements node_a's count from 2 to 1node_bdrops: Decrements node_b's count from 2 to 1- Both nodes still have count = 1 (from each other's
nextfield) - Neither can be freed because their counts never reach 0
- Memory leaked forever! 💀
This is EXACTLY what Chapter 7's std::mem::forget example demonstrated - the memory stays allocated because the reference count never reaches zero, preventing Drop from being called.
Why Cycles Leak Memory
The cycle prevents deallocation because:
- Node A can't be freed while Node B holds a reference to it
- Node B can't be freed while Node A holds a reference to it
- There's no "starting point" to begin cleanup
- The reference counts stay positive forever
This is a memory leak - the memory is allocated but never freed. Unlike other memory safety violations (use-after-free, double-free), memory leaks are considered "safe" in Rust - they won't crash your program, but they waste memory.
Note: "Forever" means for the program's lifetime - the OS reclaims memory when the program exits. The real problem: if cycles are created repeatedly in long-running programs (servers that run 24/7, desktop apps), memory usage grows unbounded until the program crashes.
Breaking Cycles with Weak
Remember our cycle problem? Both nodes point to each other:
When we drop both variables, the counts go from 2 to 1, but never reach 0. The nodes keep each other alive forever.
The Real-World Problem: Parent-Child Trees
Imagine a tree structure where:
- Parent nodes need to access their children
- Child nodes need to access their parent
#![allow(unused)] fn main() { struct Node { value: i32, parent: Rc<RefCell<Node>>, // Child points to parent children: Vec<Rc<RefCell<Node>>>, // Parent points to children } }
This creates a cycle! The parent keeps children alive, and children keep the parent alive. When you drop the root, nothing is freed.
The Solution: Weak References
The insight: Children shouldn't "own" their parent. They just need to reference it.
What does "own" mean? In Rust, owning a value means keeping it alive. When you have an Rc<T>, you own the value - it won't be freed as long as your Rc exists. The value is only freed when all owners (all Rc clones) are dropped.
In a tree, if children own their parent via Rc, they keep the parent alive. But the parent also owns the children via Rc, keeping them alive. This is a cycle - nothing can be freed.
The fix: Use Weak<T> for child → parent references. Weak doesn't own - it doesn't keep the value alive:
#![allow(unused)] fn main() { use std::rc::{Rc, Weak}; use std::cell::RefCell; struct Node { value: i32, parent: Weak<RefCell<Node>>, // Weak: doesn't own parent children: Vec<Rc<RefCell<Node>>>, // Rc: owns children } // Create root node (the parent) let root = Rc::new(RefCell::new(Node { value: 1, parent: Weak::new(), // Root has no parent children: vec![], })); // Create child nodes let child1 = Rc::new(RefCell::new(Node { value: 2, parent: Rc::downgrade(&root), // Weak reference to parent children: vec![], })); let child2 = Rc::new(RefCell::new(Node { value: 3, parent: Rc::downgrade(&root), // Weak reference to parent children: vec![], })); // Root stores strong references to children root.borrow_mut().children.push(Rc::clone(&child1)); root.borrow_mut().children.push(Rc::clone(&child2)); // Reference counts: // - root: strong=1 (only `root` variable owns it; children have Weak which don't increment strong_count) // - child1: strong=2 (owned by `child1` variable + root.children) // - child2: strong=2 (owned by `child2` variable + root.children) }
Now when you drop the variables:
#![allow(unused)] fn main() { drop(child1); // child1 strong_count: 2 → 1 (still owned by root.children) drop(child2); // child2 strong_count: 2 → 1 (still owned by root.children) drop(root); // ⬅ This triggers the cleanup cascade: // 1. root strong_count: 1 → 0 (no more owners) // 2. root is freed // 3. root.children Vec is freed // 4. child1 strong_count: 1 → 0 (Vec was the last owner) // 5. child2 strong_count: 1 → 0 (Vec was the last owner) // 6. child1 and child2 are freed // 7. No leak! Everything is cleaned up. }
Why this works: Children use Weak for parent references, so they don't keep the root alive. When the root variable is dropped, the root's strong count reaches 0 and it's freed. This frees the children Vec, which drops the children.
How Weak Works
#![allow(unused)] fn main() { let strong = Rc::new(42); let weak: Weak<i32> = Rc::downgrade(&strong); // Strong count: 1, Weak count: 1 // Must upgrade to use (might return None if value was dropped) if let Some(upgraded) = weak.upgrade() { // upgrade() succeeded! strong_count: 1 → 2 // `upgraded` is now an Rc<i32> - a new owner assert_eq!(*upgraded, 42); // Value: 42 // When `upgraded` goes out of scope: strong_count: 2 → 1 } // After scope ends, strong_count back to 1 drop(strong); // strong_count: 1 → 0, value is freed (even though weak_count = 1, Weak doesn't keep value alive) // After value is dropped, upgrade returns None assert!(weak.upgrade().is_none()); }
Key differences:
Strong (Rc<T>) | Weak (Weak<T>) |
|---|---|
| Keeps value alive | Doesn't keep value alive |
Increments strong_count | Increments weak_count |
Direct access with * | Must .upgrade() first (returns Option<Rc<T>>) |
| Value freed when count = 0 | Can access if strong refs still exist |
Parent-Child Tree Pattern
The classic use case: Trees where children can navigate to their parent.
The rule: Use Rc for ownership direction, Weak for back-references:
#![allow(unused)] fn main() { use std::rc::{Rc, Weak}; use std::cell::RefCell; struct Node { value: i32, parent: RefCell<Weak<Node>>, // ← Weak to parent (non-owning) children: RefCell<Vec<Rc<Node>>>, // ← Strong to children (owning) } impl Node { fn new(value: i32) -> Rc<Node> { Rc::new(Node { value, parent: RefCell::new(Weak::new()), children: RefCell::new(vec![]), }) } fn add_child(parent: &Rc<Node>, child: &Rc<Node>) { // Child holds Weak reference to parent (doesn't keep parent alive) *child.parent.borrow_mut() = Rc::downgrade(parent); // Parent holds strong reference to child (keeps child alive) parent.children.borrow_mut().push(Rc::clone(child)); } } // Usage let root = Node::new(1); let child1 = Node::new(2); let child2 = Node::new(3); Node::add_child(&root, &child1); Node::add_child(&root, &child2); // Child can access parent if let Some(parent) = child1.parent.borrow().upgrade() { println!("Parent value: {}", parent.value); // 1 } // When root is dropped: // - Children's strong_count goes to 0 (only parent held them) // - Children are freed // - Parent's weak_count goes to 0 (children held weak refs) // - No memory leak! ✅ }
Visual representation:
Why this works:
- Parent owns children with
Rc(strong references) - Children reference parent with
Weak(non-owning references) - When parent drops, children's
strong_countgoes to 0, so they're freed - No cycle, no leak! ✅
Common Patterns
Pattern 1: Graph Structure with Adjacency List
Build a graph where nodes can have edges to other nodes:
#![allow(unused)] fn main() { use std::rc::Rc; use std::cell::RefCell; struct Node { id: usize, edges: RefCell<Vec<Rc<Node>>>, } impl Node { fn new(id: usize) -> Rc<Node> { Rc::new(Node { id, edges: RefCell::new(vec![]), }) } fn add_edge(&self, target: Rc<Node>) { self.edges.borrow_mut().push(target); } fn neighbors(&self) -> Vec<Rc<Node>> { // Clone the Vec (clones each Rc inside, incrementing their reference counts) // The returned Vec owns these Rc clones until the caller drops it self.edges.borrow().clone() } } // Usage: Build a simple graph let node1 = Node::new(1); let node2 = Node::new(2); let node3 = Node::new(3); node1.add_edge(Rc::clone(&node2)); node1.add_edge(Rc::clone(&node3)); node2.add_edge(Rc::clone(&node3)); // Traverse the graph for neighbor in node1.neighbors() { println!("Node {} connects to Node {}", node1.id, neighbor.id); } }
Note: This creates strong references only, which can leak if there are cycles. For graphs with cycles, use Weak for some edges (e.g., back edges).
Pattern 2: Doubly-Linked List
A list you can traverse in both directions:
#![allow(unused)] fn main() { use std::rc::{Rc, Weak}; use std::cell::RefCell; struct Node { value: i32, next: RefCell<Option<Rc<Node>>>, // Strong forward link prev: RefCell<Weak<Node>>, // Weak backward link } impl Node { fn new(value: i32) -> Rc<Node> { Rc::new(Node { value, next: RefCell::new(None), prev: RefCell::new(Weak::new()), }) } fn append(current: &Rc<Node>, next: &Rc<Node>) { *next.prev.borrow_mut() = Rc::downgrade(current); *current.next.borrow_mut() = Some(Rc::clone(next)); } } // Usage let head = Node::new(1); let second = Node::new(2); let third = Node::new(3); Node::append(&head, &second); Node::append(&second, &third); // Navigate forward if let Some(next) = head.next.borrow().as_ref() { println!("After head: {}", next.value); // 2 } // Navigate backward if let Some(prev) = third.prev.borrow().upgrade() { println!("Before third: {}", prev.value); // 2 } }
Key insight: Forward links use Rc (ownership), backward links use Weak (non-owning).
Pattern 3: Observer Pattern
An observable that notifies multiple observers:
#![allow(unused)] fn main() { use std::rc::Rc; use std::cell::RefCell; trait Observer { fn notify(&self, event: &str); } struct Observable { observers: RefCell<Vec<Rc<dyn Observer>>>, } impl Observable { fn new() -> Self { Observable { observers: RefCell::new(vec![]), } } fn subscribe(&self, observer: Rc<dyn Observer>) { self.observers.borrow_mut().push(observer); } fn notify_all(&self, event: &str) { for observer in self.observers.borrow().iter() { observer.notify(event); } } } // Concrete observer struct Logger { name: String, } impl Observer for Logger { fn notify(&self, event: &str) { println!("[{}] Received event: {}", self.name, event); } } // Usage let observable = Observable::new(); let logger1 = Rc::new(Logger { name: String::from("Logger1"), }); let logger2 = Rc::new(Logger { name: String::from("Logger2"), }); observable.subscribe(Rc::clone(&logger1)); observable.subscribe(Rc::clone(&logger2)); observable.notify_all("UserLoggedIn"); // Output: // [Logger1] Received event: UserLoggedIn // [Logger2] Received event: UserLoggedIn }
Why this works: Observers are shared via Rc (multiple parts can hold references to the same observer). The observable mutates its list through RefCell.
Pattern 4: Shared Cache with Updates
Multiple readers with occasional updates:
#![allow(unused)] fn main() { use std::rc::Rc; use std::cell::RefCell; use std::collections::HashMap; struct Cache { data: RefCell<HashMap<String, String>>, } impl Cache { fn new() -> Rc<Cache> { Rc::new(Cache { data: RefCell::new(HashMap::new()), }) } fn get(&self, key: &str) -> Option<String> { self.data.borrow().get(key).cloned() } fn set(&self, key: String, value: String) { self.data.borrow_mut().insert(key, value); } } // Multiple components share the cache let cache = Cache::new(); let reader1 = Rc::clone(&cache); let reader2 = Rc::clone(&cache); let writer = Rc::clone(&cache); // Writer updates cache writer.set(String::from("user:123"), String::from("Alice")); // Readers can access if let Some(name) = reader1.get("user:123") { println!("Reader1 sees: {}", name); } if let Some(name) = reader2.get("user:123") { println!("Reader2 sees: {}", name); } }
Important: This is single-threaded only! For multi-threaded caching, use Arc<Mutex<HashMap<K, V>>> (covered in Chapter 14).
Pitfalls
Pitfall 1: Creating Accidental Cycles
BAD: Both directions using strong references:
#![allow(unused)] fn main() { use std::rc::Rc; use std::cell::RefCell; struct Node { value: i32, next: RefCell<Option<Rc<Node>>>, prev: RefCell<Option<Rc<Node>>>, // ❌ BAD: Strong reference both ways! } let node_a = Rc::new(Node { value: 1, next: RefCell::new(None), prev: RefCell::new(None), }); let node_b = Rc::new(Node { value: 2, next: RefCell::new(None), prev: RefCell::new(None), }); // Create bidirectional strong references - CYCLE! *node_a.next.borrow_mut() = Some(Rc::clone(&node_b)); *node_b.prev.borrow_mut() = Some(Rc::clone(&node_a)); // Memory leak! 💀 // Counts: node_a = 2, node_b = 2 // When they drop: counts go to 1, never reach 0 // Memory leaked! }
FIX: Use Weak for one direction:
#![allow(unused)] fn main() { use std::rc::{Rc, Weak}; use std::cell::RefCell; struct Node { value: i32, next: RefCell<Option<Rc<Node>>>, // Strong forward prev: RefCell<Weak<Node>>, // ✅ Weak backward } let node_a = Rc::new(Node { value: 1, next: RefCell::new(None), prev: RefCell::new(Weak::new()), }); let node_b = Rc::new(Node { value: 2, next: RefCell::new(None), prev: RefCell::new(Weak::new()), }); // One strong, one weak - NO CYCLE! *node_a.next.borrow_mut() = Some(Rc::clone(&node_b)); *node_b.prev.borrow_mut() = Rc::downgrade(&node_a); // ✅ Weak reference // When they drop, no cycle, memory is freed properly }
The rule: Pick an ownership direction. Use Rc for ownership, Weak for back-references.
Pitfall 2: Borrow Panics at Runtime
BAD: Holding a borrow while trying to borrow mutably:
#![allow(unused)] fn main() { use std::rc::Rc; use std::cell::RefCell; let data = Rc::new(RefCell::new(vec![1, 2, 3])); let data2 = Rc::clone(&data); // Borrow immutably let borrowed = data.borrow(); println!("Length: {}", borrowed.len()); // Try to borrow mutably through another Rc // data2.borrow_mut().push(4); // 💥 PANIC! Already borrowed immutably // The borrow from data is still active! }
Why it panics: Because the RefCell is shared between data and data2 (via Rc), both Rc instances point to the same RefCell. When data.borrow() is active, trying data2.borrow_mut() violates the borrowing rules - you can't have both immutable and mutable borrows of the same RefCell at the same time.
FIX 1: Drop the borrow explicitly:
#![allow(unused)] fn main() { let data = Rc::new(RefCell::new(vec![1, 2, 3])); let data2 = Rc::clone(&data); let borrowed = data.borrow(); println!("Length: {}", borrowed.len()); drop(borrowed); // ✅ Drop the borrow data2.borrow_mut().push(4); // Now safe! }
FIX 2: Use a shorter scope:
#![allow(unused)] fn main() { let data = Rc::new(RefCell::new(vec![1, 2, 3])); let data2 = Rc::clone(&data); { let borrowed = data.borrow(); println!("Length: {}", borrowed.len()); } // ✅ Borrow dropped here data2.borrow_mut().push(4); // Now safe! }
FIX 3: Don't store the borrow (best approach):
#![allow(unused)] fn main() { let data = Rc::new(RefCell::new(vec![1, 2, 3])); let data2 = Rc::clone(&data); println!("Length: {}", data.borrow().len()); // ✅ Borrow dropped immediately data2.borrow_mut().push(4); // Now safe! }
FIX 4: Use try_borrow_mut to handle gracefully:
#![allow(unused)] fn main() { let data = Rc::new(RefCell::new(vec![1, 2, 3])); let data2 = Rc::clone(&data); let borrowed = data.borrow(); println!("Length: {}", borrowed.len()); // Try to borrow mutably - returns Err instead of panicking match data2.try_borrow_mut() { Ok(mut b) => { b.push(4); println!("Successfully modified"); } Err(_) => { println!("Can't modify right now, already borrowed"); } } }
Pitfall 3: Weak Upgrade Failures
BAD: Assuming upgrade always succeeds:
#![allow(unused)] fn main() { use std::rc::{Rc, Weak}; let strong = Rc::new(42); let weak = Rc::downgrade(&strong); drop(strong); // Value is freed! let value = weak.upgrade().unwrap(); // 💥 PANIC! upgrade() returns None }
FIX: Always check if upgrade succeeds:
#![allow(unused)] fn main() { use std::rc::{Rc, Weak}; let strong = Rc::new(42); let weak = Rc::downgrade(&strong); drop(strong); // ✅ Check before using if let Some(value) = weak.upgrade() { println!("Value: {}", *value); } else { println!("Value has been dropped"); } }
Practical example: Tree traversal with parent pointers:
#![allow(unused)] fn main() { use std::rc::{Rc, Weak}; use std::cell::RefCell; struct Node { value: i32, parent: RefCell<Weak<Node>>, } fn print_parent_value(node: &Node) { match node.parent.borrow().upgrade() { Some(parent) => { println!("Parent value: {}", parent.value); } None => { println!("No parent (root node or parent dropped)"); } } } }
Implementation: Building Weak
Note: The complete implementation is in src/rc.rs in this repository. That implementation already includes
Weak<T>and usesManuallyDrop<T>internally. These sections show how to build them step-by-step to help you understand how they work under the hood.
Weak Structure
Weak<T> has the same structure as Rc<T> - it's just a pointer to the same RcInner:
#![allow(unused)] fn main() { use std::cell::Cell; struct RcInner<T> { strong_count: Cell<usize>, weak_count: Cell<usize>, // Track weak references value: T, } pub struct Rc<T> { ptr: *mut RcInner<T>, } pub struct Weak<T> { ptr: *mut RcInner<T>, // Same pointer type as Rc! } }
Key difference: Weak points to the same allocation as Rc, but doesn't keep the value alive.
Implementation Note: Implicit Weak Reference In the real
std::rc::Rcimplementation,weak_countstarts at 1, not 0. This represents an "implicit weak reference" held by theRcallocation itself. The count only reaches 0 when both:
- All strong references are dropped (
strong_count == 0)- All explicit weak references are dropped
This pattern ensures the
RcInnerallocation stays valid as long as ANY reference (strong or weak) exists. Whenstrong_countreaches 0, the valueTis dropped butRcInnerremains allocated withweak_count = 1(the implicit reference). Only when the last explicitWeakis dropped doesweak_countgo from 1 to 0, triggering deallocation ofRcInner.For simplicity, our implementation here shows
weak_countstarting at 0, which is conceptually easier to understand but differs from the actual std library implementation.
Creating Weak: Rc::downgrade()
Convert a strong reference to a weak one:
#![allow(unused)] fn main() { impl<T> Rc<T> { pub fn downgrade(this: &Rc<T>) -> Weak<T> { unsafe { let inner = &*this.ptr; // Increment weak_count let count = inner.weak_count.get(); inner.weak_count.set(count + 1); // Create Weak pointing to the same allocation Weak { ptr: this.ptr } } } } }
What happens:
- Increment
weak_count(doesn't affectstrong_count) - Create a new
Weakwith the same pointer - Value stays alive (strong_count unchanged)
Upgrading Weak: Weak::upgrade()
Try to convert a weak reference back to a strong one:
#![allow(unused)] fn main() { impl<T> Weak<T> { pub fn upgrade(&self) -> Option<Rc<T>> { // Check for null pointer (from Weak::new()) if self.ptr.is_null() { return None; } unsafe { let inner = &*self.ptr; let strong = inner.strong_count.get(); // Check if value is still alive if strong == 0 { // Value has been dropped, can't upgrade None } else { // Value still alive, increment strong_count inner.strong_count.set(strong + 1); // Create new Rc pointing to same allocation Some(Rc { ptr: self.ptr }) } } } } }
Why it returns Option:
- If
strong_count == 0, allRcowners have dropped, value is gone → returnNone - If
strong_count > 0, value still alive → increment count and returnSome(Rc)
Example:
#![allow(unused)] fn main() { let strong = Rc::new(42); let weak = Rc::downgrade(&strong); // strong_count = 1, weak_count = 1 // Upgrade succeeds (strong_count > 0) if let Some(rc) = weak.upgrade() { assert_eq!(*rc, 42); // strong_count now 2 } // strong_count back to 1 drop(strong); // strong_count = 0, value dropped // Upgrade fails (strong_count == 0) assert!(weak.upgrade().is_none()); }
Weak::new() - Creating an Empty Weak
#![allow(unused)] fn main() { impl<T> Weak<T> { pub fn new() -> Weak<T> { Weak { ptr: std::ptr::null_mut(), // Null pointer, doesn't point to anything } } } }
Usage: For fields that might never have a value:
#![allow(unused)] fn main() { struct Node { parent: Weak<Node>, // Start with no parent } let root = Node { parent: Weak::new(), // Root has no parent }; }
Weak's Drop Implementation
When a Weak is dropped, we decrement weak_count and potentially deallocate:
#![allow(unused)] fn main() { impl<T> Drop for Weak<T> { fn drop(&mut self) { // Check for null pointer (from Weak::new()) if self.ptr.is_null() { return; } unsafe { let inner = &*self.ptr; let weak = inner.weak_count.get(); inner.weak_count.set(weak - 1); if weak - 1 == 0 { // Last Weak reference! let strong = inner.strong_count.get(); if strong == 0 { // Value already dropped by Rc, safe to deallocate drop(Box::from_raw(self.ptr)); } // If strong > 0, some Rc still exists, they'll handle deallocation } } } } }
The lifecycle:
#![allow(unused)] fn main() { let strong1 = Rc::new(String::from("data")); let strong2 = Rc::clone(&strong1); let weak = Rc::downgrade(&strong1); // strong_count = 2, weak_count = 1 drop(strong1); // strong_count: 2 → 1 // Nothing deallocated drop(strong2); // strong_count: 1 → 0 // Rc::drop() drops the String // But weak_count = 1, so RcInner stays allocated drop(weak); // weak_count: 1 → 0 // strong_count already 0, so Weak::drop() deallocates RcInner }
Key insight: Weak::drop() only deallocates RcInner when BOTH counts are 0. It never drops the value - Rc::drop() handles that.
Why Weak Doesn't Keep Value Alive
The key is in upgrade():
#![allow(unused)] fn main() { pub fn upgrade(&self) -> Option<Rc<T>> { if strong_count == 0 { return None; // Value gone, can't access it } // ... } }
Contrast with Rc::clone():
#![allow(unused)] fn main() { impl<T> Clone for Rc<T> { fn clone(&self) -> Rc<T> { // Always increments strong_count let count = self.inner().strong_count.get(); self.inner().strong_count.set(count + 1); Rc { ptr: self.ptr } } } }
The difference:
Rc::clone()→ incrementsstrong_count→ keeps value aliveRc::downgrade()→ incrementsweak_count→ doesn't keep value aliveWeak::upgrade()→ checksstrong_countfirst → returnsNoneif value dropped
Complete Weak Example
#![allow(unused)] fn main() { use std::rc::{Rc, Weak}; // Create strong reference let strong = Rc::new(42); println!("strong_count: {}", Rc::strong_count(&strong)); // 1 // Create weak reference let weak: Weak<i32> = Rc::downgrade(&strong); println!("weak_count: {}", Rc::weak_count(&strong)); // 1 // Upgrade weak to strong (value still alive) if let Some(upgraded) = weak.upgrade() { println!("Upgraded: {}", *upgraded); // 42 println!("strong_count: {}", Rc::strong_count(&upgraded)); // 2 } // upgraded dropped, strong_count back to 1 // Drop the strong reference drop(strong); // Try to upgrade (value is gone) match weak.upgrade() { Some(_) => println!("Upgraded successfully"), None => println!("Can't upgrade, value dropped"), // This runs } }
Implementation: Why Rc Needs ManuallyDrop
Quick Reminder: Rc Structure
From Chapter 7, recall that Rc<T> is defined as:
#![allow(unused)] fn main() { use std::cell::Cell; struct RcInner<T> { strong_count: Cell<usize>, // Reference count value: T, // The actual data } pub struct Rc<T> { ptr: *mut RcInner<T>, // Pointer to heap allocation } }
When implementing Drop for Rc<T>, we face a double-drop problem. This affects Rc<T> for any type T, not just RefCell.
The Problem: Double Drop
Concrete example: Let's say we have 2 strong references and 1 weak reference:
#![allow(unused)] fn main() { use std::rc::{Rc, Weak}; let strong1 = Rc::new(String::from("data")); let strong2 = Rc::clone(&strong1); let weak: Weak<String> = Rc::downgrade(&strong1); // State: strong_count = 2, weak_count = 1 }
What needs to happen when we drop both strong refs:
- Drop
strong1→strong_count= 2 → 1 (value still alive) - Drop
strong2→strong_count= 1 → 0 (value must be dropped now!) - But
weakstill exists!RcInnermust stay allocated soweak.upgrade()can checkstrong_count - Later, drop
weak→weak_count= 1 → 0 (now deallocateRcInner)
The challenge: In step 2, we must drop the String but keep the RcInner allocation alive. These are two separate operations.
Naive Approach (Broken)
Now let's see the implementation problem:
#![allow(unused)] fn main() { impl<T> Drop for Rc<T> { fn drop(&mut self) { unsafe { // self.ptr is *mut RcInner<T> let inner = &*self.ptr; let count = inner.strong_count.get(); inner.strong_count.set(count - 1); if count - 1 == 0 { // Last strong owner! (strong_count = 0, but weak_count = 1) // Step 1: Drop the value T ptr::drop_in_place(&mut (*self.ptr).value); // ✅ String is now dropped, its buffer freed // Step 2: Check if we can deallocate RcInner let weak = inner.weak_count.get(); if weak == 0 { // No Weak references, safe to deallocate drop(Box::from_raw(self.ptr)); // ⚠️ PROBLEM: Box::drop will try to drop all fields of RcInner<T> // This includes `value: T` - but we already dropped it in Step 1! // 💥 DOUBLE DROP! } else { // weak_count = 1, Weak reference still exists! // We must keep the RcInner allocation alive so Weak::upgrade() can check strong_count // The String is dropped, but RcInner stays allocated // Later when Weak is dropped, it will deallocate RcInner } } } } } }
Why we need TWO separate steps:
When strong_count reaches 0:
- Drop the value - No more strong owners, value must be freed
- Keep the RcInner allocation - If
weak_count > 0, Weak references still need to checkstrong_count
When weak_count ALSO reaches 0: 3. Deallocate the RcInner - No more references at all
The double-drop problem:
If we skip Step 1 and just do Step 3 (deallocate RcInner), we can't keep the allocation alive for Weak references. We need to drop the value when strong_count → 0, but deallocate the struct only when weak_count → 0.
Without ManuallyDrop, step 3 would automatically drop value again - but we already dropped it in step 1!
The Solution: ManuallyDrop
ManuallyDrop<T> is a wrapper that tells Rust: "Don't automatically drop this value - I'll handle it manually."
How does it do this? Simple: by not implementing Drop. When Rust drops a struct, it automatically drops all its fields - but only if the field's type implements Drop. Since ManuallyDrop<T> doesn't implement Drop, Rust won't automatically drop the inner T.
#![allow(unused)] fn main() { use std::mem::ManuallyDrop; use std::cell::Cell; struct RcInner<T> { strong_count: Cell<usize>, weak_count: Cell<usize>, value: ManuallyDrop<T>, // ← Wrapped in ManuallyDrop } }
Now with ManuallyDrop, the Drop implementation is safe:
#![allow(unused)] fn main() { impl<T> Drop for Rc<T> { fn drop(&mut self) { unsafe { let inner = &*self.ptr; let count = inner.strong_count.get(); inner.strong_count.set(count - 1); if count - 1 == 0 { // Manually drop the value (need mutable access) let inner_mut = &mut *self.ptr; ManuallyDrop::drop(&mut inner_mut.value); // ✅ Drops T inside ManuallyDrop let weak = inner.weak_count.get(); if weak == 0 { // Deallocate the RcInner // Since value is ManuallyDrop, Rust won't try to drop it again drop(Box::from_raw(self.ptr)); // ✅ Safe! No double-drop } } } } } }
Why it works:
ManuallyDrop<T>prevents automatic drop whenRcInneris deallocated- We explicitly call
ptr::drop_in_place()on the value whenstrong_countreaches 0 - No double drop - the value is dropped exactly once
What is ManuallyDrop?
ManuallyDrop<T> is a zero-cost wrapper defined in std::mem:
#![allow(unused)] fn main() { #[repr(transparent)] pub struct ManuallyDrop<T> { value: T, } }
Key properties:
- Zero cost: Same size and layout as
T(due to#[repr(transparent)]) - Does not implement Drop: Rust never automatically drops the inner
T - Requires unsafe to drop: You must explicitly call
ManuallyDrop::drop(&mut x)
Core operations:
#![allow(unused)] fn main() { // Create let x = ManuallyDrop::new(String::from("hello")); // Access (no drop) println!("{}", *x); // Deref to access inner value // Drop explicitly (unsafe - you must ensure no future access) unsafe { ManuallyDrop::drop(&mut x); // Drops the String } // After this, accessing x is undefined behavior! // Move out without dropping (unsafe) let s = unsafe { ManuallyDrop::take(&mut x) }; // Takes ownership of String // Now s owns the String, x is left in uninitialized state }
Can We Manually Implement It?
Yes! Here's how ManuallyDrop is essentially implemented:
#![allow(unused)] fn main() { use std::mem::forget; use std::ptr; #[repr(transparent)] pub struct ManualDrop<T> { value: T, } impl<T> ManualDrop<T> { pub const fn new(value: T) -> Self { ManualDrop { value } } // Safe: Just creates a reference pub fn deref(&self) -> &T { &self.value } // Unsafe: Caller must ensure no further access pub unsafe fn drop(slot: &mut Self) { // Read the value out and drop it ptr::drop_in_place(&mut slot.value); } // Unsafe: Caller must ensure no further access pub unsafe fn take(slot: &mut Self) -> T { // Move the value out, leaving uninitialized memory ptr::read(&slot.value) } } // No Drop implementation! This is the key. // Without Drop, Rust won't automatically drop T. impl<T> Deref for ManualDrop<T> { type Target = T; fn deref(&self) -> &T { &self.value } } }
Key insight: By not implementing Drop, the wrapper doesn't drop its contents. You must manually call drop() or take() to handle the inner value.
ManuallyDrop in Practice
You'll see ManuallyDrop used whenever:
- Reference counting -
Rc,Arcneed to control whenTis dropped - Custom allocators - Managing memory lifetime manually
- FFI boundaries - Passing ownership to C code
- Union types - Only one variant should be dropped
- Implementing Drop flags - Control drop order precisely
Warning: Using ManuallyDrop is subtle - you must ensure:
- Drop is called exactly once
- No access after drop
- Memory is eventually freed
Get it wrong, and you'll have memory leaks or use-after-free bugs!
Anti-patterns
#![allow(unused)] fn main() { // ❌ BAD: Single owner, no sharing needed let data = Rc::new(RefCell::new(vec![1, 2, 3])); data.borrow_mut().push(4); // Should be: let mut data = vec![1, 2, 3]; // ❌ BAD: Never mutated, just shared let config = Rc::new(RefCell::new(Config::load())); // Should be: let config = Rc::new(Config::load()); // ❌ BAD: Using in multi-threaded context let counter = Rc::new(RefCell::new(0)); thread::spawn(move || { ... }); // Won't compile! (Not Send) // Should be: Arc<Mutex<i32>> or Arc<AtomicI32> }
Key Takeaways
- Rc<RefCell
> combines shared ownership + interior mutability - Multiple owners can all mutate the same data - Two levels of tracking - Rc tracks reference counts (ownership), RefCell tracks borrows (usage)
- Two ways to fail - Memory leaks from Rc cycles (safe but wasteful), panics from RefCell borrow violations (runtime error)
- Use Weak to break cycles - Weak references don't keep values alive, preventing memory leaks in cyclic structures
- Parent-child pattern - Use Rc for ownership direction (parent → child), Weak for back-references (child → parent)
- Drop borrows quickly - Keep RefCell borrows short-lived to avoid runtime panics
- Don't overuse - If you have a clear owner or don't need sharing, use simpler patterns (&mut T, Box
, Rc ) - Always check Weak::upgrade() - Weak references might point to dropped values, always handle None case
- Runtime cost - Both Rc (reference counting) and RefCell (borrow checking) have runtime overhead, use only when needed
Exercises
See examples/08_rc_refcell.rs for hands-on exercises demonstrating:
- Creating and using
Rc<RefCell<T>> - Multiple owners mutating shared data
- Creating and breaking reference cycles with
Weak - Building graphs and trees with cycles
- Observer pattern implementation
- Common pitfalls and how to avoid them
Complete solutions: Switch to the answers branch with git checkout answers to see completed exercises.
Next Chapter
Chapter 9: Arc - Atomic Reference Counting - Thread-safe shared ownership for multi-threaded code.
Appendix B: Closures - Fn, FnMut, FnOnce Traits
This document covers Rust's closure traits and how they represent different levels of capture and mutation.
What Are Closures?
Closures are anonymous functions that can capture local variables from outside their body. They look like lambda functions from other languages:
#![allow(unused)] fn main() { let y = 5; let add_y = |x: i32| x + y; // ✅ Closure: captures y (declared outside) add_y(10) // 15 }
What Makes Something a Closure?
Closures (anonymous functions with captures):
#![allow(unused)] fn main() { let multiplier = 10; let multiply = |x| x * multiplier; // ✅ Captures multiplier let mut count = 0; let increment = || { count += 1; }; // ✅ Captures and mutates count let data = vec![1, 2, 3]; let processor = move || println!("{:?}", data); // ✅ Takes ownership of data }
Not closures (regular functions):
#![allow(unused)] fn main() { fn add(x: i32, y: i32) -> i32 { x + y // ✅ Regular function: no captures, globally defined } fn double(x: i32) -> i32 { x * 2 // ✅ Regular function: all data comes from parameters } }
Not closures (closures without captures):
#![allow(unused)] fn main() { let add_one = |x| x + 1; // ⚠️ Technically a closure, but captures nothing // Behaves like a regular function (zero-size!) }
Key Differences from Regular Functions
Compare a closure to a regular function:
#![allow(unused)] fn main() { let y = 5; // ✅ Closure: can access y from environment let add_y = |x: i32| x + y; // ❌ Function: cannot access y fn add_y_fn(x: i32) -> i32 { x + y // ERROR: Can't access y, not in function scope } }
Regular functions are:
- Globally defined with
fnkeyword - Stateless (no captured environment)
- Can only access their parameters and global items
Closures are:
- Locally defined with
||syntax - Can capture variables from their surrounding scope
- Each closure has a unique, compiler-generated type
Closures Can Have State
Because closures capture variables, they can maintain state across multiple calls:
#![allow(unused)] fn main() { let mut count = 0; let mut counter = || { count += 1; count }; println!("{}", counter()); // 1 println!("{}", counter()); // 2 println!("{}", counter()); // 3 // count is now 3 (modified by the closure) println!("{}", count); // 3 }
This is powerful! The closure counter "remembers" the value of count between calls. Each time you call it, it increments and returns the new value.
Regular functions cannot do this:
#![allow(unused)] fn main() { fn counter_fn() -> i32 { let mut count = 0; // Reset to 0 every time! count += 1; count } println!("{}", counter_fn()); // 1 println!("{}", counter_fn()); // 1 (doesn't remember previous calls) println!("{}", counter_fn()); // 1 }
This stateful behavior makes closures useful for:
- Event handlers that track state
- Iterators that maintain position
- Callbacks that accumulate results
- Any situation where you need a function with memory
The Three Closure Traits
Rust represents closures as traits. Every closure implements one (or more) of these three traits.
The compiler automatically decides which trait(s) a closure implements based on how it uses captured variables:
#![allow(unused)] fn main() { let s = String::from("hello"); // Closure that only borrows: let closure1 = || { println!("{}", s); // Only borrows s (immutable reference) }; // ✅ closure1 implements Fn (and FnMut, and FnOnce) // Closure that mutates: let mut count = 0; let mut closure2 = || { count += 1; // Mutates count (mutable reference) }; // ✅ closure2 implements FnMut (and FnOnce, but NOT Fn) // Closure that consumes: let closure3 = || { let s1 = s; // Moves s (takes ownership) }; // ✅ closure3 implements FnOnce only (NOT Fn or FnMut) }
You don't specify which trait - the compiler infers it from the closure body.
FnOnce - Call Once, Consumes Captures
#![allow(unused)] fn main() { pub trait FnOnce<Args> { type Output; fn call_once(self, args: Args) -> Self::Output; } }
Key: self (takes ownership of the closure)
The closure consumes itself when called - can only be called once. Used when the closure takes ownership of captured values:
#![allow(unused)] fn main() { let s = String::from("hello"); let once = || { let s1 = s; // Takes ownership - s is gone after this }; once(); once(); // ❌ ERROR: once already consumed }
FnMut - Call Multiple Times, Mutable Capture
#![allow(unused)] fn main() { pub trait FnMut<Args>: FnOnce<Args> { fn call_mut(&mut self, args: Args) -> Self::Output; } }
Key: &mut self (can modify the closure's captures)
The closure can be called multiple times and can mutate captured values:
#![allow(unused)] fn main() { let mut counter = 0; let mut increment = || { counter += 1; // Mutates counter (captured by mut ref) counter }; increment(); // 1 increment(); // 2 increment(); // 3 }
Fn - Call Multiple Times, Immutable Capture
#![allow(unused)] fn main() { pub trait Fn<Args>: FnMut<Args> { fn call(&self, args: Args) -> Self::Output; } }
Key: &self (only reads captures, never modifies)
The closure can be called any number of times without modifying anything:
#![allow(unused)] fn main() { let multiplier = 5; let multiply = |x: i32| x * multiplier; // Only reads multiplier multiply(2); // 10 multiply(3); // 15 multiply(4); // 20 }
Trait Hierarchy
Fn is the most restrictive, FnOnce is the most general:
classDiagram
class FnOnce {
call_once(self, args)
}
class FnMut {
call_mut(&mut self, args)
}
class Fn {
call(&self, args)
}
FnOnce <|-- FnMut
FnMut <|-- Fn
This means:
- A function that accepts
FnOncecan take any closure (Fn, FnMut, or FnOnce) - A function that accepts
FnMutcan take Fn or FnMut, but NOT FnOnce-only closures - A function that accepts
Fncan only take Fn closures
When to Use Each Trait
Decision tree:
- Does the closure take ownership of a captured value? →
FnOnce - Does the closure mutate a captured value? →
FnMut - Does the closure only read captured values? →
Fn
| Trait | Used When | Standard Library Examples |
|---|---|---|
FnOnce | Closure is called at most once | Option::unwrap_or_else, Result::unwrap_or_else |
FnMut | Closure mutates captured state | Iterator::for_each, [T]::sort_by |
Fn | Closure is pure (no side effects) | Iterator::map, Iterator::filter |
How Closures Work: Generated Structs
Under the hood, every closure is an anonymous struct that the compiler generates for you. This struct contains fields for each captured variable, and implements one of the three closure traits (Fn, FnMut, or FnOnce). Understanding this desugaring helps explain why closures behave the way they do.
Key insights:
- Each closure gets a unique, compiler-generated anonymous struct type
- The struct contains fields for each captured variable
- Which trait is implemented depends on how the captured variables are used
call_oncetakesself(consumes the closure),call_muttakes&mut self(mutates captures),calltakes&self(immutable access)- This is why you can't write the closure's type explicitly - it's anonymous and compiler-generated
- Closures are a zero-cost abstraction: the generated struct is exactly the size of the captured data, with no runtime indirection
- The
extern "rust-call"ABI efficiently unpacks argument tuples with zero runtime cost
Example 1: FnOnce Closure (Captures by Value)
When a closure consumes a captured variable (moves it), it implements FnOnce:
#![allow(unused)] fn main() { // What you write: let s = String::from("hello"); let consume = |prefix: &str| { let s1 = s; // Moves s into s1 (takes ownership) println!("{}: {}", prefix, s1); }; consume("Processing"); // consume("Again"); // ❌ ERROR: consume already called }
Here's roughly what Rust generates:
#![allow(unused)] fn main() { // Compiler-generated anonymous struct: struct ClosureEnv { s: String, // Captured by value (takes ownership) } // Compiler-generated trait impl: impl FnOnce<(&str,)> for ClosureEnv { type Output = (); extern "rust-call" fn call_once(self, args: (&str,)) -> Self::Output { // The closure body: let s1 = self.s; // Moves self.s into s1 println!("{}: {}", args.0, s1); } } }
Notice that call_once takes self (not &self or &mut self). This means the closure consumes itself - it can only be called once. When we call consume("Processing"), the entire struct is moved into the call_once method along with the argument tuple ("Processing",), the closure body executes, and the struct is dropped.
Important: The closure implements FnOnce because it consumes s by moving it into the local variable s1. If the closure only borrowed s (like println!("{}", s) without the move), it would implement Fn instead.
Type vs Trait: If you hover over
consumein your IDE, it will show the type asimpl FnOnce(&str). This doesn't mean the type isFnOnce- rather, it's an anonymous struct (likeClosureEnvabove) that implements theFnOnce(&str)trait. The IDE showsimpl FnOnce(&str)because the actual struct name is compiler-generated and unknowable.
Example 2: FnMut Closure (Captures by Mutable Reference)
When a closure mutates a captured variable, it implements FnMut:
#![allow(unused)] fn main() { let mut count = 0; let mut increment = |delta: i32| { count += delta; // Mutates count }; increment(1); increment(2); println!("{}", count); // 3 }
Here's roughly what Rust generates:
#![allow(unused)] fn main() { // Compiler-generated anonymous struct: struct ClosureEnv<'a> { count: &'a mut i32, // Captured by mutable reference } // Compiler-generated trait impl: impl<'a> FnMut<(i32,)> for ClosureEnv<'a> { extern "rust-call" fn call_mut(&mut self, args: (i32,)) -> Self::Output { // The closure body: *self.count += args.0; } } // FnMut also requires FnOnce: impl<'a> FnOnce<(i32,)> for ClosureEnv<'a> { type Output = (); extern "rust-call" fn call_once(mut self, args: (i32,)) -> Self::Output { self.call_mut(args); } } }
Notice that call_mut takes &mut self. This allows the closure to mutate its captures (via the mutable reference), but the closure itself isn't consumed - it can be called multiple times. The struct stores a mutable reference to count, not the value itself.
Example 3: Fn Closure (Captures by Immutable Reference)
When a closure only reads captured variables, it implements Fn:
#![allow(unused)] fn main() { // What you write: let y = 5; let add_y = |x: i32| x + y; let result1 = add_y(10); // 15 let result2 = add_y(20); // 25 }
Here's roughly what Rust generates:
#![allow(unused)] fn main() { // Compiler-generated anonymous struct: struct ClosureEnv<'a> { y: &'a i32, // Captured by immutable reference } // Compiler-generated trait impl: impl<'a> Fn<(i32,)> for ClosureEnv<'a> { extern "rust-call" fn call(&self, args: (i32,)) -> Self::Output { // The closure body: args.0 + *self.y } } // Fn also requires FnMut and FnOnce: impl<'a> FnMut<(i32,)> for ClosureEnv<'a> { extern "rust-call" fn call_mut(&mut self, args: (i32,)) -> Self::Output { self.call(args) } } impl<'a> FnOnce<(i32,)> for ClosureEnv<'a> { type Output = i32; extern "rust-call" fn call_once(self, args: (i32,)) -> Self::Output { self.call(args) } } }
Notice that call takes &self (immutable reference). The closure can be called many times because it doesn't modify anything. The struct stores an immutable reference to y, allowing the closure to read it repeatedly.
Closures with No Captures
If a closure doesn't capture any variables, the generated struct is empty:
#![allow(unused)] fn main() { // What you write: let always_five = || 5; println!("{}", std::mem::size_of_val(&always_five)); // 0 bytes! // What Rust generates: struct ClosureEnv; // Empty struct impl Fn<()> for ClosureEnv { extern "rust-call" fn call(&self, args: ()) -> i32 { 5 } } }
An empty struct has zero size! This is why closures that capture nothing compile down to pure function calls - they're literally free abstractions.
The move Keyword: Forcing Ownership
By default, closures capture variables by reference (either &T or &mut T). The move keyword forces the closure to take ownership of all captured variables.
Syntax
#![allow(unused)] fn main() { let s = String::from("hello"); let closure = move || { println!("{}", s); // Takes ownership of s }; }
What move Does
Without move (default), closures capture by reference:
#![allow(unused)] fn main() { let s = String::from("hello"); let closure = || println!("{}", s); // Borrows s as &String closure(); println!("{}", s); // ✅ OK: s is still available }
Without move, you can also force the closure to consume a captured variable by moving it inside the closure body:
#![allow(unused)] fn main() { let s = String::from("hello"); let closure = || { let s1 = s; // Moves s out of the environment (forces ownership) println!("{}", s1); }; closure(); // closure(); // ❌ ERROR: closure is FnOnce, can't call twice // println!("{}", s); // ❌ ERROR: s was moved into closure }
This makes the closure FnOnce because it consumes s on the first call - the closure cannot be called multiple times.
Here's what Rust generates:
#![allow(unused)] fn main() { // Compiler-generated struct: struct ClosureEnv<'a> { s: &'a String, // Captured by reference (not owned!) } // Implements FnOnce only: impl<'a> FnOnce<()> for ClosureEnv<'a> { type Output = (); extern "rust-call" fn call_once(self, args: ()) -> Self::Output { let s1 = *self.s; // Moves the String out through the reference println!("{}", s1); } } }
Notice: The struct captures s by reference (&'a String), but the closure body moves the value out, making it FnOnce.
How the closure is instantiated
When you write the closure, Rust automatically creates an instance of the generated struct and captures the environment:
#![allow(unused)] fn main() { // What you write: let s = String::from("hello"); let closure = || { let s1 = s; println!("{}", s1); }; // What Rust generates (conceptual): let s = String::from("hello"); let closure = ClosureEnv { s: &s, // Captures s by reference }; // When you call the closure: closure(); // Rust generates: FnOnce::call_once(closure, ()); // Moves closure, calls call_once }
The key steps:
- Closure creation:
ClosureEnv { s: &s }- the struct is instantiated with a reference tos - Closure call:
FnOnce::call_once(closure, ())- the entire struct is moved intocall_once - Inside call_once:
let s1 = *self.s- the String is moved out through the reference - After call: The closure is consumed and cannot be called again
With move, closures take ownership at closure creation time, but can still be called multiple times:
#![allow(unused)] fn main() { let s = String::from("hello"); let closure = move || println!("{}", s); // Takes ownership of s immediately closure(); closure(); // ✅ OK: closure is Fn, can call multiple times! // println!("{}", s); // ❌ ERROR: s was moved at closure creation (line 2) }
The closure is still Fn (not FnOnce) because it only reads s, it doesn't consume it. The move keyword affects when ownership is transferred (at creation vs at call time), not whether the closure can be called multiple times.
Here's what Rust generates:
#![allow(unused)] fn main() { // Compiler-generated struct: struct ClosureEnv { s: String, // Owned by the closure (moved at creation) } // Implements Fn (and FnMut, and FnOnce): impl Fn<()> for ClosureEnv { extern "rust-call" fn call(&self, args: ()) -> Self::Output { println!("{}", self.s); // Only reads self.s, doesn't consume it } } impl FnMut<()> for ClosureEnv {...} impl FnOnce<()> for ClosureEnv {...} }
Notice: The struct owns s (String, not &String), and the closure implements Fn because call only reads self.s without consuming it. The closure can be called multiple times.
Key difference:
- Without
move: Variable is captured by reference; ownership transfer happens when the closure runs (if moved in body) - With
move: Variable ownership is transferred when the closure is created
When to Use move
Use Case 1: Returning Closures
Closures that capture references can't outlive those references. Use move to transfer ownership:
#![allow(unused)] fn main() { fn make_greeter(name: String) -> impl Fn() { move || println!("Hello, {}!", name) // Takes ownership of name } let greet = make_greeter(String::from("Alice")); greet(); // "Hello, Alice!" }
Without move, this would fail:
#![allow(unused)] fn main() { fn make_greeter(name: String) -> impl Fn() { || println!("Hello, {}!", name) // ❌ ERROR: borrowed value doesn't live long enough } }
Use Case 2: Spawning Threads
Threads require 'static lifetime. Use move to transfer ownership to the thread:
#![allow(unused)] fn main() { use std::thread; let data = vec![1, 2, 3]; thread::spawn(move || { println!("{:?}", data); // Takes ownership of data }).join().unwrap(); println!("{:?}", data); // ❌ ERROR: data was moved }
Without move:
#![allow(unused)] fn main() { let data = vec![1, 2, 3]; thread::spawn(|| { println!("{:?}", data); // ❌ ERROR: closure may outlive current function }).join().unwrap(); }
Use Case 3: Async Functions
Similar to threads, async closures often need move to avoid lifetime issues:
#![allow(unused)] fn main() { async fn process_data(data: Vec<i32>) { tokio::spawn(move || async move { // Process data asynchronously println!("{:?}", data); }); } }
move with Copy Types
For types that implement Copy (like i32, bool, char), move copies the value instead of moving it:
#![allow(unused)] fn main() { let x = 42; let closure = move || println!("{}", x); // Copies x (i32 is Copy) closure(); println!("{}", x); // ✅ OK: x is still available (was copied, not moved) }
move with Non-Copy Types
For non-Copy types (like String, Vec), move transfers ownership:
#![allow(unused)] fn main() { let s = String::from("hello"); let closure = move || println!("{}", s); // Moves s (String is not Copy) closure(); println!("{}", s); // ❌ ERROR: s was moved }
Selective Moving
You can't selectively choose which variables to move - move applies to all captured variables:
#![allow(unused)] fn main() { let x = String::from("x"); let y = String::from("y"); let closure = move || { println!("{}", x); // Both x and y are moved // Even if we don't use y, it's still moved }; // println!("{}", x); // ❌ ERROR: moved // println!("{}", y); // ❌ ERROR: moved (even though not used in closure) }
If you need selective ownership, clone the values you want to keep:
#![allow(unused)] fn main() { let x = String::from("x"); let y = String::from("y"); let y_clone = y.clone(); let closure = move || { println!("{}", x); // x is moved println!("{}", y_clone); // y_clone is moved }; println!("{}", y); // ✅ OK: original y is still available }
Common Pattern: Clone Before move
A common pattern is to clone before using move:
#![allow(unused)] fn main() { use std::sync::Arc; let data = Arc::new(vec![1, 2, 3]); let data_clone = Arc::clone(&data); std::thread::spawn(move || { println!("{:?}", data_clone); // Uses clone, not original }).join().unwrap(); println!("{:?}", data); // ✅ OK: original Arc still available }
Key Takeaways
moveforces ownership transfer of all captured variables- Use
movewhen:- Returning closures from functions
- Spawning threads or async tasks
- The closure needs to outlive the current scope
- Copy types (like
i32) are copied, not moved - Non-Copy types (like
String) are moved and become unavailable moveapplies to all captures - you can't selectively move some variables- Clone before
moveif you need to keep the original value
Closure Size and Zero-Cost Abstractions
Each closure has a compiler-generated anonymous type with fields for each captured variable. The size of a closure equals the size of its captured data:
#![allow(unused)] fn main() { let x = 5i32; // 4 bytes let y = 10i32; // 4 bytes let z = String::from("hello"); // 24 bytes (on modern Rust) let closure = || { let _ = x; // Captures x: 4 bytes let _ = y; // Captures y: 4 bytes let _ = z; // Captures z: 24 bytes }; println!("{}", std::mem::size_of_val(&closure)); // 32 bytes }
Important: The String is 24 bytes of stack metadata (pointer + length + capacity), not the heap data itself. The actual string bytes "hello" live on the heap and aren't counted in the closure's size. size_of_val only measures the stack footprint of the closure struct.
Zero Size Closures
Closures that capture nothing have zero size:
#![allow(unused)] fn main() { let zero_size = || 42; println!("{}", std::mem::size_of_val(&zero_size)); // 0 bytes! }
Why Zero-Cost?
This is why Rust closures are zero-cost abstractions - they compile down to the same code as manual structs, with no runtime overhead:
- No indirection (direct field access)
- No heap allocation (stack-allocated struct)
- No wrapper overhead (just the captured data)
- Argument passing via
extern "rust-call"ABI compiles to efficient assembly
The compiler generates optimal code that's equivalent to hand-written structs with method calls.
Common Misconceptions: Closure Types
Misconception: "Fn(i32) -> i32 is the closure's type"
When you hover over a closure in your IDE, you might see something like impl Fn(i32) -> i32:
#![allow(unused)] fn main() { let f = |x: i32| x + 2; // IDE shows: impl Fn(i32) -> i32 }
This is NOT the closure's actual type. Here's the truth:
Fnis a trait, not a type. Just likeIteratororDebug, it's something types can implement.- The closure has a unique, anonymous compiler-generated type (as we discussed in the "How Closures Work" section).
- The IDE notation
impl Fn(i32) -> i32means: "some unknown type that implements theFn(i32) -> i32trait"
What is the actual type?
Every closure has a unique, unnamed type that only the compiler knows:
#![allow(unused)] fn main() { let f = |x: i32| x + 2; let g = |x: i32| x + 2; // Same code, but different type! // ❌ ERROR: mismatched types // let h: typeof(f) = g; // Can't do this - f and g have different types }
Even though f and g have identical code, they're different types with different anonymous names generated by the compiler. This is why you can't write the type explicitly - it doesn't have a name you can refer to.
How to annotate closure types
Since you can't name the exact type, you have three options:
Option 1: Let the compiler infer (recommended)
#![allow(unused)] fn main() { let f = |x: i32| x + 2; // Type is inferred }
This is the most common approach. The compiler figures out the type automatically.
Option 2: Use a trait bound with generics
When passing closures as parameters, use generic type parameters with trait bounds:
#![allow(unused)] fn main() { fn apply<F>(f: F, x: i32) -> i32 where F: Fn(i32) -> i32, // F is any type that implements Fn(i32) -> i32 { f(x) } let f = |x: i32| x + 2; apply(f, 5); // 7 }
This is static dispatch - the compiler generates specialized code for each closure type, with zero runtime cost.
Option 3: Use impl Trait (modern Rust)
For function parameters and return types, you can use impl Trait:
#![allow(unused)] fn main() { fn make_adder(n: i32) -> impl Fn(i32) -> i32 { move |x| x + n } fn apply(f: impl Fn(i32) -> i32, x: i32) -> i32 { f(x) } }
This is syntactic sugar for generic type parameters with trait bounds. Still static dispatch, zero cost.
Option 4: Use trait objects (Box<dyn Fn>)
If you need to store closures with different types in the same collection, use trait objects:
#![allow(unused)] fn main() { let closures: Vec<Box<dyn Fn(i32) -> i32>> = vec![ Box::new(|x| x + 1), Box::new(|x| x * 2), Box::new(|x| x - 3), ]; for f in &closures { println!("{}", f(10)); // 11, 20, 7 } }
This is dynamic dispatch - has a small runtime cost (heap allocation + vtable lookup) but allows heterogeneous collections.
Type annotation examples
#![allow(unused)] fn main() { // ❌ ERROR: Can't use Fn as a concrete type let f: Fn(i32) -> i32 = |x| x + 2; // ✅ OK: Let compiler infer let f = |x: i32| x + 2; // ✅ OK: Use impl Trait in function parameter fn apply(f: impl Fn(i32) -> i32, x: i32) -> i32 { f(x) } // ✅ OK: Use generic with trait bound fn apply_generic<F: Fn(i32) -> i32>(f: F, x: i32) -> i32 { f(x) } // ✅ OK: Use trait object (dynamic dispatch) let f: Box<dyn Fn(i32) -> i32> = Box::new(|x| x + 2); // ✅ OK: Reference to trait object let f = |x: i32| x + 2; let f_ref: &dyn Fn(i32) -> i32 = &f; }
Key Takeaways
Fn(T) -> Uis a trait, not a type- Every closure has a unique anonymous type generated by the compiler
- You cannot write the closure's actual type - it's unnamed
- Use trait bounds or trait objects when you need to annotate types
- Prefer static dispatch (
impl Traitor generics) for zero-cost abstractions - Use dynamic dispatch (
Box<dyn Fn>) only when you need type flexibility at runtime
See also: Appendix Index | A: Send and Sync | C: Dynamic Dispatch
Appendix: Memory Layout - Where Your Data Lives
This document demystifies where your Rust data actually lives in memory. We'll visualize the process memory layout and understand the stack, heap, and static data segments.
Recommended resource: cheats.rs/#memory-layout provides excellent visual memory layouts for Rust types.
The Simple Program
Let's start with a concrete Rust program and trace where everything lives:
// Global/static data - lives in data segment static GREETING: &str = "Hello"; static mut BUFFER: [u8; 10_000] = [0; 10_000]; // 10 KB zero-initialized fn main() { // Stack: local variables let x = 42; let y = 100; // Stack: String struct (24 bytes: ptr + len + cap) // Heap: actual string data "world" let s = String::from("world"); // Stack: vector struct (24 bytes: ptr + len + cap) // Heap: array data [1, 2, 3, 4, 5] let v = vec![1, 2, 3, 4, 5]; // Stack: native array - all data lives on stack (20 bytes) let arr = [10, 20, 30, 40, 50]; // Stack: function call frame // Passing by value (x) and by reference (&s) let doubled = process_data(x, &s); println!("{} -> {}", x, doubled); } fn process_data(param_num: i32, param_text: &String) -> i32 { // Stack: new function frame // Arguments passed: // - param_num: COPY of x's value (42) - passed by value // - param_text: pointer to s (on stack) - passed by reference let result = param_num * 2; println!("{}: {}", param_text, result); // Return: result is COPIED to caller's stack frame result // Returns 84 }
Now let's see where each piece of data lives in memory.
Process Memory Layout
When your Rust program runs, the operating system gives it a contiguous chunk of virtual memory organized into distinct regions:
Key Insight: The stack and heap grow toward each other!
Let's Trace Our Program
Now let's see exactly where each piece of data from our example lives.
Step 1: Program Starts - Static Data is Loaded
Before main() even runs, the OS loads static data into the DATA segment:
Note on addresses: The stack addresses shown (0x7FFF_FFFF_FFFF) are realistic—they represent the upper canonical address range on x86-64 Linux. However, the low addresses (0x1000-0x7000) are simplified examples for clarity. Real addresses on modern systems would be:
- TEXT/DATA/BSS segments: Around
0x5555_5555_0000to0x5555_5556_0000(randomized by ASLR)- Heap: Typically starts around
0x5555_5556_0000and grows upward- First 64KB (0x0000-0xFFFF): Unmapped as null pointer protection—dereferencing causes segfault We use simplified addresses in diagrams to keep them readable and focus on concepts rather than implementation details.
Why BSS exists: It's a file size optimization! BSS stores only zeros, so the executable doesn't need to include them.
Program in file:
static BUFFER: [u8; 1_000_000] = [0; 1_000_000];
- BSS: Executable just says "allocate 1 MB of zeros" (~16 bytes metadata)
- Data: Would need to store all 1 million bytes of zeros (~1 MB in file)
Process in memory: Both are 1 MB of zeros in memory. The OS allocates and zeroes the BSS memory at load time.
Result: Executable with BSS is ~320 KB, with Data would be ~1.3 MB. Same memory usage at runtime, but different file sizes!
Step 2: main() Executes - Local Variables on Stack
When main() is called, the function's prologue (compiler-generated instructions at the beginning of the function) creates a stack frame by adjusting the stack pointer (typically sub rsp, N where N is the size needed for local variables). After all local variables are initialized (right before calling process_data(x, &s)), the stack looks like this:
Important observations:
xandyare just 4 bytes each, living directly on the stacks(String) is 24 bytes on the stack (metadata: pointer, length, capacity)- The actual string data "world" lives on the heap
v(Vec) is 24 bytes on the stack (metadata: pointer, length, capacity)- The actual array data [1,2,3,4,5] lives on the heap
arr(native array) is 20 bytes entirely on the stack (no heap allocation!)- All 5 integers live directly in the array, no pointer indirection
Step 3: Calling process_data() - New Stack Frame and Passing Arguments
When we call process_data(x, &s), here's what the CPU actually does (x86-64 calling convention):
- Arguments loaded into registers (not pushed to stack!):
x(i32, 4 bytes): Loaded intoEDIregister → becomesparam_num&s(&String, 8 bytes): Pointer loaded intoRSIregister → becomesparam_text- There is no param_num or param_text in memory - they ARE the registers themselves
- CALL instruction executes: Pushes return address onto stack, then jumps to process_data
- Callee (process_data) sets up its stack frame:
- Saves registers if needed
- Allocates space for local variables
- May spill register arguments to stack (compiler's choice)
Key observations about arguments and returns:
-
param_numandparam_textare in CPU REGISTERS, not in memory!param_num(value 42) lives in the EDI registerparam_text(pointer to s) lives in the RSI register- They are NOT stored on the stack (unless the compiler decides to spill them later)
-
Pass by value vs by reference both use registers:
- Pass by value (
x): The value 42 is copied into EDI register - Pass by reference (
&s): The pointer to s (address on stack) is copied into RSI register
- Pass by value (
-
doubledvariable in main's frame has space allocated but isn't set yet - it will receive the return value -
Stack frames stack up - process_data's frame sits on top of main's frame
-
The heap data doesn't move - only stack frames are created/destroyed
-
arrstays on the stack in main's frame - native arrays don't involve heap
How arguments are actually passed (x86-64 System V ABI):
; Conceptual assembly for: let doubled = process_data(x, &s);
mov edi, DWORD PTR [rbp-4] ; Load x (42), located at rbp-4 into EDI register
lea rsi, [rbp-32] ; Load address of s into RSI register (pointer to s on stack)
call process_data ; CALL pushes return address, jumps to function
Inside process_data
; Arguments are in registers: EDI = 42, RSI = pointer to s
; Function prologue - setting up stack frame:
push rbp ; Save caller's base pointer
mov rbp, rsp ; Set up our base pointer
sub rsp, 16 ; Allocate space for local variables (result, etc.)
; Now we can execute the function body:
; let result = param_num * 2;
mov eax, edi ; Load param_num (42) into EAX
shl eax, 1 ; Multiply by 2 (shift left) -> EAX = 84
mov DWORD PTR [rbp-4], eax ; Store result on stack
; ... println! call happens here ...
; Return preparation:
mov eax, DWORD PTR [rbp-4] ; Load result (84) into EAX (return register)
; Function epilogue - cleanup:
add rsp, 16 ; Deallocate local variables
pop rbp ; Restore caller's base pointer
ret ; Return to caller (pops return address, jumps back)
Key points:
-
Arguments go into registers first (not pushed to stack):
- First 6 integer/pointer arguments use: RDI, RSI, RDX, RCX, R8, R9
- Our i32 uses EDI (lower 32 bits of RDI)
- Our &String uses RSI (just a single pointer, 8 bytes)
-
CALL instruction pushes return address onto stack automatically
-
Compiler may spill to stack if:
- Register needed for other operations
- Function has too many arguments (7+ integers)
- Debugging is enabled (makes variables inspectable)
-
Pass by value vs by reference both use registers - the difference is what's copied:
- By value (
x): The actual value (42) is copied into EDI - By reference (
&s): Only the pointer to s (8 bytes) is copied into RSI, pointing to s on the stack
- By value (
Step 4: process_data() Returns - Value Copied Back and Stack Frame Destroyed
When process_data() returns, two things happen:
- Return value is copied: The value in
result(84) is copied todoubledin main's frame (using CPU register or direct memory copy) - Stack frame is popped: process_data's entire frame is destroyed
Key observations about returns:
- Return value is copied: The 4 bytes of
resultare copied (typically via CPU register likeRAXon x86-64, then to stack) - process_data's stack frame is gone: All local variables (param_num, param_text, result) are destroyed
- The heap data remains untouched: Only stack frames change, heap is unaffected
- doubled now has the value 84: Ready to be used by main
How return values work:
- Small values (like i32, 4 bytes): Returned via CPU register (RAX on x86-64), then copied to destination
- Larger values (like structs): Caller pre-allocates space, callee writes directly to it
- Owned heap types (like Vec, String): Only the metadata is copied (24 bytes), heap data stays put
Step 5: main() Ends - Cleanup
When main() returns, s and v go out of scope. Their Drop implementations run:
sis dropped: Callsdealloc()to free the heap memory at 0x8000vis dropped: Callsdealloc()to free the heap memory at 0x8100- main's stack frame is popped: All local variables disappear
STACK: (empty)
HEAP: (freed)
0x8000: (deallocated)
0x8100: (deallocated)
DATA Segment: (still there)
0x5000: GREETING = "Hello"
BSS Segment: (still there)
0x6000: COUNTER = 0
Memory Regions in Detail
The Stack
What lives here:
Before we categorize by type, let's ask: Which of these types live on the stack?
- Primitives:
i32,f64,bool,char? - Enums?
- Structs?
- Arrays:
[T; N]? - Pointers:
&T,&mut T,*const T,*mut T? - Smart pointers:
String,Vec,Box(which are actually just structs)?
You might think: "Primitives live on stack, arrays live on heap..."
But actually, the type doesn't matter. Here's the simple rule:
The Rule: All local variables (declared with let) live on the stack.
Let's test this. Which of these live on the stack?
#![allow(unused)] fn main() { let x: i32 = 0; struct Number { n: i32 } let arr: [i32; 3] = [1, 2, 3]; }
Answer: The values declared with let live on stack.
xlives on stack (it's a local variable)struct Number { n: i32 }is just a type definition - doesn't live anywhere!arrlives on stack (it's a local variable)
When we create an instance of Number with let, that's when it gets memory:
#![allow(unused)] fn main() { let num = Number { n: 42 }; // num lives on stack, so field n (as part of num) lives on stack }
No matter the type, if it's a local variable, it lives on the stack:
#![allow(unused)] fn main() { let result = Ok(42); // enum on stack (including its data) let n: i32 = 5; // primitive on stack let ref_n: &i32 = &n; // reference (pointer) on stack, points to n (also on stack) let p_n: *const i32 = &n; // raw pointer on stack, points to n (also on stack) }
Note: Pointers are bridges between stack and heap. They can point to stack (like
ref_nabove) or to heap (like Vec's internalptr). We'll explore heap allocation in detail later.
What about String, Vec, Box?
These are also just structs! Let's see what Vec actually is:
#![allow(unused)] fn main() { struct Vec<T> { ptr: *mut T, // pointer to heap data len: usize, // length cap: usize, // capacity } let number = Number { n: 42 }; // number on stack, field n on stack let v = Vec::new(); // v (the struct) on stack // ptr is null/dangling, len=0, cap=0 v.push(1); // FIRST push calls alloc()! // Now ptr points to heap, len=1, cap=4 (typically) v.push(2); // adds to heap, len=2, cap=4 }
Key insight: The Vec struct itself always lives on stack. Its fields (ptr, len, cap) always live on stack. But ptr only points to heap after the first allocation (which happens during the first push() that needs capacity).
- After
Vec::new(): ptr is null (or dangling), no heap allocation yet - After first
push(1): ptr points to heap (alloc() was called), heap data exists
Stack memory layout:
We'll explore heap and allocation in more detail in the next section.
Summary:
Your local variables (you create these):
- The Rule: Everything declared with
letin a function lives on stack - type doesn't matter - Primitives:
i32,f64,bool,char - Structs: entire struct including all fields
- Enums: including their variant data
- Arrays:
[T; N]- all elements inline - Pointers:
&T,&mut T,*const T,*mut T- the pointer itself (8 bytes) - Smart pointer structs:
String,Vec,Box- the struct metadata on stack, the data they point to on heap
Compiler-managed (you don't interact with these):
- Function parameters (passed via registers, may be spilled to stack)
- Return addresses (managed via
CALL/RETinstructions) - Saved registers (managed via
PUSH/POPinstructions)
Characteristics:
- Automatic management: Variables automatically disappear when they go out of scope. The CPU has built-in stack instructions (
PUSH,POP,CALL,RET) and a dedicated stack pointer register (RSPon x86-64) that make stack operations trivial. - Fast allocation: Just move the stack pointer (one CPU instruction:
sub rsp, 16to allocate 16 bytes) - Fixed size: Typically 2-8 MB (OS-dependent)
- LIFO (Last In, First Out): Like a stack of plates
- Grows downward: From high addresses to low addresses
Example:
#![allow(unused)] fn main() { fn example() { let x = 42; // Allocate 4 bytes on stack let y = vec![1]; // Allocate 24 bytes on stack (Vec metadata) } // Stack pointer moves back, x and y are gone }
Stack overflow happens when you use too much stack space:
#![allow(unused)] fn main() { fn infinite_recursion() { let huge = [0u8; 1_000_000]; // 1 MB per call! infinite_recursion(); // Each call adds another frame } // Eventually: stack overflow! }
Raw Pointers
References: De-abstracting the abstraction
First, let's understand what references actually are. Despite all the Rust jargon about "borrowing" and "lifetimes", references are just pointers - plain old memory addresses.
#![allow(unused)] fn main() { let mut x: i32 = 42; // x lives on stack (4 bytes) let x_ref: &i32 = &x; // x_ref is a pointer to x (8 bytes on 64-bit) println!("x is at address: {:p}", x_ref); // Prints something like: 0x00007ffc1234abcd // Can't have both immutable and mutable references at the same time: let x_mut_ref: &mut i32 = &mut x; // ❌ Error: x_ref is still in scope }
You can think a reference as a safe pointer guaranteed by the compiler.
What's in memory :
Key points about references:
- References are pointers:
&i32is just an 8-byte address (on 64-bit systems) - They point to existing data:
x_refcontains the address0x0000_7FFF_FFFF_FF00which is wherexlives - Borrow checker enforces rules at compile time:
- You can have many
&T(immutable refs) OR one&mut T(mutable ref) - But NOT both at the same time
- You can have many
- References are always valid: Compiler guarantees the pointed-to data exists
Mutable references work the same way:
#![allow(unused)] fn main() { let mut y: i32 = 100; let y_mut_ref: &mut i32 = &mut y; *y_mut_ref = 200; // Dereference and modify // y is now 200 }
References vs Raw Pointers:
- References (
&T,&mut T): Safe, borrow-checked, always valid - Raw pointers (
*const T,*mut T): Unsafe, no checking, may be invalid
Let's see raw pointers next:
Raw pointers come in two types, mirroring safe references:
*const T- Read-only raw pointer, like&Tbut without safety guarantees*mut T- Mutable raw pointer, like&mut Tbut without safety guarantees
What safety guarantees are removed?
With safe references (&T, &mut T), the compiler guarantees:
- ✅ Always points to valid, initialized data
- ✅ Properly aligned for the type
- ✅ Won't outlive the data it points to (lifetime checking)
- ✅ Exclusive access for
&mut T(no aliasing mutable references)
With raw pointers (*const T, *mut T), you must ensure:
- ❌ May point to invalid/uninitialized data
- ❌ May be misaligned
- ❌ May outlive the data (dangling pointer)
- ❌ Multiple
*mut Tcan exist to same location (you must prevent data races)
Can raw pointers point to arbitrary addresses like in C?
Yes! Unlike references, raw pointers can be created from arbitrary addresses:
#![allow(unused)] fn main() { // Point to GPU's VRAM framebuffer at a specific address // Example: NVIDIA GeForce GTX 1650's prefetchable memory region (from lspci) // Memory at c0000000 (64-bit, prefetchable) [size=256M] let framebuffer: *mut u32 = 0xC000_0000 as *mut u32; // Each pixel is a 32-bit color value (RGBA format) // Pixel at position (x=100, y=50) in a 1920x1080 screen let pixel_offset = (50 * 1920) + 100; // y * width + x let pixel_ptr = unsafe { framebuffer.add(pixel_offset) }; unsafe { // Write a red pixel: 0xFF0000FF (RGBA: Red=255, Green=0, Blue=0, Alpha=255) *pixel_ptr = 0xFF0000FF; // Boom! A red dot appears on screen } }
This example won't actually work because the virtual address 0xC000_0000 in your process's address space is not mapped to anything. While the GPU framebuffer exists at physical address 0xC000_0000, your process doesn't have a page table entry mapping the virtual address 0xC000_0000 to that physical location. Dereferencing it causes a page fault → segmentation fault.
Note: Attempting to mmap this physical address region (e.g., via
/dev/mem) will be rejected by the kernel. Modern Linux kernels haveCONFIG_STRICT_DEVMEMenabled, which prevents mapping memory regions already claimed by device drivers. Since the GPU driver (nvidia, nouveau, amdgpu, etc.) has registered this PCI BAR region, direct userspace access is blocked. Additionally, the display server (Wayland/X11) has exclusive control via the DRM subsystem.This pattern works in: kernel drivers (which own the hardware), embedded systems without a display server, or bare-metal environments. This example demonstrates raw pointers' ability to reference arbitrary addresses - essential for hardware interaction and systems programming.
This is extremely dangerous but necessary for:
- Embedded systems (memory-mapped hardware)
- Operating system development
- Interfacing with C libraries
- Performance-critical code with manual memory management
Key difference from safe references:
#![allow(unused)] fn main() { let mut y: i32 = 42; // ❌ Safe references: Can't have multiple mutable refs let y_ref1 = &mut y; let y_ref2 = &mut y; // ERROR: cannot borrow as mutable more than once // ✅ Raw pointers: Can have multiple mutable pointers let y_ptr1: *mut i32 = &mut y; let y_ptr2: *mut i32 = &mut y; let y_ptr3: *mut i32 = &mut y; // All OK! (but unsafe to use) unsafe { *y_ptr1 = 100; // Write 100 to y *y_ptr2 = 200; // Overwrite with 200 *y_ptr3 = 300; // Overwrite with 300 (last write wins) // All three pointers point to the same location, so they all read 300 println!("y_ptr1, y_ptr2, y_ptr3: {}, {}, {}", *y_ptr1, *y_ptr2, *y_ptr3); // Output: y_ptr1, y_ptr2, y_ptr3: 300, 300, 300 } println!("y is now: {}", y); // Prints: 300 }
Why this is dangerous: With multiple *mut pointers, you can create data races and undefined behavior - the compiler won't stop you!
Pointers to heap data:
So far, all our pointer examples pointed to stack data (like &x where x is on the stack) or arbitrary addresses. But how do pointers point to heap-allocated data? The answer: allocation.
To allocate memory on the heap, we use std::alloc::alloc() which returns a raw pointer to the allocated memory:
#![allow(unused)] fn main() { use std::alloc::{alloc, dealloc, Layout}; unsafe { // 1. Define the memory layout: we want space for 3 i32s (12 bytes) let layout = Layout::array::<i32>(3).unwrap(); // 2. Allocate memory on the heap (alloc is unsafe!) let ptr: *mut i32 = alloc(layout) as *mut i32; // 3. Check if allocation succeeded (alloc returns null on failure) if ptr.is_null() { panic!("Allocation failed!"); } // 4. Now ptr points to heap! We can write to it *ptr = 42; println!("Value at heap: {}", *ptr); // Prints: 42 // 5. Remember we allocated space for 3 i32s, so we can treat ptr like an array of 3 *ptr.add(0) = 1; // Write 1 at index 0 (first i32) *ptr.add(1) = 2; // Write 2 at index 1 (second i32) *ptr.add(2) = 3; // Write 3 at index 2 (third i32) // 6. What happens if we write beyond our allocation? // *ptr.add(3) = 4; // ⚠️ UNDEFINED BEHAVIOR! We only allocated 3 i32s (indices 0-2) // 7. Read the values back println!("Heap data: {}, {}, {}", *ptr.add(0), *ptr.add(1), *ptr.add(2)); // Output: Heap data: 1, 2, 3 // 8. We MUST manually deallocate when done! dealloc(ptr as *mut u8, layout); // After dealloc, ptr is now a dangling pointer - using it is undefined behavior! } }
Wait, primitives on the heap?
Many people think primitives like i32 always live on the stack. But that's not true! We just allocated three i32s on the heap using alloc(). The location of data (stack vs heap) isn't determined by the type - it's determined by how you allocate it:
let x: i32 = 42;→xlives on stack (local variable)alloc(Layout::new::<i32>())→ returns pointer to heap (manual allocation)
In our example, the three i32 values (1, 2, 3) are sitting on the heap at addresses 0x5555_8000_0000, 0x5555_8000_0004, and 0x5555_8000_0008. They're heap-allocated primitives!
What happens if we write beyond our allocation?
Writing to *ptr.add(3) is undefined behavior - we only allocated 3 i32s (indices 0-2). Writing to index 3 is out-of-bounds and could:
- Corrupt other heap data - overwrite someone else's allocation
- Trigger a segfault - if
ptr+12isn't in valid memory - Appear to work - but corrupt memory silently
- Cause mysterious bugs later - when the corrupted data is used
Important: This won't cause a compilation error! Inside unsafe blocks, the compiler trusts you completely. It won't check bounds, validate pointers, or prevent undefined behavior. That's your responsibility now.
Unlike Vec, raw pointers don't do bounds checking! Vec would panic on vec[3] if len=3, but raw pointers trust you completely. This is why manual memory management is dangerous.
Memory layout:
After *ptr.add(2) = 3, the heap looks like this:
Key points:
alloc()returns a pointer to heap memory - the allocated bytes live on the heap- Manual deallocation is required - forgetting
dealloc()causes a memory leak - After
dealloc(), the pointer is dangling - using it causes undefined behavior - This is extremely unsafe - you must ensure:
- The layout matches what you allocated
- You don't use the pointer after dealloc
- You don't call dealloc twice on the same pointer
Smart pointers do this for you:
Types like Vec, String, and Box internally use alloc() and dealloc(), but they:
- Call
alloc()automatically when you create them - Store the pointer in a struct on the stack
- Call
dealloc()automatically in theirDropimplementation - Prevent you from using dangling pointers (via the borrow checker)
The Heap
What lives here:
Since we now know about allocation, what lives here is anything that was allocated on the heap. Heap allocation needs two components:
- A raw pointer (to track where the allocation is)
- Allocation management (calling
alloc()anddealloc())
Rather than asking "which types live on the heap?", we should ask: which Rust standard library types manage heap allocations internally?
These types have a raw pointer field and call alloc()/dealloc():
Box<T>- theTvalue lives on heapVec<T>- the array ofTelements lives on heapString- the character data lives on heapHashMap<K, V>- the buckets and entries live on heapRc<T>/Arc<T>- theTvalue lives on heap
Important: Types like Option<T>, Result<T, E>, Cell<T>, and RefCell<T> don't allocate on the heap by themselves. They're just wrappers around T:
Option<i32>- entirely on stack (just an enum)Option<Box<i32>>- Box's pointer on stack, thei32on heap (because ofBox, notOption)RefCell<Vec<i32>>- RefCell and Vec metadata on stack, Vec's array data on heap (because ofVec, notRefCell)
How to know if a type allocates on the heap:
Use "Go to Definition" in your IDE (or check the Rust standard library docs) to inspect the type's internal structure. If you see pointer fields like *mut T, that type manages heap allocations:
#![allow(unused)] fn main() { // Go to Definition on Option<T> shows: pub enum Option<T> { None, Some(T), // ← Just contains T directly, no pointer! } // Go to Definition on Box<T> shows: pub struct Box<T> { ptr: NonNull<T>, // ← Let's Go to Definition on NonNull<T> to see what it is! } pub struct NonNull<T> { pointer: *const T, // ← Pointer, so Box<T> manages heap allocation } // Go to Definition on RefCell<T> shows: pub struct RefCell<T> { borrow: Cell<BorrowFlag>, value: UnsafeCell<T>, // ← Let's Go to Definition on UnsafeCell<T> to see what it is! } pub struct UnsafeCell<T> { value: T, // ← Just contains T directly, no pointer! } }
The rule: If the type has a pointer field (*mut T, *const T, NonNull<T>), it manages heap allocation. Otherwise, it's just a wrapper that lives on the stack.
Characteristics:
- Manual management: You allocate/deallocate (Rust does this for you via
Drop) - Slower allocation: Requires finding a free block (complex algorithms)
- Large size: Typically gigabytes (depends on available RAM)
Example:
#![allow(unused)] fn main() { fn example() { // Stack: 24 bytes (Vec metadata) // Heap: 400 bytes (100 * 4-byte integers) let v = vec![0; 100]; // Stack: 24 bytes (String metadata) // Heap: Variable (depends on string length) let s = String::from("hello"); // Stack: 8 bytes (Box pointer) // Heap: 4 bytes (i32) let b = Box::new(42); } // Drop is called, heap memory is freed }
Heap allocation is expensive:
#![allow(unused)] fn main() { // Allocates once, then grows as needed (reallocating) let mut v = Vec::new(); for i in 0..1000 { v.push(i); // Might allocate/reallocate } // Pre-allocate: only allocates once let mut v = Vec::with_capacity(1000); for i in 0..1000 { v.push(i); // No allocation needed } }
Static Data (DATA Segment)
What lives here:
staticvariablesconstvalues (inlined, but literals live here)- String literals (
"hello") - Binary data embedded at compile time
Characteristics:
- Loaded at program start: Burned into the executable
- Lives forever: Never deallocated (program lifetime)
- Fixed size: Known at compile time
- Read-only or read-write: Depends on whether it's
staticorstatic mut
Example:
static GREETING: &str = "Hello, world!"; // DATA segment const MAX: i32 = 100; // Inlined (no memory allocated) fn main() { println!("{}", GREETING); // Uses data from DATA segment let x = MAX; // Constant inlined: let x = 100; }
String literals are special:
#![allow(unused)] fn main() { let s1 = "hello"; // Points to DATA segment let s2 = "hello"; // Points to SAME location in DATA segment! assert_eq!(s1.as_ptr(), s2.as_ptr()); // Same address! let s3 = String::from("hello"); // Allocates on heap (different address) }
Visualizing Types
Let's see where different types memory layout:
Simple Types (Copy)
#![allow(unused)] fn main() { let x: i32 = 42; let y: bool = true; let z: f64 = 3.14; Stack: ┌──────────────┐ │ x: i32 = 42 │ 4 bytes │ y: bool = 1 │ 1 byte (+ padding) │ z: f64 = ... │ 8 bytes └──────────────┘ Heap: (nothing) }
Arrays (Fixed Size)
#![allow(unused)] fn main() { let arr: [i32; 5] = [1, 2, 3, 4, 5]; Stack: ┌──────────────────────────┐ │ arr: [i32; 5] │ │ [1][2][3][4][5] │ 20 bytes └──────────────────────────┘ Heap: (nothing) }
String
#![allow(unused)] fn main() { let s = String::from("hello"); Stack: Heap: ┌──────────────────┐ ┌──────────────────┐ │ s: String │ │ │ │ ptr ──────────────────>│ [h][e][l][l][o] │ │ len: 5 │ │ 5 bytes │ │ cap: 5 │ │ │ └──────────────────┘ └──────────────────┘ 24 bytes 5 bytes (+ capacity) }
Vec
#![allow(unused)] fn main() { let v = vec![1, 2, 3]; Stack: Heap: ┌──────────────────┐ ┌──────────────────┐ │ v: Vec<i32> │ │ │ │ ptr ──────────────────>│ [1][2][3] │ │ len: 3 │ │ 12 bytes │ │ cap: 3 │ │ │ └──────────────────┘ └──────────────────┘ 24 bytes 12 bytes }
Box
#![allow(unused)] fn main() { let b = Box::new(42); Stack: Heap: ┌──────────────────┐ ┌──────┐ │ b: Box<i32> │ │ │ │ ptr ──────────────────>│ 42 │ └──────────────────┘ │ │ 8 bytes └──────┘ 4 bytes }
Nested Types
#![allow(unused)] fn main() { let v: Vec<String> = vec![ String::from("hello"), String::from("world"), ]; Stack: Heap: ┌────────────────────┐ ┌─────────────────────────────────────────┐ │ v: Vec<String> │ │ String 0: │ │ ptr ─────────────────────> │ ├─ ptr ──┐ │ │ len: 2 │ │ ├─ len: 5 │ (24 bytes) │ │ cap: 2 │ │ └─ cap: 5 │ │ └────────────────────┘ │ │ │ │ String 1: │ │ │ ├─ ptr ──┼──┐ │ │ ├─ len: 5 │ │ (24 bytes) │ │ └─ cap: 5 │ │ │ │ ↓ │ │ │ "hello" [h][e│[l][l][o] (5 bytes) │ │ ↓ │ │ "world" [w][o][r][l][d] (5 bytes)│ └─────────────────────────────────────────┘ - Stack: 24 bytes (Vec metadata) - Heap: 48 bytes (2 × String metadata: 2 × 24 bytes) + 10 bytes (string data) - Total heap: 58 bytes }
Three levels of indirection!
vpoints to array ofStrings- Each
Stringpoints to its character data - All on the heap
Common Misconceptions
Misconception #1: "Vec allocates on the stack"
#![allow(unused)] fn main() { let v = Vec::new(); }
Wrong mental model:
Correct mental model:
Misconception #2: "String is just text"
#![allow(unused)] fn main() { let s = String::from("hello"); }
Wrong: "s is the text 'hello'"
Correct: "s is a struct containing a pointer to the text 'hello' on the heap"
Stack: s = { ptr: 0x1000, len: 5, cap: 5 } (24 bytes)
Heap: 0x1000 = "hello" (5 bytes)
Misconception #3: "Box makes things bigger"
#![allow(unused)] fn main() { let x = 42; // 4 bytes let b = Box::new(42); // How many bytes? }
Answer: b is 8 bytes (just a pointer), but total memory usage is 12 bytes (8 + 4).
However: Boxing can actually save stack space for large types:
#![allow(unused)] fn main() { let huge = [0u8; 1_000_000]; // 1 MB on stack! Dangerous! let boxed = Box::new([0u8; 1_000_000]); // 8 bytes on stack, 1 MB on heap }
Misconception #4: "All heap allocations are slow"
Not all heap operations allocate:
#![allow(unused)] fn main() { let mut v = Vec::with_capacity(100); // ✅ One allocation for i in 0..50 { v.push(i); // ✅ No allocation - within capacity } v.push(51); // ✅ Still no allocation v.push(52); // ✅ Still no allocation // ... up to 100 elements, still no allocation v.push(101); // ❌ NOW we reallocate (capacity exceeded) }
Pre-allocating capacity is a common optimization!
Performance Implications
Stack Operations (Fast)
#![allow(unused)] fn main() { fn stack_test() { let x = 42; // ~1 CPU cycle (just move stack pointer) let y = x; // ~1 CPU cycle (copy 4 bytes) } // ~1 CPU cycle (move stack pointer back) }
Cost: ~3 CPU cycles
Heap Operations (Slow)
#![allow(unused)] fn main() { fn heap_test() { let x = Box::new(42); // ~100 CPU cycles (call allocator) let y = x; // ~1 CPU cycle (copy 8-byte pointer) } // ~100 CPU cycles (call deallocator) }
Cost: ~200 CPU cycles
100x slower! But remember:
- This is microseconds, not seconds
- Sometimes you need the heap (dynamic size, large data, shared ownership)
- The real cost is in many allocations, not just one
Optimization Tips
- Pre-allocate collections:
#![allow(unused)] fn main() { // Bad: multiple allocations let mut v = Vec::new(); for i in 0..1000 { v.push(i); } // Good: one allocation let mut v = Vec::with_capacity(1000); for i in 0..1000 { v.push(i); } }
- Use
&strinstead ofStringwhen possible:
#![allow(unused)] fn main() { // Bad: allocates on heap fn greet(name: String) { println!("Hello, {}", name); } // Good: no allocation fn greet(name: &str) { println!("Hello, {}", name); } }
- Use
[T; N]instead ofVec<T>for fixed-size data:
#![allow(unused)] fn main() { // Bad: heap allocation let v = vec![0; 10]; // Good: stack allocation let arr = [0; 10]; }
- Avoid cloning when borrowing works:
#![allow(unused)] fn main() { // Bad: clones the string (heap allocation) fn process(s: String) { println!("{}", s); } let s = String::from("hello"); process(s.clone()); // Good: borrows (no allocation) fn process(s: &str) { println!("{}", s); } process(&s); }
Memory Leaks
Rust prevents many memory leaks, but they're still possible:
Safe Memory Leak (Reference Cycles)
#![allow(unused)] fn main() { use std::rc::Rc; use std::cell::RefCell; struct Node { next: Option<Rc<RefCell<Node>>>, } let a = Rc::new(RefCell::new(Node { next: None })); let b = Rc::new(RefCell::new(Node { next: Some(Rc::clone(&a)) })); a.borrow_mut().next = Some(Rc::clone(&b)); // Memory leak! a and b reference each other // Neither will ever be dropped (reference count never reaches 0) }
Solution: Use Weak references to break cycles.
Intentional Memory Leak
#![allow(unused)] fn main() { let s = String::from("hello"); let leaked: &'static str = Box::leak(Box::new(s)); // s is now leaked - memory will never be freed // But we got a 'static reference! }
Use case: Creating 'static data at runtime (rare).
Key Takeaways
- Stack is automatic - variables disappear when out of scope
- Heap is manual - you allocate/deallocate (Rust automates via
Drop) - Stack is fast - just move a pointer
- Heap is flexible - dynamic size, outlives scope
- String/Vec/Box are smart pointers - metadata on stack, data on heap
- Static data lives forever - loaded at program start
- Use stack by default - only heap allocate when necessary
- Pre-allocate when possible - avoid repeated reallocations
Further Reading
- cheats.rs/#memory-layout - Visual memory layouts for Rust types
- The Rustonomicon: Memory layout and representation
- Rust Performance Book: Memory allocation strategies
- Operating Systems textbooks: Virtual memory, process address space
Appendix: Sized - Understanding Compile-Time Size
This document covers Rust's Sized trait and dynamically-sized types (DSTs) - one of Rust's most invisible yet fundamental features.
The Invisible Trait You Use Everywhere
Pop quiz: How many times do you think about whether a type has a known size at compile time?
If you're like most Rust programmers: never. And that's by design.
But here's the thing - Sized is probably the most commonly used trait in Rust. It's on almost every generic type parameter you write:
#![allow(unused)] fn main() { // What you write: fn process<T>(value: T) { } // What the compiler sees: fn process<T: Sized>(value: T) { } // ^^^^^^ Invisible implicit bound! }
That Sized bound is automatically added to every generic type parameter unless you explicitly opt out. It's so common that Rust makes it implicit to reduce noise.
What is "Size"?
When we say a type has a "size," we mean: how many bytes of memory does a value of this type occupy?
#![allow(unused)] fn main() { // These types have known sizes: let x: i32 = 42; // 4 bytes let s: String = String::new(); // 24 bytes (ptr + len + cap) let arr: [u8; 10] = [0; 10]; // 10 bytes println!("{}", std::mem::size_of::<i32>()); // 4 println!("{}", std::mem::size_of::<String>()); // 24 println!("{}", std::mem::size_of::<[u8; 10]>()); // 10 }
Important: For types like String and Vec, the "size" is the size of the stack-allocated metadata (pointer, length, capacity), not the heap data they point to.
Why Does the Compiler Need to Know Sizes?
The compiler needs to know sizes for several reasons:
1. Stack Allocation
When you declare a local variable, the compiler needs to reserve space on the stack:
#![allow(unused)] fn main() { fn example() { let x: i32; // Compiler reserves 4 bytes on stack let y: String; // Compiler reserves 24 bytes on stack let z: [u8; 100]; // Compiler reserves 100 bytes on stack } }
The compiler generates assembly code like:
sub rsp, 128 ; Reserve 128 bytes on stack (4 + 24 + 100)
If the compiler doesn't know the size, it can't reserve the right amount of space!
2. Passing by Value
When you pass a value to a function, the compiler needs to copy it:
#![allow(unused)] fn main() { fn take_value(value: String) { // Copies 24 bytes // ... } let s = String::from("hello"); take_value(s); // memcpy 24 bytes from s into the function's stack frame }
Without knowing the size, the compiler wouldn't know how many bytes to copy.
3. Struct Layout
When you define a struct, the compiler calculates its size based on its fields:
#![allow(unused)] fn main() { struct Point { x: f64, // 8 bytes y: f64, // 8 bytes } // Total: 16 bytes (plus potential padding) println!("{}", std::mem::size_of::<Point>()); // 16 }
If one of the fields had an unknown size, the compiler couldn't calculate the total size.
The Sized Trait
#![allow(unused)] fn main() { pub trait Sized { // This trait has no methods - it's a marker trait } }
Sized is a marker trait - it has no methods, it just marks types that have a known size at compile time.
Automatic Implementation
Unlike most traits, you never implement Sized manually. The compiler automatically implements it for types it can determine the size of:
#![allow(unused)] fn main() { // Compiler automatically implements Sized for these: impl Sized for i32 { } impl Sized for String { } impl Sized for [u8; 10] { } impl<T> Sized for Vec<T> { } impl<T> Sized for Box<T> { } // Box itself is sized (it's just a pointer) }
The Implicit Bound
Here's where things get interesting. Every generic type parameter has an implicit Sized bound:
#![allow(unused)] fn main() { // What you write: fn process<T>(value: T) { } // What the compiler actually sees: fn process<T: Sized>(value: T) { } }
This happens because:
- The function takes
value: Tby value (copies it onto the stack) - To copy it, the compiler needs to know its size
- So
Tmust beSized
Same with struct fields:
#![allow(unused)] fn main() { // What you write: struct Container<T> { value: T, } // What the compiler sees: struct Container<T: Sized> { value: T, } }
The struct needs to know how big T is to calculate its own size!
Dynamically Sized Types (DSTs)
Now we get to the interesting part: types that DON'T have a known size at compile time.
These are called Dynamically Sized Types (DSTs), and there are exactly three kinds in Rust:
1. Slices ([T])
A slice is just the items, without any metadata:
#![allow(unused)] fn main() { // This doesn't compile: // let slice: [i32] = ???; // ❌ How big is this? 1 element? 10? 1000? // Slices must always be behind a pointer: let slice: &[i32] = &[1, 2, 3]; // ✅ Reference to slice let boxed: Box<[i32]> = Box::new([1, 2, 3]); // ✅ Box containing slice }
Why is [i32] unsized but [i32; 3] is sized?
[i32; 3]= exactly 3 integers = 12 bytes (always!)[i32]= some number of integers = ??? bytes (depends on runtime data)
The number of elements is part of the type for arrays but not for slices!
2. String Slices (str)
Same as slices, but for strings:
#![allow(unused)] fn main() { // This doesn't compile: // let s: str = "hello"; // ❌ How many bytes? Depends on the string! // Must be behind a pointer: let s: &str = "hello"; // ✅ Reference to str let boxed: Box<str> = "hello".into(); // ✅ Box containing str }
String is sized (24 bytes of metadata), but str is unsized (the actual text data).
3. Trait Objects (dyn Trait)
When you use dyn Trait, the actual type is unknown at compile time:
#![allow(unused)] fn main() { trait Animal { fn speak(&self); } struct Dog; impl Animal for Dog { fn speak(&self) { println!("Woof!"); } } struct Cat; impl Animal for Cat { fn speak(&self) { println!("Meow!"); } } // This doesn't compile: // let animal: dyn Animal = Dog; // ❌ Is it a Dog? Cat? How big? // Must be behind a pointer: let animal: &dyn Animal = &Dog; // ✅ Reference to trait object let boxed: Box<dyn Animal> = Box::new(Cat); // ✅ Box containing trait object }
The compiler doesn't know if animal is a Dog (size X) or Cat (size Y), so it can't determine the size of dyn Animal.
Fat Pointers: How References to DSTs Work
Here's a crucial insight: references to unsized types are twice as large as normal pointers!
#![allow(unused)] fn main() { println!("{}", std::mem::size_of::<&i32>()); // 8 bytes (on 64-bit) println!("{}", std::mem::size_of::<&[i32]>()); // 16 bytes! (pointer + length) println!("{}", std::mem::size_of::<&str>()); // 16 bytes! (pointer + length) println!("{}", std::mem::size_of::<&dyn Animal>()); // 16 bytes! (pointer + vtable) }
A reference to a DST is called a fat pointer because it contains extra metadata:
Fat Pointer Layout
For slices (&[T]) and string slices (&str):
Example:
#![allow(unused)] fn main() { let data = [1, 2, 3, 4, 5]; let slice: &[i32] = &data[1..4]; // [2, 3, 4] // Fat pointer contains: // - Pointer to data[1] (the start of the slice) // - Length: 3 (number of elements) }
For trait objects (&dyn Trait):
Example:
#![allow(unused)] fn main() { let dog = Dog; let animal: &dyn Animal = &dog; // Fat pointer contains: // - Pointer to dog // - Pointer to vtable for Dog's Animal impl }
The vtable is a table of function pointers for the trait's methods. This is how Rust does dynamic dispatch - looking up which method to call at runtime.
The ?Sized Bound: Opting Out
Sometimes you want to write code that works with both sized and unsized types. That's where ?Sized comes in:
#![allow(unused)] fn main() { // T must be Sized (implicit): fn only_sized<T>(value: &T) { } // T can be unsized: fn sized_or_unsized<T: ?Sized>(value: &T) { } // ^^^^^^^ Opt out of the Sized requirement }
The ?Sized syntax means: "T may or may not be Sized" - it's a question mark about whether the Sized bound applies.
When You Need ?Sized
You need ?Sized when you're working with references or pointers to potentially unsized types:
#![allow(unused)] fn main() { // This works with both &i32 and &[i32]: fn print_len<T: ?Sized>(value: &T) { println!("Size of &T: {}", std::mem::size_of_val(&value)); } let x = 42; print_len(&x); // &i32 (thin pointer, 8 bytes) let slice = [1, 2, 3]; print_len(&slice[..]); // &[i32] (fat pointer, 16 bytes) }
When You DON'T Need ?Sized
If your function takes T by value, moves it, or stores it in a struct, you almost certainly need Sized:
#![allow(unused)] fn main() { // Can't work with unsized types: fn take_by_value<T>(value: T) { } // Needs to know size to copy // ^^^^ Implicit Sized bound // let slice: [i32] = [1, 2, 3]; // take_by_value(slice); // ❌ ERROR: size cannot be known at compile time }
Common Confusion Points
Confusion #1: "&[T] is Sized, but [T] is Not"
This trips up everyone at first:
#![allow(unused)] fn main() { println!("{}", std::mem::size_of::<[i32]>()); // ❌ ERROR: size cannot be known println!("{}", std::mem::size_of::<&[i32]>()); // ✅ OK: 16 bytes (fat pointer) }
Why?
[i32]is the slice itself (unsized - could be any length)&[i32]is a reference to a slice (sized - always 16 bytes on 64-bit: pointer + length)
The reference has a known size even though the thing it points to doesn't!
Confusion #2: "String is Sized, but str is Not"
#![allow(unused)] fn main() { println!("{}", std::mem::size_of::<String>()); // ✅ 24 bytes println!("{}", std::mem::size_of::<str>()); // ❌ ERROR println!("{}", std::mem::size_of::<&str>()); // ✅ 16 bytes }
Why?
Stringis a struct with three fields (ptr, len, cap) - always 24 bytesstris the actual text data - variable length&stris a fat pointer to text data - always 16 bytes
Confusion #3: "Box is Sized, but Box<[T]> Also Exists"
#![allow(unused)] fn main() { println!("{}", std::mem::size_of::<Box<i32>>()); // 8 bytes (thin pointer) println!("{}", std::mem::size_of::<Box<[i32]>>()); // 16 bytes (fat pointer) println!("{}", std::mem::size_of::<Box<dyn Trait>>()); // 16 bytes (fat pointer) }
Why?
Box<T>is just a pointer (8 bytes) whenTis sizedBox<[T]>is a fat pointer (16 bytes: ptr + length)Box<dyn Trait>is a fat pointer (16 bytes: ptr + vtable)
The Box itself is always sized - it's just a pointer! But the pointer can be thin or fat depending on what it points to.
Why Box, Cell, and RefCell Use ?Sized
This is why the stdlib splits implementations:
#![allow(unused)] fn main() { // Methods that need to move T - require Sized: impl<T> Box<T> { fn new(value: T) -> Box<T> { } // Takes ownership of T (needs size) fn into_inner(self) -> T { } // Returns owned T (needs size) } // Methods that only need references - work with ?Sized: impl<T: ?Sized> Box<T> { fn as_ref(&self) -> &T { } // Just returns a reference fn as_mut(&mut self) -> &mut T { } // Just returns a reference fn from_raw(ptr: *mut T) -> Box<T> { } // Just stores a pointer } }
Same pattern for Cell and RefCell:
#![allow(unused)] fn main() { // Methods that move T: impl<T> Cell<T> { fn new(value: T) -> Cell<T> { } // Takes ownership fn set(&self, value: T) { } // Takes ownership } // Methods that only use references: impl<T: ?Sized> Cell<T> { fn as_ptr(&self) -> *mut T { } // Just returns a pointer fn get_mut(&mut self) -> &mut T { } // Just returns a reference } }
This lets you use Cell<[i32]> even though [i32] is unsized!
#![allow(unused)] fn main() { let slice: &mut [i32] = &mut [1, 2, 3]; let cell: &Cell<[i32]> = Cell::from_mut(slice); let ptr = cell.as_ptr(); // ✅ Works! Returns *mut [i32] }
Practical Examples
Example 1: Generic Functions with Slices
Without ?Sized, this function only works with fixed-size arrays:
#![allow(unused)] fn main() { // Only works with &[T; N]: fn process<T>(slice: &T) { println!("Size: {}", std::mem::size_of_val(slice)); } process(&[1, 2, 3]); // ✅ &[i32; 3] // process(&[1, 2, 3][..]); // ❌ &[i32] is unsized! }
With ?Sized, it works with both:
#![allow(unused)] fn main() { // Works with both &[T; N] and &[T]: fn process<T: ?Sized>(slice: &T) { println!("Size: {}", std::mem::size_of_val(slice)); } process(&[1, 2, 3]); // ✅ &[i32; 3] process(&[1, 2, 3][..]); // ✅ &[i32] }
Example 2: Trait Objects
#![allow(unused)] fn main() { trait Animal { fn speak(&self); } // Without ?Sized - only works with concrete types: fn make_speak<T>(animal: &T) where T: Animal { animal.speak(); } let dog = Dog; make_speak(&dog); // ✅ Works with &Dog let animal: &dyn Animal = &dog; // make_speak(animal); // ❌ ERROR: dyn Animal is unsized! // With ?Sized - works with trait objects too: fn make_speak_dyn<T: ?Sized>(animal: &T) where T: Animal { animal.speak(); } make_speak_dyn(&dog); // ✅ Works with &Dog make_speak_dyn(animal); // ✅ Works with &dyn Animal }
Example 3: Smart Pointers
Why can you do Box<dyn Trait>? Because Box uses ?Sized:
#![allow(unused)] fn main() { // This works: let boxed: Box<dyn Animal> = Box::new(Dog); // Because Box has: impl<T: ?Sized> Box<T> { // Methods that work with unsized types } }
Without ?Sized, you couldn't have Box<[i32]>, Box<str>, or Box<dyn Trait> - huge limitations!
Advanced: Zero-Sized Types (ZSTs)
While we're talking about sizes, let's mention the opposite end: types with zero size!
#![allow(unused)] fn main() { struct Empty; // No fields struct PhantomWrapper<T>(std::marker::PhantomData<T>); println!("{}", std::mem::size_of::<Empty>()); // 0 bytes! println!("{}", std::mem::size_of::<PhantomWrapper<String>>()); // 0 bytes! let empty = Empty; let array = [Empty; 1000000]; // Still 0 bytes! }
ZSTs are completely optimized away by the compiler:
- No stack space allocation
- No memory copies
- No heap allocations
They're used for:
- Marker types (like
PhantomData) - Unit type
() - Empty enums for state machines
- Closures that capture nothing
ZSTs are still Sized! Their size is known (it's zero), so they don't need ?Sized.
Advanced: Custom DSTs
You can create your own DSTs using the #[repr(C)] attribute and a slice as the last field:
#![allow(unused)] fn main() { #[repr(C)] struct CustomSlice<T> { len: usize, data: [T], // Unsized field must be last! } // Can only use behind a pointer: let ptr: *const CustomSlice<i32> = ...; let reference: &CustomSlice<i32> = ...; }
This is how types like std::path::Path work - they're essentially wrappers around [u8] with extra guarantees.
Warning: Creating custom DSTs is advanced and requires careful use of unsafe code. Most Rust programmers never need this!
When to Use ?Sized
Use ?Sized when:
- You're implementing a smart pointer (like Box, Rc, Arc)
- You're working with trait objects and want flexibility
- Your function only needs references, never moves values
- You're wrapping unsized types in a newtype
Don't use ?Sized when:
- You need to store T by value in a struct field
- You need to move T or return it by value
- You're allocating T (you need to know the size!)
- You don't understand why you'd need it (let the compiler add
Sizedimplicitly)
Quick Reference
| Type | Sized? | Behind Pointer | Size |
|---|---|---|---|
i32 | ✅ Yes | &i32 | 4 bytes |
String | ✅ Yes | &String | 24 bytes |
[i32; 3] | ✅ Yes | &[i32; 3] | 12 bytes |
[i32] | ❌ No | &[i32] | N/A (unsized) |
str | ❌ No | &str | N/A (unsized) |
dyn Trait | ❌ No | &dyn Trait | N/A (unsized) |
&T | ✅ Yes | &&T | 8 bytes (thin) or 16 bytes (fat) |
Box<T> | ✅ Yes | &Box<T> | 8 bytes (thin) or 16 bytes (fat) |
() | ✅ Yes (ZST) | &() | 0 bytes |
Key Takeaways
Sizedis implicit - almost all generic types have this bound automatically- DSTs can't live on the stack - they must be behind a pointer or reference
- Fat pointers are 2x size - they contain metadata (length or vtable)
?Sizedopts out - allows working with both sized and unsized types- Use
?Sizedfor references - when you only need&T, notT - Box/Cell/RefCell split impls - some methods need
Sized, others work with?Sized - ZSTs are completely free - zero runtime cost, still
Sized
Further Reading
- RFC 1861: Clarifications to Sized bounds
- The Rustonomicon: Dynamically Sized Types chapter
- Rust Reference: Type layout and sizes
See also: Appendix Index | Closures | Dynamic Dispatch