Rustbelt, a formalization of Rust type system

Ralf Jung et al. @ POPL'18

Presented by Yanning Chen @ ProSE Seminar

Features of Rust type system

  • Ownership
  • Mutable/Shared References
  • Lifetimes
  • Interior Mutability

Goal: Well-typed Rust programs should be memory-safe.

How? Aliasing and mutation cannot ocur at the same time on any given location.

Ownership

In Rust, a type represents:

  • information on what values it can hold
  • ownership of the resources (e.g. memory)

Fact: Rust uses an affine type system, i.e. a value can be used at most once.
Consequence: no two values can own the same resource, so mutation allowed but no aliasing.

let s1 = String::from("hello");
let s2 = s1; // s1 is moved to s2, i.e. s1 is used and no longer available
println!("{}", s1); // error: use of moved value: s1

Q: What happens when a value is weakened?
A: Underlying resources deallocated!

Mutable Reference

What we don't want to gain permanent access to a value?

fn Vec::push<T>(Vec<T>, T) -> Vec<T>

let v_ = Vec::push(v, 1);   // v is no longer available

Instead, Rust uses mutable references:

fn Vec::push<T>(&mut Vec<T>, T)

Vec::push(&mut v, 1);   // v is still available

A mutable reference grants temporary exclusive access (i.e. borrowing) to the value for the duration of the function call.
Result: still, mutation allowed but no aliasing.

Shared Reference

What if we want to access a value at multiple places?
Admit aliasing!

let v = vec![1];
// Two shared references to v are created
join(|| println!("{:?}", &v), || println!("{:?}", &v));
// Still have access to v at the main thread after references are dropped
Vec::push(&mut v, 2);

Result: for memory safety, allow aliasing but no mutation.

let v = vec![1];
let r = &v;             // temporary shared reference
Vec::push(&mut v, 2);   // error: active shared reference exists
println!("{:?}", r);    // shared reference ends here

Shared Reference - Copy types

What if we want to access a value at multiple places?
Admit aliasing!

Therefore, shared references can be freely duplicated, i.e. unrestricted variables in linear logic.

In Rust, unrestricted types are called Copy types.

Semantically, every type that can be duplicated via bit-wise copy is a Copy type.

  • &T yes, because it's a shared pointer. Int yes, because it's a number.

  • &mut T no, because it also holds exclusive access. Vec<Int> no, because it's a pointer to an heap array, and a bit-wise copy doesn't duplicate the underlying data.

Lifetimes

  • Ownership: exclusive access, mutation
  • Mutable Reference: temporary exclusive access, mutation
  • Shared Reference: temporary shared access, aliasing

How to track if a reference is active? How long is temporary?
Answer: equip each reference with a lifetime.

&'a mut T   // mutable reference with lifetime 'a
&'b T       // shared reference with lifetime 'b

Lifetimes


  • the output of index_mut has the same lifetime as the input.
  • passing v to index_mut, we create a lifetime 'b for v and head.
  • to call push we need to create a mutable reference, whose lifetime overlaps with 'b.

Interior Mutability

Q: What if we need shared mutable state? i.e. multi-thread queue?
A: Add primitives that allow mutation through shared references, i.e. interior mutability.

Cell::set(&Cell<T>, T)
Cell::get(&Cell<T>) -> T
let c1 : &Cell<i32> = &Cell::new(1);
let c2 : &Cell<i32> = &c1;
c1.set(2);
println!("{}", c2.get()); // 2

Oops! Aliasing and mutation at the same time!

Cell is implemented using unsafe code, i.e. opting out of the type system.

Interior Mutability

If you think about it, Cell is still safe to use.

Cell::set(&Cell<T>, T)
Cell::get(&Cell<T>) -> T

Cell can only hold Copy types, and returns a copy of the value when get is called.

No way to alias the inner data semantically!

Formalization of Rust: Challenges

  • Complex language: imperative, traits, ...
  • Unsafe types: opting out of syntactic typing rules

Challenge: complex language

Solution: work on a subset of Rust intermediate representation called .

w: auto

Type system of

Observation: local variables of Rust are also addressable.

Simplification: treat local variables as heap-allocated, i.e. pointer types.

  • Primitives: bool, int
  • Pointers:
    1. : pointer with full ownership of an allocation containing a value of type
    2. : mutable/shared reference with lifetime to a value of type
  • Other types: , , , , ...

Note: Types of local variables of Rust programs are all pointer types.

Not describing in detail due to time limit.

Challenge: unsafe types

Unsafe types opts out of typing rules, so no way to prove safety from the rules!

Solution: take the semantic approach.

  • syntactic typing: terms the typing rules allow to be of type
  • semantic typing: terms that are safe to be treated as type

Semantic typing

What is a type? a certain set of values, or, a predicate on values.

Example: in lambda calculus with booleans,

  • .
  • .

Challenges to model Rust type system

  • How to describe ownership?
  • How to describe temporary access?
  • How to deal with interior mutability?

Challenge: How to describe ownership?

What is a type? a certain set of values, or, a predicate on values.

What predicate? using which logic?

Separation Logic!

Separation Logic 101

A logic that describes a heap.

  • : empty heap
  • : heap with a single cell at address containing value
  • : heap that can be split into two parts, one satisfying and the other satisfying (like conjunction, but disjoint)
  • : heap that, disjointly combined with another heap satisfying , satisfies (like implication, but disjoint)

Separation Logic 101

Separation logic is a substructural logic.

Example: Consider the following heap: .
holds, but does not. Thus, no contraction.

Also, after an implication is applied to a value, the value is consumed.

Separation Logic 101

A logic that describes a heap.

Separation logic is a substructural logic.

  • Rust types: a type represents ownership of a resource, and the type system is affine.
  • Separation logic: a predicate represents a resource, and the logic is affine.

Perfect logic to describe Rust types!

P.S. 🤓👆 Not every separation logic is affine, but the one used in Rustbelt, i.e. Iris, is.

Interpreting Rust types: primitives

Associate every type to an Iris (separation logic) predicate on values.

(Ignore why we name it "own" for now, will be explained later.)

Notice: is separating conjunction, meaning its two oprands are disjoint in memory.

Interpreting Rust types: Copy types

Recall: types that can be freely duplicated via bit-wise copy are Copy types.
Consequence: given , we can freely duplicate the proposition, recovering contraction rule on the type.

Proposition that can be freely copied (i.e. ) is called a persistent proposition.

Therefore, the interpretation of Copy types can always be written as:

, where is a persistent proposition.

E.g. for , , which is trivially persistent because it's not describing any resource.

Interpreting Rust types: owned pointers

Associate every type to an Iris (separation logic) predicate on values.

𝓁𝓁𝓁

  • 𝓁𝓁: contains a single address 𝓁.
  • 𝓁: heap at address 𝓁 contains value .
  • : can be seen as a value of type .

Notice: is separating conjunction, meaning location 𝓁 and memory region representing are disjoint.

† Interpreting Rust types: owned pointers (for Copy types)

𝓁𝓁𝓁

Recall: types that can be duplicated via bit-wise copy are Copy types.

Try to duplicate :

  • 𝓁𝓁: can always find another address 𝓁. (assume no allocation failure)
  • 𝓁: let up to bit-wise copy.
  • : holds because can be duplicated by bit-wise copy.

Property: for any Copy type , predicate can be freely duplicated.

Interpreting Rust types: mutable references

What's the difference between mutable references and owned pointers?

  • owned pointers: ownership for an unlimited time
  • mutable references: ownership for a limited period of time

Challenge: how to describe temporary ownership?

Recall how we tracked references in Rust type system: lifetimes.

Solution: lifetime logic.

Full borrow predicate

: separation assertion representing ownership of some resource
: assertion representing ownership of during lifetime

Intuition: holds only when is active.

We'll head back to the precise definition of lifetime logic later.

Interpreting Rust types: mutable references

: mutable reference with lifetime to a value of type

Meaning: ownership of a value of type for the duration of lifetime .

  • 𝓁𝓁𝓁

  • 𝓁𝓁𝓁

Interpreting Rust types: shared references

Question: what can we say about shared references universally?

  1. they are pointers
  2. they are Copy types, i.e. can be freely duplicated
  3. they can be created by downgrading a mutable reference
  4. for Copy , we can bit-wise copy the value it points to and get a new

Not so interesting! Is that true?

Interior mutability!

How to deal with interior mutability?

Many types have their own sharing reference behavior deviating from the universal rules!

Solution: let every type define their own sharing reference behavior, i.e. sharing predicate.

  • owned predicate : describe values that can be considered as type
  • sharing predicate 𝓁: describe a location 𝓁 and lifetime to be considered as type

Leveraging the sharing predicate to describe the behavior of shared references.

𝓁𝓁𝓁

Interpreting Rust types: shared references

Leveraging the sharing predicate to describe the behavior of shared references.

𝓁𝓁𝓁

Laws for sharing predicates:

  1. they are pointers: already satisfied by the definition of sharing predicate
  2. they are Copy types can be freely duplicated: 𝓁 must be persistent.
  3. they can be created by downgrading a mutable reference: 𝓁𝓁

is a token that asserts the lifetime is active, and we'll talk about it later.

Interpreting Rust types: shared references

  1. for Copy , we can bit-wise copy the value it points to and get a new .

Recall: for Copy types ,

, where is a persistent proposition.

Define:
𝓁𝓁

Interpreting Rust types: shared references

Define: for Copy types ,
𝓁𝓁

Recall: for mutable references,
𝓁𝓁𝓁

Intuition: † fractured borrow also represents ownership during lifetime , but:

  • is persistent, because it represents a shared borrow, while full borrow is not
  • only grants a fraction of its content ()

†: no need to understand the details. Just treat them as full borrows.

Lifetime logic

Things we used but not defined yet:

  • Full borrow : assertion representing ownership of during lifetime
  • Fractured borrow : assertion representing ownership of during lifetime , but only grants a fraction of its content
  • Lifetime token : token that asserts the lifetime is active

Lifetime logic

let mut v = Vec::new();
v.push(0);
{ // <- Vec<i32>
  let mut head = v.index_mut(0);
  *head = 23;
}
println!("{:?}", v);

given that

index_mut: for<'a> fn(&'a mut Vec<i32>, usize) -> &'a mut i32

Lifetime logic

index_mut: for<'a> fn(&'a mut Vec<i32>, usize) -> &'a mut i32
{ 
  let mut head = v.index_mut(0); // <- Vec<i32>
  • need to provide 'a
  • need to pass a mutable reference of lifetime 'a

Lifetime logic

index_mut: for<'a> fn(&'a mut Vec<i32>, usize) -> &'a mut i32
{ 
  let mut head = v.index_mut(0); // <- Vec<i32> * [κ] * ([κ] -* [†κ])
  • need to provide 'a

    :

    Can always

    • create a lifetime token , accompanied by
    • a way to end it . ( is a token that asserts the lifetime has ended.)

Lifetime logic

index_mut: for<'a> fn(&'a mut Vec<i32>, usize) -> &'a mut i32
{ 
  let mut head = v.index_mut(0); // <- &κ mut Vec<i32> * [κ] * ([†κ] -* Vec<i32>) * ([κ] -* [†κ])
  • need to provide 'a (done)

  • need to pass a mutable reference of lifetime 'a

    :

    Given an owned resource , can split it into

    • a full borrow , and
    • an inheritance that can retrieve back after dies.

Lifetime logic: separating conjunction

:
:

  • Sep logic : heap that can be split into two disjoint parts, one satisfying and the other satisfying
  • Lifetime logic : time that can be split into two disjoint parts, one satisfying (when is alive) and the other satisfying (when is dead)

Lifetime logic: frame rule

It's important for and to be disjoint.

Consider and .

But for ,

What if and describes some shared resource, and while , modifies something that invalidates ?

Lifetime logic: separating conjunction

:
:

Whatever we do about , we can always get back the inheritance.

Lifetime logic

index_mut: for<'a> fn(&'a mut Vec<i32>, usize) -> &'a mut i32
{ 
  let mut head = v.index_mut(0); // <- inside `index_mut`
  1. split input &κ Vec<i32> into the accessed &κ i32 and the rest &κ Vec<i32>
  2. return &k i32 to the caller, and drop the rest

Lifetime logic

index_mut: for<'a> fn(&'a mut Vec<i32>, usize) -> &'a mut i32
{ 
  let mut head = v.index_mut(0); // <- inside `index_mut`
  1. split input &κ Vec<i32> into the accessed &κ i32 and the rest &κ Vec<i32>
    :
  2. return &κ i32 to the caller, and drop the rest
    , because Iris is an affine logic
{ 
  let mut head = v.index_mut(0); // <- &κ mut i32 * [κ] * ([†κ] -* Vec<i32>) * ([κ] -* [†κ])

Lifetime logic

  let mut head = v.index_mut(0);
  *head = 23;   // <- i32 * (i32 -* &κ mut i32 * [κ]) * ([†κ] -* Vec<i32>) * ([κ] -* [†κ])

Need to access the resource of mutable reference head.

:

Given a full borrow and a witness that shows is active,

  • can access the resource , accompanied by
  • an inheritance that can retrieve mutable reference and lifetime token back after the access

It's important to return things you borrowed!: lifetime token is such a certificate.

Lifetime logic

  *head = 23;   // <- &κ mut i32 * [κ] * ([†κ] -* Vec<i32>) * ([κ] -* [†κ])
}
  *head = 23;
}               // <- [κ] * ([†κ] -* Vec<i32>) * ([κ] -* [†κ])
  *head = 23;
}               // <- ([†κ] -* Vec<i32>) * [†κ]
  *head = 23;
}               // <- Vec<i32>

† Lifetime logic

Fractured borrow vs Full borrow

  • Fractured borrows are persistent: can be accessed simultaneously by multiple parties (freely duplicatable), but do not have full access, i.e. only a fraction of the resource.
  • It's always possible to take a little bit of a resource from a Fractured borrow, no matter how many times it's been borrowed.

Intuition:

  • from a full borrow with full lifetime , by downgrading it to a fractured borrow, we can get a fraction of it, thus getting fractional lifetime , e.g. , which is shorter than .
  • The semantics guarantees that we can always get a tiny bit of resource of lifetime from a fractured borrow.

Proof of soundness

Typing judgments are defined as

  • lifetime context
  • type context
  • instruction

After the instruction, the type context is updated to with new variable added.

Proof of soundness

Interpretation of typing judgments:

  • Interpreted as a separation logic triple
  • uses interpretation of types described earlier

Proof of soundness

  1. FTLR (Foundamental Theorem of Logical Relations):

    Syntactic typing rules are sound w.r.t. semantic typing rules.

  2. Adequacy: a semantically well-typed program never gets stuck (no invalid memory access or data race).

Collary: every rust program that consists of syntactically well-typed safe code and semantically well-typed unsafe code, is safe to execute.

Conclusion

  • Rust type system: ownership, mutable/shared references, lifetime, interior mutability
  • Formalization: , , . Unsafe types? Semantic typing!
  • Semantic typing:
    • Separation logic
    • , 𝓁 (for interior mutability)
    • , , ? Lifetime logic!
  • Lifetime logic by example
    • Fractured borrow: persistent + fractional (inclusion) lifetime
  • Soundness proof:
    • Judgment interpreted as separation logic triple
    • FTLR (syntactic -> semantic) + Adequacy (semantic -> runtime)

Appendix: Lifetime logic meets Interior Mutability

Example: Mutex is a product of flag (true: locked, false: unlocked) and the resource.

𝓁
𝓁
𝓁𝓁