[Rust] Get start with the 'Book' (#01-#06: concepts and syntax).

Jan 1, 2023 21:10 · 3523 words · 17 minute read

https://doc.rust-lang.org/book/

environment 🔗

with VSCode devcontainer:

❯ lsb_release -a
No LSB modules are available.
Distributor ID: Ubuntu
Description:    Ubuntu 20.04.5 LTS
Release:        20.04
Codename:       focal

installing rustup 🔗

info: downloading installer

Welcome to Rust!

This will download and install the official compiler for the Rust
programming language, and its package manager, Cargo.

Rustup metadata and toolchains will be installed into the Rustup
home directory, located at:

  /home/vscode/.rustup

This can be modified with the RUSTUP_HOME environment variable.

The Cargo home directory is located at:

  /home/vscode/.cargo

This can be modified with the CARGO_HOME environment variable.

The cargo, rustc, rustup and other commands will be added to
Cargo's bin directory, located at:

  /home/vscode/.cargo/bin

This path will then be added to your PATH environment variable by
modifying the profile files located at:

  /home/vscode/.profile
  /home/vscode/.bashrc
  /home/vscode/.zshenv

You can uninstall at any time with rustup self uninstall and
these changes will be reverted.

Current installation options:


   default host triple: aarch64-unknown-linux-gnu
     default toolchain: stable (default)
               profile: default
  modify PATH variable: yes

1) Proceed with installation (default)
2) Customize installation
3) Cancel installation
>1

info: profile set to 'default'
info: default host triple is aarch64-unknown-linux-gnu
info: syncing channel updates for 'stable-aarch64-unknown-linux-gnu'
info: latest update on 2022-12-15, rust version 1.66.0 (69f9c33d7 2022-12-12)
info: downloading component 'cargo'
  6.3 MiB /   6.3 MiB (100 %)   5.0 MiB/s in  1s ETA:  0s
info: downloading component 'clippy'
info: downloading component 'rust-docs'
 19.1 MiB /  19.1 MiB (100 %)   6.2 MiB/s in  3s ETA:  0s
info: downloading component 'rust-std'
 39.8 MiB /  39.8 MiB (100 %)   4.2 MiB/s in  8s ETA:  0s
info: downloading component 'rustc'
 81.1 MiB /  81.1 MiB (100 %)   5.5 MiB/s in 13s ETA:  0s
info: downloading component 'rustfmt'
  3.9 MiB /   3.9 MiB (100 %)   3.5 MiB/s in  1s ETA:  0s
info: installing component 'cargo'
info: installing component 'clippy'
info: installing component 'rust-docs'
 19.1 MiB /  19.1 MiB (100 %)   8.6 MiB/s in  1s ETA:  0s
info: installing component 'rust-std'
 39.8 MiB /  39.8 MiB (100 %)  15.0 MiB/s in  2s ETA:  0s
info: installing component 'rustc'
 81.1 MiB /  81.1 MiB (100 %)  19.7 MiB/s in  4s ETA:  0s
info: installing component 'rustfmt'
info: default toolchain set to 'stable-aarch64-unknown-linux-gnu'

  stable-aarch64-unknown-linux-gnu installed - rustc 1.66.0 (69f9c33d7 2022-12-12)


Rust is installed now. Great!

To get started you may need to restart your current shell.
This would reload your PATH environment variable to include
Cargo's bin directory ($HOME/.cargo/bin).

To configure your current shell, run:
source "$HOME/.cargo/env"

also install build-essential package.

❯ sudo apt install build-essential

check if Rust was installed correctly.

❯ rustc --version
rustc 1.66.0 (69f9c33d7 2022-12-12)

Hello, World! 🔗

just write below, and compile it.

fn main() {
  pritnln!("Hello, world!");
}
rustc main.rs
./main
Hello, world!

Like Go, the main function is the entrypoint.
Use 4 spaces, not tab (rustfmt is available as formatter).
println! calls a Rust macro, not a function. If omit !, then calls function.

Hello, Cargo! 🔗

Cargo is Rust’s build system and package manager.
Init project like this,

cargo new <project name>

Cargo.toml is generated in root.

[package]
name = "hello_world"
version = "0.1.0"
edition = "2021"

# See more keys and their definitions at https://doc.rust-lang.org/cargo/reference/manifest.html

[dependencies]

appendix E: editions
edition indicates Rust edition that the Rust team produces every 2 or 3 years.
Basically, Rust keeps 6-week release cycle with tiny feature updates.
But for those who don’t update so frequentry, it become a lot of changes between 2 gapped versions.

edition can be a rallying point, like Rust 2015, Rust 2018 and Rust 2021.

dependencis lists depending projects. In Rust, those packages are called as crate.

cargo expects your source files to live inside the src directory. Also Cargo.toml in the top direcory.

The top-level directory is just for README, license, configuration and anything else not related to your code.

build project 🔗

$ cargo build

and then, the executable file is created in target/debug/hello_world.
Cargo creates a debug build as default. --release flag is required to release build (it compile with optimization).

These are convenient commands:

  • cargo run: cargo build + run executable
  • cargo check: check if source files are compilable.

guessing game 🔗

  • I/O library is std::io.
  • std::prelude is loaded by default.
  • let statement is to create the variable.
    • variables are immutable by default.
    • to make a variable mutable, add mut before the variable name.
  • :: is calling an associated function.
    • an associated function is a function that is implemented on a type.
  • read_line returns Result.
    • Result’s variants are Or and Err.
      • Ok indicates the operation was successful.
    • Result has an expect method.
      • if the Result is Err, expect crashes proccess and print the message given in.
      • if the Result is Ok, expect just returns the value Ok is holding.
    • if expect hasn’t been called, complier just warns.
  • for string interpolation, use {}.
  • to generate rundom number, use rand crate.
    • there’s 2 types of crate:
      • binary crate: executable.
      • library crate: source codes intended to be used in other programs.
  • to add rand crate, run cargo add rand.
    • then a line rand = "0.8.5" was added to Cargo.toml.
      • 0.8.5 is shorthand for ^0.8.5.
        • meaning, >= 0.8.5 and < 0.9.0.
  • then cargo fetches crate from the registry Crates.io.
  • Cargo.lock file indicates which version of a crate should be used.
    • to update crate, need to run cargo update explicitly.
  • start..=end expression is requesting a number between 1 and 100.
  • another way to parse into i32
match guess
    .trim()
    .parse::<i32>()
    .expect("invalid input: {guess}")
    .cmp(&secret_number)
{
    Ordering::Less => println!("Too small!"),
    // --snip--
}
  • read_line reads a delimiter (\n).
  • handle invalid input:
let guess :i32 = match guess.trim().parse() {
  Ok(num) => num,
  Err(_) => continue,
}
  • match provides declarative assignment.

variables 🔗

  • imutable values vs. constants
    • const cannot be used with mut.
    • const must be annotated data type.
    • const can be declared in any scope, even in global.
  • convention: constant is uppercase with underscores.
  • shadowing
    • it’s allowed to declare a new variable with the same name as a previous one.
      • then, “the first variable is shadowed by the second”.
    • once a variable shadows another, the value is used as the name of variable until the end of scope or be shadowed again.
  • why is shadow needed, even though Rust has mut?
  • overflow
    • in debug mode, it panics.
    • in release mode, it is overwrapped.
      • if a u8 comes to 256, then become 0.
  • char and string
    • char literal is single quated, while string is doble.
    • char is 4 bytes.
      • unicode scalar valur range from U+E000 to U+10FFFF is inclusive.
  • tuple: let tup: (i32, f64, u8) = (500, 6.4, 1);
    • destruct tuple: let (x, y, z) = tup;

statement and expression 🔗

  • statements are instructions that perform some action and do not return a value.
    • let x = 6;
  • expressoins evaluate to a resultant value.
    • 1 + 2
    • 6 in the above case.
    • calling a function or a macro.
    • scope block created with curly brackets.
    • expression does not have ending semicolons.

loop 🔗

  • loop can be labeled: 'label: loop {}

ownership 🔗

  • usually, memory is managed in 2 ways:
    • user explicitly allocate and free memory.
    • garbage collector.
  • Ownership is the third approach to manage memory, which Rust uses.
    • memory is managed through a system of ownership with a set of rules.
    • if any of the rules are violated, the program won’t compile.
  • different from many language, Rust requires user to consider which memory to use: stack or heap.
  • all data stored in stack must have a known, fixed size.
  • data with unknown size at compile time, or might change must be stored on the heap.
  • because the pointer to the heap is a known, fixed size, the pointer can be stored on the stack.
  • pushing to the stack is faster than allocating on the heap.
    • the allocator never has to search for a place to store new data.
    • location is always at the top of the stack.
  • once you understand ownership, you don’t need to think about the stack and the heap very often.
  • Ownership Rule:
    • Each value in Rust has an owner.
    • There can only be one owner at a time.
    • When the owner goes out of scope, the value will be dropped.
  • example: String
  • when a string created by a literal, we know the contents at compile time.
    • the text is hardcoded directly into the final executable.
      • this is why string literals are fast and efficient.
      • but this property only comes from the string literal’s immutability.
  • when a string created by as a String type, it is allocated on heap.
    • memory must be requested from the memory allocator at runtime.
    • it is required to return the memory to the allocator when we done with the String.
  • the first part is done by calling String::from.
  • the second part is the point Rust is unique:
    • the memory is automatically returned once the variable that owns it goes out of scope.
    • The function Rust calls to release memory is drop.
      • Rust calls drop automatically at the closing curly bracket.
    • NOTE: in C++, this pattern is named as Resource Acquisition Is Initialization.
  • simple case:
let x = 5;
let y = x;
  • this code bind 5 to x, then make a copy of x and bind it to y.
  • both values are pushed onto the stack.
  • how about string?
let x = String::from("hello");
let y = x;
  • string is made up of three parts:
    • ptr: pointer of the index 0.
    • len: length of the string.
    • cap: capacity of the string.
  • these three data is stored on the stack.
  • but the actual data (“hello”) is stored on the heap.
  • so the case above, y is just a copy of x’s ptr, len and cap.
    • the data stored in the heap is not copied.
  • so both x and y points the same heap address.
    • when x and y goes out of scope, drop is called twice.
      • this is known as a double free error.
        • freeing memory twice can lead to memory corruption.
  • to ensure memory safety, after let s2 = s1;, Rust considers s1 is no longer valid.
    • we cannot use s1 after the line.
      • value borrowed here after move error occurs.
    • Rust does nothing when s1 goes out of scope.
    • Rust invalidates the original variable when it is shallow copied.
    • in other words, Rust moves value instead of shallow copy.
    • also, Rust will never automatically create deep copy of data.
  • if we need to deeply copy the heap data, we can use clone method.
  • the behavior above is only about date on the heap.
  • when it comes to data on the stack, there’s no reason to invalidate the variable which is a value moved from.
  • not only assignment, but also passing a variable to function will move or copy.
    • this code also occurs value borrowed here after move error.
fn main() {
    let x = String::from("hello");
    a(x);
    println!("{x}");
}

fn a(s: String) {
    println!("{}", s);
}
  • returning values can also transfer ownership.
    • if returned data was not assigned, it is dropped.

reference 🔗

As discussed above, once the values are moved from the variable, it cannot be used after that. It is the same passing values to a function. But it is a very common case to use variables after passing them to a function. We can avoid it to return the values which are passed, but it’s too tedious.

fn main() {
    let s1 = String::from("hello");
    let (s1, len) = calc_len(s1);
    println!("{}: {}", s1, len);
}

fn calc_len(s: String) -> (String, usize) {
    let ln = s.len();
    (s, ln)
}

Here’s another way to do that: reference.
reference is like a pinter in that it’s an address we can follow to access the data stored to the heap.

By using reference, we can modify the above code like this:

fn main() {
    let s1 = String::from("hello");
    let ln = calc_len(&s1);
    println!("{s1}: {ln}");
}

fn calc_len(s: &String) -> usize {
    s.len()
}

reference is just a reference, meaning that the argument does not have ownership.

We call the aciton of creating a reference as borrowing.

mutable reference 🔗

By using mutable reference, borrowed variable can be changed while not getting ownership.

fn main() {
    let mut s = String::from("hello");
    let ln = push_sth_and_calc_len(&mut s);
    println!("{s}: {ln}");
}

fn push_sth_and_calc_len(s: &mut String) -> usize {
    s.push_str(" world");
    s.len()
}

Mutable reference has one big restriction: once a mutable reference is created from the value, no other reference can be created from the value in the same scope.

This code does NOT work:

fn main() {
    let mut s = String::from("hello");
    let ln = push_sth_and_calc_len(&mut s);
    println!("{s}: {ln}");
    let s1 = &mut s;
    let s2 = &mut s;
    println!("{s1}{s2}");
}

fn push_sth_and_calc_len(s: &mut String) -> usize {
    s.push_str(" world");
    s.len()
}

By separating scopes, it works:

fn main() {
    let mut s = String::from("hello");
    let ln = push_sth_and_calc_len(&mut s);
    println!("{s}: {ln}");
    {
        let s1 = &mut s;
        println!("{s1}");
    }
    let s2 = &mut s;
    println!("{s2}");
}

fn push_sth_and_calc_len(s: &mut String) -> usize {
    s.push_str(" world");
    s.len()
}

Unlike other languages, Rust prevent data races by prohibitting data change anywhere.

where a data race happens:

  • Two or more pointers access the same data at the same time.
  • At least one of the pointer is being used to write to the data.
  • There’s no mechanism being used to synchronize access to the data.

By this mechanism, Rust can find data races at compile time.

Multiple immutable references are allowed.

Note that a reference’s scope starts from where it is introduced and continues through the last time that reference is used.
So the code below works:

fn main() {
    let mut s = String::from("hello");
    let ln = push_sth_and_calc_len(&mut s);
    println!("{s}: {ln}");
    let s1 = &mut s;
    println!("{s1}");
    let s2 = &mut s;
    println!("{s2}");
}

fn push_sth_and_calc_len(s: &mut String) -> usize {
    s.push_str(" world");
    s.len()
}

dangling reference 🔗

By freeing some memory while preserving a pointer to that memory, it happens erroneously dangling pointer.
In Rust, by contrast, the compiler guarantees that references will never be dangling references.

fn dangle() -> &String {
    let s = String::from("hello");
    &s
}

This cannot be compiled:

❯ cargo check
    Checking variables v0.1.0 (/workspaces/sandbox/variables)
error[E0106]: missing lifetime specifier
 --> src/main.rs:5:16
  |
5 | fn dangle() -> &String {
  |                ^ expected named lifetime parameter
  |
  = help: this function's return type contains a borrowed value, but there is no value for it to be borrowed from
help: consider using the `'static` lifetime
  |
5 | fn dangle() -> &'static String {
  |                 +++++++

For more information about this error, try `rustc --explain E0106`.
error: could not compile `variables` due to previous error

slice type 🔗

In a case that we separate a string by spaces.
It would be implemented naively:

fn main() {
    let s = String::from("hello world.");
    let w = first_word(&s);
    println!("{w}");
}

fn first_word(s: &String) -> usize {
    let bytes = s.as_bytes();
    for (i, &item) in bytes.iter().enumerate() {
        if item == b' ' {
            return i;
        }
    }
    s.len()
}

But once s.clear() is called, the value of w is no longer valid.
It’s better approach to track a starting and ending index.
It’s possible by using slice.

fn main() {
    let s = String::from("hello world.");
    let w = first_word(&s);
    let s1 = &s[0..w];
    let s2 = &s[w + 1..];
    println!("{}:{}", s1, s2);
    println!("{}:{}", s1, s2);
}

fn first_word(s: &String) -> usize {
    let bytes = s.as_bytes();
    for (i, &item) in bytes.iter().enumerate() {
        if item == b' ' {
            return i;
        }
    }
    s.len()
}

string slice is written as &str.

In above case, s2 is a pointer references s’s 6th index, and 5 length.
Note that the slice range indice must occur at valid UTF-8 character boundaries.

Scince s1 and s2 borrow s, s cannot be changed even though it is mutable. This is where Rust compiler avoids a data race in compile time.

※ after immutable borrow ended, mutable borrow is allowed:

fn main() {
    let mut s = String::from("hello world.");
    let w = first_word(&s);
    let s1 = &s[0..w];
    let s2 = &s[w + 1..];
    println!("{}:{}", s1, s2);
    println!("{}:{}", s1, s2);
    s.clear();
}

fn first_word(s: &String) -> usize {
    let bytes = s.as_bytes();
    for (i, &item) in bytes.iter().enumerate() {
        if item == b' ' {
            return i;
        }
    }
    s.len()
}

defining and instantiating structs 🔗

To define a struct, use struct keyword:

struct User {
    active: bool,
    username: String,
    email: String,
    sign_in_count: u64,
}

To use a defined struct, create an instance by specifying concrete values:

fn main() {
    let u = User {
        email: String::from("john@example.com"),
        username: String::from("john doe"),
        active: true,
        sign_in_count: 10,
    };
    println!(
        "{}, {}, active={}, sign_in={}",
        u.username, u.email, u.active, u.sign_in_count
    );
}

Note that the entire instance must be mutable. Rust doesn’t allow us to mark only certain fields as mutable.

When a variable name and a field name are exactlry the same, this shorthand is available:

fn main() {
    let email = String::from("john@example.com");
    let username = String::from("john doe");
    let u = User {
        email,
        username,
        active: true,
        sign_in_count: 10,
    };
    println!(
        "{}, {}, active={}, sign_in={}",
        u.username, u.email, u.active, u.sign_in_count
    );
}

Also cloning from other instance like spread operator in JS is available:

fn main() {
    let email = String::from("john@example.com");
    let username = String::from("john doe");
    let u = User {
        email,
        username,
        active: true,
        sign_in_count: 10,
    };
    let u2 = User {
        email: String::from("doe@example.com"),
        ..u
    };
    println!(
        "{}, {}, active={}, sign_in={}",
        u2.username, u2.email, u2.active, u2.sign_in_count
    );
}

struct User {
    active: bool,
    username: String,
    email: String,
    sign_in_count: u64,
}

Note that this moves ownership, so this cannot be compiled:

fn main() {
    let email = String::from("john@example.com");
    let username = String::from("john doe");
    let u = User {
        email,
        username,
        active: true,
        sign_in_count: 10,
    };
    let u2 = User { ..u };
    println!(
        "{}, {}, active={}, sign_in={}",
        u.username, u.email, u.active, u.sign_in_count
    );
}

struct User {
    active: bool,
    username: String,
    email: String,
    sign_in_count: u64,
}

Because bool and i32 has known and fixed size(i.g., pushed onto the stack: Copy trait), u.active and u.email are still valid.

Rust also supports tuple struct:

struct Color(i32, i32, i32);

example: rect 🔗

struct Rectangle {
    width: usize,
    height: usize,
}

fn main() {
    let rect = Rectangle {
        width: 30,
        height: 50,
    };
    println!("{}", area(&rect));
}

fn area(rect: &Rectangle) -> usize {
    rect.width * rect.height
}

Note that accessing fields of a borrowed struct instance does not move the field values, which is why you often see borrows of structs

derived traits 🔗

This doesn’t work:

println!("{}", rect);
^^^^ `Rectangle` cannot be formatted with the default formatter

the curly brackets tell println! to use formatting known as Display: output intended for direct end user consumption.

There’s a hint in message:

note: in format strings you may be able to use `{:?}` (or {:#?} for pretty-print) instead

After changing it, still errors:

note: add `#[derive(Debug)]` to `Rectangle` or manually `impl Debug for Rectangle`

Rust does include functionality to print out debugging information, but we have to explicitly opt in to make that functionality available for our struct

To opt in, use annotate #[derive(Debug)]

#[derive(Debug)]
struct Rectangle {
    width: usize,
    height: usize,
}
Rectangle { width: 30, height: 50 }

Another way to do this is using dbg! macro.

method syntax 🔗

To define a function within the context of a struct, use impl keyword.

#[derive(Debug)]
struct Rectangle {
    width: usize,
    height: usize,
}

impl Rectangle {
    fn area(&self) -> usize {
        self.width * self.height
    }
}

fn main() {
    let rect = Rectangle {
        width: 30,
        height: 50,
    };
    println!("{:?}", rect.area());
}

Within an impl block, the type Self is an alias for the type that the impl block is for. Methods must have a parameter named self of type Self for their first parameter, so Rust lets you abbreviate this with only the name self in the first parameter spot.

&self is shorthand for self: &Self.

All functions defined within an impl block are called associated functions because they’re associated with the type named after the impl. We can define associated functions that don’t have self as their first parameter (and thus are not methods) because they don’t need an instance of the type to work with.

Associated functions that aren’t methods are often used for constructors that will return a new instance of the struct.

Enums and pattern matching 🔗

enum IpAddrKind {
  V4,
  V6,
}

fn main() {
  let four = IpAddrKind::V4;
}

We can put data directly into each enum variant:

enum IpAddr {
  V4(String),
  V6(String),
}

It’s also allowed to define like this:

enum IpAddr {
  V4(u8, u8, u8, u8),
  V6(String),
}

The standard library std::option offers Option<T> enum.
Using None value is preferable to using null in some cases.

match 🔗

match allows us to compare a value against a series of pattern.

#[derive(Debug)]
enum UsStates {
    Any,
    Alabalma,
    Alaska,
}

enum Coin {
    Penny,
    Nickel,
    Dime,
    Quarter(UsStates),
}

fn values_in_cents(coin: Coin) {
    let (state, v) = match coin {
        Coin::Penny => (UsStates::Any, 1),
        Coin::Quarter(state) => (state, 25),
        _ => (UsStates::Any, 0),
    };
    println!("¢{v} in {:?} state.", state)
}

fn main() {
    let c = Coin::Dime;
    values_in_cents(c);
}

if let 🔗

The match block and the if let block is almost the same:

fn main() {
    let m = Some(3u8);
    match m {
        Some(max) => println!("The maximum is configured to be {max}"),
        _ => (),
    }
    if let Some(max) = m {
        println!("The maximum is configured to be {max}");
    }
}