[Rust] Get start with the 'Book' (#07-#11: ecosystem and toolchain).

Jan 7, 2023 14:10 · 2509 words · 12 minute read

packages, crates and modules 🔗

  • Packages: A Cargo feature that lets you build, test and share crates.
  • Crates: A tree of modules that produces a library or executable.
  • Modules and use: Let you control the organization, scope and privacy of path.
  • Paths: A way of naming an item, such as a struct, function or module.

packages and crates 🔗

A crate is the smallest amount of code that the Rust compiler considers at a time.
Crates can contain modules, and the modules may be defined in other files that get compiled with the crate.

Binary crates are executable, such as CLI or a server.
Each must have a function called main.

Library crates don’t have a main function, and they don’t compile to an executable.
Most of the time when Rustaceans say “crate”, they mean library crate. It almost the same as “library” in general programming languages.

The crate root is a source file that the Rust compiler starts from and makes up the root module of your crate.

A package can contain multiple binary crates and optionally one library crate.
As a package grows, it can be extract parts into separate crates.

A package contains a Cargo.toml file that describes how to build those crates.
Actually, Cargo is also a package. The cargo as a command is the binary crate. cargo also provides library crates which the command uses, so we can use the same logic from our codes.

Cargo.toml 🔗

Cargo.toml just after run cargo new is like this:

[package]
name = "myproj"
version = "0.1.0"
edition = "2021"

# See more keys and their definitions at https://doc.rust-lang.org/cargo/reference/manifest.html

[dependencies]

There’s no mention of src/main.rs.
Cargo follows a convention that src/main.rs is the crate root of a binary crate with the same name as the package.

Likewise, Cargo knows that if the package directory contains src/lib.rs, the package contains a library crate with the same name as the package.

If a package contains both src/main.rs and src/lib.rs, it has two crates.

Defining modules 🔗

Here’s how modules work:

  • Start from the crate root
    • when compilling a crate, the compiler first looks in the crate root file.
      • usually src/lib.rs for a library crate, src/main.rs for a binary crate.
  • Declaring modules
    • in the crate root file, you can declare new modules with a mod keyword.
    • when mod garden; is declared, the compiler will look for the module’s code in these places:
      • Inline: curly brackets following mod garden.
      • In the file src/garden.rs.
      • In the file src/garden/mod.rs.
  • Declaring submodules
    • In any file other than the crate root, you can declare submodules.
    • you might declare mod vegetables; in src/garden.rs.
    • the compiler will look for the submodule’s code within the directory named for the parent module in these places:
      • Inline
      • In the file src/garden/vegetables.rs.
      • In the file src/garden/vegetables/mod.rs.
        • mod.rs is older style.
  • Paths to code in modules
    • Once a module is part of your crate, you can refer to code in that module from anywhere else in the same crate.
    • an Asparagus type in the garden vegetables module is found at crate::garden::vegetables::Asparagus.
  • Private vs public
    • codes in a module is private by default.
    • to make a module public, use pub mod instead of mod.
    • items in a module is alse private by default.
    • to make an item public, add pub keyword before their declarations.

Paths for referring to an item in module tree 🔗

A path can take 2 forms:

  1. An absolute path: the full path starting from a crate root.
  2. A relative path starts from the curent module.

Both forms are followed by one or more identifiers separated by ::.

Items in a parent module cannot access the private items inside child modules.
But items in child modules can use the items in their ancestor modules.

Even though a binary crate refers to a library crate in the same package, only public items are available.

super path 🔗

super is like .. of a filesystem.

fn deliver_order() {}

mod back_of_house {
  fn fix_incorrect_order() {
    cook_order();
    super::deliver_order();
  }
  fn cook_order() {}
}

idiomatic use keyword 🔗

use keyword allows us to shortcut a referrence path.

It is prefereble that including parent module path to calling an function, rather than just calling a function.

This is because specifying the parent module when calling the function makes it clear that the function isn’t locally defined.

On the other hand, when bringing in structs, enums and toher items with use, it’s idiomatic to specify the full path.

re-exporting names with pub use 🔗

You can expose a child’s module item as if it is defined in the module by using pub use keyword.

It is useful when the internal structure of your code is different from how programmers calling your code would think about the domain.

nested paths 🔗

use std::cmp::Ordering;
use std::io;

is the same as:

use std::{cmp::Ordering, io};

collections 🔗

Rust’s std lib includes a number of useful data structure called collections.

  • vector: to store a variable number of values.
  • string: a collection of characters.
  • hash map: a key-value pair.

vector 🔗

To create a new vector:

let v = Vec::<i32>::new();
let v: Vec<i32> = Vec::<i32>::new();
let v = vec![1, 2, 3];

There’re 2 ways to get a value from vector:

fn main() {
    let v = vec![1, 2, 3];
    let third = &v[2]; // --1
    println!("{third}");
    if let Some(third) = v.get(2) {
        // --2
        println!("{:?}", third);
    }
}

The first way panics if the index is out of range, while the latter returns None.

Because reference to an element in vector occurs borrow, mutable borrow (e.g., v.push) is not allowed until its end of scope.

store multiple type in a vector 🔗

To store multiple type in the same vector, use emun as vector type.

fn main() {
    let mut v = Vec::<Cell>::new();
    v.push(Cell::Int(10));
    v.push(Cell::Float(3.5));
    v.push(Cell::Text(String::from("hello")));
    println!("{:?}", v);
}

#[derive(Debug)]
enum Cell {
    Int(i32),
    Float(f64),
    Text(String),
}

String: storing UTF-8 encoded text with strings 🔗

New Rustaceans get stuck on strings for a combination of three reasons:

  • Rust’s propensity for exposing possible errors.
  • Strings are more complicated data structure than programmers think.
  • UTF-8

Rust has only one string type in the core language, which is the string slice str.
When it comes to string slice, it is a reference to some UTF-8 encoded string data stored elsewhere(String literals are stored in the program’s binary).

Rust’s std library provides String type.
Because String actually is a wrapper around a vector of bytes, many of the same operations with Vec<T> are available with String as well.

To create String, both String::from("...") and "...".to_string(); is the same.
It’s a matter of style and readability.

+ operator and format! macro 🔗

The + operator over 2 Strings is like calling fn add(self, s: &str) -> String method of the left operand.
So the ownership of the left operand is moved, while the right is keep vaild (just a referrence).

To clarify what to do, format! macro is usuful.

fn main() {
    let mut s1 = String::from("tic");
    let s2 = String::from("tac");
    let s3 = String::from("toe");
    let s = format!("{s1}-{s2}-{s3}");
    s1.push('l');
    println!("{s}");
    println!("{s1}");
}

format! uses references to strings so that this call does not take ownership.

indexing into string 🔗

While many other languages allows to reference a character by indexing into string, Rust does not supoprt string indexing.

To reveal this, run the code:

fn main() {
    let mut helloes = Vec::<String>::new();
    helloes.push(String::from("Dobrý den"));
    helloes.push(String::from("Hello"));
    helloes.push(String::from("こんにちは"));
    helloes.push(String::from("Здравствуйте"));
    for i in helloes {
        println!("{}: {}", i, i.len());
    }
}

❯ ./target/debug/collection
Dobrý den: 10
Hello: 5
こんにちは: 15
Здравствуйте: 24

A String is a wrapper over a Vec<u8>, but a UTF-8 charactor is not fixed to 1 byte.
It doesn’t make sense to return the third u8 as s[2].

Though indexing a string is not supported, iterating and slicing are supported.

hash map 🔗

Unlike Vec and String, HashMap needs to be used.
Like Vec, hash maps are monogeneous: all of the keys must have the same type, and all of the values must have the same type.

For types that implement the Copy trait, the values are copied into the hash map.
For owned values like String, the values will be moved and the hash map will be the owner of those values.

To overwrite value, get mutable values from a hash map:

use std::collections::HashMap;

fn main() {
    let mut scores = HashMap::new();
    scores.insert(String::from("Blue"), 20);
    scores.insert(String::from("Yellow"), 50);
    println!("{:?}", scores);

    if let Some(blue_score) = scores.get_mut(&String::from("Blue")) {
        *blue_score += 10;
    }

    println!("{:?}", scores);
}

error handling 🔗

Rust groups errors into 2 major categories:

  1. recoverble: just reporting the problem to the user and retry.
  2. unrecoverble: symptoms of bugs, immediately stop the program.

Rust doesn’t have exceptions.
Instead, it has tyhe type Result<T,E> for recoverble errors, and panic! macro for unrecoverble errors.

unrecoverble error 🔗

There are 2 ways to cause a panic in practive:

  1. explicityly calling the panic macro.
  2. by taking an action that causes our code to panic (e.g., accessing an array past the end)

By default, panics will print a failure message, unwind, cleanup the stck and quit.
Unwinding means Rust walks back up the stack and cleans up the data from each function it encounters.
This takes a lot of work, so you can choose the alternative of immediately aborting.
To do this, add setting below to Cargo.toml:

[profile.release]
panic = 'abort'

using a panic! backtrace 🔗

A backtrace is a list of all the functions that have been called to get to this point.
The key to reading the backtrace is to start from the top and read until you see files you wrote.

To controll the amount of backtrace, set RUST_BACKTRACE environment variable.
In order to get backtraces with this information, debug symbols must be enabled (it is enabled by default).

recoverble errors 🔗

Most erros are not serious enough to require the program to stop entirely.

Result enum is defined as having 2 variants, Ok and Err:

enum Result<T, E> {
  Ok(T),
  Err(E),
}

<T> represents the type of the value that will be returned in a success case.
<E> represents the type of the value that will be returned in a failure case.

use std::{
    fs::File,
    io::{ErrorKind, Read},
    process::exit,
};

fn main() {
    let mut greeting_file = match File::open("tmp/hello.txt") {
        Ok(file) => file,
        Err(error) => match error.kind() {
            ErrorKind::NotFound => {
                println!("File not found");
                exit(0);
            }
            other => panic!("unexpected error occurred: {:?}", other),
        },
    };
    let mut s: String = String::new();
    greeting_file.read_to_string(&mut s);
    println!("{}", s);
}

Using match works well enough, but it can be a bit verbose and doesn’t always communicate intent well. Result<T, E> has many helper methods.

unwrap method is a shortcut method impletented just like the match expression.
If the Result is the Ok variant, unwrap will return the value inside the Ok.
If the Result is the Err variant, unwrap will call panic! macro.

expect works like unwrap, but it accepts error message on panic.

propagating error 🔗

You can return the error to the calling code so that it can decide what to do.

use std::{
    fs::File,
    io::{self, Read},
};

fn read_username_from_file() -> Result<String, io::Error> {
    let f = File::open("tmp/hello.txt");
    let mut s = match f {
        Ok(file) => file,
        Err(e) => return Err(e),
    };
    let mut u = String::new();
    match s.read_to_string(&mut u) {
        Ok(_) => Ok(u),
        Err(e) => Err(e),
    }
}

fn main() {
    println!("{}", read_username_from_file().expect("error"));
}

the ? operator 🔗

The ? operator placed after a Result value is defined to work in almost the same way as the match expressions.
If the value of the Result is Ok, the value inside the Ok will get returned from this expression.
If the value of the Result is Err, the value inside the Err be returned from the whole function, as if we had used the return keyword.

This is From trait in the std lib.
By using Fron trait, the code can be like this:

use std::{
    fs::File,
    io::{self, Read},
};

fn read_username_from_file() -> Result<String, io::Error> {
    let mut uname = String::new();
    File::open("tmp/hello.txt")?.read_to_string(&mut uname)?;
    Ok(uname)
}

fn main() {
    println!("{}", read_username_from_file().expect("error"));
}

Note that the ? operation can only be used in functions whose return type is compatible with the value the ? is used on.

The ? can be used with Option<T> as well.

Luckily, main can also return a Result<(), E>.

use std::{fs, io};

fn read_username_from_file() -> Result<String, io::Error> {
    fs::read_to_string("tmp/hello.txt")
}

fn main() -> Result<(), io::Error> {
    read_username_from_file()?;
    Ok(())
}

To panic! or Not to panic! 🔗

Result is a good default choice when you’re defining a function that might fail.
In situations such as examples, prototype code, and tests, it’s more appropriate to write code that panics instead of returning a Result.

It would also be appropriate to call unwrap or expect when you have more information than the compiler.

Generics 🔗

here’s an example:

fn main() {
    let list = [1, 2, 3, 4, 5];
    let max = largests(&list);
    println!("{max}");
}

fn largests<T>(list: &[T]) -> &T {
    let mut largest = &list[0];
    for item in list {
        if item > largest {
            largest = item
        }
    }

    largest
}
   Compiling collection v0.1.0 (/workspaces/sandbox/restaurant/collection)
error[E0369]: binary operation `>` cannot be applied to type `&T`
  --> src/main.rs:10:17
   |
10 |         if item > largest {
   |            ---- ^ ------- &T
   |            |
   |            &T
   |
help: consider restricting type parameter `T`
   |
7  | fn largests<T: std::cmp::PartialOrd>(list: &[T]) -> &T {
   |              ++++++++++++++++++++++

error: aborting due to previous error

For more information about this error, try `rustc --explain E0369`.
error: could not compile `collection` due to 2 previous errors

This error is because T type variants item and largest are compared by > while T is not guaranteed as comparable.

To ensure T can be comared, restrict the type only those that implement std::cmp::PartialOrd.

fn largests<T: std::cmp::PartialOrd>(list: &[T]) -> &T

Using generics types won’t make your program slower than it would with concrete types.
Rust accomplishes this by performing monomorphization of the code.
Monomorphization is the process of turning generic code into specific code by filling in the concrete types that are used when compiled.

Traits 🔗

A trait is what defines afunctionlality of a particular type.
We can use traits to define shared behavior in an abstract way.

Traits are similar to a feature often called interface in other languages.
(but with some differences.)

Different types share the same behavior if we can call the same methods on all of those types.
Trait definitions are a way to group method signartures together to define a set of behaviors.

fn main() {
    let article = Article {
        headline: String::from("Hello world"),
        location: String::from("Japan"),
        author: String::from("lp-peg"),
        content: String::from("hoge, fuga, piyo..."),
    };
    notify(&article);
}

pub trait Summary {
    fn summarize(&self) -> String;
}

fn notify<T: Summary>(item: &T) {
    println!("News: {}", item.summarize())
}

pub struct Article {
    pub headline: String,
    pub location: String,
    pub author: String,
    pub content: String,
}

impl Summary for Article {
    fn summarize(&self) -> String {
        return format!("{}, by {}({})", self.headline, self.author, self.location);
    }
}

pub struct Tweet {
    pub username: String,
    pub content: String,
    pub reply: bool,
    pub retweet: bool,
}

impl Summary for Tweet {
    fn summarize(&self) -> String {
        return format!("{}: {}", self.username, self.content);
    }
}

Traits can be composed by + like this:

pub fn notify(item: &(impl Summary + Display))

Lifetimes 🔗