ragona

Learning Rust in 2019

January 31st, 2019

I spent the month of January playing with Rust in my free time. Rust is often introduced as a systems programming language, and it's a good choice for that. Rust is also fast, and when you need fast you'll put up with anything to get it. The history of programming is full of people performing unholy acts in the name of fast.

But as a programming language nerd, Rust would be interesting even if it was significantly slower. By introducing new concepts like borrow checking and lifetimes, Rust eliminates whole classes of memory bugs. Rust also has a very expressive type system, and a number of modern functional-influenced features. (While not being overly functional; it doesn't mind if you just write good old-fashioned side-effect-inducing for loop.)

The last time I used Rust was a few years ago, and I walked away from the experience vaguely intrigued but feeling annoyed by the rough edges I'd been bumping into. However, Rust only hit 1.0 in 2015, and the experience I had in 2019 was dramatically better than the first time I touched the language. After a month I'm feeling very comfortable, and looking forward to writing more.

In this article I'm going to share some quick notes on some of the mistakes I made while learning, the resources I used to get unblocked, and what I took away from the experience. My hope is that I can save other Rust newcomers a bit of time by documenting some of the things that I banged my head into. Now, it's worth noting that I'm still a Rust beginner, so if I got something wrong please let me know so that I can update it.

Rust is moving fast

Rust has an impressive release cadence; even just in the month of January 2019 I saw noticable progress in the language due to the Rust 1.32 release. (Check out the new dbg!() macro; it's much faster than println!("{:?}") syntax!) This means that the language is getting better all the time, and the team has really done a good job focusing on the usability of the whole ecosystem, so Rust is becoming easier and easier to write. The language is much smoother in 2019 than it was in 2015 when Rust 1.0 was first released.

One side effect of this is that when you go looking for an answer to a question it's easy to stumble over an outdated solution to a problem. That relevant Stack Overflow post from 2015 is stale in a way it might not be if it was Python. It's actually likely to still work, but you might walk away with the impression that the language is less ergonomic than it is. Rust has great documentation, but old versions of it are still floating around out there, so it isn't always immediately obvious what the idiomatic answer is.

Seriously, Read the docs

Rust has really excellent documentation, and it's really worth reading. Along with the excellent Rust book ("The Book") I recommend reading the Edition Guide, which walks through the major differences between the 2015 and 2018 editions of Rust. This will help clear up some confusing inconsistencies you'll see between code that you read. (I've found that the majority of code samples still use 2015-style examples with regards to use vs. extern crate statements, for example.) Rust also makes it really easy to document your own code; this is really an area of strength for the language.

The community is great

Rust has an unusually helpful community, so if you get stuck I recommend reaching out for help. I appreciate that Rust has a really reasonable code of conduct, and moderators who enforce it across their community channels. The community has just embraced that no one starts out understanding the new concepts that Rust introduces, and it has produced a really refreshing attitude for a programming community. The code of conduct helps ensure that this positive culture can scale as the language grows.

I found the Discord channels to be particularly helpful, since you can have a real-time(ish) conversation with someone about the language. It's a huge advantage for the language to have a well-moderated, positive, and active community chat channel where you can go to get help. I like thinking of Discord like the old rules for torrenting; you go ask for help (downloading), then once you get your answer you hang out and try to answer questions (seeding).

I also highly recommend the New Rustacean podcast, which has been running for a couple of years now and is a pretty great resource for anyone who likes learning by hearing someone talk. One of my favorite ways to learn is by talking to my coworkers about coding, so being able to listen to Chris describe Rust concepts has really worked well for me.

If it's getting weird, step back

At some point you will find yourself in a situation where you're passing a reference, the borrow checker tells you that the lifetimes are unclear, you start sprinkling lifetime annotations all over your entire code base, next thing you know you're wrapping things in Box and RefCell and stuff is getting weird.

The compiler is so helpful that it's easy to get into a flow where when you hit an error you just go do whatever the compiler suggests. This is usually great, but sometimes you'll end up way down a rat hole having done a bunch of weird stuff that you only mostly understand.

I've started recognizing this situation as a sort of code-smell, where if I'm fighting with the compiler it usually means that I need to take a step back and see if there is an easier way to express something. The good news is that there is almost always a solution that is also relatively simple and clean, so go for a walk and see if there is another way to do what you're doing.

Error handling

An area that made me scratch my head is error handling; there are a lot of seemingly correct options. This is an area that is moving quickly, and active work is being done to improve the state of affairs. (Check out this RFC for a more detailed explanation.) As you research error handling you'll find that there are several options here, and it turns out they're all fine depending on the situation.

Let me briefly go over the options I tried when trying to land on the right option for a library. Rather than try to recreate a bunch of existing content I'm going to try to summarize the various options, and recommend that you read Andrew Gallant's excellent article Error Handling in Rust.

Error handling options:

  1. Just use unwrap or expect. This is the first option you'll see as you read through the beginner documentation, which has unwrap all over the place. This is fine for quickly prototyping, but it creates somewhat messy code. I suspect you'll prefer the ? syntax most of the time, but in order to do that you must return some variety of Result. One of the biggest improvements to the cleanliness of my code was creating a custom Result<T, MyError> type for the library.

  2. Use an external crate for error handling like failure. There are a number of crates that are designed to make error handling less painful, like failure, quick-error, and error-chain. The most common appears to be failure. This is probably the right option for binary projects; it's a simple way to handle multiple error types without too much boilerplate. I'd use this if I was making a quick tool. It can sometimes be the wrong option for library authors though, since it forces the dependency on people who use your library. It also has some performance cost, especially if used incorrectly.

  3. Create a custom error type. This is often the right option for library authors. It requires a fair bit of boilerplate, but it allows you to create a custom Result<T, MyError> type that can handle the types of errors your library throws. You'll see this pattern in the std library in std::io::Error for example. I ended up using the error.rs example in the CSV library as an example, and once I had it working I was really happy with this solution. I actually learned a lot about Traits implementing my own error type, so I'm glad that I did.

I also found myself asking the question, "Should this method ever fail?" Often "no" is a reasonable answer; you don't need to return Option or Result from everything. Returning a default value can make your library more ergonomic.

Don't fear the heap

Everything in Rust is stack-allocated by default. This has some significant implications, and one is that the compiler often needs to know the memory size of an object so that it can appropriately allocate memory on the stack. This means that you'll sometimes need to use the container types like Box or Rc to pass around pointers to an object that has been heap allocated. For some reason my first instinct was to avoid these types. They seem complicated, there's a cost to heap allocation, and I tried to fight using them at first. However, there are situations in which you must use the container types like Box in order to express certain ideas, and in fact some of the coolest uses of the trait system in Rust are enabled by the container types.

One simple example is a Vector that contains Trait objects. In my drawing library I have a Sprite type, which is a container for other drawable objects. You can add children to the display list of a particular sprite, and it will composite all of its children into a single image via the get_pixel method.

pub struct Sprite {
    children: Vec<Box<Drawable>>,
    ...
}

I want you to be able to add any type of object that implements the Drawable trait to the display list of a Sprite. However, that presents a problem, since Vec needs to know how big the objects it's supposed to hold will be. We solve that problem by storing a Box in the vector. However, this presents a problem, which is that I just forced my users to think about this kind of memory management problem when using the library, and I don't like that.

Hiding container types from library users

Container types can add a bit of extra cruft to your API if you're not careful. For example, with the above Sprite example, I wanted to have an add_child method that added a new Box<Drawable> into children. My initial signature looked like this:

pub fn add_child(&mut self, child: Box<Drawable>)

A user of the library would call the method like this:

parent.add_child(Box::new(Rectangle::new()));

This is fine, but as a user of the library I don't want to care about the fact that Sprite uses heap allocations. Vec and String are also heap-allocated, but I don't need to pass Box objects around. To clean up the API I added a method to the Drawable trait that moves the object into a Box, and it really cleans up the interface, although the signature for add_child is now a bit more complicated:

pub fn add_child<T: 'static>(&mut self, child: T)
where
    T: Drawable + Sized,
{
    self.children.push(child.into_box());
}

// as a user of the library
parent.add_child(Rectangle::new());

That's much better.

Embrace Rust's flexibility

Rust has some really neat functional-inspired features, and I sometimes found myself getting caught up trying to use a functional style. However, Rust allows you to use whatever style you're most comfortable with, and in some situations a normal for loop can be the easier way to express a concept. I also found several situations where it was easier to implement the solution with a for loop first, and then go back and change it to an iter() chain when I'd gotten it working. One example was in looping over each pixel in a bitmap and returning a flattened Vec<Pixel> in order to convert it to bytes for a PNG.

Let's say we have some code like this:

fn pixels(&self) -> Vec<Pixel> {
    let mut pixels: Vec<Pixel> = Vec::with_capacity(self.area());
    for x in 0..self.width() {
        for y in 0..self.height() {
            pixels.push(self.get_pixel(x, y))
        }
    }
    pixels
}

That's pretty easy to understand. The version using pure iterators is a little weird:

fn pixels(&self) -> Vec<Pixel> {
    (0..self.height())
        .flat_map(|y| {
            (0..self.width())
                .map(move |x| {
                    self.get_pixel(x, y)
                })
        })
        .collect()
    }

I don't love that. As with Python though, the itertools library is awesome and can really clean this up. I'm taking advantage of the iproduct! macro which is a nice clean way to iterate over the Cartesian product of two iterators.

fn points(&self) -> Vec<Point> {
    iproduct!(0..self.width(), 0..self.height()) // iproduct is from itertools
        .map(|(x, y)| Point { x, y })
        .collect()
}

fn pixels(&self) -> Vec<Pixel> {
    self.points()
        .into_iter()
        .map(|point| self.get_pixel(point.x, point.y))
        .collect()
}

There are lots of working answers here. Rust is flexible, you can get the idea down in whatever style is easiest, and then come back and clean it up in the future.

Custom type conversions

This is just a minor tip that I keep finding uses for. Rust has a From trait (docs) that you can implement for your custom types, and it can really clean up your code in some places. Rust's default traits feel a bit to me like the "double under" methods in Python (also called "magic methods") like __hash__ that can be added to custom classes to change how they behave. In Rust From allows easy conversions between two types.

For example, I had a situation where I was sending a Point type in my library to an algorithm (Bresenham in the code below) for line drawing. However, the line library wants a tuple of (isize, isize). I could do that conversion inline like I do for the set_pixel arguments, but instead I implemented a From<Point> for (isize, isize) block. The code is cleaner for it.

impl From<Point> for (isize, isize) {
    fn from(point: Point) -> (isize, isize) {
        (point.x as isize, point.y as isize)
    }
}

pub fn line(&mut self, start: Point, end: Point, color: Pixel) {
    // note that I can use .into() here because of the From trait
    for (x, y) in Bresenham::new(start.into(), end.into()) { 
        self.set_pixel(x as i32, y as i32, color);
    }
}

There are quite a few situations where this is really helpful, so it's useful to keep in mind.

What's next?

I've only scratched the surface of Rust, and I'm still learning a lot about how to get the most out the language. There are also several areas in the code for this library where my solution feels incorrect, and so I'll need to do more research there. Overall I'm enjoying Rust a lot, and I'm excited about the blend between speed and usability. Once you get past the initial learning curve Rust feels like a very productive language, and its speed unlocks the ability to work on some problems that are out of reach in higher level languages.