What's special about Rust's Rocket framework

OvermindDL1 · 9 June 2021 20:21

Woooooooo! This is such a huge release for it, and 2 years incoming!

In short, the library is now using an updated hyper backend (not just updated, but it requires the absolute latest as of the time of this post), it’s fully async now, and lastly it runs on the Stable Releases of Rust!!

It’s an amazing library, I use it in a few personal projects. Let me give a short intro to it and why people use it:

First, you initialize a new web server like:

#[launch]
fn rocket() -> _ {
    rocket::build().mount("/", routes![index])
}

Now the bit of magic here is the launch procmacro, but all it does is just startup a tokio runtime and wraps the rest of the code in it. If you start a tokio runtime some other way then you can just start rocket directly in it.

Next, the build builds the rocket server via a builder, and you mount routes via mount, which takes a root path to mount everything at (you can call mount multiple times, or just put all your routes in a single mount, whatever you want).

Now the rocket builder supports a variety of things, a quick run-through:

Configurations: By default it loads a default set of options like hosting on localhost at port 8080 and so forth, but by calling configure (or with provider instead of build to create a builder straight from a configuration) you can pass in a configure structure, or load from a file, or etc… You can pass in options that are not only for rocket but user options as well, that can be used be anything running in rocket (or outside technically too).
Fairings: This is like middleware, you can attach a fairing type. Fairings, like any normal middleware, are called when a rocket server is built on_ignite, when the rocket server fully is serving on_liftoff, when a request comes in on_request, and when a response comes back after being served from a request on_response. However, don’t use fairings for things you would most often use middleware for in other web server libraries, rocket has far more powerful things otherwise, and more efficient. ^.^
Mounts: You can mount routes on the rocket server, it takes a base path for the routes, as well as an iterator/vec/array/whatever of routes, talked about lower.
State: You can register any random data that you want by calling manager with it, and anything can access that data safely within the system (so guards, routes, whatever). Talked about more with routes. Very convenient to do things like register a database pool that you can grab and get DB connections from for example.
Fallback Catchers: Like routes you can register fallback handlers to handle things like 404 errors or whatever else. They ‘catch’ a response being returned that has an HTML catcher that you are registering, so if you are, say, registering a 404 catcher then you can handle 404 errors that are passed through of the error types you want to handle. The register function, like mount, also takes a prefix path to work on (so you can have different handlers registered in different paths of the system, most libraries can’t support that!) as well as an iterator/vec/array/whatever for the catchers.
Shutdown: You can even gracefully handle shutdowns by calling shutdown on it to get a type back that you can then call shutdown.notify() on it, then rocket will gracefully shutdown existing connections letting them run in full (or at least to your configured graceful_timeout time) while not allowing new connections, super useful! By default rocket registers signal handlers to perform graceful shutdown when the program is requested to be killed, like via an init system or so, but you can override those if you have alternative methods.
In addition the rocket server instance has calls on it like state, routes, fairings, and other things to get all registered instances of the relevant data, useful for access outside of rocket.

Now routes, the big big thing about rocket and how it works, the big unique aspect about it that puts it above for ease of use of so many other web libraries while remaining extremely fast!

First, the basic route, just a function with a procmacro stating some options:

#[get("/")]
fn index() -> &'static str {
    "Hello, world!"
}

This is just a function that returns a string, as simple as it really gets (short of just doing and returning nothing, which returning nothing is a no-response 200 return to html). Even this little example shows so much functionality!

get in the procmacro says what type of request for this, in this case it handles a get request when the path is /. If you register this route via .mount("/", routes![index]) then it will be called with the path is precisely just /, no trailing paths or anything, this is a precise match. You can call the procmacro multiple times if you want to, for example, handle both get and post, or any other of course.
It takes no arguments, that’s because this function doesn’t need any, we’ll get into those soon.
It returns just a normal static string slice.

Now let’s talk about this return type, routes must return a Responder type, however the Responder trait is default implement on a number of standard things, and you can implement it on your own types (like to render a template super efficiently, though rocket has built in support for a few of them with the included rocket-contribs library). A responder just has a function defined on it that takes a request and returns a response. Since Responder is defined on &str's then that function just puts the string into the response body, fills out the appropriate headers for length and so forth, and sets the return code to 200, great and simple way to get started.

Responder’s are implemented on a lot of types. You can see the built-in rust types its implemented on here but it’s really simple to make your own types as well. You can see its even implemented on convenient things like Result and such as well, so you can easily and trivially return a 200 success or some other error type as well for failure, making it really simple to use!

Now let’s look at more complicated route in increasing complexity, going to focus on the procmacro and the function signature, ignoring the body for now:

#[get("/world")]
fn whatever() { /* .. */ }

So this just accepts a path that only is the <basepath> of the mount and /world after it, so if registered via .mount("/", routes![whatever]) then it would only accept paths of /world, or .mount("/blee", routes![whatever]) would only accept exact paths of /blee/world.

#[post("/")]
fn whatever() { /* .. */ }

Takes a post request only at a path / after its base.

#[get("/hello/<name>")]
fn whatever(name: &str) { /* .. */ }

Now this takes a single path segment, names it name, and maps it to an argument of the same name in the argument list, so calling /hello/blah would have the string blah in the name argument of the whatever function. It does not cascade to deeper paths, so something like /hello/blah/blorp won’t match this route at all!

#[get("/hello/<name>/<age>/<cool>")]
fn whatever(name: &str, age: u8, cool: bool) { /* .. */ }

This takes 3 path entries, and names them as you’d expect on the arguments, however notice the argument types! name will of course be a string in name, it’s type is string after all so that’s expected. Now see age is a u8, this has the FromParam trait already implemented on it by default (as it is for a LOT of build-in types, and of course you can implement it on your own types too!), so it calls FromParam::from_param(..) on the source string to try to convert it to a u8, and if it fails the route doesn’t match! So if you call /hello/blah/42/true then that will match, but /hello/blah/bleep/true will pretend like this route doesn’t even exist, because it couldn’t be cast to it, this means you can do a fallback, like:

#[get("/hello/<name>/<age>/<cool>")]
fn whatever(name: &str, age: u8, cool: bool) { /* .. */ }

#[get("/hello/<name>/<age>/<cool>")]
fn whatever2(name: &str, age: &str, cool: bool) { /* .. */ }

And now both of the path examples will match, the one with 42 will match the first, otherwise it falls back to the second route that matches the string, except it won’t, because how do you know which should be tried first? Sure you could specify the order in the routes![...] argument and that would work, but what if you really want to override it at the route function level as there it would be far far more clear, well that’s the rank argument to the procmacro, like this:

#[get("/hello/<name>/<age>/<cool>")]
fn whatever(name: &str, age: u8, cool: bool) { /* .. */ }

#[get("/hello/<name>/<age>/<cool>", rank = 2)]
fn whatever2(name: &str, age: &str, cool: bool) { /* .. */ }

Routes are tried in increasing rank order, so lower ranks are tried first! You can even pass on to another route from ‘inside’ your route call by returning a Forward response to force a delegation to another route even after yours was matched but you end up finding out it shouldn’t be.

Now, I used 2 for the above rank, but that’s because rocket will default rank things somewhere between -12 to 1 depending on how ‘specific’ the matching of the route it, so on or after 2 or on or before -13 is good to choose for custom ranks. If you are curious on how the default rank is chosen you can consult the chart on the rocket docs here, but in short the more specific matches, precise strings, etc… gets it lower, and the more general or argument matches makes it higher, which is generally the default everyone always wants anyway so its unlikely you’ll ever need to override for it those reasons, but you still can if you want. Rocket can of course print out all your routes in matching order showing their ranks and all.

As for path matchers, if you want to match a path segment but don’t actually care what it is and not going to use it, then you can just name it _ as is common in rust for ‘ignore this’:

#[get("/hello/<_>")]
fn whatever() { /* .. */ }

If you want to match multiple paths then you can have the last path argument’s name end with .. like:

#[get("/hello<names..>")]
fn whatever(names: PathBuf) { /* .. */ }

All the paths at the end after any prior matches get put into a normal rust PathBuf type, with all the normal functionality as you’d expect on it, very safe, no path traversal attacks, etc…

And of course if you want to accept any paths after something but don’t care about it then can ignore that too like <_..>.

So so far on routes have seen how to match paths, query arguments are just as important:

#[get("/hello?name=blah")]
fn whatever() { /* .. */ }

This will only match when there is a query argument that exists named name with the value blah.

#[get("/hello?name=<name>")]
fn whatever(name: &str) { /* .. */ }

This will only match when there is a query argument named name and it will give you its value in the argument name of name as thats what I called it to match on, hwever this is wordy, if the query name is the same as the argument name, as it is in this case, you can just do:

#[get("/hello?<name>")]
fn whatever(name: &str) { /* .. */ }

And yes you can have more!

#[get("/hello?<name>&<age>&<cool>")]
fn whatever(name: &str, age: u8, cool: bool) { /* .. */ }

And yes, as you can see the types can be non-strings as well, as long they implement FromFormField, which is like FromParam but operates on queries, which have more powerful ‘from’ characteristics. And yes, you can accept multiple query arguments of the same name too, just accept it in a container type, like Vec:

#[get("/hello?<names>")]
fn whatever(name: Vec<&str>) { /* .. */ }

And if you want it to optionally exist without specifying multiple routes, then wrap it in an Option:

#[get("/hello?<name>")]
fn whatever(name: Option<&str>) { /* .. */ }

However, the type could also potentially implement more query arguments then just one, like what if we want to accept, say, a login form, that fills in a whole structure for us? Well we can do that!

#[derive(FromForm)]
struct LoginData<'r> {
  login: &'r str,
  password: &'r str,
  timeout: usize,
}

Here I define a structure that derives FromForm, you can then use it like any other argument:

// Don't ever actually pass passwords as `get` arguments, this is purely an example
#[get("/login?<login>")]
fn whatever(name: LoginData) { /* .. */ }

This will with the default derivation accept a path like /login?login.login=name&login.password=hunter2&timeout=42, and yes if the timeout in this case is not a number of size usize then the route won’t match and it will fallback through the ranks as normal. You can pass options to FromForm to rename fields or other things, or you can implement it manually to access the data however you wish! It makes it super easy to parse form data however!

However, you notice it puts it inside a login query so you do things like login.login for the query, what if you want it top level? Well you can pass in any non-matches query arguments in a hashmap or so, but you can also stuff them into your own type like:

// Don't ever actually pass passwords as `get` arguments, this is purely an example
#[get("/login?<login..>")]
fn whatever(name: LoginData) { /* .. */ }

Because of the .. after login it will take any remaining unmatched query arguments (all in this case since we don’t have any other query names) and put them into the given type, or LoginData in this case (or the route fails to match). You can mix and match named arguments and dynamic arguments and ‘rest’ arguments and all at once as well! You can even nest forms inside other forms. ^.^

Forms/queries in rocket are extremely powerful, far more so than what I’m showing here with validation and other features as well, it’s extremely powerful and yet still extremely simple to use, the types tend to ‘just work’.

Now what if you want to parse out the body of a request, like a post or so, well that’s just:

#[post("/", data = "<input>")]
fn whatever(input: LoginData) { /* .. */ }

The data type, input in this case, can also be anything that implements FormData, which uses form encoding by default, however you can wrap it in another decoder, for example rocket also comes with json, so you can accept json bodies, you’d wrap it like:

#[post("/", data = "<input>")]
fn whatever(input: Json<LoginData>) { /* .. */ }

And of course you can also make a type (or just use Either) to be able to accept multiple different types of data if you want to as well in the order you wish!

However, you may notice that this could potentially pass in a lot of data, like a malicious client or so. Well by default rocket has pretty small limits, like I think its 32kb for bodies, you can change that in the configuration, or you can even override it on individual routes, like if you want to stream the body in then you can control the size it allows, though if the configuration amount is fine you could also just do:

#[post("/", data = "<input>")]
async fn whatever(input: TempFile<'_>) { /* .. */ }

And here an input will be to a temporary file on the filesystem, a common pattern for webserver to take in, say, file uploads or otherwise larger data, this will still be restricted to the configured body size limit however. If you want to overcome the limit, or you want to process the data in-full in code without it going to the filesystem then you can stream it like:

#[post("/", data = "<input>")]
async fn whatever(input: Data<'_>) {
    data.open(512.kibibytes())
        .stream_to(tokio::io::stdout())
        .await?;
}

Like here it takes a Data stream, we accept it and only allow to read up to 512 kb, overriding the configured default (you always put in a limit, however large you may want to allow, good practice), and in this case it just gets streamed to stdout. Notice these last two examples require async (other things you do may be as well) as the code will run asynchronously because of delays in IO processing and all as well, normally this is behind the scenes like with the other example, but sometimes you have async code yourself, which will likely be super common with database work for example, which gets us to Guards now!

Guards are other arguments to a route function that aren’t handled by something like a query parameter or whatever, such as:

#[get("/world")]
async fn whatever(db: State<DatabasePool>) { /* .. */ }

The State type is how you accessed a managed state put on to the main rocket instance, you give it the type to look up in the rocket instance, So in this case I grabbed the database pool type, and so I can make a transaction on it or whatever. You can access as many arguments as you wish, and there are a lot of things beyond just State. There is also request-local state, which is specific to just the request behind handled, this is generally so that things like fairings can pass data to its response handler or guards can register some shared cache or something.

You can grab a cookie guard like:

#[get("/world")]
fn whatever(cookies: &CookieJar<'_>) { /* .. */ }

And that lets you access cookies (both public and encrypted/private), get, set, and delete them, see what cookies are pending to be set or removed, etc…

You can get an IpAddr to get the IP address of the request, can get SocketAddr to get the actual remote IP address of the request (usually a proxy or whatever you may have set up). You can wrap any other guard up in a Result or Option type to conditionally get it if they fail. You can get headers, you can get all kinds of things. And most importantly, you can easily define your own Guards! To implement a guard you just implement the FromRequest trait on your type, where you get direct access to the request and server instance to get state, get or set request-local state, access data, options, configuration, other guards (yes you can cascade guards!), and all kinds of other things.

Like you could make, say, an ApiKey guard that verifies that an API key both exists on the request and that it is correct. You can make a User guard that verifies someone is logged in, maybe even an AdminUser that just wraps the User while also making sure they are an admin. This is the main replacement of middleware, and makes it so guards are only created when necessary instead of all the time, and gives you very useful and easy to access information back.

Which also gets to, when routes are being tested and you might have multiple matching routes except for their guards, it would be inefficient to keep re-creating a guard multiple times, so when a guard is created on a request its result, whether it successfully created or not, is kept on that request so it doesn’t have to be recreated again for other tests, keeping efficiency as high as possible, better than handwritten code!

A guard can also require the lifetime of the Request if you so wish it to (that’s what the '_ is in things like TempFile<'_> meaning that the type is bound to that request, and if you don’t have it copyable/cloneable then that means you know for sure that it can only be associated with that one request, great for security guards!

There’s lots of other capabilities and the rocket-contrib has a lot of prebuilt guards and responses and all as well, like a couple different template engines (including one that compiles into code at compile-time for the fastest possible templates) and a couple database types (though I prefer using the external sqlx so far). But there’s so much power while remaining so clear and so efficient and safe!

AstonJ · 9 June 2021 22:54

Great post ODL! I’ve bookmarked it and will come back to it when I’m learning Rust

dimitarvp · 10 June 2021 01:17

Wow! Amazing! Instant bookmark. I’ve read it top to bottom and I’m very excited about this. Thought of making a small test project with actix_web but you’ve won me over for rocket.

AstonJ · 10 June 2021 01:34

I made it a pinned thread for the #rocket portal too, based on ODL’s post alone

Hmmm, wonder if we should split the thread from ODL’s post and call it ‘What’s great about Rocket v0.5.0!’ (S’up to you ODL - just say if you’d like that )

OvermindDL1 · 14 June 2021 15:47

Most of my post was just about rocket in general, the 5.0 bit just adds the new and very useful async functionality. ^.^

Whatever you think is appropriate? ^.^

AstonJ · 14 June 2021 22:14

Ok done - I think it warrants a thread of its own

(Feel free to edit the title as you see fit )

SmithyTT · 22 June 2021 23:27

Seconded!