So. Hello! I’m Jack. And as it may or may not be obvious, this is my first blog post. I don’t want to detract from the contents of this post too much, but I figured it’s at least worth a brief explanation as to why I’m making it.
This is a response to the Rust call for blog posts for 2021. Now, I do co-lead the traits working group and have contributed to Chalk a fair bit over the past year. But, I do want to put a disclaimer that the points I may make in this post are my own and I’m not speaking in any “official” capacity here. I wanted to make this post in case others find my thoughts useful. And, frankly, I figured it would be fun.
Well, onto the contents of the post itself.
Okay, let’s start with some background. My day job is actually being a graduate student studying bioinformatics. In this field, it’s really a mixed bag of programming languages that you see. Python and bash/shell scripts are extremely common for data munging, processing, etc. Python also is used often for machine learning. R is used often for plotting, but also some data analysis. Anytime you need more raw performance, you see a lot of C or sometimes C++. In the work I do, I also heavily use Javscript/Typescript for web development, with a mixing of Kotlin. I use Rust for one (unreleased) project, but that’s all so far.
A lot of the libraries and software for bioinformatics are used because other people use them; they’ve been peer-reviewed and published, and as more people use them, the more of the de facto standard they become. This isn’t bad necessarily: there are plenty of libraries and software that are well-maintained, with easily-accesible source code, that does exactly what you need it to do. But, there are also plenty of libraries and software that isn’t well-maintained, the source is hard to find, or you need it to do something different.
Now, you might be asking: “how does this relate to Rust?” Well, in a way, it does and it doesn’t. In some sense, the conclusion from the past couple paragraphs is “I want more libraries and software for bioinformatics written in Rust.” But why? If you think the answer is “because I like the language and like writing in it”, you would be partially correct. But, really, it’s because that I think Rust has a lot to offer here:
Python and bash are generally okay for data munging and processing, but sometimes you really just want some type safety. R is generally excellent for plotting, but…well I’m not even going to try to list all the issues here. As far as data analysis tools written in R: have you ever had to hunt down documentation, unsure of what exactly that function does? Something like rustdoc would be great here. Performance-sensitive libraries and software are where I feel like Rust will shine the most; I would probably be preaching to the choir here if I started to list off reasons why. And finally, for web development related work, particularly server-side, it would be great to have better predictability for the memory and CPU requirements needed.
Eek, that was a longer background than I expected. I guess the tldr is: I like Rust and I think it’s a language that offers a lot of promise for many different types of applications. But, at least for bioinformatics, there’s more work to be done to build up libraries.
Now it’s time for the meat and potatoes: what I want to see from the Rust programming language in 2021. I already mentioned in my background that I feel like more libraries are needed in Rust, particularly around bioinformatics. But, in my opinion, that will come in time, and there’s not much that can be done other than making the language something that the people writing those libraries want to write in.
What I do want to talk about can maybe be summed up into a single word: stability. This is very much a loaded word, but in this post, I’ll talk about two separate parts of this.
Maturity as stability
I recently saw a reddit post of someone asking about Rust’s stability. But, in their post, they also said a different word: mature. What exactly is a mature language? A mature library? Something that doesn’t add many new features? Well, then surely C++ wouldn’t be called mature then (since there are numerous new features in C++20, for example). One that doesn’t have a lot of bugs? Well, good luck.
One way you could imagine maturity is like this: If the maintainers of the language or library stopped working on it today, how bad would that be. Under this definition, you can imagine something like the C language being very mature; the core language hasn’t change much in a long time. A library that has been written and does everything it’s supposed to without (more than perhaps a couple) bugs is probably considered mature. C++11? Probably mature. C++20? Probably not. Maturity is relative and it’s subjective.
Maturity in a language is important. If you ask the question: “is this language going to be around in 20 years”, the list of languages that you would feel a confident “yes” for is probably small. In my opinion, Rust isn’t quite there, yet. But what are some things that can be done to bring us there?
First, let’s think about the standard library. I want to make sure before I go forward that the number of different APIs in std doesn’t exactly indicate it’s maturity or not: a standard library with “all the batteries” doesn’t mean it’s stable and a standard library without them doesn’t mean it’s not. Instead, std maturity is more about having better knowledge of what programming patterns are used often enough that they should be in std; and importantly, what shouldn’t be. There are plenty of unstable library features where it’s up in the air whether they should be stabilized or removed. And there are plenty of “wanted” features that just aren’t in std yet. This is not to say there won’t always be some API that someone wants, but I do feel like we’re still not quite “mature”. Quick point though, I would say that 98% of the current std is excellent and probably falls under the “mature” category.
Next, we can think about language features. There are number of features people
want to see: GATs, const generics, specialization all come to mind. In some form
or another, these each have partial implementations. There are other areas of
the language where the design hasn’t even been decided on (see the RFC
repository). Maturity here means pushing these language features through. It
means stabilizing pieces that work (e.g.
min_specialization). It means working through the design of interacting with
bits and pieces of the language that today are somewhat “magic” (e.g.
traits/vtables or dynamically sized types). Of course, this can’t all happen in
2021, but that’s okay. It’s much better to get these designs right than to get
them fast. In 20 years, it isn’t going to matter if we had these features in
2021 or 2023.
Finally, libraries. A lot of libraries right now are
0.*; i.e. they aren’t
committing to semver compatibility. Part of this is due to the previous two
points, but part of it isn’t. I would really like to see more libraries hit
1.0, but again this will come in time.
Librarification as stability
Before I continue with this, I do want to point out that Niko Matsakis previously wrote a great blog post covering some of this earlier this year. As he put it, “the basic idea is to refactor the compiler into a set of independent libraries, all knit together by the query system.” In my view, this is super important. In the blog post, he covers two points:
First, librarification can mean that tools can use the components of the rust compiler itself to do analysis. For example, rust-analyzer could share the type checker with rustc. This is great because it means that the tools to do static analysis, code refactoring, etc. all become better or easier to implement. You wouldn’t have to worry “am I covering all the types here” because the types you use are the types the compiler uses.
More importantly, in my opinion, librarification also means that each individial library is more accesible than everything as a whole. The rustc codebase as it is now is…intimdating. There’s been a huge surge of work fairly recently to help make it more accesible, but at least for now, the compiler is still all together. Making the compiler more accessible to contributors is especially important when you consider Rust in 20 years. It’s a much better idea to have people be able to pick up parts of the compiler and contribute than to get stuck thinking of the compiler as a monolith. This in particular is near and dear to my heart, because we have had so much success with this with Chalk; this past year, we’ve had so much support, especially from people that don’t work on the rustc codebase at all. I feel like it’s a model that really works well.
Let’s talk about Chalk though. For a quick summary for those who may not now: Chalk is an implementation and definition of the Rust trait system using a PROLOG-like logic solver. The plan is to eventually integrate it into the compiler as the trait solver. Currently, there is a very experimental integration into rustc. My goal for Rust in 2021 is to get this integration to a place that people can start playing with it. I’m definitely not committing to anything here, but I’m super hopeful for this. And, I would love to start seeing other bits of the compiler start to be split into libraries.
Phew. That ended up being a lot. There’s more I could say. There’s so much praise I could give to the Rust language and the people that contribute their time and energy to it, whether it be to the compiler, the standard library, the docs, the libraries, the infrastructure, the community. At the end of the day, I want to see this language grow. I want to see it mature. And ultimately, 2021 is going to be another year of that. It’s been 5 years of Rust, so here’s to another 20+ more. 🥂