Pietro Albini explores how the Rust type system promotes correctness and simplifies refactorings, and how procedural macros minimize code repetition while introducing parallelism and enhancing tooling.
Pietro Albini, actively engaged with the Rust project, contributes to the Infrastructure Team, Release Team, and Security Response WG. His current role is at Ferrous Systems as the technical lead of Ferrocene, aimed at integrating Rust into safety critical applications.
Software continues to reshape the world. QCon London is dedicated to enriching the software development community by disseminating knowledge and spurring innovation. Tailored for technical team leads, architects, engineering directors, and project managers, QCon champions progressive development leadership.
Albini remarks on Rust’s rapid ascension in the programming community. Initially a modest research initiative by Mozilla to elevate Firefox’s development and performance, Rust has blossomed since its stable release nine years ago into a robust endeavor supported by 300 individuals, blending volunteer efforts with corporate backing, all dedicated to the evolution of Rust.
Rust is increasingly being embraced globally, with developers showing a strong preference for the language. According to the Stack Overflow Developer Survey, Rust has reigned as the most beloved programming language for eight consecutive years. The inclination towards Rust mainly derives from its foundational goal of ensuring memory safety, a critical issue in low-level languages such as C or C++. Unlike these languages, Rust prevents common vulnerabilities like buffer overflows, use-after-frees, and null pointer dereferences, drastically reducing the potential for security breaches. Microsoft identified that approximately 70% of the vulnerabilities in their applications stem from memory safety issues, problems that Rust’s safeguards inherently address.
Many may wonder about the relevance of Rust when high-level languages like Java, Python, JavaScript, Go, and Ruby, which are inherently memory safe, are already in use. The answer lies not only in Rust’s memory safety but also in its ability to optimize performance while maintaining excellent tooling and developer experience. For instance, a Python package manager implemented in Rust, named UV, performed up to 115 times faster than pip in benchmarks. Such performance benefits are pivotal, as seen in projects like Home Assistant, where switching to the UV package manager saved 215 hours of CI time monthly due to Rust’s efficiency.
In my upcoming talk, I’ll explore how Rust can be leveraged within your projects or organizations and discuss why its adoption might be beneficial. My name is Pietro, and I’ve been deeply involved with the Rust project for several years, focusing on infrastructure, release, and security response to ensure reliable delivery of Rust. Previously, I led the Rust infrastructure team and was a member of the Rust Core team for two years. Professionally, I serve as a technical lead at Ferrous, where we specialize in developing Rust-based solutions for safety-critical applications in sectors like automotive and aerospace.
I want to start with the elephant in the room that Rust is not an easy language to learn. If you look at Rust, you’ll see everywhere online that people say it’s hard to learn. That is true, because Rust forces you to use a different programming model. One of the core pillars of Rust, the reason why Rust can ensure memory safety is the concept of single ownership of data. That an object, a value can only be owned by a single function at a given point in time. You cannot have the ownership spread between multiple parts of your code.
Then you can lend out references if you want other parts of your code to access them. The problem is, most programming languages don’t enforce that. If you come to Rust, you’re going to hit a big wall of having to internalize this new model, having to think again on how you architect your software. Once you do that, then Rust will click, then you will be able to productively use Rust, and take advantage of all of the good things about Rust without slowing down your team. Google who is a large user of Rust and has been reverting more of its external services in Rust, ran an internal survey recently.
This was announced by the director of engineering of Google Android at a conference, that Rust teams at Google are as productive as the ones using high-level programming languages like Go, and more than twice as productive as teams using C++. You can get all of these after you learn Rust. You can get all of these without slowing down your team.
Also, there is another reason why I think Rust is hard to learn. This is something that every person trying Rust, me included, is guilty of, that we all make Rust harder to learn for ourselves, because Rust allows you to squeeze every bit of performance. Rust offers all of the tools to create a reliable and efficient software. Doing that requires some of the parts of Rust that are harder to learn. If you want to learn Rust, what I can recommend is, don’t start writing the most efficient algorithm possible even though Rust tempts you to do that. Start writing normal code.
As you become more acquainted with Rust and its efficiency-enabling concepts, you can delve deeper into the language. It’s important not to optimize prematurely to avoid complicating the learning process. Remember, you are not learning Rust in isolation; numerous supportive communities are available online to assist you. There’s abundant access to both free and paid resources, including books and commercial training to enhance team skills. Rust further aids developers with its compiler, which effectively acts like a collaborative partner in coding.
Indeed, the Rust compiler is akin to having an experienced senior developer at your side. Rust emphasizes delivering clear and actionable error messages, setting new standards for industry compilers. Other compilers are now taking cues from Rust, aiming to improve their error messaging—an area that the Rust team is deeply invested in. Should you encounter a less clear compiler error, it’s viewed as a bug worthy of reporting to and being rectified by the Rust community.
Consider a typical compiler error related to the ownership model, which emphasizes single ownership. If an error occurs because data is used in several places, Rust’s compiler will clarify the issue—showing where the data was moved previously and suggesting where to rethink your code, the data’s type, and potential refactoring techniques. It will even notify if the data type supports cloning to create a copy, thus providing a practical solution if the performance trade-off is acceptable.
Rust’s confidence in its compiler errors is so robust that it allows automatic corrections for some errors through a specific compiler flag, immediately correcting source code and resolving issues. Moreover, Rust includes Cargo, its official build system, which simplifies project builds and manages dependencies. This tool is particularly beneficial for those accustomed to high-level programming paradigms or those new to universally-applied tools from environments like C or C++. Cargo supports dependency management from both private and public sources, notably the crates.io registry, facilitating easy integration of new dependencies via simple additions to your project’s Cargo.toml file.
Rust comes equipped with a variety of essential tools including a static analyzer named Clippy, the rustfmt code formatter to maintain consistent coding style across projects, comprehensive IDE integration via rust-analyzer suitable for several text editors, and rustup, a tool for managing Rust versions which facilitates installing and updating Rust, along with handling multiple Rust versions for different projects.
Now, let’s explore the practical application of Rust. We’ll explore how Rust enables writing code that is both efficient and maintainable. Our focus will be on two aspects: utilizing macros within the type system to enhance robustness without compromising on efficiency, and harnessing concurrency to boost both efficiency and performance. Starting with the former, Rust incorporates a powerful feature known as macros to automate repetitive coding tasks. This allows, for instance, the creation of a macro to define multiple functions from a single template, or to write reusable patterns that minimize code redundancy. Rust utilizes two varieties of macros: declarative macros, which are integrated directly within the code and share similarities with preprocessor macros found in other languages, and procedural macros, which function as external code generators that integrate directly with the Rust language. Our emphasis will be on procedural macros, specifically on derived macros, which facilitate the generation of complex default implementations for Rust traits, akin to interfaces or type classes in other programming languages. While some traits require bespoke implementations for each type owing to varying behaviors among objects, others benefit from a sensible default implementation that can be customized if necessary. For straightforward cases, a simple default method suffices, but more intricate scenarios may necessitate generating a tailored default based on the structure of the data.
Consider the example of the clone derive. The clone trait in Rust enables object duplication. Implementing clone is relatively straightforward: simply clone each field within your struct recursively to achieve a full copy of the data. By applying the derive clone attribute to your type, Rust’s compiler engages the code generator to produce an optimal clone implementation automatically.
Then you can simply use the clone method. This is achieved without using runtime reflection. Moreover, as Rust does not support reflection, you are unable to utilize it anyway. The macro evaluates your type and composes optimal code specifically for your requirements. By examining the backend workings of the macro, it’s apparent that it constructs the clone trait for your type. Within this, it defines the clone method where each individual field is replicated. This reflects precisely the approach you would adopt if you were implementing clone manually.
In Rust, this principle is referred to as zero-cost abstractions. Despite the term, it does not imply that these abstractions are cost-free; instead, they impose no performance penalty. Essentially, these abstractions are as rapid, efficient, and effective as code that you could manually optimize for a specific type. The strength of this feature lies in its capacity to simplify your code using abstractions without sacrificing efficiency.
Clone is a built-in derive available across all Rust toolchains. It is inherently accessible in your code and requires no additional effort to utilize. Moreover, you can create custom derives tailored to your project needs or leverage derives from third-party sources, among which Serde is noteworthy. Serde is a comprehensive serialization and deserialization framework prevalent within Rust, renowned for its user-friendliness and often considered the standard. Most Rust libraries are compatible with Serde, which is the recommended choice, except in specific scenarios where it might not be ideal. With Serde, initiating serialize and deserialize derives at the beginning of your type generates highly optimized serialization and deserialization routines.
This process avoids any intermediary stages, such as storing the data in a dynamic map accessible through strings, which could incur runtime and accuracy costs due to the need to verify correct string usage. The logic constructed by Serde translates data directly into the format required (e.g., JSON) and integrates seamlessly into your structs and type representations. Being a macro, it refrains from using reflection and instead generates code at compile time, which is then thoroughly optimized by the compiler for your specific type. This results in a zero-cost abstraction, offering fully functional deserializers and serializers with performance equivalent to manually crafted ones, but without the headache of maintaining manual serialization for every type.
How do we manage various forms of data? Consider an example where different data types are required based on the message type in our application. In Rust, this is effectively handled using Enums. Unlike in many programming languages where Enums are simply a set of named constants, Rust’s Enums can include data. This concept, known as algebraic data types or sum types, is not limited to Rust but is prevalent in numerous functional languages, including the more specialized or widely-used ones like Scala or Haskell. Enums in Rust enable the inclusion of diverse data types and the association of distinct data forms relative to the Enum variant utilized.
For instance, consider a basic Enum outlining your application’s database configuration options: SQLite or Postgres. The configurations for these are distinctly different—SQLite uses a file path while Postgres requires a URL along with a username and password. Enums allow this specific information to be encapsulated within each respective Enum variant.
This arrangement ensures that you cannot accidentally create a SQLite configuration with a username and password meant for Postgres, thereby preventing any possibility of an invalid state. This property not only aids in code maintainability but is also a feature sorely missed when working in languages devoid of this capability. Enums’ necessity becomes apparent once you begin using them, making you wish for their presence in every programming scenario.
Moreover, Enums enforce a check on the data type before you can access the underlying information—preventing errors such as attempting to extract a password from a SQLite configuration that does not possess one. The safest method to verify the data’s integrity and relevance is through pattern matching, which ensures the data’s structure matches your expectations before any operations are conducted on it.
Pattern matching extends across many programming languages, offering a robust method for handling complex data structures by mapping them to variables based on their pattern. This feature isn’t unique to Rust; it has recently been introduced in Java and Python and is being considered for inclusion in C++. A practical example of pattern matching in Rust can be illustrated with a match statement that determines how to connect to various types of databases based on their configurations.
In Rust’s approach, patterns are processed sequentially. To illustrate, suppose we are dealing with an SQLite database stored entirely in memory. The pattern matching will initially check if the database variant is SQLite and then verify if it is configured to use memory as storage. If both conditions are satisfied, Rust executes the function designated for ephemeral storage. If the conditions are not met, the matching process moves to the next pattern where it checks again for an SQLite database without specific path constraints, allowing for a broader match that still ensures type safety and correctness of the operations involved.
If the SQLite conditions fail, the matching process continues, this time checking for a Postgres database. If it finds a match, it fetches and binds specific internal fields to variables used for database connection and authentication. This hierarchical checking ensures that the operations on data align precisely with its defined structure, showcasing pattern matching’s capability to encapsulate complex checks succinctly. Rust enforces that all potential variants are handled in match statements, thus preventing any oversight in handling data variants. For scenarios where not all cases need explicit handling, a default pattern can be specified.
If a variant is omitted, such as Postgres in an earlier scenario, Rust’s match checks will flag this as an error, indicating the missing pattern and prompting a review of the data structure definition or suggesting possible extensions to the existing patterns. This emphasizes Rust’s focus on maintaining exhaustive and correct data handling, reducing runtime errors and increasing code reliability.
Procedural macros in programming can do just about anything, from verifying database queries at compile time to auto-generating table names. Enums are incredibly versatile and can represent state machines that eliminate invalid states, ensure data layout integrity, and assist in maintaining efficiency and performance while enhancing code maintainability.
Maintaining efficient and organized code is one thing, but what about boosting performance? Leveraging concurrency is key, especially in modern environments where multicore systems are the norm, ranging from servers with 128 cores to everyday mobile phones with multiple cores. Rust programming facilitates this through its capacity for parallel processing. Consider the example of Rust iterators, commonly used to manage data in functional programming. A basic iterator might handle a dataset using a potentially slow function, impacting overall program performance if the data set is large and the function is slow. Enter Rayon, a third-party Rust library that supports parallel iterators.
By simply importing Rayon and switching from iter to par_iter, Rust can execute data transformations across multiple threads effortlessly. While parallel computing exists in other languages like Java with parallel streams, implementing it can be daunting due to potential concurrency-related issues in older, non-parallel codebases. However, Rust’s approach to concurrency is distinctive. It offers what is known as fearless concurrency, providing confidence that the parallel code is free from common errors like data races and that adding parallelism to legacy codebases will not introduce complex issues.
Let’s examine how this functions in a real-world scenario. For simplicity, we start with some data encapsulated in an Rc, a reference-counting object tracking data copies to deallocate when no longer needed. It includes a RefCell, which allows us to bypass certain Rust restrictions and modify data over time. Within this setup, we forge a closure that processes the data and dispatch it across a separate thread.
Attempting such an operation, however, will result in a compilation failure, as the program harbors a significant concurrency flaw: Rc is not designed for thread safety. The method it uses to manage reference counts isn’t performed atomically, leading to potential data races and unexpected behaviors. The compiler, recognizing this issue due to the non-implementation of the Send trait on Rc, blocks compilation. The Send trait, which is typically implemented automatically by the compiler, helps ascertain thread safety by evaluating if all components of a structure are sent. If any component, like Rc in this case, fails to meet this criterion due to internal thread-unsafe mechanisms, the type itself is deemed non-sendable.
To resolve this, replacing Rc with Arc – a type identical in function but utilizing atomic operations for thread safety – is necessary. The distinction between Rc and Arc is critical as atomic operations involve higher processing costs. By segregating these types, Rust allows use of Rc generally until thread-safety concerns are flagged by the compiler, prompting a switch to Arc for those specific instances to manage performance impacts effectively. If this amended code is compiled, a new error emerges because RefCell, providing mutable data access across different code parts, also lacks thread safety characteristics.
If multiple threads attempt to alter the same data simultaneously, a data race occurs, which is a complex issue to debug. The RefCell type in Rust is not inherently thread safe, so the compiler will flag any improper usage. To ensure thread safety, one must use synchronization traits such as Send and Sync, which allow for safe concurrent access by multiple threads. Replacing RefCell with a locking mechanism resolves these issues, ensuring the code is thread safe.
In a simplified example, the issue might seem obvious to an experienced Rust developer. Consider a large, outdated codebase where adding parallelism to critical sections could enhance performance. Rust’s compiler aids in this by automatically verifying the thread safety of types involved in concurrent operations. This capability arises from the requirement that data types involved in such operations must implement the Send and Sync traits. Consequently, Rust supports robust parallelism, enabling fearless concurrency due to its protective compiler.
Rust introduces a unique approach to concurrency by integrating locks with data protection. Unlike other programming languages where locks typically secure code blocks, Rust’s locks are data-centric. Data is encapsulated within locks, so access is controlled by the lock state. The Rust type system enforces this by allowing access to the data only when it is securely locked, preventing any access attempts when unlocked. This mechanism ensures tight control over data integrity in concurrent environments.
This is not actually valid Rust code. I created the unlock method to make it clear to understand. Notice that you still have to prevent deadlocks yourself. The Rust compiler cannot prevent your program from compiling if you attempt to create a deadlock. Still, that is probably not easiest to debug because it’s fairly hard to detect exactly why deadlocks happen, but is the easiest to detect that something is wrong. With data races, with corrupting data, when maybe 1 in 1000 times it accesses at the same time, that is way harder to detect. Rust prevents that, Send, Syncs.
The way that locks are designed enables fearless concurrency. It enables to add parallelism to your code. It enables to squeeze every bit of performance you can get from the hot path of your application that does most of the data processing without having to worry about it. Rust has even more to offer. There is native support for Async/Await that is actually even more interesting the more you look at it. You have a powerful type system apart from Enum that allows you to design extremely maintainable software.
Rust makes it really easy to interoperate with other languages. If you want to take advantage of Rust, you can, but you don’t have to rewrite all of your code, you don’t have to rewrite all of your legacy application that was developed for 20 years. Rust was designed to be able to slowly replace the parts of your code base that could benefit the most, maybe the parts that will benefit from concurrency, or the parts that need the performance without having to deal with memory safety issues. That is actually the approach that Mozilla has taken to introduce Rust in Firefox, because Rust originally was created just by Mozilla just to improve Firefox.
When Mozilla launched Firefox Quantum in 2017, they rewrote the Firefox CSS engine. This led to a 30% speedup in loading amazon.com. This was not because the old code was inefficient. This was because the old code was single thread. It was a legacy C++ multi-threaded application. Mozilla deemed basically impossible to add parallelism to it without all of the protections that Rust gives you, without all of the confidence that Rust gives you. With Rust, Mozilla was able to just replace a small part of Firefox and add parallelism to it, and get such a big speedup. There are multiple ways to interoperate with Rust.
The most universally adaptable method across various programming languages involves using the C Foreign Function Interface (FFI), as most modern languages can interact with a C library. Rust supports this by allowing the creation of a C interface for libraries or program segments; this enhances interoperability between Rust and other components of an application. The ecosystem around this is robust, featuring tools that simplify the creation of bindings and decrease necessary boilerplate coding.
An outstanding example of such integration tools is PyO3 which facilitates the development of Python native modules using Rust. This capability is demonstrated in a ready-to-use PyO3 module that defines a new function to sum numbers and return the result as a string. After compilation, this module functions seamlessly within Python environments, offering all the benefits of Rust, including safety from memory-related issues typically encountered in C.
I recommend evaluating which parts of your existing code could gain significantly from Rust. It is crucial not to resort to its application indiscriminately but to strategize its integration where it provides maximum advantage. Rust’s utility extends beyond just enhancing memory safety; it also boosts programming efficacy and performance in projects involving higher-level languages.
A distinctive factor behind Rust’s success is not solely its proficiency in parallel computing or memory safety, but also its capability to bridge previously isolated facets of the programming landscape. Rust introduces substantial innovations to low-level programming, enabling developers to sidestep security flaws, enhance operational safety, and enjoy modern development environments and tooling that are advantageous to programmers universally.
While higher-level programmers, perhaps previously intimidated by the complexities of C and C++, now find Rust an accessible tool to maximize performance. It does not compromise on tooling or developer experience, making it ideal for crafting reliable and efficient software. This is why Rust continues to be the most loved programming language year after year.
According to your experience, is a better learner an expert C++ developer, or rather a junior developer?
I believe there isn’t much difference in learning effectiveness between the two. Rust necessitates a different style of programming than C++. An expert in C++ might have to unlearn some old habits, yet they might also grasp more readily the reasons behind Rust’s safeguards. The real determinant of ease in learning Rust is more about an individual’s approach to learning than their prior expertise.
Where do you see Rust not being used?
Albini explains that for simple, quick scripting tasks like CI setup, Rust may not be the best choice due to its slow compilation time, although efforts are being made to improve this. Rust, being a relatively young language, has areas of varying maturity within its ecosystem. This affects where and how effectively it can be applied. For instance, in game development, Rust shows potential through several successful projects but lacks the comprehensive ecosystem that languages like C++ offer.
In terms of UI development, though there are libraries to create aesthetically pleasing interfaces, they do not yet match the maturity of established frameworks such as Qt. The decision to use Rust depends on individual assessment of the current ecosystem’s capabilities and whether it meets specific needs or merits contribution to its development. The appropriateness of adopting Rust thus varies significantly with the context of its application.
Participant 3 raises a query about procedural macros in Rust, noting their rarity in other programming languages and questioning potential applications where they might be necessary, as opposed to regular functions.
Albini: A place where functions cannot be used and procedural macros can is like, for example, the clone implementation we saw before, because the clone implementation depends on knowing which fields the struct has. It depends on knowing the shape of the data you want to clone. That is information that you don’t have access at runtime. Even if Rust had reflection support, it will probably be inacceptable from a performance level in a lot of places. Procedural macros really shine where the code you need to generate changes depending on the shape of the data you have.
A good way to think about it is that procedural macros are something you would use where you would reach to reflection in other languages. If you were in Java and you wanted to use reflection, that’s what in Rust you have to use procedural macros, which on one hand is worse because procedural macros are harder to write because you need to actually parse the struct and then generate code from it rather than just invoking a reflection method. On the other hand, they bring the efficiency and maintainability things I mentioned before.
See more presentations with transcripts
Sep 10, 2024
Welcome to DediRock, your trusted partner in high-performance hosting solutions. At DediRock, we specialize in providing dedicated servers, VPS hosting, and cloud services tailored to meet the unique needs of businesses and individuals alike. Our mission is to deliver reliable, scalable, and secure hosting solutions that empower our clients to achieve their digital goals. With a commitment to exceptional customer support, cutting-edge technology, and robust infrastructure, DediRock stands out as a leader in the hosting industry. Join us and experience the difference that dedicated service and unwavering reliability can make for your online presence. Launch our website.