Writing a Compiler and a Virtual Machine in Rust

First of all I want to give credit to craftinginterpreters as whenever I would google something it would have the best explanation on the topic.

Second this is my personal journey and my experience going through this journey so please remember your experience will surely be a different one.

Motivation

First I wanted to brush up my rust skills as my last big endeavor with rust was 2 years ago so I decided to write something that was low level to be match the domain where rust excels. The second reason is that I’ve always been fascinated with virtual machines and thought writing my own would be a fun thing to do.

The Journey

I was really humbled by how complicated writing a virtual machine and a high level programming language that worked with it.

This made me feel how lucky we are having all these high level programming languages to use, as it abstracted a lot of the complicated concepts and made it really easy to develop something while focusing on problem we want to solve not fighting with the programming language to do our bidding.

I implemented enough concepts from a programming language to get my benchmark to run as I had an AST version with full OOP features and I used python as my benchmark base to test my implementation against.

The Experience

The good

  • With rust you have a nice collection of zero cost abstractions to use out of the box compared to C and in the standard library so I was grateful for a couple of things namely the Vec::<Local>::new() , it is a nice to use growing array like C++ Vector but its methods makes more sense.
  • The HashMap implementation in rust is also solid enough for most use cases and gets the job done and knowing it is a zero cost abstraction makes me feel safe enough that I am not taking a big performance hit if I need to use it.
  • The match statement and its need to handle all variations of the matched results is something that helped me avoid a ton of errors when handling a value that has some edge case situation.
  • The macro system is amazing, #[derive(Debug, Clone, Copy, PartialEq, PartialOrd)] is a big time saver, I was able to quickly use the new enums I created to do a lot of things right away without needing to implement these features for them.

The Bad

  • Rust is opinionated regarding the use of pointers (references) in general. While this helps it requires a lot of restructuring of an existing code base when migrating to rust from C/C++.
  • Rust pushes towards for everything to be explicit, example copying values variables or making them mutable, handling errors and heap allocation. This makes the code quite longer, while it improves readability it forces very big functions for rather simple functionality in other languages.
  • A lot of the low level optimizations that you can do with C/C++ would require you to use the unsafe keyword one example is casting a enum that is represented as u16 to u16 which is reasonable or should be handled better in the future for example: let opcode = unsafe { std::mem::transmute(5 as u16) };
  • The reuse of heap memory is forbidden in rust I cannot allocate the memory for something on the heap and mess with underlying byte representation, it is possible to cast and re cast the some types like an enum but a struct for example will need to be serialized. This makes the code longer and less readable and including serde will bring along a bunch of other dependencies making the compilation much slower and increases the binary size.

The Ugly

  • I won’t talk much about this one but here is how I got the short representation of the enum with the help of the official documentation unsafe { *<*const _>::from(self).cast::<u16>() } I am not even sure that this follows the rust syntax but it is definitely not readable.
  • To make easier to do a lot of mutable operations on type that is not being used except in one place, but since I had to pass it as a function parameter the borrow check refused the mutable reference forcing me to copy it back and forth, even when I totally dropped the reference.
  • Starting to meddle with life times makes a huge mess if you are working with other members on a project, so opting out of using them is the straight forward solution.

Coming from C/C++ and Go , Rust definitely feels like an upgrade although you’d have to give up some of the low level optimization in C/C++, rust improved the stability of the code and eliminated a whole class of bugs that are really hard to debug.

Conclusion

You can find the compiler and virtual machine in this repo C-Aurora and if you are curious about the tree walking interpreter AST implementation it is available in here Aurora

Rust has it’s place and it’s use cases but I believe you will have to make a trade of optimizations and shorter code for the sake of memory safety and better error handling when coming from C/C++.

For mission critical services in web applications and tight memory constrained embedded systems I can see rust brings a lot of benefits, but I don’t believe in the near future rust will be able to replace C/C++ in low level and on metal situations.

I enjoyed writing a virtual machine and a programming language parser and compiler, there is a bunch of things to learn from studying these areas and it makes you appreciate the tools you are using on daily basis more.

The speed of the virtual machine and compiler is nothing to brag about. Although having reached a speed of only five times slower than python in a Fibonacci calculation with my programming language, I consider it a success learning all these Mystical Arts.


Posted

in

By

Comments

3 responses to “Writing a Compiler and a Virtual Machine in Rust”

  1. Needless Avatar
    Needless

    Hi. Could you point interested readers to some of the reference materials you consulted while writing the virtual machine?

    Like

  2. evomassiny Avatar
    evomassiny

    Congrats on trying rust with such a nice project !

    You mention having a hard time casting an enum to it u16 representation, did you needed it for comparing “precedence” in a Pratt Parser ?
    If so, you don’t need to get the actual u16 value of your type, you can #[derive(Ord, PartialOrd, Eq)] for your enum type, and directly use it for the comparison.

    Like

Leave a comment

Create a website or blog at WordPress.com