Type-checking Erlang and Elixir

The BEAM community couldn’t be more varied when it comes to opinions about static type systems. For some they’re the most desired feature of other functional languages which we miss. Others shun them and choose our ecosystem exactly because, and not despite the fact that it doesn’t force the perceived overhead of types. Some others still worry whether static types could be successfully applied on the Erlang virtual machine at all.

Over the years, there’s been some academic research into type checking Erlang. Even WhatsApp worked for a few years on inventing a new statically typed Erlang dialect. There were also numerous community attempts at building ML-inspired statically typed languages. However, for quite a while the only project that reached wider adoption was Dialyzer.

Until now! The lineup of Code BEAM 2023 sports five different talks about static type systems. Is the community fed up with its long time “let it crash” adage? Or is this many talks too much of a good thing? Let’s have a closer look at the current state of type checking on the BEAM.

Erlang, Elixir, Gleam, PureScript

Before we begin, though, let’s get one thing out of the way. There are a few relatively new programming languages on the BEAM platform that offer static type checking from the get-go. For the sake of completeness, I’ll list them here, but as of writing they still seem to be niche – though they certainly don’t lack potential! Here comes the list:

Gleam – probably the best known and most actively developed one, Gleam targets both JS (browsers) and Erlang backends. The compiler is written in Rust, the language has a lively community, and, in general, it seems to have the biggest potential!

PureScript – a hidden gem, known either as a Haskell-dialect targeting JS and the browser, or as a more complex sibling of Elm, PureScript has a complete Erlang backend! It allows writing code with the expressiveness of Haskell, yet leveraging the legendary robustness of the BEAM.

Hamler – PureScript’s younger brother with some syntax modifications more closely resembling Erlang. Based on the GitHub repository, though, the development seems to have stopped some 2-3 years ago.

Caramel – an OCaml-inspired language compiling to Erlang. It aimed to bring OCaml’s strictness to the BEAM. Sadly, the project is discontinued with Leandro Ostera, the author, pointing at Gleam as the spiritual successor.

Alpaca – another attempt at bridging the gap between ML and BEAM. The project, though, was discontinued in 2019.

With that out of the way, let’s focus on tools we can use in Erlang or Elixir. But how can we even compare programming languages and their type systems?

Type system design 101

I don’t want to quote too much academic material here, but I especially like the short introduction to type systems presented in “Bidirectional Typing” by Jana Dunfield and Neel Krishnaswami. If you’re already familiar with the theory, feel free to skip this section.

> Type systems serve many purposes. They allow programming languages to reject nonsensical programs. They allow programmers to express their intent, and to use a type checker to verify that their programs are consistent with that intent. Type systems can […] even [be used] to guide program synthesis.

On top of that, I’d add that type systems can make it significantly easier to build IDEs that provide good language support, code completion, and focused and detailed error messages, increasing development velocity.

Dynamic or static? It doesn’t matter!

Every programming language has some system of preventing errors. However, not all these systems are the same. A crucial property a programming language should help enforce is “program safety” (it’s certainly not the only one). It’s about whether I as a programmer can do something completely stupid, like divide by zero. Can I use a variable that was never instantiated? Can I store a number, yet later try to use it as text? Ideally programming languages should not allow any of that.

One major distinction between such error-prevention systems is when they kick in to prevent the nonsensical operation. This might happen when the program is already running, just before the operation happens at runtime. We call programming languages using such systems “dynamically checked”, or just “dynamic”. They might also do it as soon as we run the compiler, way before a program is even deployed, not to mention being started. These are called “statically checked”, “statically typed”, or sometimes just “static”. Such an approach is called static analysis. Strictly speaking, only those latter – statically checked – programming languages use formal type systems.

Statically typed languages historically have been more verbose and less expressive than dynamically typed ones. Dynamically typed ones have usually been easier to learn and quicker to deliver working code when starting from scratch, but also easier to shoot oneself in the foot. The latter have been on the rise since the 2000s, since when programming for the web has skyrocketed. However, with the size of computer systems growing, it’s becoming evident that building buggy systems fast is not the way to go. The question is: how can we make programming languages more expressive and more correct at the same time? It’s not dynamic vs static, it’s expressiveness vs correctness.

Please accept marketing-cookies to watch this video.

Solving the equation

In basic algebra classes we learn how to solve linear equations like:

x + 3 = 5

We know that with a single equation it’s possible to solve for only one variable. If x was also given, there would be nothing to solve – we could still check if the equation holds, though.

A type system’s task is similar, but the equation it solves looks different:

Γ ⊢ e : T

Γ is a typing context, that is the set of all our variables together with their types
e is our expression
T is its type

If we know all three, we’re talking about “type checking”. It’s relatively easy, similar to checking if an equation holds.

If we only know Γ and e, we’re talking about “type inference”. We’re trying to come up with a type of our expression e. Type inference is one of the techniques used to improve expressiveness and reduce verbosity of statically typed languages. The type system has more work to do in this case as it has to fill in the types that we, as programmers, didn’t have to provide. It saves us work at the cost of extra computation. However, that computation is hard, way harder than just type checking, and in some cases it’s just impossible. Practical type systems with inference often actually are undecidable without some type annotations provided by a programmer.

If we know Γ and T and want to solve for e, we’re talking about program synthesis also known as code generation. This can be useful, for example, for implementing data serialization or in some cases just to save us some typing. Code generated from types usually is not complete, so it requires details to be filled in by a programmer.

In practice, one of the common approaches to constructing type checkers is “bidirectional typing”, which means that when a checker traverses our program it alternates between “checking” and “inference” modes depending on the available type information. This approach makes inference possible in some situations where it otherwise wouldn’t be so, yet still allows programmers to omit the vast majority of tedious to write type annotations.

Tell me the truth

We already know that type systems aren’t all the same. They aren’t perfect either. Firstly, “they can only guarantee that well-typed programs are free from certain kinds of misbehavior.” Secondly, static analysis tools can be divided into two categories: underapproximating and overapproximating. This generally means that an underapproximating checker will not detect some bugs that exist in a given code base. An ideal checker (which cannot exist) would detect all bugs and raise no false alarms. An overapproximating checker might raise warnings even for perfectly valid code. There’s no escaping that, it’s just a matter of which side of the fence we’re on. Type systems, in general, fall in the overapproximating category.

Type checker survey

At the 2022 edition of Lambda Days I presented a side by side comparison of a few type checkers for Erlang. The landscape has radically changed since then! How much? Let’s take a look.

Dialyzer

Dialyzer is quite likely the most widely used tool in this domain. Strictly speaking it’s not a type system, but a discrepancy analyser: a program consistency checker based on flow analysis. The theory underpinning it is described in detail in “Practical type inference based on success typings“. It should be highlighted that it’s meant for use on existing codebases, i.e. no source code modification is necessary to start using it. It’s one of the two tools I presented at Lambda Days 2022 that are also covered in this survey.

It’s known for the “Dialyzer is never wrong” slogan, which sounds like snake oil, but is actually true. But what does it really mean? It boils down to the distinction between under- and overapproximating checkers – Dialyzer is of the former kind. This means it never returns false positives, i.e. never reports errors that actually aren’t there in the source code.

The other side of the coin, though, is that it might return a series of somewhat confusing non-local errors, only one of which is the root cause of the entire report – however, that one is then a real problem that needs to be fixed. Due to the underapproximating nature, it might also not warn about potential pitfalls, that do not occur 100% of the time.

Dialyzer requires a clever approach when checking libraries meant for inclusion in other code. To deliver the best results, it requires client code to be analyzed together with main code. When writing a library, this means we need to put library API callers, not just the library implementation under analysis. Usually, including our library tests in the analysis does the trick.

Traditionally, Dialyzer required generating a procedure lookup table (aka PLT) before running an analysis. It was quite a time consuming step, with the analyses not being very fast either. However, since Erlang OTP 26, the situation has greatly improved thanks to the “incremental” mode implemented by Tom Davies from WhatsApp.

A boatload of information on how to use Dialyzer effectively can be learnt from Jesper Eskilson’s “Slaying the Type Hydra”talk from Code BEAM 2022.

Please accept marketing-cookies to watch this video.

Thomas Davies spoke that same year about “Incremental Dialyzer: How we made Dialyzer 7x Faster“. At this year’s Code BEAM edition Marc Sugiyama will present “How I grew to love Erlang Type Specs”.

Because of the historically slow analyses, confusing errors, and limited guarantees, even quite prominent community figures have voiced skepticism about Dialyzer’s efficacy. On the other hand, it’s a battle tested tool with integrations built into Rebar3, Mix, Erlang and Elixir language servers, and therefore practically any editor or IDE. You cannot go wrong with Dialyzer when looking for the extra bit of confidence in your codebase.

Gradualizer

For quite a while Dialyzer was leaving part of the BEAM community unsatisfied. One of the attempts to address this was Josef Svenningsson’s Code BEAM 2018 talk “A gradual type system”, where he unveiled Gradualizer. A gradual type system is one which generally behaves like a static one, but leaves some checks to be done at runtime – the fewer checks left for runtime, the more guarantees about code correctness can be offered before running it. Probably the best known example of a language with a gradual type system is TypeScript.

Gradualizer is the other one of the two type checkers I covered in the Lambda Days 2022 talk. It’s also the one yours truly has invested the most time into, so keep in mind I might be a bit biased! I’m trying to stay level-headed, though.

Gradualizer’s theoretical underpinnings are based on type system literature such as B. Pierce’s classic “Types and Programming Languages” or J. Siek’s and W. Taha’s “Gradual Typing for Functional Languages”, as well as some inspiration from G. Castagna’s work on set-theoretic types (which we’ll discuss separately in a while). Sadly, there’s no accompanying whitepaper describing it in detail. The project also doesn’t have dedicated funding, which means it’s developed as a community effort with – for better or worse – all its implications.

The pros, however, are that it’s relatively fast, especially in comparison with pre-incremental-mode Dialyzer. Since Josef’s initial announcement, thanks to the community effort, it’s gained some features like partial type inference, polymorphism support and extended its syntax coverage to almost all legal Erlang constructs, including records and maps. It’s got a Rebar3 plugin as well as a Mix task. Thanks to its Elixir spinoff, Gradient, it can be used to check both Erlang and Elixir codebases. I presented a very early version of Gradient at ElixirConf EU 2021 in Warsaw – it’s still experimental, though.

One milestone Gradualizer reached in early 2023 was cleanly passing a self-check. This means the project, while experimental in nature, can be used in practice to test a non-trivial codebase of a significant size. A caveat to keep in mind, though, is that it requires writing in a certain style, e.g. by sometimes adding inline type assertions, so it’s slightly opinionated. Another takeaway from its ability to self typecheck is that it’s a reference on how using it impacts the coding style. Gradualizer generally follows the “no spec, no check” rule, meaning that only code annotated with function specs is checked. All in all, it means it might be a bit of an effort to use in existing codebases as is, but should be easy enough to apply when starting a new project from scratch.

Elixir Set-Theoretic types

We’ve already mentioned “set-theoretic types” in one of the previous paragraphs. It’s a theory that’s been developed and extended by G. Castagna, a professor at CNRS – Université de Paris, and his team, for over 20 years. It’s used in CDuce, an expressive functional programming language purpose built for manipulating XML. Incidentally, its semantics match Erlang’s semantics, and therefore Elixir’s, exceptionally well.

That’s the reason why José Valim, the creator of Elixir, has been cooperating with Giuseppe Castagna and Guillaume Duboc, a PhD student focusing on set-theoretic types for Elixir, at least since 2022, as outlined by José’s blog post, “My Future with Elixir: set-theoretic types”.

As explained in Castagna’s“Programming with union, intersection, and negation types”, set-theoretic types give the programmer unparalleled freedom of expressing intent in type annotations. This is nicely captured in theScala 3 Book chapter on Union Types – see the lengths to which one has to go if a language doesn’t have union types! Non-discriminated union types usually are not available in traditional statically typed languages like OCaml or Haskell. Scala 3 does have them, as does Erlang and Elixir, but set-theoretic types also offer the intersection and negation connectives, which together provide even more expressive power. Guillaume Duboc’s Elixir Prototype Showcase allows us to play with set-theoretic types in Elixir with an ad-hoc, prototype syntax. Please note the prototype is powered by the CDuce type checker, not the final Elixir one – that’s still a work in progress.

However, this comes at a cost. Set-theoretic types are extremely expressive, but global type inference with set-theoretic connectives is undecidable (original result by Coppo and Dezani-Ciancaglini, 1980, here via Tommaso Petrucciani’s PhD thesis “Polymorphic set-theoretic types for functional languages”). This means that in the edge cases, a programming language with set-theoretic types requires the programmer to put in at least some type annotations. Local type inference should still allow for omitting most of them.

This project, with José, the Elixir creator, Giuseppe, a researcher with over 20 years of experience in the field, and Guillaume, working on it full-time and writing a PhD thesis, has a huge potential! Listen to Guillaume and Giuseppe talking about “The Design Principles of the Elixir Type System” at this year’s Code BEAM in Berlin!

However, it’s likely only going to target Elixir, with changes being gradually rolled out with new Elixir versions. Erlang is likely not going to benefit from it directly, though the research results will certainly be applicable. Another point worth mentioning is that the plan for Elixir is to abandon the traditional Erlang-inspired type and spec syntax to enable the full expressiveness of set-theoretic types. If this happens, it might mean some extra difficulties for projects trying to target or leverage both Elixir and Erlang at the same time (for example, using dependencies written in one language from a project in another). Nonetheless, the benefits seem to outweigh the costs, so let’s keep our fingers crossed for Elixir with set-theoretic types to copy (or surpass!) the success of TypeScript!

Etylizer – Erlang Set-Theoretic types

Since we’re at set-theoretic types and we’ve already mentioned these fit Erlang just as well as Elixir, then it makes sense to ask the question if there’s any work targeting the former. Apparently, the answer is yes!“Set-theoretic Types for Erlang” by Albert Schimpf, Stefan Wehr, and Anette Bieniusa is a paper announcing the development of etylizer, a new set-theoretic type checker for Erlang. The paper itself is a summary of Castagna and others’ work that best applies to Erlang and Elixir, with extensions on Erlang specific pattern-matching that doesn’t exist in CDuce.

Etylizer is a very new project with not much user documentation and still little coverage of the Erlang language syntax. For example, features like records or maps are not implemented yet, which makes it practically impossible to use on real-world code. This makes it look paler in comparison to Gradualizer or Gradualizer. On the other hand, it’s based on a very strong theoretical underpinning, so there’s great opportunity ahead if the enthusiasm doesn’t fade. It can be set up with ease, though currently it only has a command line interface. It’s rather fast. It accepts contributions, and being written completely in Erlang, it should be quite easy to contribute to it for the BEAM community. Come listen to Anette and Albert talk about “etylizer: Set-theoretic Types for Erlang” at Code BEAM 2023!

eqWAlizer

Talking about ease of contribution for the BEAM community, eqWalizer is an outlier – it’s the only tool in this comparison implemented in non-BEAM languages, namely Scala with some Rust here and there.

With a dedicated team at WhatsApp backing it up, it’s currently the only such well polished contender to the BEAM ecosystem’s baseline, that is Dialyzer. And there’s no doubt that a strong contender it is!

It’s fast. It’s easy to set up – it has a Rebar3 plugin, but it can be used without it on non-Rebar projects. It was announced at the ICFP Erlang Workshop keynote in late 2022 by Ilya Klyuchnikov, its main author. It seems to work just as well as Gradualizer on the latter’s test suite (more on that to come in another blog post). It’s slower when run on a single file, but still significantly faster than Dialyzer when run on an entire project. It’s a gradual type system supporting a dynamic() type. The treatment of that type depends on the mode – gradual or strict – with the gradual mode making it compatible with any type.

On the cons side, it seems to be blind to some nuances that a set-theoretic type checker would catch. It’s also quite opinionated, not distinguishing between integers and floating point numbers, introducing concepts such as “shapes” and “dictionaries” to deal with maps, or not caring about the difference between proper and improper lists. Dialyzer and Gradualizer, in comparison, pay more attention to these aspects of the language. This might make eqWAlizer a bit hard to adopt in established codebases. Fortunately, eqWAlizer documents these design decisions well and thanks to the simplifications they bring, it seems to be a well polished and very handy tool in an Erlang programmer’s belt. Yes, just an Erlang programmer’s – there are no official plans to support Elixir.

Roberto Aloi, Michał Muskała, and Robin Morisset from WhatsApp, as well asAlan Zimmerman from Meta, will be speaking at Code BEAM this year about various tools they develop for the Erlang ecosystem. While none of the talks is going to be about eqWAlizer directly, there’s a chance they could share some of their impressions in the “hallway track”.

To conclude

That’s the end of our survey of Erlang and Elixir type checkers available on the BEAM platform. As you can see, some of the mentioned projects change almost as we speak and we can watch their development very closely – these are truly interesting times! It’s even better that we can speak to the authors thanks to events like Code BEAM or even contribute directly since all the mentioned projects are open-source.

Does it mean the BEAM community has grown tired of “let it crash”? Probably not, as that’s a valid mitigation strategy for errors which cannot be avoided like network links dropping, storage failing, or hardware malfunctioning.

However, there are errors like programming mistakes, logical flaws, design shortcomings, or just simple API misuse that can be avoided or mitigated ahead of time. Moreover, such errors usually don’t go away just because of a process restart and have to be fixed in code. The sooner we realise they’re there, the sooner we can react. It would be short-sighted not to leverage approaches like static analysis or type checking if they can bring improved developer productivity, foster type-driven development, encourage better design, enable easier and more aggressive refactoring, or just make the developer experience better.

Radek Szymczyszyn

Radek is as Erlang/Elixir/XMPP Engineer at Erlang Solutions