Gluon VM - Virtual machine for BEAM
Not published yet
Gluon — A pocket-size BEAM VM
Presenting a hobby project that I've been doing for several months now. It is named after "gluon", a particle which acts as the exchange particle for the strong force between quarks. Like gluons bind subatomic particles together, the Gluon VM is also small and it strongly connects the world of large Erlang systems with the tiny embedded world with heavily constrained resources.
- A virtual machine which runs BEAM bytecode -- Erlang, Elixir, LFE, Joxa, you name it.
- Really small, at 10-20% of original BEAM VM right now and dedicated to go even smaller.
- Single core — simple!
- Some features can be disabled to shed more weight.
- Written in standard C++ — platform independent.
- Clean source, modern standard (C++14). Keeping code simple and typesafe.
There definitely is a vacant niche in Erlang ecosystem -- we miss a VM which has ability to run in constrained environments. Existing projects try to shrink and reach there, but they were never designed to be small in the first place. This project attempts to occupy this niche and deliver small and simple VM which is memory-savvy and covers most of BEAM features enough to run complex projects (such as a web server). As a side effect and a result of programming style it is also possible to embed virtual machine into one's application.
Even having rewritten the VM to be small, the next step would be to minimize memory usage at startup, rewrite parts of startup code and cut the library. I've been able to find references to such attempts but even after reduction they still require quite powerful hardware to even boot into a shell on an empty VM. For example "On maintaining a small memory footprint" mentions, that Ling VM required between 16 and 20Mb RAM to be able to boot properly. And "Is Erlang small enough for embedded systems?" states, that Ericsson's VM is able to fit into 2Mb of storage and run on a 16Mb system.
We want to go smaller in the not so distant future, this is why I plan extensive work to create embedded size-optimized standard library, derived from existing. Boot sequence can be simplified, libraries can be sorted into optional categories with selection of what user actually wants, cleaned from things that are never used on small machines and shrinked down significantly. The code itself can be packed more efficiently while compressing and cutting away extra bits.
I started with single core for simplicity sake, because hours I am able to put into it are limited, so project will benefit from being delivered as soon as possible, but will not benefit from rich features that take too many hours, rarely wanted and are never done.
Because Gluon VM is small, it opens new possibilities which were not reachable before. Main idea would be to use it on IoT platforms which are too constrained for regular BEAM VM to run. Also those platforms that are not able to run Linux should become within reach. Replacing OS layer or even rewriting it for OS-less chip is a fairly short project, only several hooks to manage VM process, reach files, sockets or memory are required.
Here on image, "estimated bloat" is the approximate possible growth of compiled binary with development of all missing features. Libs are taken as 1Mb for all rows, a cut version of OTP. 128k and 64k bars are there just for the scale, we do not target them yet and they remain a distant goal.
Ranges of system sizes, where Erlang VM is (and could be) used:
- 32+Mb memory systems. These kinds of systems are able to run original Ericsson's VM or Ling without any problems right today.
- 16..20Mb systems are not able to run original Ericsson's VM or Ling, but are able to execute trimmed down versions according to "On maintaining a small memory footprint" and "Is Erlang small enough for embedded systems?".
- 2 to 16Mb systems. Here begins dark area, where Erlang doesn't come that often. Gluon VM will be able to run on systems above 2-4Mb right away after the first release.
- 512kb to 1Mb systems. These will become reachable after some effort and rewriting parts of standard library. This is a low hanging fruit.
- 64 to 256Kb systems. These may become reachable after heavy rewrite and optimization. Smaller fruits hang higher. Will have to give up a lot of functionality and simplify library a lot too.
Possible areas of use include but are not limited to:
- Projects which run on memory constrained devices, possibly without OS.
- POS terminals, handhelds.
- Robots, such as warehouse management or manufacture.
- Smart card readers and RFID trackers.
- Many sorts of remotely managed devices, like hardware controllers, railroad signaling, mobile network hardware, vehicle tracking and controllers.
- Scripting for a larger application. VM is easily embedded as a library and is self-contained.
Some of them never considered Erlang before, some did, and hopefully some will do now.
How can I use it
Please note that the project is still under development and has not reached "ready to use" phase.
On the 10th of November the project was able to:
- Read multiple BEAM files and run most of code in
lists module, some simple demos.
- Run and return result from famous ring example (spawn multiple processes and chain send/receive a number).
- Switch between multiple processes, handle mailboxes.
- Pattern-match on functions.
- Manipulate lists, tuples, use list comprehensions, do simple math.
- Handle lambdas and function objects in a way that made sense.
- Run a notable chunk of erts/src/init.erl before crashing with a TODO(notimpl catch).
Currently working on the ability to enter the shell via starting init:boot. This will require implementation of exceptions, IO ports, links and monitors.
To be able to run the project, you should clone the repository and follow the README steps: enter
emulator directory and run
make. This will build the system, and enter GDB. Entry point is defined in
src/main.cpp and library search paths are hardcoded in
vm.cpp constructor. Get own test
.erl file, mention it in
main.cpp and compile it to
.beam, edit paths in
vm.cpp and you're good to go!
- Exceptions and stack unwinding. This is work in progress during November.
- GC. I am using "The garbage collection handbook" from 2012 and will start working on that soon.
- Ports, sockets, file access for Erlang code, and I/O in general
- Binary opcodes
- Bignums and floats
- More compact BEAM format
To go the path of further shrinking:
- Standard C library has to be replaced with an embedded-oriented library. Such as
dietlibc or a similar.
- Try how 32-bit version affects the binary size (hint: it doesn't) and memory footprint (hint: it shrinks).
- Try how other than AMD64 instruction set affects size and memory footprint.
- Shrink the BEAM file size, possibly develop a new smaller format. Possibly optimize out unused functions in standard library and produce 'release-like' output directory with shrinked Erlang library.
- Possibly optimize out unused BIF functions from the VM source to produce smaller VM.
- Optimize Erlang startup, load start modules temporarily then remove them from memory.
- Tweak startup memory usage, aggressively reduce memory allocations in init, reduce amount of work done on startup, throw away extra code.
- JIT? Not there yet.
- Reduce Erlang and library to a small subset of desktop Erlang, call it, say, Embedded Erlang or something, and run it. Similar to what they did to Java Card to be able to run on 64kb smart cards and SIM.
Further work would have to focus on reducing BEAM file sizes and rewriting libraries to a very compact and limited form. In current state libraries take a significant amount of memory, many of them are not used in one specific system, and many of those which are used - are ran only once, at init time.
Comparisons and License
Trying to compare Gluon VM against Ericsson's VM or Ling, I can say they are different. The aim is to not have an exact implementation of original library and VM, but to be good enough, while shedding away the extra weight and saving space here and there to try and become really small. I will consider making benchmark and compare features once it is shaped out and first release is done.
Benchmarks make little sense at this moment as the project is work in progress. The VM prints a LOT of debug output. It is slow, because everything is double and triple checked with assertions. It may look fast otherwise, because there is no garbage collection.
The code is licensed under Apache version 2. Contributions are not welcome yet, and will not be accepted until the code is fully shaped and first release is coming.
Go back to the blog