Advent of Code 2023

Hello! I’m Piotr from Erlang Solutions Poland and I have the pleasure of saving Christmas this year with the power of Erlang for you!

This is the second time we participate in the amazing event called the Advent of Code. Last year’s edition was solved by my colleague Aleksander and as far as I know – many of you enjoyed following his efforts. I hope you’ll like my tale of helping Santa too!

I’m going to publish my solutions in my GitHub repository. They will be accompanied by a commentary, added to this page on a daily basis. I will add solutions for each day in an individual folder, with my input file downloaded from the AoC website.

I’m also going to include a bit of microbenchmarking in every solution, with and without JIT. Firstly, to measure the overall performance of the code and secondly to see how much the efficiency improves thanks to JIT. I’m going to measure the computation time only with `timer:tc/3` call, as I consider time needed to compile a module and load the input file irrelevant. By “load” I mean: read it and split it into lines. Any further processing of individual lines is considered a computation. I will provide min, max and arithmetic average of 100 runs.

I’ll be implementing solutions as EScripts, so running them is a bit more straightforward. And frankly – I think they are underrated and sometimes I prefer them over writing BASH scripts. I’ll always include `-mode(compile).` directive to avoid the interpretation performance penalty. For those who are not aware of this capability, I’ll also run Day 1 without this option to show you how the timings change.

I’m going to run every piece of the code on Linux Mint 21.2 VirtualBox machine with 4 cores and 8GB of memory, hosted on my personal PC with Ryzen 3700X and DDR4 at 3200MHz. I will use OTP 26.1.1.

Day 1

Part 1

I would never suspect that I’ll begin the AoC challenge with being loaded onto a trebuchet. I’d better do the math properly! Or rather – have Erlang do the calibration for me.

FYI: I do have some extra motivation to repair the snow production: my kids have been singing “Do You Want to Build a Snowman?” for a couple of days already and there is still nowhere enough of it where I live.

I considered three approaches to the first part of the puzzle:

  1. Run a regular expression on each line.
  2. Filter characters of a line with binary comprehension and then get the first and last digit from the result.
  3. Iterate over characters of a line and store digits in two accumulators.

I chose the last one, as (1) felt like shooting a mosquito with a M61 Vulcan Cannon. Second one felt kind of less Erlang-ish than the third one. After all, matching binaries and recursive solutions are very natural in this language.

Timings

MinAvgMax
Compiled + JIT0.000091s0.000098s0.000202s
Compiled + no JIT0.000252s0.000268s0.000344s
Interpreted0.091494s0.094965s0.111017s

Part 2

By choosing the method of matching binaries, I was able to add support for digits as words pretty easily. If there were more mappings than just nine, I’d probably use a map to store all possible conversions and maybe even compile a regular expression from them.

Eventually, the temptation of violating the DRY rule a bit was too strong and I just went for individual function clauses.

And my solution was invalid. Shame on me but I admit I needed a hint from other participants – it turned out that some words can overlap and they have to be treated as individual digits. It wasn’t explicitly specified and ignoring overlaps in the example did not lead to an invalid result – a truly evil decision of AoC maintainers!

Simply put, at first I thought such a code would be enough:

parse(<<"one", Rest/binary>>, First, _Last) -> store(1, Rest, First);

But the actual Rest must be defined as <<_:8, Rest/binary>>.

Timings

MinAvgMax
Compiled + JIT0.000212s0.000225s0.000324s
Compiled + no JIT0.000648s0.000679s0.000778s
Interpreted0.207670s0.213344s0.242223s

JIT does make a difference, doesn’t it?

Day 2

Part 1

I’m not complaining but maybe if the elves spent more time on polishing engineering skills than inventing games, perhaps the Snow Island would have never broken. However, an army of technically skilled elves could render our jobs obsolete. All of our customers and employers would only have to write a letter to Santa. …don’t you dare to mention AI here.

Lucky for us, there were no traps in the puzzle input and the specification today. Actually, parsing each line was probably more challenging than computing the result itself.

In part 1 I decided to retain all information from the file in the internal structure. I was expecting that I may need it in part 2. Plus, I obviously began the work with testing the parser itself and printing a complete structure that was friendly to a human debugger’s eye.

Using classic regular expressions was tempting, just like in Day 1, but I was a bit afraid that a small mistake in the pattern or my reasoning could lead to hours of debugging and losing a bit of sanity. Laziness won and I just decomposed each line with a chain of `binary:split` calls.

MinAvgMax
JIT0.000692s0.000732s0.000802s
No JIT0.000799s0.000849s0.000987s

JIT influence was less noticeable this time but still measurable.

Part 2

Thanks to building the internal data structure with complete information from the input, the second part was very straightforward. For every game I simply needed to find the biggest number of red, blue and green draws and multiply them.

Implementing it was very fast so I decided it’s a nice occasion to experiment with performance of splitting vs. scanning. To be more specific: the code so far used lots of `binary:split` calls and I’ve replaced it with sequential parsing of a binary, (almost) character after character. My intuition told me that it will be most likely similar or even a bit slower but I wanted to have an empirical proof.

I’ve also tried to merge parsing and computations to avoid unnecessary iterations over the full game list. After all, I needed only the “power” of a game and maximums of each color could be found at the time of reading the input. Actually, there even was no need to distinguish between individual draws!

MinAvgMax
Binary split, JIT0.000703s0.000730s0.000790s
Recursive scan, JIT0.000136s0.000156s0.000206s
Scan and compute, JIT0.000097s0.000105s0.000148s
Binary split, no JIT0.000868s0.000938s0.001229s
Recursive scan, no JIT0.000250s0.000278s0.000470s
Scan and compute, no JIT0.000203s0.000224s0.000331s

Maybe it’s just me but I think it was definitely a surprise! Recursive scan of a binary was about 4-5x faster than `binary:split` approach, even with such short lines. It seems that more efficient usage of sub binaries and match contexts makes quite a difference.

Computing the solution in one pass also brought measurable improvement but the code might be considered less understandable if read without any comments. Well, we do sometimes have to balance between readability and performance, right?

Day 3

I did try regular expressions this time.

Part 1

Today’s puzzle had a very interesting input to parse. Its “syntax” was simple but at first I wasn’t sure what would be the most suitable representation of the provided data internally. The most straightforward one, that came to my mind, was to simply keep two maps: for locations of numbers and engine parts. The value in the numbers map was the number itself and its length.

Since the task was to find isolated part numbers, my code iterates over all of them and checks all possible part locations candidates in a bit naive way – because it doesn’t bother to check the engine plan boundaries and doesn’t take into account that parts cannot be located inside the number. It is a small waste of CPU cycles but I’ve considered part 1 just a warmup – the optimizations will come later.

MinAvgMax
JIT0.001688s0.001915s0.002737s
No JIT0.002685s0.002879s0.003540s

Part 2

My expectation was that the task for part 2 will be to find the correct engine elements’ numbers in a manner similar to the Minesweeper. Fortunately, the actual puzzle was easier.

So, the data model that I’ve chosen for part 1 wasn’t that great in the second problem. It was possible to solve it nevertheless, but it was a bit awkward, because it required filtering out all parts that are not ’*’ and then collecting lists of all adjacent numbers for each possible gear. It did work and allowed me to pass the second stage of the puzzle but I definitely wanted to do it better, with significantly better performance.

So I decided to give regular expressions a try, since I was confident I can get it right with such simple objects as “nonempty series of digits” or “anything other than digits and periods”. And I did but unfortunately the performance got significantly worse! It may be partially caused by a fact, that `re:run` returns either `{Offset, Length}` pairs or the matched string but not both. In this case I had to take the former and use `binary:part` and `binary:at` to get the actual values. I tried to precompile the expressions too, as I thought that repeated compilation might accumulate into a significant delay but it didn’t bring that much improvement.

The next version involved iterating over ‘*’ instead of numbers. Also, this time the map of part numbers became an index: for each number there are as many keys as digits in it. As an example: the previous implementation encoded `2137` at coordinates `{4, 2}` as `{4, 2} => {2137, 4}`, while this time it is stored in the map as: `{4, 2} => 2137, {5, 2} => 2137, {6, 2} => 2137, {7, 2} => 2137`. Now, for each potential gear the following coordinates are checked:

  • Adjacent to the left and right of the part symbol.
  • Directly above the symbol. If there is no number there, upper left and upper right are checked, as a number might end and begin there respectively. If there is a number directly above, there is no need to check diagonally because it would be the continuation of the same number.
  • Same but below the symbol.

Final optimisation step introduced different means of storage than two maps. For the gears, a plain list of coordinates is used (as the values from the map weren’t even used in the previous version) and in case of the numbers index – an ETS table. I suspected that the latter will be especially beneficial, as the index can become quite a large map.

MinAvgMax
No optimisations0.002262s0.002519s0.003287s
RegEx0.004212s0.004541s0.006180s
Precompiled RegEx0.003902s0.004199s0.005541s
Numbers index0.001990s0.002254s0.002577s
Index in ETS0.001355s0.001487s0.002081s
No optimisations, no JIT0.003488s0.003774s0.005147s
RegEx, no JIT0.004811s0.005070s0.008774s
Precompiled RegEx, no JIT0.004687s0.005065s0.006503s
Numbers index, no JIT0.002694s0.003005s0.003677s
Index in ETS, no JIT0.002032s0.002156s0.002382s

Once again, my intuition misled me. I expected the RegEx version to be at least as good as plain scan but likely faster… but it was almost 2x slower! Precompiling the expression did help a bit but it was still nowhere near scan’s performance.

Also, using the numbers index brought much smaller improvement than I thought, given that iterating over engine parts was bound to have less misses than iterating over numbers. Also, it allowed the code to calculate the sum of “gear ratios” on the fly. Still, it did perform slightly better.

Replacing the map with the ETS table gave the calculations a significant boost. It was expected, as it avoided repeated copying of the map in the parsing stage.

Well, off we go! The gondola lift works again and maybe we’ll finally find out what’s broken in the water source of the Snow Island. Ironically, there is a water outage where I live at the moment of writing these words…

Day 4

OK, I won’t try parsing with regular expressions in the next days. Binary matching.

Part 1

Until today, I considered two options of retiring as a developer one day: become a truck driver or a teacher at school. But now I’m seriously considering the career of gambling addiction therapist for elves. A game on day 2, scratchcards on day 4… are we going to deal with Lapland Hold’em and slot machines too?

Overall, it was a pretty easy task – the only performance risk was the fact that for every game card an intersection of two sets of numbers was needed. However, thanks to them being relatively short, the result was calculated quite fast.

MinAvgMax
JIT0.000452s0.000495s0.000603s
No JIT0.001110s0.001167s0.001368s

JIT is particularly strong today, bringing over 2x improvement!

Part 2

I’ve chosen to keep the counts of each card in a list, because it would be sequentially processed anyway:

  • We need the head to get the count of the current card in the loop.
  • We need to update at most 10 (maximum count of winning numbers) following values in the list, so the tail doesn’t have to be copied, making the whole operation fairly fast.

I’ve tried once again to make the parsing faster with regular expressions, assuming that with as simple pattern as `\\d+`, it could be even faster than matching binaries one character at a time. Eventually, binary matching won once again.

It was an interesting experiment to check if the new type of sets (“version 2”), introduced in OTP 24, could provide better performance. I wasn’t sure about it, since we’re operating on very short lists. Empirical proof has shown that indeed the overhead of creating new structures, finding the intersection and retrieving its size is more expensive after all than plain membership check in short lists.

MinAvgMax
Initial, JIT0.000431s0.000460s0.000511s
Regular expression, JIT0.006323s0.006593s0.008358s
Sets V2, JIT0.000618s0.000721s0.000943s
Initial, no JIT0.001104s0.001144s0.001325s
Regular expression, no JIT0.007620s0.008014s0.011486s
Sets V2, no JIT0.001285s0.001420s0.001826s

Day 5

I never really liked gardening. Too many spiders and insects. I’m glad that AoC only expects me to solve a puzzle, not weed the weeds or dig through an acre of land.

Part 1

When I successfully solved the example and saw the real input file, I was SO HAPPY that I did not decide to store the mappings as discrete pairs and kept them as tuples. First of all, the calculations would be less efficient and secondly they might not even fit in the memory.

I implemented the parsing of mappings in a way that would allow just any series of transitions, as long as they begin with seeds and finish with locations. After all, I was very suspicious of AoC maintainers’ malicious ideas (Day 1, part 2 took its toll!) and I preferred to write a code that can handle mappings provided out of order and that are not restricted to the ones from specification (seed, soil, fertilizer, water, light, temperature, humidity and location), instead of being forced to make any necessary adjustments later.

MinAvgMax
JIT0.000086s0.000095s0.000126s
No JIT0.000216s0.000240s0.000389s

Part 2

I was scared that I might not be able to describe and present the solution to you on time. Spoiler alert: I made two trivial mistakes and I was able to spot them after a careful analysis of the code.

To keep the code efficient, I did not expand the seed ranges and took the challenge to keep them as `{Start, End}` tuples. My implementation gave the correct answer for the example input but was wrong for the real one. In order to get it right, I had to realize two facts:

  1. If I ensure everything (both lists of mappings and the list of ranges) is sorted before each processing step, I can do each mapping iteration in a single pass, making the sort operation the most expensive part of the whole algorithm (so pretty cheap after all). Here I made the first of the mistakes mentioned earlier: I added the sorting for all but the humidity => location mapping.
  2. The result of each iteration may include overlapping ranges – they have to be merged. My second “trivial mistake” was about improper merging method.

I’ve included the detailed explanation of the recursive solution inside the code as comments, so I encourage you to take a look. It does feel nice to have a solution with sub-millisecond execution time.

MinAvgMax
JIT0.000107s0.000125s0.000196s
No JIT0.000262s0.000283s0.000331s

Upon inspection, you’ll notice that the secret ingredient is the decomposition of the problem: when you consider all possible relative “positions” of a range, it’s pretty straightforward to deal with one case at a time and just delegate (possibly split the range) the rest to the other function clauses.

Day 6

Part 1

Hello boys and girls! Do you remember math classes at school? Good, because we’ll be using the square function today!

I had a hunch that besides a brute force approach, today’s puzzle should be solvable with an analytical solution. Initially I thought I’ll have to dig out knowledge about function derivatives but eventually much simpler tools were needed.

So, ignore units completely and assume the following symbols:

  • t – time of a race
  • tP – time of pressing the button, so actually a velocity – this is our primary unknown
  • tM – time of boat movement
  • d – distance to beat

t = tP + tM

tM = t – tP

tP * tM = d

tp * (t – tP) = d

-tP2 + t * tP – d = 0

Now, we need to know the lowest and the highest tP, that match the race’s best distance. We calculate this by finding the square function zeros.

Δ = t2 – 4 * (-1) * (-d) = t2 – 4 * d

tPmin = ( -t + sqrt(t2 – 4 * d) ) / -2

tPmax = ( -t – sqrt(t2 – 4 * d) ) / -2

Of course this gives us the answer in real numbers space, so we need to get a ceiling of the minimum and floor of the maximum.

Today’s task doesn’t seem to be that hard to implement with brute force too, as the numbers aren’t as big as they sometimes tend to be in these challenges. However, it’s always a good idea to avoid this approach in AoC because usually there is a smarter way to crack the problem.

MinAvgMax
JIT0.000000s0.000001s0.000023s
No JIT0.000001s0.000002s0.000010s

Part 2

Thanks to the analytical solution, part 2 was as easy and fast as part 1 and it required only minor changes to the code.

MinAvgMax
JIT0.000000s0.000001s0.000030s
No JIT0.000000s0.000001s0.000007s

It’s interesting that in both parts there was at least one execution that took more with JIT than without it. It may be caused by the initial cost of native compilation of the code.

Day 7

I did mention “Lapland Hold’em” on day 4, didn’t I? They do have gambling issues and when I become a therapist for elves – I’m going to be insanely rich!

Part 1

When parsing, I decided to convert everything to numbers for easy comparisons (and there will be lots to match and compare!). I suspect that the AoC invented their own poker rules, so people won’t be able to reuse existing poker libraries (I’m sure there are some out there!). Lucky for us, because these are fairly simple. Hierarchy (not to mention card colors – we don’t have these in this puzzle!) in the classic game is more complex and would be more challenging to implement.

MinAvgMax
JIT0.000593s0.000649s0.000832s
No JIT0.001191s0.001299s0.001626s

Part 2

It was a real fun to figure out the correct logic to take Joker into account. At first I tried to match on the counts list, just like in part 1 but I quickly realized that it should be easier with Jokers’ count taken from the list. With these pieces of data:

  • If a list is empty, then it’s 5 of Jokers.
  • We can have five of a kind with the following combinations of the highest number and jokers respectively: 5+0, 4+1, …, 1+4, 0+5
  • Similar story in case of “four of a kind”: 4+0, …, 0+4
  • Full house may be achieved in a classic way or with two pairs and a Joker.
  • Three of a kind: classic three, a pair and a Joker or a pair of Jokers.
  • Pair: classic pair or a Joker and all other unique cards.
  • High card: none of the above.
MinAvgMax
JIT0.000626s0.000675s0.000858s
No JIT0.001250s0.001303s0.001440s

Day 8

Part 1

When I started reading the description of the puzzle, I expected that we’ll need to do a graph search to find the correct route. Fortunately, it was way easier, as we are provided with the sequence of movements, it only has to be repeated until we reach the goal.

MinAvgMax
JIT0.000885s0.000935s0.001099s
No JIT0.000975s0.001030s0.001167s

Part 2

My workmate Dennis wrote today: “They should call this the Advent of Math.”

As one of the steps towards the final solution, I made the code print the individual paths’ lengths for each starting node. In my case these were [19637, 21251, 12643, 15871, 19099, 11567]. It was very likely it will require lots of iterations for these to “meet”, so I tried to find a shortcut. I decided to use school math again and this time it was even more basic knowledge than square functions: lowest common multiple.

Lucky for me – it did work! Later, my colleagues made me aware that if the input was more tricky, this wouldn’t work. One of them was even disappointed that there were no ugly traps in the input data today.

To be more specific – LCM can be used only with paths repeating the same way every time. If, for example, going from AAA to ZZZ took 12345 steps in the first iteration and 54321 in the next one, things would get more complicated.

MinAvgMax
JIT0.005665s0.005769s0.005987s
No JIT0.006812s0.006934s0.007138s

… is it going to be a gambling puzzle again tomorrow? Perhaps dice or roulette simulator? Blackjack?

Day 9

Huh? No gambling today? I’m almost disappointed. Well, today we’re analysing the environment of an oasis and I suppose tomorrow we’ll be calculating how to properly use hot air to reach the metal island with hang glider… so perhaps on Monday…?

Part 1

The core of the solution was the proper decomposition of the problem into recursive steps. At the top, we need to know what is the next value in the sequence of numbers (the original input line). In order to compute it, we need to add the next delta to the last number of the sequence. The next delta can be acquired by treating the list of deltas as a normal input sequence, of course if there is at least one non-zero value.

I fell into two traps today:

  • I reused my escript template with an existing function for parsing integers. However, it was unable to parse signed integers. The symptom was the script running infinitely. For a moment I was afraid that my solution was so inefficient that I needed to find some really clever workaround. Fortunately, after adding a few debug `io:format` I quickly understood what’s wrong.
  • For some reason, I was sure that it’s perfectly fine to sum all deltas to find out if all of them are zeros. I forgot that some of them could be negative and indeed the sum for some of the lines in my input equalled zero, while it wasn’t just a sequence of them. I’m not sure if this is a deliberate edge case planted by AoC deities but using a plain boolean is a way better idea anyway.
MinAvgMax
JIT0.000398s0.000414s0.000520s
No JIT0.000961s0.000985s0.001103s

Part 2

This was most likely one of the easiest modifications required to solve part 2. All I had to do was to reverse the input lists.

MinAvgMax
JIT0.000402s0.000427s0.000571s
No JIT0.000966s0.000985s0.001118s

Day 10

Maybe it’s just my impression but the difficulty progression this year is seriously broken. After quite trivial day 9, today it took me several hours to get everything right. Obviously, traveling by train and programming isn’t the most concentration-friendly combination, but still I kind of expected to complete the puzzle in 3 hours at most.

Part 1

I do think that probably there is a more efficient data structure for the parsed input of today’s task. Certain pieces of code can most likely be refactored to not attempt being unnecessarily generic. For example: we do know that we won’t have more than 4 paths outgoing from the starting point.

Ultimately, I believe that the way I used elaborate structures and a certain dose of boilerplate, will make the whole implementation easier to understand while maintaining decent performance.

MinAvgMax
JIT0.004062s0.004421s0.005392s
No JIT0.005224s0.005607s0.006058s

Part 2

Oooh, this one gave me more than a bit of a headache. I suspect that the problem of calculating the area of an arbitrary 2D shape has already been long solved but I took the challenge to figure this one out by myself.

First of all I had to modify the code from part 1 to store the full path traversed, so all pipes that belong to the loop are known.

I decided that it’s probably a good idea to check the area inside the loop by “tracing” one vertical slice at a time, as it’s possible to come up with rules to identify which parts are “inner” and which are “outer”.

This approach did work but I rewrote the logic several times from scratch until I figured out the correct collection of function clauses:

  • We begin in the “outer” state (we are NOT inside the loop area)
  • Whenever we see pure vertical borders, we can remove them, as only the “edge” pipes are relevant to us.
  • If in the outer state:
    • If it’s a pure horizontal border, enter the “inner” state.
    • If we have two subsequent edges and they ultimately lead from left to right (or the other way around), we skip it and remain in the outer state.
    • If these edges lead to the same direction (both point left or both point right), we enter the inner state.
  • If in the inner state:
    • If it’s a pure horizontal border, enter the “outer” state.
    • If we have two subsequent edges and they ultimately lead from left to right (or the other way around), we enter the outer state.
    • If these edges lead to the same direction, we skip it and enter the outer state.
    • If none of the above apply, add the distance between two first pipes in the queue to the total area.

Of course the actual implementation is a bit more nuanced but these are the core rules to follow.

The most significant mistakes I made today:

  • I forgot to deduct the type of the starting point from its adjacent pipes.
  • I thought that it’s sufficient to know the coordinates of the pipes to calculate the inner area – it’s not.
  • I did not distinguish between “inner” and “outer” states.
  • I treated both kinds of continuous edges (left-right, left-left and right-right) the same within a single state.
  • Having only about 4 hours of sleep.
MinAvgMax
JIT0.010449s0.010908s0.011680s
No JIT0.012210s0.012841s0.013636s

Day 11

Part 1

There was nothing overly complex with today’s puzzle. After the classic operation of reading the locations of galaxies into the map, applying a proper shift is pretty straightforward. Then all we have to do is get the taxicab distance between every pair.

Since today’s description is pretty short and the puzzle is about observing the galaxies, let’s spend a minute or two on contemplating November picture of the month from JWST:

[source: ESA/Webb]

I suppose I’m not the only one to see a serpent in the lower part. It seems that part of our Universe is used by some cosmic entity to play Snake and a fraction of it is in fact a huge Nokia 3310 emulator.

MinAvgMax
JIT0.008142s0.008496s0.011024s
No JIT0.017631s0.018599s0.020832s

Part 2

As easy as part 2 on day 9. Offset used for calculating final Xs and Ys has to be increased by 999,999 instead of 1 for every “expansion”, so it’s not even a full one-liner.

MinAvgMax
JIT0.008112s0.008603s0.010933s
No JIT0.017386s0.018262s0.023179s

Day 12

I started today’s challenge with a brute force approach – I tend to begin the puzzle solving early in the morning, before it’s time to wake up my kids to school. At the same time – it’s definitely not my optimal time of the day to think. Therefore, brute force was the best I could come up with at first. And I hoped for the best.

Foolish…

While this approach was very slow but acceptable for part 1, obviously part 2 had to include a much more complex input. The superficial simplicity of the raw input file was a trap!

I came up with a recursive solution that takes the first “checksum” (number of damaged hot springs) and finds out all possible ways it can fit into the sequence. For each of them, the rest of the sequence is checked against the rest of the checksums. And so on, until we exhaust all checksums – if the remaining sequence doesn’t contain any “bad” springs, we’re good.

This approach was very fast in part 1 but when I tried it for part 2, it was painfully slow. With some extra debugging, I saw that it’s chewing through each line but some of them take many seconds to process.

I came to the conclusion that I should try memoization – there was a very good chance that certain sequence+checksums combinations do repeat. Adding a simple cache in ETS was very straightforward. And BOOM! Part 2 cracked! Although, I admit, when I saw 850504257483930 as my output, I was seriously afraid it’s way too high.

Part 1

MinAvgMax
Brute, JIT0.970235s1.005418s1.101696s
Brute, No JIT1.595984s1.672389s1.925581s
Efficient traverse, JIT0.003736s0.004082s0.005317s
Efficient traverse, No JIT0.008716s0.009372s0.011776s
Memoized traverse, JIT0.009096s0.010023s0.013193s
Memoized traverse, No JIT0.011752s0.012488s0.014494s

It’s noteworthy that the memoized version is noticeably slower in this case. It’s a very good example that adding a cache doesn’t always guarantee improved performance (although often it does). In unlucky cases – it may even slow things down!

Part 2

MinAvgMax
Memoized traverse, JIT0.771392s0.823118s0.909711s
Memoized traverse, No JIT0.887931s0.911738s1.002154s

It’s a bit late when I’m writing it but I’m going to check tomorrow how the memoized version performs with parallel processing! I’d like to try the following:

  • 4 processors, shared cache
  • 4 processors, individual caches
  • processor per line, shared cache
  • processor per line, individual caches

Do you have your favourite? My bet is on “4 processors, shared cache” – less context switching and it’s probably more likely to find a value cached by another processor. Of course on the other hand we’ll have a certain overhead of parallel access to ETS. Perhaps it’s a good idea to try Persistent Terms too?

Day 13

Part 1

Since the task involved lots of comparisons between lines that consisted only of two possible values, I decided to play with bitmaps today. Simply put, “..#.” would be converted to 0010. Integer comparisons are way cheaper than matching two lists, maps or other complex data types.

MinAvgMax
JIT0.000969s0.001066s0.001594s
No JIT0.001432s0.001507s0.001705s

Part 2

At first, it seemed to be very expensive to introduce mirror cleaning to the current logic. However, when I examined the provided examples once again, I checked if there was exactly one reflection in each pattern in part 1 after all. Since it turned out that it is the case – it made the process way easier, as I could discard all the ideas I had about multiple reflection candidates, looking for the longest one etc.

I chose to continue my fun with binary operations, as verification whether two lines/rows have only one symbol different can be performed by XORing the values and checking if the result has only 1 bit set (i.e. only one bit was different in the input integers).

MinAvgMax
JIT0.000995s0.001046s0.001255s
No JIT0.001513s0.001561s0.001741s

Bonus

I promised to benchmark the parallelized version of day 12 part 2 solution. To keep the table at a reasonable size, I’m not going to test “No JIT” variants this time. Here it comes!

Note: Persistent term has only “Avg”, as it is very time consuming to clean the created persistent terms. Therefore, I’d have to start the script from scratch on every attempt and it won’t be comparable to the other results.

MinAvgMax
Original0.817489s0.896030s1.093369s
4 processors, shared cache0.676303s0.832595s1.285685s
4 processors, shared cache, read concurrency0.735774s0.858655s1.215173s
4 processors, shared cache,write concurrency0.293589s0.329503s0.404031s
4 processors, individual cache0.195259s0.237037s0.267828s
4 processors, persistent term1.227170s
one per line, shared cache0.617219s0.727700s1.056201s
one per line, shared cache, read concurrency0.619448s0.832244s1.472503s
one per line, shared cache, write concurrency0.265507s0.304517s0.419459s
one per line, individual cache0.164978s0.181643s0.208810s
one per line, persistent term19.110257s

Conclusions

  • Persistent term is not suitable for caching purposes, at least not in this scale.
  • `write_concurrency` flag was a positive surprise – I expected that this implementation is more likely to have bursts of reads than writes. Of course the latter are guaranteed too but I expected the ratio between them to be different.
  • In fact – `read_concurrency` decreased the performance by a measurable factor!
  • In some applications per process caches may be more efficient than a shared one.
  • I lost the bet. 🙁 It seems that sometimes spawning as many workers as necessary may be a bit better strategy. However in the real world it would make throttling and resource control more challenging.
  • DO NOT update Persistent Term Storage in bursts concurrently.

Day 14

Part 1

Mirrors’ alignment, just like after the launch of JWST! Well, the telescope is not calibrated with rocks. Fortunately.

MinAvgMax
JIT0.000548s0.000627s0.000764s
No JIT0.000680s0.000787s0.001001s

Part 2

I had enough off-by-one mistakes for a lifetime.

So we have to perform lots of rotations in this part. It’s quite obvious that 4 billion transformations of a large table is not something we can brute force. At least not with an average hardware.

We had to implement moving the rocks in all 4 directions. I decided it’s probably better to rotate the platform clockwise and always slide the rocks “up”. Now I’m pretty sure that I made my life harder with this, since rotating arrays represented as a list of lists of positions is possible but requires more mental effort than just sliding in 4 different directions.

Nevertheless, the number of iterations clearly indicated that the platform arrangements had to loop at some point. But the important “gotcha” today was the fact that the loop doesn’t begin from the first iteration – there is an offset.

And this is where I made lots of tiny mistakes in my implementation: offset was off by one, the loop length was off by one, the number of remaining steps was off by one. Or by two or three at some point too.

The good news is that this is the last island! Now we’re slowly going to retrace our steps!

MinAvgMax
JIT0.251737s0.268959s0.315577s
No JIT0.296287s0.318066s0.370740s

Day 15

Today’s puzzle was so easy that I’m seriously concerned about the possible difficulty level on Saturday. I learned my lesson yesterday and I did not try to over engineer the solution with ultra efficient data structures and I went for a plain map of lists and used standard list manipulation functions from OTP. Lucky for me, there were absolutely no performance traps in the input (or at least I didn’t encounter any).

Since the solution is pretty straightforward, let’s visit our science corner again and in reference to Day 14, talk about “Archimedes’ Heat Ray”. Some of you may be familiar with the concept but let’s recap.

Certain sources describe a powerful weapon used during the Siege of Syracuse by the Greek to defend against Roman ships. It is claimed that 70 mirrors were used to focus the sunlight in order to set enemy vessels on fire. It was verified more than once by “Mythbusters” that this was not only unlikely (only a minor damage was done over a relatively long time and under perfect conditions) but also impractical (it’s easier to burn a ship with more conventional means).

I suppose that the usage of focusing lenses could help to a degree but I think we can conclude that the Lava Island’s devices are heavily enchanted to enhance sun rays power output!

Sources:

Part 1

MinAvgMax
JIT0.000231s0.000276s0.000444s
No JIT0.000549s0.000569s0.000709s

Part 2

MinAvgMax
JIT0.002239s0.002878s0.004472s
No JIT0.003118s0.003938s0.007493s

Day 16

Due to the relaxed mood of Saturday and the nature of today’s puzzle, I decided to have some fun.

Part 1

Have you ever played these puzzle games where you have to use various optical instruments to manipulate the laser light to reach certain targets? Usually it involved not only plain mirrors or splitters but also prisms, filters etc. Although I don’t remember any specific name, I did enjoy solving them and AoC brought back these memories!

I am aware that probably it would be more optimal to keep a list of concurrent rays (their position and movement vector) but we’re implementing the solution in Erlang – let’s model each ray as a process!

We begin with a single ray/process. Bouncing off the mirrors is easy. In case of encountering a splitter, two new processes are spawned, heading in opposite directions. The “old” process waits for them to finish. A common ETS table is maintained as a global list of energized tiles (if a ray passes a tile that was energized in the same direction – the ray may be terminated).

MinAvgMax
JIT0.017631s0.105174s0.221526s
No JIT0.018842s0.024031s0.045180s

Part 2

Part 2 uses exactly the same ray tracing method, with the only difference being that now we have many more initial, parallel rays! Each initial ray has an individual ETS table assigned, so states do not collide.

Solution is picked from the complete list of all results from all initial rays.

MinAvgMax
JIT5.526677s5.796180s6.208749s
No JIT5.783146s6.083252s6.757137s

Day 17

The 17th puzzle made me question my CS degree. Although I understood pretty quickly that I’m dealing with a directed weighted graph and I’m going to need an algorithm to find the shortest path. It’s been too long since I needed one but a quick search refreshed my memory – I had to use Dijkstra’s Algorithm.

Part 1

However, Mr Dijkstra didn’t have to solve Advent coding problems. At least I think he didn’t. I struggled quite a lot with finding a good modification of the base version of the algorithm. I did suspect that I can’t operate purely on `{X, Y}` vertices but what would be a good representation that won’t explode the memory?

With a trial and error and a bit of inspiration from Reddit, I understood that each vertex consists of:

  • 2D coordinates
  • Current movement vector
  • How many vertices were visited already with the movement vector

This means that each 2D coordinate can have 4 (up, left, right, down) * 3 (crucible can move in a specific direction only between 1 and 3 tiles) = 12 variations. Fortunately we don’t have to consider all of them but instead of naively populating the priority queue with all combinations we need to generate neighbors of the current node on the fly: one along the movement vector if less than 3 moves were made, plus node to the left and to the right of the vector (with new vectors and tile counter reset to 1).

When all vertices are visited (i.e. no more valid neighbors to generate), we have to pick the lowest distance of all “flavors” of the bottom right corner from the distance table.

MinAvgMax
JIT10.285737s10.386970s11.041682s
No JIT13.660166s13.760896s14.031004s

Part 2

Fortunately the only required change here is the way neighbors are generated. Instead of always generating “left” and “right” turns plus “forward” until the 3rd tile in a row, now we don’t generate “left” and “right” until the 4th tile (exclusive). Also, “forward” moves are generated until the 10th tile (exclusive).

While the solution worked, I considered its performance unacceptable. My primary suspect was the priority queue implemented on top of an ordinary sorted list. When I replaced it with the implementation that uses `gb_trees`, the script became over 30x faster!

MinAvgMax
Priority queue on list, JIT*95.956005s
GBT priority queue, JIT2.857002s3.075149s3.250199s
GBT priority queue, No JIT3.127588s3.467205s3.767187s

* Since a single execution of this version takes about 90-100s, I’m providing the result of just a single run as “Avg”.

Day 18

Part 1

I’ve started solving today’s puzzle with printing my input trench in the terminal – just to ensure there are no nasty edge cases. Lucky for me – I saw a “boring” polygon with no surprises like intersecting or overlapping edges.

Staying true to my recent conclusion to avoid premature optimizations, I’ve reused the algorithm from day 10. Actually, I’ve rewritten it to look nicer, but the core remained almost the same. The important difference is the fact that edges count into the area.

MinAvgMax
JIT0.002791s0.003466s0.005258s
No JIT0.003157s0.003568s0.004584s

Part 2

It’s SO OBVIOUS that AoC deities love throwing huge integers at us out of nowhere. I knew these hexadecimal “colors” in the input file look suspicious but I admit this caught me a bit off guard.

Human beings learn every day and after today I’m richer with knowledge about Shoelace Formula and Pick’s Theorem. It is possible I came across them in the past but well… it’s easy to forget about things we don’t use every now and then.

After all, it wasn’t rocket science – simply a matter of translating a well-known and proven formula into Erlang.

MinAvgMax
JIT0.000226s0.000272s0.000474s
No JIT0.000353s0.000396s0.000511s

Day 19

When an Erlang developer sees the requirement to “send parts between workflows”, it’s very hard to resist the temptation to model each workflow as an independent process. I couldn’t.

Part 1

I chose to register each workflow process under its name from the input file. Besides the defined ones, my code also spawns special ‘A’ and ‘R’ processes and they forward all parts to the parent including their name as a tag.

MinAvgMax
JIT0.003807s0.005190s0.010223s
No JIT0.004412s0.007273s0.015742s

Part 2

I’ve kept the original architecture of running a process per workflow but their implementation had to significantly change.

Instead of a stream of parts from the input we send a single part to the “in” process and instead of specific ratings, it includes valid ranges for each parameter (so at first all ranges are [1, 4000]). Then, each workflow for each condition in its list checks if:

  • The part fully satisfies the condition;
  • The part fully misses the condition (whole range fails the check);
  • The part satisfies the condition only partially.

The last condition is the one that I got wrong at first and it led to wasted hour or two on the attempts to properly calculate the number of combinations. The part can “partially” satisfy the condition when the compared value is inside the range. In this case the part needs to be “split” into a variant that passes the check and one that fully fails it. I made the mistake of using the unchanged range for the latter one, while we need two new ranges. As a result, I got lots of overlapping ranges.

Collection of results is less elegant than in part 1, because there is only one input part and a completely unknown number of output ones. I solved it by keeping a shared counter in ETS. It keeps the number of parts still processed by workflows.

MinAvgMax
JIT0.003988s0.006356s0.012356s
No JIT0.004298s0.008020s0.022086s

Day 20

Part 1

It was so tempting to implement a parallel solution today as well! Unfortunately due to the cycles in the input and the explicit requirement to process the signals in order, I had to stick to the boring loop with a queue of signals.

MinAvgMax
JIT0.014104s0.014916s0.017333s
No JIT0.023655s0.024460s0.028135s

Part 2

Once again my attempt to be too clever and cautious backfired.

  1. I thought that the input is actually a register (link: https://en.wikipedia.org/wiki/Hardware_register) and I can detect a pattern by printing N states of all flip flops. Nope.
  2. I tried to come up with an analytical solution. It would propagate all the possible sequences from the broadcaster through the remaining modules. However, I wasn’t sure how to deal with cycles and I was afraid they would introduce excessive complexity.
  3. I tried to come up with a generic algorithm that works with any combination of cycles in the graph and an arbitrary location of the `rx` node in the graph. I considered ideas to simplify parts of the graph to make simulation more predictable and/or efficient.

Ultimately, all of these approaches failed. Then I saw a visualisation of the input made by one of the Redditors and it looked just like the graph I started sketching on a sheet of paper. I realised that the layout of the graph has a pattern and the whole problem can be decomposed into simulating 4 subgraphs.

Conveniently, each subgraph emitted a single low signal after a long series of high signals. Also conveniently, modules between `rx` and subgraphs are always connected in a way that emits low signal to `rx` only when all subgraphs emit low at the same time.

It means that everything narrows down to calculating Least Common Multiple of all 4 cycles. Generic solutions are absolutely not necessary!

MinAvgMax
JIT0.027580s0.030409s0.037126s
No JIT0.054880s0.057583s0.063418s

Day 21

Part 1

Breadth First Search again but with a twist: we can go backwards. Naive solution was efficient enough.

MinAvgMax
JIT0.055365s0.067503s0.099729s
No JIT0.063782s0.069720s0.086406s

Part 2

This was probably the most exhausting task so far.

At first, I tried to figure out a generic solution. Well, I had one but it was absolutely not fast enough. I saw at least one claim in a Reddit megathread, that one exists but I didn’t investigate the provided code. I focused on my implementation.

It became quite clear that certain properties of the input data have to be exploited. These are:

  • Each visited location behaves like a flip-flop afterwards.
  • The outer set of locations is diamond shaped due to:
    • all walkable plots on the edge of the garden map
    • direct route without rocks from starting point to the edges
  • There are only a few types of garden copies that have to be considered:
    • the original, center one
    • directly to the up, down, left and right: they are always entered in the middle of bottom, up, right and left edge respectively
    • diagonal ones: they are always entered in the bottom right, bottom left, top left and top right
  • Due to the size of the garden (131×131), subsequent instances of the same type alternate between “odd” and “even” number of steps they have to go through.
  • The number of states each garden type can have is not strictly equal to 131: the last two instances in each series have to be treated individually.

For performance reasons, instead of keeping the list of complete states of each garden, I’m using the list of numbers of the flip-flops added in each simulation step. It’s way more compact and it’s relatively easy to calculate the final state after an arbitrary number of steps. You only need to be VERY careful about “off by one” mistakes.

Then, in order to get the result, we need a sum of “on” flip-flops in:

  • Center garden after 26501365 steps
  • Gardens adjacent vertically and horizontally: in their case subtract 66 from the number of steps, because they are not reached until this step. Then we can consider pairs of them and simply add the number of ALL walkable plots. Thus the division by 262 instead of 131. For the final 2 gardens they have to be calculated individually, as IIRC their flip-flop patterns loop after 190 cycles.
  • Similar story for the diagonal gardens but we need to calculate the number of “even” and “odd” tiles, with the former having rows of 1, 3, 5, … and the latter: 2, 4, 6, …, so simple sums of sequences.
MinAvgMax
JIT0.132134s0.150313s0.209061s
No JIT0.158194s0.176707s0.241650s

Day 22

Part 1

The key to the correct solution is to understand what “snapshot” means. It’s the state of all bricks in the same point in time, not the list of subsequent falling bricks. It means that in order to perform the correct 3D Tetris simulation, we need to sort the bricks by their bottom Z coordinates.

For the space model I concluded that I don’t need to store a complete map of occupied locations, because to find where a brick stops and what bricks are going to support it, I only need to remember the highest Z for every {X, Y} pair and what brick is already there.

This worked pretty efficiently, because the sizes of the bricks were not that big. Otherwise, we’d probably need to play around with ranges again.

Also, for the purpose of this solution my code maintains the complete map from brick ID to the list of bricks that support it and those that are supported. What heavily payed off in part 2.

MinAvgMax
JIT0.002166s0.002955s0.009769s
No JIT0.003778s0.004995s0.011249s

Part 2

Remember kids – always ensure you’re executing the correct file!

My first implementation had a bug that caused it to run over 2 minutes (although it produced a correct result!). I thought the slowness is caused by using lists a lot and I wanted to try to improve the performance by replacing them with sets and maybe to add some caching. So I copied the original to the new file. I’m sure you can guess which one I kept running by mistake and what level of frustration I reached?

By the way: The faulty line was the recursive call to the fall function. I forgot to deduplicate the list of fall candidates for the next iteration.

All in all, given the second parts of the recent puzzles, this one was really benign! The core of the solution is to check if the supporting bricks for the current candidates are already removed. Then the candidate is removed as well.

MinAvgMax
JIT0.065106s0.075434s0.100548s
No JIT0.093616s0.109218s0.138036s

Day 23

Part 1

Naive BFS with traversing one tile at a time worked reasonably fast. I did not perform any extra processing of the input in this part.

MinAvgMax
JIT0.223056s0.272931s0.359681s
No JIT0.288227s0.410099s0.576530s

Part 2

I underestimated how the “slopes” reduced the complexity of the problem. The major optimisation I could think of was to transform the “map” into an undirected weighted graph. The vertices are the “crossroads” and, obviously, the weights are the distances between them (excluding starting tile, including the next crossroad one). It is possible to acquire both vertices and distances in a single pass but I didn’t want to introduce the extra complexity and increase the chance of making a mistake.

The solution was produced in a bit over 20s with sets being used to store visited vertices and with a somewhat better result with plain lists. There aren’t that many vertices so we’re still in the sweet spot where list operations (adding a head and checking membership) are faster than sets’ API.

I was wondering if there is any optimization I can introduce. I even drew the complete graph and I couldn’t see any reasonable pattern. Then I checked Reddit megathread and it seems that there is no other way than to perform a plain brute force DFS. In the generic case the longest path problem is NP-hard.

Note: The benchmark of today’s solution has a sample of 10 runs instead of 100.

MinAvgMax
JIT14.803970s15.28991s15.694286s
No JIT20.148084s21.007705s22.846742s

Day 24

Part 1

In order to solve this puzzle, we just need to play around a bit with algebraic expressions. Since we only need to take ‘x’ and ‘y’ coordinates into account, we need to find where each pair meets after adding a vector multiplied by a certain factor.

When there is no solution, then it means the lines are parallel. If either ‘a’ or ‘b’ (or both) are negative, it means the lines intersect in the “past”.

MinAvgMax
JIT0.003238s0.003538s0.005159s
No JIT0.008181s0.008480s0.011210s

Part 2

This time the expressions are way more convoluted!

MinAvgMax
JIT0.000162s0.000181s0.000266s
No JIT0.000469s0.000515s0.000615s

It may be difficult to track what’s happening so let’s explain step by step:

  1. We begin with two equations for the moment where the rock and a hailstone meet in the X and Y axis. “t” stands for the time passed from the moment the rock is thrown, “S” in subscript means the parameter is related to the rock (Stone).
  2. It turns out that we can get rid of nonlinearity by transforming the expressions so we have “t” on the left in both of them and we can join them.
  3. After multiplying and simplifying everything, we have the nonlinear part to the right. Now, if we take any two hailstones, put their values into instances of the expression, we can join them again and get rid of the nonlinear part, just like in the previous step.
  4. We do some more grouping and simplifying. Finally, we have a linear expression of 4 variables (rock’s starting X and Y plus its velocity vector’s X and Y value)! Its coefficients are calculated from any pair of hailstones.
  5. We take 4 unique pairs and voila! We have a system of four equations with four unknowns! I’ve used Gauss’ algorithm to solve it.
  6. Now we are able to calculate the “t” of collision with any hailstone. It also means that it’s possible to find “Z” axis values for the rock – it’s a simple system of two equations with two variables.
MinAvgMax
JIT0.000162s0.000181s0.000266s
No JIT0.000469s0.000515s0.000615s

Day 25

I’ve immediately discarded the idea of choosing random edges and testing the graph without them. In the case of my input file there were over 1500 edges! It means that in the worst case I would have to try over 560 million triplets!

At some point I tried to find all adjacent triangles, because this is what would work with the example data. Guess what – it failed miserably with the real file…

My next idea was to perform BFS from every vertex and count how many each edge was traversed in total. With a bit of luck – the three bridges would stand out more or less significantly. After all, they would have to be traversed more often than other edges to reach all vertices.

However, before I implemented this concept, I decided to google a bit about this problem and I learned about Karger’s algorithm. The problem is, it is probabilistic (I don’t think that AoC deities would like such a solution) and has potentially quite high complexity. I saw someone employ algebra and matrix operations to get the solutions but I decided that I left such advanced math long behind me in my university days and it would probably take too much time to relearn all the necessary information.

So, eventually, I tried the “edge heat” approach. To my delight – the bridges were quite nicely visible in the example input and were clearly outstanding in the real one! With a set of three edges it was easy to adjust the incidence map and perform BFS one last time to get the size of both subgraphs.

MinAvgMax
JIT3.564212s3.867032s5.548451s
No JIT4.533271s4.794619s5.873216s

Conclusion

It was quite a ride!

Some days were pure fun, some were a pure headache. If I were to choose the ones I enjoyed the most, these would be: ex aequo first place goes to day 6 (boat race) and day 24 (hailstorm) – for an amusing usage of algebra. On the second place we have day 12 (hot springs) – because it reminds us about the vast benefits of clever caching. The third puzzle I’d like to highlight is day 19 – parts workflows, because the parallel solution was very natural and looked cool in Erlang code.

Among the most dreadful ones, the ultimate winner is the day 21 (infinite gardens), as it relied heavily on the properties of the input and was very difficult to solve without painful and obscure off by one errors. All kinds of 2D plane operations were tiresome too, as these aren’t the kind of problems I deal with in my daily work so I definitely lack practice and knowledge about essential algorithms. Second place belongs to day 17 and its “Clumsy Crucible” – a very unorthodox usage of the classic Dijkstra algorithm was mindblowing in both good and bad ways. Last but not least on the podium is day 14 – dish calibration with rocks. Again, it was too easy to make “off by one” mistakes and it took me a long time to eliminate them.

Also I need to mention that almost every time JIT brough more or less significant improvement. Although most of the software we write in Erlang has a rather different focus than AoC puzzles, it is quite obvious that there is a very high chance your applications will benefit from switching to the most recent OTP releases!

Will we see each other again in the journey through the Advent of Code? Who knows… a lot depends on you and your feedback! Don’t hesitate to share on LinkedIn and other social media and tell us what you think!

Keep reading

Advent of Code 2024

Advent of Code 2024

Join Lorena in this years Advent of Code 2024. She'll be solving daily puzzles throughout the month of December.

Optimising for Concurrency: Comparing and contrasting the BEAM and JVM virtual machines

Optimising for Concurrency: Comparing and contrasting the BEAM and JVM virtual machines

Attila Sragli explores the BEAM VM's inner workings, comparing them to the JVM to highlight their importance.

MongooseIM 6.3: Prometheus, CockroachDB and more

MongooseIM 6.3: Prometheus, CockroachDB and more

Pawel Chrząszcz introduces MongooseIM 6.3.0 with Prometheus monitoring and CockroachDB support for greater scalability and flexibility.