pwmarcz.pl

Remote Chaos Experience 2020

2021-01-02

Remote Chaos Experience (rC3) was 2020's remote version of Chaos Communication Congress. Apart from the online talks, there was a huge 2D world to explore, and various online sessions. I mostly watched the talks, though, so here are the ones I liked best.

You can find the recordings on media.ccc.de.

How to survive in spacecraft is a talk about spacecraft life support systems: generating oxygen, scrubbing CO2, the chemical reactions involved. Pretty fascinating.

Operation Mindfuck Vol. 4 is a collection of random interesting facts. The speakers talk about unusual music keyboards, DIY plotters, April Fools internet RFCs ("scenic routing for IP packets"), and many others.

Scientific Literacy 101 explains how the scientific system works, and gives some tips on reading papers, looking for sources, and training your internal "bullshit detector".

Hacking the Nintendo Game & Watch was about Nintendo's recently released mini-console that allowed you to play some NES-era games. The speaker managed to reverse-engineer it (turns out it includes a NES emulator), open it up for homebrew games (there is a convenient library for programming) and emulation, and even run DOOM. The talk is well-prepared and easy to follow.

Unconventional HDL synthesis experiments - if you played with FPGAs and programming Verilog, you will enjoy this talk about compiling Excel spreadsheets to logic gates, and synthesizing circuits out of 7400-series chips.

RAMN: Resistant Automotive Miniature Network talks about hardware for experimenting with automotive software. They have a programmable board with a few cores connected by a bus, and connect them to CARLA, an open-source driving simulator. It was cool to have a look into the world of car electronics and the standards that they have to follow.

Hacking German Elections should serve as your reminder that using computers for elections is a bad idea. The authors focus on a case of vote counting (not voting itself), and show how the procedure turned out to be deeply flawed: the software was full of holes, but also there were multiple opportunities to tamper with the vote counts, and the counting was completely non-transparent to any external observer.

Escape the macOS sandbox and TCC is a security talk about new mechanisms in macOS. There are various protections (fine-grained permissions instead of root, code signing, making the base system partition read-only, etc.), but some of them are relatively recent and there are interesting ways of defeating them.

Attacking CPUs with Power Side Channels from Software: Warum leaked hier Strom? is a talk about power side-channel attacks: if we can detect how much power the CPU uses, we can not only detect what instruction is being run, but also what's the data (for example, the multiply instruction will take up more power if the numbers have more ones - with some preparation, this is enough to extract an encryption key).

Anchorhead

2020-07-25

Anchorhead jest grą interactive fiction (przygodówką tekstową) napisaną w 1998 roku przez Michaela Gentry, opartą na motywach z twórczości H.P. Lovecrafta. To bardzo ważna przedstawicielka swojego gatunku - zdaniem wielu osób jest najważniejszym przykładem Lovecraftowskiego horroru w gatunku interactive fiction, wiele razy zajmowała też wysokie miejsca na listach najlepszych gier tekstowych w ogóle. Na mnie również zrobiła bardzo mocne wrażenie. Opowiem, dlaczego.

Witajcie w Anchorhead

Bohaterka gry przyjeżdża do miasteczka Anchorhead ze swoim mężem, żeby zamieszkać w posiadłości odziedziczonej po jego krewnych. Miasteczko wita nas deszczową pogodą, mieszkańcy patrzą podejrzliwie i oczywiście różne rzeczy nie są tym, czym się wydają. W miarę, jak dowiadujemy się coraz więcej, odkryte szczegóły zaczynają składać się w straszliwą całość…

Bardzo mi się podoba, że w miasteczku Anchorhead - i w grze - znajduje się wszystko, czego oczekiwalibyśmy od tego gatunku. Jest ponura pogoda i podejrzliwi mieszkańcy. Jest też do zwiedzenia olbrzymi dom, pełen historii, tajnych przejść, i dziwnych obrazów przedstawiających nieziemskie krajobrazy oraz przodków o strasznych, czerwonych oczach. Jest opuszczony kościół, kanały, latarnia morska. Jest zagadka, którą powoli odkrywamy znajdując gazety, czytając kroniki, i rozmawiając z ludźmi. Jest sklep pełen dziwnych przedmiotów, którego właściciel jest dla nas bardzo przyjazny, ale który na następny dzień jakoś nie daje się odnaleźć. Są kultyści, którzy odprawiają bluźniercze rytuały. Wszystko jest na swoim miejscu.

Świat gry jest ogromny (121 lokacji!). Co prawda na początku fabuła jest dość liniowa (musimy zdobyć klucze do domu, wybrać się tam, i pójść spać), ale nic nie stoi na przeszkodzie, żeby od razu porozglądać się po całym mieście.

Tak duży świat w grze tekstowej oznacza, od gracza wymagane jest trochę więcej - konkretnie, trzeba samemu rysować mapę. Takie rzeczy długo mnie zniechęcały, ale kiedy już się przełamałem, okazało się, że zupełnie niesłusznie. Rysowanie własnej mapy sprawiło mi dużo satysfakcji - bardzo przyjemnie było dodawać nowe miejsc do mapy w miarę, jak je odkrywałem, i sądzę, że doświadczenie z "gotową" mapą byłoby o wiele płytsze.

fragment mapy

Topologia gry nie jest specjalnie skomplikowana na standardy gier interactive fiction (północ, południe, wschód, zachód, góra, dół), ale skala robi wrażenie. Bardzo też podobały mi się "efekty specjalne". W różnych momentach gry zmienia się pogoda, co odzwierciedlają opisy. Podczas burzy, wiatr na ulicy potrafi wyrwać nam parasol. Z różnych miejsc mamy widok na zmieniający się krajobraz. Interactive fiction to zupełnie inne medium niż "graficzne" gry, więc i efekty specjalne polegają na czym innym, ale nadal jest co podziwiać!

Mechanika gry

Anchorhead jest "tradycyjną" interactive fiction, czyli grą opartą na parserze, jak Zork i inne gry Infocomu. W przeciwieństwie do gier hipertekstowych (np. opartych na popularnym Twine), takie gry wymagają od nas wpisywania prostych poleceń z klawiatury, takich jak west, get lamp albo give keys to Michael.

Z początku ten sposób interakcji z grą może być trudniejszy, i na pewno wymaga nauczenia się obowiązujących konwencji. Ale też w zamian dostajemy więcej - w takiej grze można chodzić w różne miejsca, nosić i używać przedmiotów, rozmawiać z ludźmi… a przede wszystkim, ma się poczucie, że właściwą czynność należy wymyślić, a nie próbować wszystkich możliwości po kolei.

Trudność

Wychowałem się na przygodówkach (graficznych) LucasArts, których główną cechą była "nieprzegrywalność". Niezależnie od tego, co zrobiliśmy, w grze nie dało się zginąć ani się zablokować, więc można było próbować różnych rzeczy bez strachu.

Anchorhead… niestety nie idzie aż tak daleko. W grze można umrzeć (czasami widać nawet, jak dużą satysfakcję autor miał pisząc niektóre bardziej brutalne sceny śmierci). Daje się też zablokować (choć jest to trudne). Przydaje się zapisywanie gry od czasu do czasu.

Na szczęście gra stara się być uczciwa. Jeśli coś jest złym pomysłem, prawdopodobnie będziemy o tym wiedzieć. Specyfika gatunku sprawia też, że wczytywanie sejwa i powtarzanie części swojej drogi nie jest bolesne - nie musimy przeczekać żadnej długiej sekwencji, bo nie trzeba przecież czytać tego samego tekstu na nowo. Wystarczy, że wpiszemy kilka poleceń jeszcze raz.

Jeden, moim zdaniem, spory błąd projektowy to zarządzanie zasobami. W różnych miejscach trzeba oświetlać sobie otoczenie. Mamy kilka źródeł światła, ale bateria w latarce się wyczerpuje, a zapałki mogą się skończyć, i zostajemy wtedy w "niewygrywalnej" sytuacji. W praktyce, o ile pamiętamy o wyłączaniu latarki na czas, nie jest to dużym problemem, ale widać, że bez tego elementu gra byłaby lepsza.

Przyszło mi też do głowy, że gdyby Anchorhead rzeczywiście było "nieprzegrywalne", mogłoby sobie na więcej pozwolić - na przykład na niszczenie graczowi ekwipunku, albo odcięcie dostępu do różnych lokacji, bez wzbudzania w nim obawy, że właśnie zablokował sobie dalszy postęp.

Zagadki i fabuła

Zagadki w grze dosyć mi się podobały - w ramach gatunku (przygodówek) to raczej standard, ale trzeba było użyć różnych przedmiotów w ciekawy sposób, przekonać do różnych rzeczy postaci, uważnie czytać wycinki z gazet oraz opisy otoczenia.

Bardzo też podobało mi się, w jaki sposób wskutek naszych działań odsłaniają się kolejne części tajemnicy. Co prawda nie zawsze jest zupełnie jasne, czego w danej chwili oczekuje od nas gra (nie ma na przykład spisanej "listy questów"), ale jednak motywacje bohaterki są zwykle zrozumiałe, a dalsze działanie to kwestia pociągnięcia dalej wątków, jakie w grze się pojawiły.

Nie grałem co prawda nigdy w papierowe RPG Zew Cthulhu, ale podejrzewam, że Anchorhead byłoby dobrą przygodą do tego systemu. Zwiedzanie miasta, rozmawianie z postaciami i rozwiązywanie zagadek trochę przypominało mi granie w papierowe RPG. Gra jest też przyzwoitym symulatorem Mistrza Gry, który w jednych miejscach zostawia sporo swobody, w innych mocno podpowiada i naprowadza, a czasami ma już obmyśloną sekwencję dalszych wydarzeń, przed którą nie pozwala uciec graczom, w zamian zapewniając im na przykład widowiskowe sceny.

Anchorhead nie jest grą bardzo innowacyjną. Wiele konwencji pochodzi w prostej linii od przygodówek Infocomu. Jedna z recenzji twierdziła wręcz, że jest to jedna z ostatnich przygodówek starej daty, tuż przed tym, jak w interactive fiction zaczęto więcej eksperymentować z formą. Jeśli graliście w przygodówki (tekstowe albo graficzne) oparte na zbieraniu przedmiotów i rozwiązywanie zagadek, odnajdziecie się szybko.

(To nie znaczy, że innowacji nie ma w ogóle! Na przykład Anchorhead było nominowane do nagrody za najciekawiej skonstruowanego NPC. Jest nim Michael, mąż bohaterki.)

Na pewno za to jest to gra doskonale dopracowana. W wielu miejscach widać dużą dbałość o szczegóły - wspomniane efekty atmosferyczne, podpowiedzi, możliwości dane graczowi. Na ówczesnym silniku do przygodówek autor uderzał już mocno w limit objętości, więc w pewnym sensie Anchorhead jest tak duże, jak to było możliwe.

Jak zagrać

Oryginalna gra Anchorhead z 1998 roku jest dostępna za darmo. Można ją ściągnąć ze strony na IFDB, można tam też zagrać w przeglądarce (przycisk "Play on-line"). To jest wersja, w którą grałem i na której podstawie spisałem swoje wrażenia.

W 2018 roku autor wydał nową wersję, Anchorhead - The Illustrated Edition. Ta wersja jest już do kupienia za pieniądze (10 dolarów na Steamie lub itch.io). Nie sugerowałbym się słowem "illustrated" - ilustracji jest niewiele i nie wpływają za bardzo na rozgrywkę - ale można o tej wersji myśleć jak o "reżyserskiej" wersji gry. Została przepisana w nowej technologii (w języku Inform 7), wiele rzeczy zostało poprawionych albo uczynionych wygodniejszymi. Jeśli jesteście gotowi zapłacić, pewnie warto wybrać tę wersję.

Jeśli Anchorhead to wasz pierwszy kontakt z "tradycyjnymi" interactive fiction, może przydać się mały wstęp od Andrew Plotkina - na jego stronie znajdziecie m.in. ściągę z najczęściej dostępnymi poleceniami.

Warto też pamiętać, że jako gra z 1998 roku Anchorhead niekoniecznie jest łatwe. Kilka razy pomagałem sobie solucją (również dostępną na stronie IFDB), i nie czuję, żeby to był powód do wstydu.

Linki

Anchorhead na IFDB - gra, informacje, recenzje
Oficjalna strona gry - odnowiona wersja z 2018 roku
IF Theory Reader (PDF) - książka na temat interactive fiction, zawiera m.in. artykuł autora Anchorhead pod tytułem "Parser at the Threshold: Lovecraftian Horror in Interactive Fiction".
Wywiad z autorem z 2006 roku w serwisie Game Couch
Recenzja wersji z 2018 roku w serwisie PC Gamer

Making Autotable

2020-05-24

Autotable is an online tabletop simulator for Riichi Mahjong. It's a project that I really enjoyed working on, and despite the simple core idea I put a lot of effort into improvements and tweaks. I hope this will make for an interesting story.

Why make this?

I am not very good at mahjong.

I like playing it casually, and I don't particularly care about improving my skill. I really enjoy the social aspect of playing with people and talking with them, handling the tiles, and the overall atmosphere of the game.

That's why I don't like mahjong computer games, or platforms like Tenhou. They make the game nothing like the "real thing". The play is fast and streamlined. They pause to ask me for calls (pon / chi), or propose 3 different ways to declare riichi before I realize I can do it. The computer knows my hand better than me, and what I really liked in the game gets automated away.

So when we were all staying at home social distancing, and I got a craving for mahjong, I investigated online tabletop simulators. There is Tabletopia and Tabletop Simulator. You can play mahjong in both. Tabletop Simulator is especially known for its modding community, and there are several player-made setups for mahjong.

On closer look, I'm not satisfied. Tabletopia, for instance, wants you to treat your hands of tiles like hands of cards in poker, and instead of building a wall of tiles, you pull them from a bag. Functionally, it's the same, but it doesn't feel like mahjong anymore.

Tabletop Simulator is better, because its engine is very flexible. Still, I wasn't happy with how your own tiles have to go in a special "hidden" zone. All I wanted was to have them standing on the table, facing in my direction, with no way for the opponents to peek.

This is how I arrived at my idea. What if there was a simulator geared specifically for mahjong? A game that allowed you to do all the things required in a mahjong game, but do them by yourself.

Compared to existing, heavily "automated" mahjong games, I wanted to have something that didn't care about turns, winning and rules. I'm okay with restricting some players' freedom - there is no need for flipping the table, throwing the tiles or building a pyramid out of them - but on the other hand, the player should never be forced to take this or that specific tile, and there shouldn't be any things happening out of the blue.

"No rules" means also no need to implement the rules! Minefield Mahjong was already pretty challenging, and programming a regular four-player game, with all the different special cases, would be much, much harder.

Or, to put it in yet another way: this should be a game that allows at least for some real-life mahjong manners, "anti-manners" and idiosyncracies (such as pulling your tile too soon, or doing some things in a specific order).

What about some more flashy cheating, like stacking the wall and exchanging it with your hand? Maybe someday! :)

Appearance

While a mahjong game is three-dimensional, with some tiles stacked on each other, it still has a rather simple layout. In the beginning, I thought I could fake it with drawing simple sprites, in a strategy-game-like view.

I had really great tile pictures already, but drawing them at an angle meant I also had to draw sides and colored backs. I had some early designs and a prototype for that:

However, I got discouraged when I realized that if I'm manually specifying how to draw the tiles, I would have to do so for every possible rotation: standing, lying face-up, lying face-down, sideways…

So I scrapped the idea of 2D graphics and used a 3D engine called three.js. I figured that I could still use it to generate simple "2D-looking" visuals, but at least all the math related to objects' positions an geometry would be taken care of.

As it turns out, I'm still using a simple orthographic projection for the game. I think it looks nice and clean. There is also a "perspective mode", which some players prefer for the immersion. I think it's harder to use but hey, it's a feature that I got basically for free.

The orthographic view has one problem. The far bottom tile in the wall is hard to reach:

far bottom tile

To make it somewhat usable, I had to keep the view angle at 45 degrees (with a steeper angle, the bottom tile is hidden completely), and color it darker. It still could work better - I'm planning to increase the "hitbox" so that this tile is easier to grab even when your mouse cursor is not touching it.

Textures and models

The textures for the game are generated from SVGs. I like that because I can freely adjust the texture resolution, edit them in Inkscape, add details and so on.

The models aren't that simple. Initially I just had some boxes. However, drawing the tiles next to each other caused them to bleed together. My first solution for that was to make some gaps between them. Later, I decided to add some gradients to the textures instead. That separated the tiles from each other visually, but gave them a cartoonish look.

Finally, I bit the bullet and used Blender to edit the model. I hadn't used Blender too much before, and had an impression that it was a very complicated and hard to use program. But actually, for my use case I was able to learn relatively quickly all that I needed.

Now I have nice models with rounded corners. Adding a few lights to the scene makes them look pretty good, and more realistic than the previous cartoon version.

shaded models

I was also able to construct a pretty nice data pipeline for building the models. With a single make command, I'm able to get Blender to export a glTF file containing all the models and the textures, which three.js is then able to import.

Gameplay

Here is the core idea for game actions. The mahjong table has several different places for tiles: your hand, the wall, the discard pond. Call them slots. The tiles are placed in slots, and you can drag a tile from one slot to another.

slots

As you can see, the tiles can be rotated in different directions, depending on a slot. Each slot has a list of allowed rotations: in hand, the tile can be upright or face-up (revealed); in wall, it can be face-down or face-up, and so on.

Corner cases

Although simple, this design already has some corner cases. For example, in a few places in mahjong you need to play a tile sideways. That means the next tiles need to be pushed to the right, or they will collide with the sideways one:

discards pushed

"To the right" is of course relative to the player position. And sometimes the push is actually from right to left:

discards pushed

The way I do this is there is a "push" relationship between slots: slot A pushes slot B. When a tile is rotated, I go through the relationships and check if any in these slots tiles are pushing each other.

Another, more interesting corner case is the action of drawing a tile to your hand. Previously, it worked like this:

drawing a tile

The tile is hidden while dragged, and rotates when you drop it to destination. This is usually what you want, but in this specific case, it means you need to first drop the tile into your hand, then pick it up again once you see what it is.

A better solution is to rotate a tile immediately when you hover it over your hand, like so:

drawing a tile with immediate reveal

Now drawing and discarding becomes a single motion. It looks like a small thing, but it's really a bigger deal than it looks like. It also means that if you do decide to keep the tile, you can put it in the right place in your hand:

drawing a tile with immediate reveal

This auto-rotation feels so natural that when I added it, the players got used to it without noticing anything has changed!

Another corner case is "secret" slots that are normally not active. For example, when you run out of space for discards, you are supposed to continue on the last line:

last discards

However, this situation happens rarely, and allowing you to use these places right away would be confusing and unnecessary. They are activated only when slots to their left are filled.

Paying

In Riichi Mahjong, you pay other players with point sticks. You also use a 1000 point stick as a deposit when declaring riichi. This felt important to the feeling of the game so I wanted to keep it.

Unfortunately, it would be too much to try and fit all the players' point sticks on the screen. Instead, I added a special "look down" key to switch the view. I regret not being able to keep all of the game information on a single screen, but in a way, this is also pretty realistic. Most of the time, you are focused on the game, and in a real-life automatic tables, the sticks are actually in a drawer that you need to open first.

riichi

What about paying other players? You can do that by placing the sticks on the table, but where? For some time, I was afraid I need to allow free placement everywhere on the table. While that's not hard to implement by itself, the implications are difficult: you need to have a collision system, and probably allow the players to rotate the sticks.

Instead, I extended the "slots" system and made some dedicated places on the table for the exchange between players. I think it's a good compromise.

paying

Mouse position

The player needs to be able to drag the tiles with mouse. So, apart from projecting the image to the screen, an opposite operation is needed: given the mouse coordinates on screen, which object are we pointing at?

A simple technique for that is ray casting: projecting a ray from the camera and through the mouse coordinates on the screen, then seeing if it intersects any object.

Once we have picked an object, we have to decide where in the 3D space the player is dragging it to. This is easy if we assume that the object is being dragged in the same horizontal plane.

mouse positions when dragging

An interesting implication of that is that the mouse coordinates are three-dimensional. In fact, they have to be, because we have different players and they all see the same mouse pointers. Otherwise, what looks good to you, would look bad to a player viewing the scene from opposite side of the table:

mouse positions for two players

As a result, when you move the mouse in a straight line, the pointer as seen by the other players might jump around. I don't think it looks too bad, and in a sense, it's completely accurate:

3d cursor translation

To sum up, these are the rules for determining your mouse positions in 3D:

If you're not moving over any tile, the mouse is at "ground" (table) level (intersection of ray and table).
If you're moving over a tile, the mouse is whenever you point at the tile (intersection of ray and tile).
If you're dragging a tile across screen, the mouse is at whatever level you started dragging (intersection of ray and a plane at the same elevation as initial mouse position).

Mouse motion

I mentioned synchronizing the mouse position with other players. When I started the project, I wasn't even sure if that was necessary, but it ended up being a very important effect. It really contributed to the realism of the game.

In the beginning, I sent the mouse position on each mousemove event. That looked pretty smooth - on localhost. When running the game on a server, the mouse started to jump around as the packets were arriving in uneven intervals.

I haven't measured very rigorously what the problem was, but I thought I was being pretty wasteful with bandwidth, and with server CPU time. So I rate-limited the mouse position updates to 10 per second. The effect was somewhat choppy, but good enough. It allowed me to play the first game with friends.

motion without smoothing

Then, I tried smoothly interpolating the position. The improvement was drastic. It really looked as if the other person was moving the mouse!

motion with smoothing

Behind the scenes, what's happening is that the game receives waypoints and shows the mouse as moving between them. New waypoints arrive 10 times a second, so the mouse motion is always played with a 100 ms delay.

waypoints

As I mentioned before, the messages might come in uneven intervals. There is no guarantee that the waypoints will arrive exactly every 100 milliseconds, and if you just render them as they arrive, the movement will also be uneven. Because of that, the messages carry a timestamp. Of course, I cannot use a timestamp from remote machine directly - your timestamp is not the same as my timestamp - but it's enough to calculate the interval between the remote timestamps (usually 100 ms), and maybe apply some limits if a waypoint arrives suspiciously late or early.

waypoint transfer

I was hesitant to add mouse smoothing at first. The problem with this technique is that I'm introducing a universal 100 ms delay to all mouse motion. I was worried it would be noticeable, and also that it would "infect" other interactions, like dropping the tiles.

My worries turned out to be mostly unfounded. This is not a fast-paced game like an FPS, and nobody notices the delay. Adding delays in other parts of the code was not necessary: the player drops the tile slightly sooner than the mouse motion would suggest, but dropping a tile usually doesn't happen "at full speed" anyway - the player slows down the mouse first, and verifies it's in the right place. There is some "landing time", so to speak.

Tile sorting

When playing mahjong, it's likely you will want to sort your tiles by suit and number value. This is a more complicated interaction: you're not just dragging a tile from one place to another, but moving other tiles in the process.

Unfortunately it's also an interaction I couldn't do without. If the game auto-sorted the hand, it would be really convenient for the player, but that would no longer be the game I intended to make.

sorting tiles

It really took me a long time. I spent several 1 am sessions trying to get it right, each time only to realize halfway through I made a horrible mess out of my code and I'm too sleepy to write good code, and to throw away everything I wrote with git reset --hard. Finally, on the third or fourth attempt I got something that worked and didn't inflict too much damage on surrounding code.

The final algorithm works something like this:

Take the list of tiles being moved (called held tiles).
Find if they can be dropped somewhere. Call it a list of target slots. If the target slots are all empty, then we're done.
If the target slots are not empty, check if the tiles in these slots can be shifted to the left or to the right. If so, temporarily display the shifted tiles in the new location.
If the player doesn't decide to drop the tiles to target slot, return the shifted tiles to their positions.

I experimented with moving the shifted tiles in a smooth motion, instead of having them jump around, but it looked very unnatural.

Initially, the "shifted tiles" were only shifted locally, until you dropped the held tiles to destination. Later I decided to send them over the network as well, so that everybody sees you sorting the tiles. And here's where the 100 millisecond lag turned up! Because the mouse movement is delayed compared to everything else, the tiles appear to move ahead of time:

sorting without delay

It looked pretty weird, so I added a 100ms delay to the shifted tiles as well. So far, all the other parts of the game work correctly without any added delays. This one might have been special because the tiles so closely follow the mouse cursor.

sorting with delay

Optimization

My mahjong table is not a very complicated 3D scene, but when you count the objects - 136 tiles, 60 point sticks, some assorted simple meshes - it does add up. On a powerful machine, that's not a problem, but when I first tried the game on my laptop, the framerate was pretty bad. In power saving mode (on battery), the game barely reached 20 FPS. I wanted to have a buttery-smooth 60 whenever possible!

It turns out that a common advice is to "merge your geometry". Static meshes, like scenery, can be often combined into one bigger object, and sent to GPU in a single draw call.

My tiles move rarely, but they do move. So I tried the second common advice, which was to use geometry instancing. With geometry instancing, if you need to draw many copies of a model, you send the model to GPU once, along with a separate array with all the positions for that mesh.

There were two complications, though:

The tiles are not all the same! Their faces are different. Fortunately, this can be worked around by writing a custom shader - a piece of GLSL code running on the GPU. My shader overrides the UV (texture) coordinates for some of the vertices:

if (vUv.x <= TILE_DU && vUv.y <= TILE_DV) {
  // If this is part of tile front, apply the
  // offset to draw the right front.
  vUv.x += offset.x;
  vUv.y += offset.y;
} else if (vUv.y >= 4*TILE_DV) {
  // If this is part of tile back, or side,
  // apply the offset to draw the right back
  // (there are yellow and blue tiles).
  vUv.y += offset.z;
}

There are many visual effects. A tile can be…
- Hovered-over with cursor: draw with a slightly lighter color.
- Held by player: draw it on the top of everything else.
- Non-droppable: if we cannot drop the tile here, draw it as slightly translucent.
- Selected: draw a glowing outline around the tile.
- Bottom row: for a usability hack mentioned earlier, the bottom tiles are drawn darker.
It's really hard to customize so many attributes of the object when using instancing, and in some cases (like drawing the objects on the top), probably impossible. Instead, I use an instanced mesh for all the "normal" tiles, and separate, regular meshes for the "custom" tiles (of which there are usually only a few on the screen).

framebuffers

Two consecutive framebuffer states, as captured by SpectorJS - you can see that the darker tiles were drawn before, and all the other tiles are drawn in a single batch.

The instanced tiles were a huge improvement. Combine that with some places where I avoid recalculating too many things (since, well, the tiles rarely change positions), and I now reach 60 FPS on most computers I test the game on. When I disable the limit, Google Chrome on my desktop computer is able to calculate almost 1500 frames per second!

chrome FPS

Of course, I might have had a much easier time with performance if I didn't use a 3D engine, but 2D sprites. Using something else than JavaScript/TypeScript and browser could also help. Still, I'm very happy with my technology choices - it's hard to get more portable than a HTML5 in-browser game, and the graphics engine is pretty flexible.

Database sync

All of that game state needs to be synchronized between players. The server is a relatively small part of the project, but it still had to undergo some evolution.

At first, I thought it would be simple. Synchronize a set of Things (objects on the board) between player. And each player controls its own data, like nickname, so also synchronize a set of Players.

Later, as it grew bigger, these two "object types" turned out not to be enough. There were more pieces of data - the match state (who is the dealer), what kind of tiles we're using, the mouse position, and so on.

Instead of hardcoding more things into the protocol, I decided to make the server a simple database. It became a simple key-value store that allowed you to modify objects like "nick 2" or "thing 151". The objects are divided into different collections (here: nicks, things, and so on). On the client code, I can refer to these collections.

For example, in the client code, I might write something like:

client.mouse.set(client.playerId(), {x, y, z, time});

Which sends the following data over the websocket:

{
    "type": "UPDATE",
    "entries": [
        ["mouse", "X123", { "x": 0.5, "y": 0.75, "z": 2.0, "time": 1590335831556 }]
    ]
}

The nice thing is that the server is a relatively simple program that doesn't know anything about the data. When I add a new feature, usually I only update the client (in-browser game), and I don't have to change the server at all.

Wait, did I say simple? Well, there is a number of features that I needed to add to my key-value database:

Rate limiting (for the mouse updates). Okay, this one is client-side only.
Weak references to players. When a player disconnects, I want the server to clean up some data (like their nickname or mouse position).
Unique attributes. This is to prevent conflicts: in a rare case, two players might attempt to put different tiles in the same slot. That would break the game, so the server rejects a update that violates this constraint.
Ephemeral messages. I'm using the server to also propagate sounds (such as dropping the tile on the board) to other players. I achieve it, like everything else, by sending an object to the server. However, I don't want the object to be remembered there, because then any player that connects later will hear the sound.

The setup sounds weird, and I might be reinventing something already existing (like Firebase, maybe?) that would suit me better here. Oh well… it does what I want, and it's a local optimum.

Reconnecting

At first, I was worried about publishing the game and letting visitors from the internet play it, because I didn't have a procedure for deploying a new server version. I would have to restart the server, throwing away all the games stored in memory! For Minefield Mahjong, a solution to that problem involved an SQLite database and a separate "replay" subsystem.

…Then I realized it's not a problem at all. Everything about the game is already stored on players' computers! If I restart the server, they can reconnect and re-populate the database.

Of course, storing all state on the client is a cheating risk. There is nothing stopping you from hacking the game. However, the game is a sandbox anyway, and all kinds of illegal actions are possible. So I could argue that this is by design, and trying to harden it is not worth the added complexity - nobody is going to play a serious competitive game on my platform anyway. The important part is that the players are having fun. :)

Speaking of cheating, an interesting bug came up. If you selected some tiles, and someone clicked "Deal", they got shuffled with all the others, but stayed selected for you:

I guess a real-life version would be marking your tiles somehow.

Summing up

Here is the page for the game again. The project is now mostly finished. I might add some gameplay improvements and new modes, but what I have so far is enough to play a complete game. I haven't open-sourced it, but I'll probably do so in the future.

Update: The project is now open source: https://github.com/pwmarcz/autotable

What really got me hooked is that in this project, I had users from the start. As soon as there were enough features to play a game, we started testing it with friends. Then, I built new features based on their feedback, and things that we all noticed were missing.

It was great to have this kind of "product pressure". I did a lot of refactoring and cleanup, and I'm still doing it, but it was always in order to build more features. Often, I added something in a quick-and-dirty way, even piled a few changes like that, and refactored later because I realized it's becoming necessary.

Thanks to everyone who played and helped me develop! I really, really enjoyed working on this project.

Rewriting Minefield Mahjong in Rust

2020-04-15

In order to learn Rust better, I decided to rewrite one of my previous projects, Minefield Mahjong. The game was originally written in Python, so I had an opportunity to see which ideas translate well and what needs to be different.

As a disclaimer, I'm pretty new to Rust. If something didn't work out for me, or I made a mess, it's very likely that there is a better way that I haven't found. Still, I thought it might be interesting to share what I noticed.

What is Minefield Mahjong?

Not long after Kaiji wins his freedom from the sinister Teiai Corporation, he is pulled into another crazy gamble. Invited by two of his friends, Kaiji is going to take on Takashi Muraoka, a shady casino owner, in a game of Minefield: a high-stakes two-player variant of Riichi Mahjong.

Kaiji-kun… that won't be a problem! You see, my mahjong games take no time at all!

Minefield Mahjong is a quick and tense game. After the two players prepare their initial tiles, they have to take "17 steps over the mine field", trying to play a tile that will not cause the other player to win. One misstep and you lose a huge amount of money.

Nobuyuki Fukumoto, author of Kaiji and Akagi, can make even a game of Rock Paper Scissors look incredibly captivating, but mahjong is where he really spreads his wings. The match goes on for 13 manga volumes, and there really are enough clever plays and plot twists, dramatic moments and detailed analyses, that you don't feel like it overstays its welcome.

The project

The Minefield Mahjong project started back in 2013 as a hackathon I organized with a few friends. While the game is pretty niche, the idea sounded interesting: an HTML5 game, multiplayer, over the network. I remember we used socket.io because that was the popular technology for network communication (websockets weren't fully supported everywhere yet), and Python on the server because most of us didn't know JavaScript well enough to be comfortable with it on the server.

During one day, we managed to build a "mahjong rules" engine and start the UI. We never repeated the hackathon, and the development slowed down a bit, but a friend finished the rules engine and was able to play using a terminal.

I picked up the project a few months later and was able to get it to a workable state. Drawing on the work experience from Codility candidate interface, I wrote a hacky "jQuery soup" style frontend, and the project was ready to be published.

After that, Minefield turned out to be a great playground for new ideas. One of them was persistence: the games were stored in memory, but I wanted to handle server restarts. I ended up storing games in an SQLite database and implementing a "replay" scheme for reconnecting clients.

Later, after I learned a bit of React and Redux, I rewrote the frontend. I was really happy with the improvement: not only rendering was much easier, but the whole state and network message processing code was completely separate and could be tested in isolation.

Some other improvements followed. I replaced socket.io with a "pure" websocket engine. A friend ported the code to Python 3. Still, the codebase is kind of a mess: there is a spaghetti of objects calling each other, and I never hunted down a leak that caused some abandoned games to stay on the list, or a bug that caused the bot to die occasionally. I just restarted the server from time to time.

Minefield always used gevent, a Python library that executes "normal" code asynchronously: this is good for websockets, because you don't have to spawn a separate thread for every connected user. Nowadays, the canonical way is to do that using asyncio (with special functions defined as async), so I started to port the code over, but soon ran into a wall. Because of the callback-driven style, the game logic is actually indirectly calling the server code, and as a result the async thing is infecting everything: if sending a message is an asynchronous operation, then every other function also has to be async.

It's not such a big deal as it sounds, it wouldn't be that hard to refactor the project. However, it gave me a good excuse to try out something completely different. I decided to try and rewrite the whole server in Rust. The project was big enough for me to learn something new, but also well-defined and easy to test: I would be connecting the same JavaScript frontend to the new server.

Welcome to Rust

What do you notice first when coming from Python to Rust?

Of course, first of all, the code is statically typed. Every function has to be annotated, every type has to fit. There is type inference for local variables, so you can still declare them with just let, but the type has to be unambiguous from the context.

I really like how everything is immutable by default. You cannot modify a variable unless it's declared as mut:

fn powers_of_two(n: usize) -> Vec<usize> {
    let mut result = vec![];
    let mut current = 1;
    for i in 0..n {
        result.push(current);
        current *= 2;
    }
    result
}

That's much better than the other way around (having to use const). It's not like you shouldn't use mutable variables, but you probably won't use them if you don't need them. Same with public/private: the default is private, and you need to use pub if you want your code to be visible outside.

There are methods, sort of, in that you can declare a function on a type with the familiar foo.bar() syntax. There is no run-time polymorphism, though (at least not until you ask for it). If something has a type Foo, you can be sure it has these specific fields and so on.

However, there are both generic types (such as Vec<usize> above), and traits which can be implemented by many types. For instance, you can create your own type implementing Iterator<Item = usize>, and it will work with any code that expects T: Iterator<Item: usize>.

The most famous are Rust's value semantics and borrowing rules. You can pass values around, but they will be moved, not copied. And you can create a reference to them, but they follow a readers-writer lock pattern: at a given time, you can either have many immutable references, or a single mutable reference, but not both.

let mut foo = vec![1, 2, 3];
use_foo(foo);
// foo.push(4); // can't use foo anymore, it has been passed to use_foo()

let mut bar = vec![1, 2, 3];
for i in bar.iter() {
    ...
    // bar.push(4); // can't modify bar, it's borrowed by iter() above
}
bar.push(4); // can modify bar again

The rules about references get more complicated when you have objects referring to each other, but for simple cases, they are still easy to follow. The interesting part is that they push you towards a specific style of programming. For instance, you will be much more hesitant to write methods that mutate your data (take &mut self), because it's harder to combine them.

Implementing the rules

I started the rewrite by working on the mahjong rules. There is a lot of code there that analyses all possible arrangements of tiles. In Python, you can often get by using generators:

def find_pairs(tiles):
    for i in range(len(tiles)-1):
        if tiles[i] == tiles[i+1]:
            yield ((tiles[i], tiles[i+1]), tiles[:i]+tiles[i+2:])

find_pairs([1, 2, 2, 3, 4, 4])
# -> [2, 2], [1, 3, 4, 4]
# -> [4, 4], [1, 2, 2, 3]

Combining a few functions like that is a pretty easy way to implement Prolog-style backtracking: call find_pairs (and find_groups) repeatedly, and you can generate all possible decompositions of a set of tiles.

Unfortunately, generator functions do not really exist in Rust (although there are experiments). There are iterators, and combining them using .map(), .filter() and so on works well, but defining your own iterator from scratch is harder.

Look at Python's itertools library - there are a lot of different combinators like zip(a, b), repeat(a), all defined as simple Python functions (even if the real implementation is something else, for efficiency). There are also recipes for more. There is a similar library for Rust, also called itertools, but most of the combinators there define their own type - Zip<A, B>, RepeatN<A> and so on. Doable, but much less convenient.

So that's all to say I really miss the generators from Python. In many cases, where I could just generate all possibilities at once, I wrote functions that return a Vec, because that seemed easiest. For the backtracking algorithm, I wrote a generic function that takes a BacktrackStrategy:

pub trait BacktrackStrategy {
    type Result;

    // Generate all the possibilities to try out in a given state
    fn generate(&self, state: &State) -> Vec<Possibility>;

    // Check if there are results
    fn check(&self, state: &State) -> Vec<Self::Result>;
}

// Run the strategy
fn backtrack<T>(&mut state, strategy: &T) -> Vec<Result>
where T: BacktrackStrategy
{ ... }

The backtrack function will repeatedly call generate and check, collecting all the results.

I suspect it's not a very Rusty solution. It's something more like the Strategy pattern from OOP. I could also define generate and check as function arguments passed to backtrack directly, but it doesn't make the code any simpler.

Enums and error handling

Some of the original Minefield's more problematic code was in network message handling. It looked something like this:

class Game:
    def on_message(self, msg_json):
        msg = json.loads(msg)
        type = msg['type']
        args = msg['args']
        getattr(self, 'on_' + type)(*args)

    def on_new_game(self, nick):
        ...

    def on_discard(self, tile):
        ...

It does the job, but it's very… dynamic: there is no parameter validation, no easy way to check if you missed or misspelled something, and so on. And probably passing something else in the JSON (like an array where a string is expected, or vice versa) can cause all kinds of funny behaviour.

In Rust, I created an enum type for the message, with all the possible variant specified:

enum Msg {
    NewGame { nick: String },
    Discard { tile: Tile },
    ...
}

fn on_message(..., msg: Msg) {
    match msg {
        NewGame { nick } => ...,
        Discard { tile } => ...,
    }
}

If a JSON library gives me a Msg, all the fields will have the right type. Also, the match is required to be exhaustive: if I add a new message type, and I'm not handling it in the match, the compiler will tell me to go back and fix that.

Rust's type system is quite good at making invalid states unrepresentable. If something cannot happen, maybe you can modify your type system so that it cannot even be constructed. For example, look at the following definition:

struct Hand {
    // 4 triples and a pair, OR seven pairs
    groups: Vec<Group>
}

enum Group {
    Triple(Tile), Pair(Tile)
}

You can add run-time assertions to check if "4 triples and a pair OR seven pairs" holds, but it doesn't guarantee that the code will not crash. But what if you could ensure that only these types of hands can be represented?

enum Hand {
    Normal { triples: [Triple; 4], pair: Pair },
    Pairs { pairs: [Pair: 7] },
}

struct Triple(Tile)
struct Pair(Tile)

Constructing a Hand like this is going to be more complicated, but now instead of the English-language definition of a valid hand, you have the same encoded in Rust and checked for you automatically.

Having to be always so explicit is not always easy. In some cases, I wasn't able to "encode only the possible states", and ended up, for instance, with an Option value that I manually unpack by calling value.unwrap():

struct GameRoom {
    game: Option<Game>;
}

// I can check if the game is there:
if let Some(game) = room.game {
    ...
}

// Or, if I'm sure it has:
let game = room.game.unwrap();

This is less elegant, as failure will instantly abort the program, but usually it's exactly what I want in this case: I have a programmer mistake, not a user error, and there's no easy way to recover.

Serialization

Okay, so I created the pretty enum above, but how do I read and write it to JSON? Over the websocket, I'm sending messages in the following format:

{ "type": "new_game", "args": ["Akagi"] }
{ "type": "join", "args": ["Kaiji", "key"] }

In the Rust world everyone uses a serialization framework called serde. It's very modular: you specify serialization in terms of Serde's own "type system", and it generates a series of macros for you. Then, you use one of the many "backend" libraries for a specific format like YAML, JSON or MessagePack.

Still, the existing format of messages looks pretty far from my new enum definition. I tinkered a bit with Serde's options, and it turns out I can get pretty close by passing a few options:

#[derive(Serialize, Deserialize)]
#[serde(tag = "type", content = "args")]
#[serde(rename_all = "snake_case")]
pub enum Msg {
    NewGame(String),
    Discard(String, String),
    ...
}

That gives me… almost what I want, but the args is different. Sometimes.

{ "type": "new_game", "args": "Akagi" }
{ "type": "join", "args": ["Kaiji", "key"] }

Unfortunately customizing the serialization further wasn't that simple. If I wanted the format to stay exactly the same, I would have to either:

Write my own serializer, with a match over all possible messages. There's about 20 of them, so that's not a good idea.
Do a pre/post-processing phase on the "args" array to handle it correctly.

I tried the approach number 2 for a while, and ended up with some awful code full of special cases for 1-element array, 0-element array, some random empty dict {} in one of the cases, and on. It turns out that I really wasn't careful with sending the messages in a regular format, just wrote whatever made sense at the time, in the language I was using - so messages originating from Python had one format, the ones originating from JavaScript had another, and the code wasn't exactly very strict on what fields it required.

So I scrapped all of that preprocessing, and changed the format slightly in Python and JavaScript so that it followed Rust conventions. Now I'm able to serialize messages with no custom code.

By the way, this is where I noticed I didn't have the protocol written down anywhere. I had to fish out the possible messages and their fields from different parts of the code. So I'm really happy I now have all the messages in the protocol defined in one central place.

I haven't done a lot with Serde, but I'm pretty happy with it. Later, when I wanted to save the games to disk, it literally took me only half an hour to write saving and loading the games.

Game architecture

Back when I was writing the Python Minefield, I wanted to have a "game" class that would just handle the game rules, and keep it isolated from all the real-life stuff like socket communication. So my idea was to use a callback:

class Game:
    def __init__(self, callback):
        self.callback = callback

    def start(self):
        # Send start_game to both players
        self.callback(0, 'start_game')
        self.callback(1, 'start_game')
    ...

That worked reasonably well, and I was able to write tests for many different game scenarios before working on rest of the code. But then, I realized my server would handle multiple games, and each of the games would have two players, so I needed one more class: a game room. Very roughly, it was something like this:

class Room:
    def __init__(self):
        self.game = Game(callback=self.on_game_message)
        self.sockets[0] = Socket(callback=self.on_player_message)
        self.sockets[1] = Socket(callback=self.on_player_message)

    def on_player_message(self, message):
        self.game.on_message(message)

    def on_game_message(self, i, message):
        self.sockets[i].send(message)

It's starting to look ugly. Here is a full path of a message and responses between objects:

We receive message (msg = socket.receive()) and pass it to room
- room.on_player_message(msg) executes, and passes the message do game
  - game.on_message(msg) executes, processes the message, and prepares 2 messages in response
  - The first response is passed to a callback registered in game, which is in room again…
    - room.on_game_message(response_1) executes, and passes the message to its own callback, which is in socket
      - socket.send(response_1) sends the response back to player
  - The second response is also passed to the same callback…
    - room.on_game_message(response_1) executes
      - socket.send(response_2) sends the response back to the other player

And that's not even taking into account that players can join and leave rooms, create new ones, reconnect, and so on.

Fortunately, this is something that Rust will not allow us to do easily. See the room calling game which calls room again? That looks like we are trying to use ("borrow") room again, while it's still in use! Maybe after a lot of fighting I would find a way to make it work, but at this point I'd rather look for a simpler solution.

What's simpler is to return the response: instead of registering callbacks, give back the messages to the layer above to handle. Here is how it would look like:

We receive message (msg = socket.receive()) and pass it to room
- room.on_player_message(msg) executes, and passes the message do game
  - game.on_message(msg) executes, processes the message, and prepares 2 responses
  - … the responses [response_1, response_2] are returned to room
- … the responses [response_1, response_2] are returned to socket
We send all the responses back:
- socket.send(response_1)
- socket.send(response_2)

In Python, I would probably write generators and yield the messages back, but in Rust I'm content to just return an array (Vec) of messages.

In the code, it looks a little clumsy, but it does the job. I am very happy that I managed to write a lot of "server" code that is still pure and does not involve any network communication yet.

Websockets

Network communication was the next part. It turned out that it's not that hard to do websockets in Rust, but it is a bit harder if you want to have an HTTP server that serves static files and also allows upgrading a connection to websockets. Especially if you want the code to be asynchronous - async is still quite young in Rust.

I ended up using two libraries (hyper for HTTP and tungstenite) for Websockets, but gluing them together myself. Now I know what it means to support websockets! You have to:

Verify the right headers in the HTTP request (Connection: upgrade, Upgrade: websocket, …),
Reply with a special response (101 Switching Protocols and right headers),
Spawn another thread / async task that will use the existing connection as websocket stream.

The websocket connection will need some state. Rust will not allow me to just create a struct Server and pass it to another task - that could be unsafe! There are special markers (Send and Sync) that say what kind of data I'm allowed to send or share between tasks. As a result, I had to wrap my data in:

a thread-safe reference-counting pointer (Arc), so that it's not dropped too soon, and
a Mutex that will ensure only one task will hold the data at a time.

Same as C++, mutexes follow Resource Acquisition Is Initialization, which means I don't have to worry about unlocking them manually. They are unlocked after a variable goes out of scope:

// a handle to my Game
let mutex: Mutex<Game> = ...;
{
    let game_handle = mutex.lock().unwrap();
    // now I can use game_handle
    game_handle.start_game();
    ...
}
// game_handle goes out of scope, and the mutex is unlocked

So how does it work together? I have the following pieces of code (tasks) running:

The HTTP server that is receiving requests,
For each websocket sonnection, a "reader task" that is reading the messages and passing them to Game,
For each websocket sonnection, a "writer task" that is pumping the responses back to the user,
A special "beat task" that fires every second and handles things like time limits.

What's great is that this part does not need to know anything about game rules, joining rooms and so on. The network code and the logic code are completely separate.

Launching

The ultimate test of the game was connecting it to the existing JavaScript client. Fortunately, there were also plenty of opportunities to test before that. For many scenarios, I didn't even have to think too hard about the tests, just translate them from Python. Thanks, 2013 Minefield authors!

I was also able to write a bot and connect it to the Python server. That was an important check for the protocol. The Rust implementation, once it compiled, more or less worked, but network communication was crashing all the time because of JSON format mismatches that I had to fix.

What's nice is that I could already see the difference in performance. The old Python bot sometimes took 30-60 seconds to prepare its initial hand. The Rust bot, following the same algorithm, was done in one second.

When everything was done, I published the Minefield bot and the game server on my website (https://pwmarcz.pl/minefield/). That was also a welcome change from Python: before, I had to push all the source files using git, then run pip install to upgrade the Python packages. The new deployment is just copying two binary files.

sync

Trait methods

I am a bit unhappy about the amount of "implicitness" in Rust. When you call foo.bar(), the bar method does not have to come only from the type of foo, but any other traits that are implemented by that type.

That makes a lot of sense because it's how you can have arithmetics, comparison, ordering, I/O and so on, on many different types, including the new ones that you define. However, it also means that it's easy to add new methods, and there are many libraries that add convenient traits like StreamExt for stream handling, or SliceRandom for permutations:

use rand::seq::SliceRandom;

let mut rng = rand::thread_rng();
let mut numbers: Vec<usize> = vec![1, 2, 3, 4, 5];

// the shuffle() method comes from SliceRandom
numbers.shuffle(&mut rng);

To be fair, you do have to import the trait before the method is available. Still, just looking at the code, there is no way of telling where the method came from. That's a pretty big departure from Python's "explicit is better than implicit" in a language that otherwise tries to follow this rule.

Libraries

Another difference with Python is the approach toward standard library. Python is famously "batteries included" and a lot of things can be implemented without installing anything extra. In contrast, Rust's library is deliberately small, and you need a separate library for things like asynchronous programming (futures), random number generation (rand, then rand_distr for things like normal distribution), logging (log)…

And of course all of these libraries pull their own dependencies. As a result, Minefield declares use of about 15 libraries (crates), and the build process pulls about 200 of them. While I suspect the libraries' quality is better than over at NPM, and the ecosystem seems pretty healthy overall, it makes me a bit uncomfortable.

Summary

Rewriting an existing project was a weird experience. I had the Python implementation to lean on, so I'm sure some of my solutions were a bit lazy and more Pythonic than Rusty. Still, I feel that the project was big enough to allow me to learn a fair chunk of Rust and get a feeling for the language.

Many things ended up more verbose than Python, and that didn't surprise me. However, what was a bit surprising is how much Rust is a C++, not a C. I had an image in my mind of something simple and low-level like C, with all the "unsafe" bits solved somehow. Instead, I see there is a lot of effort to introduce proper abstractions and allow the user to program on a higher level.

If you want to play the game, here is the link again: Minefield Mahjong. And you can find the source code at Github. Thanks for reading!

Kaboom: an unusual Minesweeper

2019-11-21

This is a post about development of Kaboom, a Minesweeper clone with a twist.

Minesweeper for Windows 3.1

Apparently Minesweeper has a pretty long history for a computer game, but I guess most people remember the versions bundled with Windows. I was never good at Minesweeper but I enjoy a game from time to time. Some people play more seriously, see for yourself if you want to enter that rabbit hole.

And if you just want to get in the mood, go watch Minesweeper - The Movie.

The idea

Recently, I had an idea: what if you had to play Minesweeper against the computer?

Normally, the arrangement of mines is decided at the start of the game (except for some trickery so that you cannot lose on the first click). But what if there was no pre-determined arrangement, and the game was allowed to choose after you play?

It could be quite cruel: if you are playing on a square that could contain a mine, it would always contain one! So the you have to always prove the square is safe.

Board with hints

(On the above board, the squares marked with . are guaranteed not to contain a mine, and the squares marked ! are guaranteed to contain one. The question marks are uncertain: maybe if you reveal more squares, you can deduce something about them.)

On the other hand, there are situations where you are forced to guess:

Need to guess

One of the bottom squares contain a mine, but it's impossible to say which one. You have to select one of them. But according to what I just said, that would mean certain death! I wanted the game to be cruel, but now it's unwinnable.

So I'll modify the idea a bit and say you are allowed to guess, but only if there are no safe squares left. This way, the game will be cruel, but fair.

In other words:

If you play a square that is guaranteed safe, it's empty.
If you play a square that has a guaranteed mine, it will contain a mine and you will blow up.
If you play a square that is uncertain, then:
- If there are any safe squares on the board, you are punished for guessing and that square will contain a mine.
- Otherwise, guessing is allowed and that square will be empty.

Mines at the boundary

How to implement such a game? I could try computing all possible boards, but that doesn't sound realistic: even a small 10x10 board means 2^100 possibilities. Selecting just the ones that contain exactly N mines doesn't help us much.

Fortunately, I don't have to care about the whole board. We don't known anything about mines not adjacent to labels. I just care about the ones at the boundary, the rest could be determined completely randomly.

Labels, boundary, outside

Then, I can compute all possible arrangements of mines at the boundary, consistent with the labels. Backtracking is a good technique that will brute-force all combinations but also quickly back out as soon as we determine a branch of computation is impossible.

Combining the possibilities

Above, there are 2 possible arrangements of mines on the boundary. By combining them, we know which squares are guaranteed to be empty or full.

I also need to track total number of mines. So the arrangements are really like "5 mines at the boundary, 5 mines remaining on the outside". This is important because otherwise I might generate too many mines on the boundary (or too few!)

So, I have all possibilities. What happens when the user chooses to reveal a square?

Select a random possibility (but one that satisfies the "cruel, but fair" rules). This will determine mines at the boundary.
Randomly scatter the remaining outside of the boundary.
If the selected square contains a mine, game over.
Otherwise:
- Compute the new label for revealed square
- Reveal additional squares if it happens to be a 0
- Forget the possibility we selected earlier! Only the labels will be binding from now on.

This is very inefficient

For smaller boards, this is OK. Usually there is only a few possible combinations… hang on, what is this?

1712 possibilities

Oh no.

18 million possibilities

Somehow I managed to unlock 18 million possible mine arrangements. My Firefox is taking up 12 gigabytes of memory and revealing a square takes half a minute. Clearly, I need a better algorithm.

You might say that since Minesweeper is NP-complete, I cannot escape exponential running times. And that's true in the general case, there will be "evil" positions that take a lot of time to compute. But most of the time, for random positions and a small board, I can do much better than traversing millions of possibilities.

I don't need to store all the combinations. I don't even need to compute all the combinations. What I need is a way to:

check if a given square is guaranteed safe, guaranteed dangerous, or uncertain,
find any valid possibility (possibly with additional requirement that a given square is empty, or full).

And if you look at the screenshots, there are many clusters of ? ? ?, but they are possibly independent of each other. Maybe I can reason about parts of the board in isolation. In fact, there are already tools for automated reasoning that implement all kinds of clever tricks…

Finding a solver

Instead of implementing the clever tricks myself, I am going to use a SAT solver. These are tools that take a formula consisting of boolean variables, and search for a set of values that would make the formula true. Which is more or less what we need here.

A more powerful class of software is SMT solvers which operate on richer set of values and formulas, such as first-order logic (quantifiers), arrays, integers and so on. It would certainly help to at least be able to specify some equations on integer numbers. However, I am looking for something working in a browser. People managed to port sophisticated tools like the Z3 prover to browser, but the WebAssembly version is a 17 MB download and that sounds like an overkill here.

So I found MiniSat, a small SAT solver, that has been compiled to Javascript by Joel Galenson. The compiled file is just 200 kilobytes so I'm going to go ahead and grab it.

CNF formulas

SAT solvers operate on conjunctive normal form (CNF) formulas. A CNF formula is "an AND of ORs", for example:

(a | ~b | ~c) & (c | d ~e) & f

You can convert any propositional logic formula (variables, and, or, not, implication) into CNF, so it's something like a universal format.

So how do we use it? Let's say we have a board:

? ? 1
2 ? 1

If I create variables for the unknown squares (clockwise: x1, x2, x3), they will need to satisfy these equations:

x1 + x2 + x3 = 2
x2 + x3 = 1
x2 + x3 = 1  (same as previous)

But how to express "a sum of variables is 2" in CNF?

I figured out a way, which I learned later is called "binomial encoding" and is the most straightforward encoding people use. You need to consider all possible subsets of variables. For example, for x1 + x2 + x3 = 2, you need the following formulas:

For every subset of 2 variables, at least one is true. That ensures the sum is greater than 1.
- (x1 | x2) & (x1 | x3) & (x2 | x3)
At least one variable is false. That ensures the sum is less than 3.
- (~x1 | ~x2 | ~x3)

For x2 + x3 = 1, I need a similar set of formulas:

At least one of the variables is true: (x2 | x3)
At least one of the variables is false (~x2 | ~x3).

Putting it together, I will have a CNF formula with 6 clauses (parts). In the standard DIMACS format:

The clauses are all terminated by 0, and negation is marked with a minus. If I plug it into MiniSat (try that for yourself), I'll get:

SAT 1 2 -3

That means that MiniSat found a solution where x1 and x2 are true, and x3 is false. Here is how the board would look like:

! ! 1
2 . 1

The whole program is a bit more complicated than that: this is just a single solution, another one exists. So in order to find out whether x1, x2, x3 can ever be true (or false), I need to make more queries. I need to ask "given the above formula and also x1, is it satisfiable? what about the above formula and also ~x1"?

The encoding means that I need to find all possible combinations (e.g. all subsets of 3) of a set of variables. However, for a given equation there will be only up to 8 variables, and so the formula is usually small enough for MiniSat to solve quickly.

Keeping track of number of mines

Unfortunately, that is not a complete solution! I still need to track how many mines are left. Some combinations should be impossible, because otherwise you can generate more mines than is allowed and the game will become unwinnable.

11/10

In fact, the opposite case is also possible: if an arrangement contains too few possible mines, the game will crash because there will be no place left to put the extra ones.

So I need to specify in the SAT formula that "the number of mines is at least X and at most Y". Initially, I thought I could just use the earlier trick with all combinations. Unfortunately it doesn't work too well with larger numbers. If there are, say, 20 squares and 10 mines, then after plugging the numbers into binomial coefficient we'll find out the number of combinations is already into 6 digits!

This is when I learned there are many many other ways of encoding the sum of variables as a SAT formula. You need to create a circuit that will combine the individual variables. See for instance this StackExchange answer or this one.

I ended up implementing the one from a paper called Efficient CNF encoding of Boolean cardinality constraints, by Olivier Bailleux and Yacine Boufkhad. It's a tree that recursively adds unary numbers (or, depending on how you look at it, sorts the bits so that all ones are at the beginning):

Counter circuit

At the end of this circuit, you get a sorted set of "output" variables. To assert that the sum is at least X, check that first X output variables are 1. To assert that the sum is at most Y, check than the last N - Y output variables are 0.

Unfortunately, while much better than using all possible combinations, this circuit is still pretty wasteful as it generates Ө(N^2) clauses. When the number of open squares is around 100, the game becomes sluggish. We can still optimize.

Reducing the number of queries

After implementing all of that, I noticed I could still reduce the number of queries to the solver. I wanted to determine the status of all the squares (i.e. check if they are guaranteed dangerous, guaranteed safe, or neither). I did that using a simple loop. Let's say the board is described using a formula F:

Solve for F & ~x1 to check if x1 can ever be 0
Solve for F & x1 to check if x1 can ever be 1
Solve for F & ~x2 to check if x2 can ever be 0
Solve for F & x2 to check if x2 can ever be 1
and so on.

What I noticed is that if I do get a solution for F & ~x1, the solution will contain assignments of all the other variables as well. This already answers many other questions: if the solution contains x2 = 0, I don't need to ask if x2 can ever be 0 because I already know that. (If I don't get a solution for a given query, well, it doesn't give me any extra information). This allows me to reduce the number of queries by about 2 to 5 times, depending on the arrangement.

Caching

This still doesn't solve the problem of a huge formula generated by the "counter" circuit. As I said before, the number of clauses is on the order of N^2. On a big board, the formula can be as big as 10,000 clauses.

Fortunately, most of the time we know many fields for certain. If a field is guaranteed empty, or guaranteed full, it will never change! That means we can cache its value and not include it in the SAT formula. Once we determine a status of a field, we don't ever need to include it in calculation again. Only the uncertain fields (?) will be kept as long as they are uncertain.

This optimization is a bit scary: we no longer have a formula stating the correctness of the entire board. If everything else works as planned, that's not a problem, but it might make bugs harder to track.

Another corner case: playing outside

Should you be allowed to click anywhere on the map, outside of boundary?

Initially I thought that it should be treated the same as guessing: if there are no safe squares, you can just click anywhere on the board. But some friends thought it was weird that it would guarantee you an empty spot anywhere.

So I changed it so that clicking outside is always punished (as long as the game has not run out of bombs to punish you, that is). With the exception of game start, of course, because then the whole board is "outside".

But it turns out that there is another corner case related to that: what if all your boundary fields are deadly?

3 . .
. . .
. . .

You have no choice but to reveal something else. This case could make the game unwinnable right at the start. So now there is another exception. You are allowed to play outside of boundary when:

nothing is revealed yet, or
all bombs in the game can be proven to be on boundary (clicking outside must be safe), or
all fields on the are certain to have bombs (you are forced to click outside).

Update: The change turned out to be controversial, as the restriction is somewhat artificial. I added a switch that will allow/disallow outside guesses.

That's all

You can play Kaboom here. Try enabling the debug mode: it makes the game trivial, but shows well how it works underneath.

The source code is at Github. It's not very pretty.

You might be also interested in a related minesweeper game by Simon Tatham, the creator of PuTTY. His version has a different twist: it's always solvable without guessing.

« Prev 1 2 3 4 5 Next »