A new chess engine : m8

For more than a decade now, I’ve been programming chess engines as a hoby on and off. However I’ve not seriously worked on my current engine, MatMoi VII for almost two years now. But I’m thinking about comming back to computer chess programming.

If I do pick up chess programming again I will start over wrtiting an 8th iteration of my engine. I think it will be more enjoyable for me to work on a new code base than to work again on MatMoi with its flaws. For the last couples of weeks I’ve been thinking about what I could do differently in my next engine, m8 (read mate). I’ve isolated the following designs goals/ideas/principles that I will try to follow:

1. KISS

Keep it simple stupid. This is a principle I always try to follow while writing code, but I’ll make a special effeort to keep it in mind while writing m8. A chess engine usually contains lots of micro-optimizations that make the code less readable. In m8 I’ll make sure that each of those is really worth the increase in complexity.

2. Unit-testing and Assertions

I don’t plan on doing TDD or have 100% test coverage, but any test that can be automated will be. I also plan on heavily using assert throughout the code.

3. Testing

For a couples of years now the trend in computer chess programming is on testing everything with thousands of games. With m8 I plan on doing this early and also plan on building multiples tests utilities. There is three types of test I want to be able to perform early :

  1. Positions tests suits (EPD files);
  2. Rating tests where m8 plays hundreds or thousands of games against a couples of opponent in order to measure it’s strength;
  3. Rapid self play tests. Theses are less common. I want m8 to be able to play really fast self play games using differents sets of parameters. This functionality could be use to tests differents evaluation function against each others for example.

4. Parametrization

Everything that can be parameterized should be. Especially search and evaluation parameter. Also the user should be allowed to set theses parameter in a configuration file, on the command line or in the command line interface. The idea is to make it as easy as possible to test differents values for any parameters of the engine.

5. Open formats

If possible any files read or written by m8 should be in an open format. Things like the opening book should not be in binary format if possible (I say if possible because I’m not sure it would be efficient to have a plain text opening book). An opening book in text format can be read and modified by a human. If text format is not an option an open binary format should be used (Think SQLite).

6. Fisher Random Chess

I’ll implement FRC from the start. It’s not too hard to implement and since there is not many FRC engine it could make m8 more appealing.

7. YBWC

I’ll try to make m8 my first multiprocessor engine.

8. Blog

I’ll try to keep a journal of my ideas and experiments and if I think some of theses could be useful or interesting to others I’ll try to blog about it.

Should we sort files before or after merging them?

This weekend I produced a very large database of chess positions. This database is useful to chess engine programmers and researcher. The database is a text file containing one position per line encoded using the Forsyth-Edwards Notation.

At some point in the process of creating this file I had multiple files containing more than 600 million lines with lots of duplicates (positions occurring in more than one game). I needed to use the GNU sort utility to merge those files and remove duplicates. I had two options : I could join the files in one single large file and sort it, or I could sort the files individually and use the sort utility to merge them together and remove duplicates at the same time. After some research on the internet I could not find any advice about which route might be the best. So I decided to test it.

I’ve taken a more manageable subset of my file containing 9.5 million lines, I shuffled it with :

sort --randomized-sort -o testdata testdata

Then I splited this file into roughly 10 equals parts :

split -n l/10 -d testdata testdata

I’ve run the first scenario (join the files and sort the result) four times with the command below. The average time for the four run is 69.4 seconds with the slowest run taking 70.9 seconds.

cat testdata* > testdata; sort -o testdata testdata

I’ve run the second scenario with the small bash script below. The four run average is 70.9 with the slowest run at 71.1.

#!/bin/bash
for f in testdata*
do
 sort -o $f $f
done
sort --merge -o testdata testdata*

While the second method is marginally slower it’s not really significant. It kinds of make sense. GNU sort implements a merge sort which is already a recursive sort. By sorting the individual files before hands we are doing exactly what GNU sort is doing internaly, that is merge sorted chunks of the data.

 

FEN Database

Back in 2003, I collaborated with a friend to generate a collection of chess positions in FEN format. Back then we were creating this collection to test the efficiency of the Zobrist hash function for chess positions. Since we only needed the pieces positions on the board we only kept the first field of the FEN notation.

In 2005 I mentioned on rec.games.chess.computer that I had such a database and since then a couple of persons contacted me by email asking me for a copy of the database. Unfortunately, most of them could not make use of it because important information about the positions (such as whose side is to play next) were missing.

This week, someone contacted me again asking for the database and I decided I would build it again, this time keeping all FEN fields.

The database is created using all the PGN files I have collected over time. I also updated my TWIC collection up to last week before I started. I used the pgn2fen.exe tool available on Tim Foden’s site to create huge files containing every FEN positions of every games in my PGN files. I then used the GNU sort utility to sort the files and remove any duplicate lines. That gave me a file with 286,623,962 unique positions, weighting 16.8 gb. I used the GNU split tool to split this file in five more « manageable » files. I used the round robin distribution capabilities of split so that each file would contains position representative of the whole collection even if they are sorted. Lastly, I compressed the files with 7za, because it was the most effective and free compression method I found in my (limited) researches. The files went from 3.4 gb to 0.65 gb each for a total of 3.3 gb.

You can dowload each separate files below. On linux you can expand them using :

7za e uniqueXX.fen.7z

On Windows you will need to use 7zip wich is free and has a graphical interface.

MatMoi 7.15.0-cct

Je vient de publier pour la première fois le moteur d’échecs que je développe dans mes temps libre depuis quelques années. Vous pouvez le télécharger sur cette page. Vous pouvez y télécharger des versions 32 et 64 bits, autant pour Windows que pour Linux. Pour jouer contre le moteur, vous aurez besoin d’une interface graphique comme Arena (Windows) ou XBoard (Linux).

MatMoi, dans sa version actuelle à un classement aux alentours de 2250 ce qui surpasse la majorité des joueurs de club, mais qui reste quand même relativement faible en comparaison des moteurs commerciaux.

Depuis le tournoi CCT de 2010, j’ai repris le développement de MatMoi de façon plus sérieuse et j’espère l’améliorer de façon significative dans les prochains mois.

Vous trouverez plus d’informations sur la page de MatMoi (en anglais) de ce site.

MathieuPagé.com

Voilà, il était un peu temps que j’ai une présence en ligne. J’ai donc concocté ce site dont voici la recette :

  • Un peu de WordPress pour la section blog,
  • Façonner le thème par défaut de WordPress pour utiliser votre propre en-tête,
  • Un peu d’html/javascript pour les autres pages,
  • Une photo, dont vous êtes l’auteur en provenance de votre profile Flickr.

Bien mélanger dans VIM pendant 4 heures, utiliser le GIMP au besoin. Et voilà!

Donne une portion

Je n’ait pas l’intention de devenir un blogeur assidue, d’autres le font déjà beaucoup mieux que je ne pourrais le faire. Ceci dit, j’utiliserais peut-être ce blog occasionnellement, si j’ai quelque chose à dire. Quoique je sais pas si ça intéresse quelqu’un ce que j’ai à dire 🙂

Dans les prochaine semaine, je vais publier le jeux d’échecs, MatMoi, que je développe depuis quelques années. Si ça vous êtes intéressé, vous devriez regarder la section MatMoi.