Still not totally convinced this is finished, since it took such a long time. I started with the 2025-01 FIDE XML players list, culled it with a ChatGPT-generated Python script so that only players rated 2000+ were included. Then converted that to XLSX (amongst other formats) and copied out the column in that spreadsheet for the FIDE IDs of those players. Then, with another Python script, I was able to download a JSON file for each player from the FIDE API, using a wrapper from a GitHub repository. These JSON files have all the information for those players. Not just general information, but the ratings and number of games for every month that they have a rating for. So these files turned out to be pretty long. The next step was converting all of those to a new XML players list, which includes all the history, as well as the general information. Even though the number of players is drastically reduced to only a bit more than 19,000, still the new XML players list is about twice the size of the old one. I did make sure to streamline the elements, so that, as much as possible, they resemble the elements in the regular players list.

https://www.mediafire.com/file/enoh1ljui47g305/fide-ratings-and-history-2000+-240115.zip/file

The size of the ZIP is about 40 MB, and oddly the size of the XML file is about 1.2 GB. I’m not sure how it compressed so well, but it seems to have done so.

The player’s list is provided with each month, but not provided in the archives for previous months. While it would be somewhat useful to have an archive of them, there’s no reliable way to build it. However, all three lists are provided as XML, which can be combined with a script. So, using a Python script written by ChatGPT, I’m able to create combined XML files which don’t contain the inactive players, but nonetheless could easily stand in for the player’s list. And these can be provided for every month that there are XML files. Here is December 2024.

https://www.mediafire.com/folder/86n30ijqd3cos/2024-12

I’ve made many attempts now to create the perfect Python script for SF commits (via ChatGPT, of course), and finally realized that the problem was that almost everything necessary is already at Abrok. Official builds are over at the official site anyway. But all the commits are here, going back to 2018 or so. Here is a list of those commits, with direct links to the executables.

https://www.mediafire.com/file/v3y5kjnvxuj5528/stockfish-data-250113.txt/file

The FIDE player list is a combination of the three ratings lists (standard, rapid, and blitz) but is very big and includes lots of players who are registered but have no rating. After running a Python script to keep only the players rated 2000 or above, I can provide a much smaller list that might also be more useful. In the MediaFire directory for 2025-01, you can find it in XML, CSV, JSON, XLSX, ODS, and TXT.

https://www.mediafire.com/folder/kpyvsijjhsiiw/2025-01

CSVs made from the full tables downloadable from:
https://training.lczero.org/matches/?show_all=1
https://training.lczero.org/networks/?show_all=1

These are tables of data related to the Leela Chess Zero self-test matches and NNUE networks. Aside from all the relevant data in the matches table, you can also turn the first value in each record — the ID field — into a URL to download the PGN itself if you put it in the following form:

https://storage.lczero.org/files/match_pgns/1/ID-NUM.pgn

It should be noted that while most of the links will come from run 1, some of them don’t, so the /1/ before the ID-NUM is actually variable data as well, and is also listed in the table along with the ID.

The data for the networks, similarly relevant, can not so easily be turned into URLs. They come with the table, but converting strips those out, and so far I don’t have a way around that. So instead I have the networks page itself, which is available at: https://storage.lczero.org/files/networks/

https://www.mediafire.com/file/ki2bt9vgbq55298/networks_storage_output.csv/file

The above link is to a CSV I just made of the networks page itself. This is of limited use, so it’s static. In the next iteration of the script, I’ll try and put all this functionality together. Or, rather, get ChatGPT to do so, as I couldn’t code my way out of a sack of pythons.

NOTE: Those initial HTML links should be saved directly to the hard drive. Attempting to view either link would result in your browser tab crashing, as they’re both too large to display. Hence downloading them directly via the browser and then converting them to CSV, which is what I did, and what these files are:

https://www.mediafire.com/file/fmjc441b7y9dvht/lc0_data_241220.zip/file

And here is the Python script that ChatGPT wrote that generates the CSVs. Using PyInstaller (and with ChatGPT’s help, of course) I managed to get it into EXE form. All you have to do is double-click on it, and it will automagically download both HTML files for you, then convert them each to CSV, putting everything into whatever folder it happens to be in:

https://www.mediafire.com/file/8abgkivf5tl5fl9/lc0_tables_generator.zip/file