Overview of pgn-extract
pgn-extract is a command-line tool for searching, cleaning, and manipulating chess game files in Portable Game Notation (PGN). Below is a concise outline showing core usage and features.
1. Basic Usage
- Syntax:
pgn-extract [flags] [input-game-files]
- If no arguments are given, it reads games from standard input and writes valid games (in SAN) to standard output.
- Example:
pgn-extract -o clean.pgn raw.pgn
- Use
-r
to check for errors without writing output:pgn-extract -r raw.pgn
2. Output Control
- -o <file>, --output <file>: Write matched games to a new file (overwrite).
- -a <file>, --append <file>: Append matched games to an existing file.
- -n <file>: Write unmatched games to a file.
- -C, --nocomments: Remove comments.
- -N, --nonags: Remove NAG symbols.
- -V, --novars: Remove variations.
- --nomovenumbers, --noresults, --notags: Suppress move numbers, results, or tags.
3. Searching and Filtering
- -x <file>: Positional variations. Matches if a game reaches the position described by moves in that file.
- -v <file>: Textual variations. Matches sequences of moves (supports wildcards
*
,!
, etc.). - -t <file>: Tag-based criteria, e.g. players, dates, results, Elo, or FEN positions.
- -T...: Limited command-line tag filter (player, date, result, etc.).
- --fenpattern <string> / --fenpatterni <string>: FEN-based pattern match in the game.
- --materialy, --materialz: Endgame or material-based filtering (e.g., R vs N endgames).
4. Limiting Game Length
- -b / -p: Filter by number of moves or plies (lower bound, upper bound).
- --minply, --maxply, --minmoves, --maxmoves: Modern equivalents of the above.
- --startply <N>: Start matching only after N plies.
- --matchplylimit <N>: Stop searching for matches after N plies.
5. Duplicate Detection & ECO Classification
- -d <file>, --duplicates <file>: Write duplicates to that file.
- -D, --noduplicates: Skip duplicates in output.
- -U, --nounique: Suppress unique games (so only duplicates appear in output).
- -e <eco-file>: Add/replace ECO codes in the output. Defaults to eco.pgn.
6. Splitting Output
- -# <N>: Split output into files of N games each (named 1.pgn, 2.pgn,...).
- -E <N>: Split output by ECO code (A.pgn, B.pgn,..., or A00.pgn,... etc.).
7. Specialized Flags
- --checkmate, --stalemate, --fifty, --repetition, --underpromotion: Match only games that exhibit these features.
- -F: Output FEN string after the final move or replace placeholder comments with FEN strings.
- --fencomments: Place FEN after every move.
- --hashcomments: Place a hashcode after every move.
- --addhashcode: Add HashCode tag to each output game.
- --splitvariants: Output each variation as a separate game.
8. Helpful Examples
1) Convert raw PGN to cleaned SAN: pgn-extract -o clean.pgn raw.pgn 2) Extract only games by Fischer: pgn-extract -t tags.txt raw.pgn // tags.txt contains: // Player "Fischer" 3) ECO classify a file: pgn-extract -e eco.pgn -o ecoclass.pgn bigfile.pgn 4) Remove duplicates, keeping unique: pgn-extract -D -o unique.pgn input.pgn 5) Extract short games (under 20 moves): pgn-extract -bu20 -o shortgames.pgn input.pgn
9. Further Documentation
- Run
pgn-extract --help
or-h
for a brief flag summary. - The tool’s source code and full manual are included with the distribution.