Thousands of enciphered historical manuscripts are buried in libraries and archives. Examples of such material are diplomatic correspondence and intelligence reports, private letters and diaries as well as manuscripts related to secret societies. The bulk of these historical manuscripts will remain undeciphered unless we can automate the processes involved in decoding them. Our aim is to develop resources and computer-aided tools for decoding of historical source material by using AI and cross-disciplinary research involving computational linguistics, computer vision, cryptology, history, linguistics and philology.
Within the DECRYPT project, we release resources and tools with open access to facilitate research in historical cryptology, allowing collection, analysis and decipherment of historical ciphertexts. Resources are collections of encrypted sources, and historical texts with language models. The tools facilitate the processing of the encrypted sources from transcription to decipherment. We list our resources and tools below, which are described in our scientific publications.
The DECODE database contains a collection of digitized images of ciphertexts and encryption keys along with metadata information about their provenance, location, transcription, and possible cryptanalysis or commentary. The database enables search and all records in the database are open to the public. HistCorp is a collection of historical corpora and other useful resources and tools for researchers working with historical text.
We provide tools for transcription and decipherment of historical ciphers using advanced machine learning algorithms. Historical cipher images can be transcribed, i.e. transformed into a computer readable text format with the help of the TranscriptTool. The transcribed ciphertext can be corrected and used as input to CrypTool which assists you in breaking a wide range of historical ciphertexts.
Records
Keys
Ciphers
Transcription pages
University of Siegen, Germany
cryptanalysis
Eötvös Loránd University, Hungary
history
Uppsala University, Sweden
natural language processing
Universität der Bundeswehr München, Germany
cryptanalysis
Computer Vision Center,
Universitat Autònoma de Barcelona, Spain
Budapest University of Technology
and Economics, Hungary, software design and development
University of Siegen, Germany, cryptanalysis
The CrypTool Team, Germany, cryptanalysis
University of Amsterdam, history
Eötvös Loránd University, Hungary, history
University of Siegen, Germany, cryptanalysis
Computer Vision Center, Universitat Autònoma de Barcelona, Spain, computer vision
MSc student at TU Munich, Germany, software design and development
Uppsala University, Sweden, data processing
We provide a collection of encrypted historical sources, and tools for the automatic analysis and decryption using AI.
Resourcesa collection of thousands of historical ciphertexts and keys
a collection of historical texts and language models for 16 European languages
transcribe images with this interactive online tool
break advanced historical (and modern) ciphers with this desktop tool
The source code of the platform and tools are being released as open source under the Apache license v.2.0 with the exception of the DECODE database with its special terms and conditions.
Filip Fornmark (2022) Models, Keys and Cryptanalysis - Evaluating historical statistical language models in cryptanalysis of homophonic substitution ciphers. Bachelor's thesis in Linguistics, Gothenburg University, Sweden.