Thousands of enciphered historical manuscripts are buried in libraries and archives. Examples of such material are diplomatic correspondence and intelligence reports, private letters and diaries as well as manuscripts related to secret societies. The bulk of these historical manuscripts will remain undeciphered unless we can automate the processes involved in decoding them. Our aim is to develop resources and computer-aided tools for decoding of historical source material by using AI and cross-disciplinary research involving computational linguistics, computer vision, cryptology, history, linguistics and philology.
Within the DECRYPT project, we release resources and tools with open access to facilitate research in historical cryptology, allowing collection, analysis and decipherment of historical ciphertexts. Resources are collections of encrypted sources, and historical texts with language models. The tools facilitate the processing of the encrypted sources from transcription to decipherment. We list our resources and tools below, which are described in our scientific publications.
The DECODE database contains a collection of digitized images of ciphertexts and encryption keys along with metadata information about their provenance, location, transcription, and possible cryptanalysis or commentary. The database enables search and all records in the database are open to the public. HistCorp is a collection of historical corpora and other useful resources and tools for researchers working with historical text.
We provide tools for transcription and decipherment of historical ciphers using advanced machine learning algorithms. Historical cipher images can be transcribed, i.e. transformed into a computer readable text format with the help of the TranscriptTool. The transcribed ciphertext can be corrected and used as input to CrypTool which assists you in breaking a wide range of historical ciphertexts.
Records
Keys
Ciphers
Transcription pages
University of Siegen, Germany
cryptanalysis
Budapest University of Technology and Economics, Hungary
software design and development
University of Siegen, Germany
cryptanalysis
Eötvös Loránd University, Hungary
history
Universität der Bundeswehr München, Germany,
cryptanalysis
Stockholm University, Sweden,
natural language processing
Computer Vision Center,
Universitat Autònoma de Barcelona, Spain
The CrypTool Team, Germany, cryptanalysis
University of Amsterdam, history
Eötvös Loránd University, Hungary, history
University of Siegen, Germany, cryptanalysis
Uppsala University, Sweden, natural language processing
Computer Vision Center, Universitat Autònoma de Barcelona, Spain, computer vision
MSc student at TU Munich, Germany, software design and development
Stockholm University, Sweden, data processing
Universität der Bundeswehr München, Germany, cryptanalysis
We provide a collection of encrypted historical sources, and tools for the automatic analysis and decryption using AI.
Resourcesa collection of thousands of historical ciphertexts and keys
a collection of historical texts and language models for 16 European languages
a desktop tool for manual transcription
a web-based tool for semi-automatic transcription using AI
a desktop tool for breaking historical and modern ciphers
an online tool for breaking (simpler) historical ciphers
The source code of the platform and tools are being released as open source under the Apache license v.2.0 with the exception of the DECODE database with its special terms and conditions.
Fornés, A., Chen, J., Torras, P., Badal, C., Megyesi, B., Waldispühl, M., Kopal, N., Lasry, G. (2024). ICDAR 2024 Competition on Handwriting Recognition of Historical Ciphers. In: Barney Smith, E.H., Liwicki, M., Peng, L. (eds) Document Analysis and Recognition - ICDAR 2024. Lecture Notes in Computer Science, vol 14809. Springer, Cham. [https://doi.org/10.1007/978-3-031-70552-6_20]
Goio, G., Torras, P., Fornés, A., and Megyesi, B. (2024) Exploring the Alignment of Transcriptions to Images of Encrypted Manuscripts. In Proceedings of the 7th International Conference on Historical Cryptology (HistoCrypt 2024) [https://hdl.handle.net/10062/98471]
Láng, B. (2023) Transfer of knowledge in the field of universal language schemes 18th-19th centuries. In Krász, Lilla (ed.): Sciences between Tradition and Innovation – Historical Perspectives/Wissenschaften zwischen Tradition und Innovation – historische Perspektiven. Wien: Praesens Verlag, 2023, 37-55
Láng, B. (2023) Valódi és ál-kódfejtések a titkosírások történetében (Real and fake codebreaking in the history of cryptology) In: Bárdos, Dániel; Tuboly, Ádám Tamás (eds.) Emberarcú tudomány : Áltudományok ésösszeesküvés-elméletek szorításában. Budapest, Magyarország : Typotex Kiadó.
Lasry, G. (2023) Armand de Bourbon Poly-Homophonic Cipher - 1649. Histocrypt 2023. Published by Linköping Electronic Press. [https://doi.org/10.3384/ecp195699]
Lasry, G., Biermann, N. & Tomokiyo, S. (2023) Deciphering Mary Stuart’s lost letters from 1578-1584, Cryptologia, [https://doi.org/10.1080/01611194.2022.2160677]
Mikhalev, V., Kopal, N., Esslinger, B., Waldispühl, M. Láng,B., and Megyesi B. (2023) What is the Code for the Code? – Historical Cryptology Terminology. In the Proceedings of the 6th International Conference on Historical Cryptology. HistoCrypt 2023. pp. 130-138. [https://doi.org/10.3384/ecp195702]
Souibgui, M. A., Torras, P., Chen, J., and Fornes. A. (2023) An Evaluation of Handwritten Text Recognition Methods for Historical Ciphered Manuscripts. Proceedings of 7th International Workshop on Historical Document Imaging and Processing (HIP). [https://doi.org/10.1145/3604951.3605509]
Torras, P., Souibgui, M. A., Chen, J., Biswas, S., and Fornes. A. (2023) Segmentation-Free Alignment of Arbitrary Symbol Transcripts to Images. Proceedings of 15th IAPR International Workshop on Graphics Recognition (GREC). [https://doi.org/10.1007/978-3-031-41498-5_6]
Waldispühl, M. Variation and Change. (2023) In Condorelli, M. & Rutkowska, H., The Cambridge Handbook of Historical Orthography. Cambridge: University Press. [book.pdf]
Láng, B. (2020) “Was it a Sudden Shift in Professionalization? Austrian Cryptology and a Description of the Staatskanzlei Key Collection in the Haus-, Hof- und Staatsarchiv of Vienna" In: Beata, Megyesi (ed.) In Proceedings of the 3rd International Conference on Historical Cryptology HistoCrypt 2020. Linköping University Electronic Press, Linköpings universitet, (2020) p. 87. [https://doi.org/10.3384/ecp2020171012]
Lasry, G. (2020) Solving a Tunny Challenge with Computerized Testery Methods. In Proceedings of the 3rd International Conference on Historical Cryptology. HistoCrypt 2020. [https://doi.org/10.3384/ecp2020171013]
Megyesi, B., Blomqvist, N., and Pettersson, E. (2019) The DECODE Database: Collection of Historical Ciphers and Keys. In Proceedings of the 2nd International Conference on Historical Cryptology. HistoCrypt 2019, June 23-25, 2019, Mons, Belgium. NEALT Proceedings Series 37, Linköping Electronic Press. [https://ep.liu.se/ecp/158/008/ecp19158008.pdf]
The linguist who cracks historical riddles, article and film (2024) by Stockholm University: https://www.su.se/english/news/the-computational-linguist-who-cracks-historical-riddles-1.748013
New publication gives an introduction to historical cryptology (Ny publikation ger en introduktion till Historisk kryptologi) (2024) by Stockholm University: https://www.su.se/institutionen-for-lingvistik/nyheter/ny-publikation-ger-en-introduktion-till-historisk-kryptologi-1.713255
Codebreakers find and decode lost letters of Mary, Queen of Scots (2023) by Ashley Strickland, CNN: https://edition.cnn.com/2023/02/07/world/mary-queen-of-scots-lost-letters-scn/index.html
Code breakers discover – and decipher – Long-Lost Letters by Mary, Queen of Scots (2023) by Meilan Solly, Smithsonian magazine: https://www.smithsonianmag.com/history/codebreakers-discoverand-decipherlong-lost-letters-by-mary-queen-of-scots-180981613/
How scientists are cracking historical codes to reveal lost secrets (2023), in New Scientist by Joshua Howgego: https://www.newscientist.com/article/mg25934570-900-how-scientists-are-cracking-historical-codes-to-reveal-lost-secrets/
Chiffer – en historisk gåta, article in GU Journalen (2023/3) by Eva Lundgren: https://issuu.com/universityofgothenburg/docs/guj3-2023/46
Kopal, N. and Dinnissen, J. (2022) „Konzerngeheimnisse: Entschlüsselt: Ein Brief der Niederländischen Ostindien-Kompanie “. Published in c't 16/22 on page 130. Outreach: https://www.heise.de/select/ct/2022/16/2208814061698077413
Kopal, N. and Megyesi, B. (2022) „Die Kryptografen des Papstes: Entschlüsselt: Geheime Nachrichten aus dem Vatikan“. Published in c't 3/22 on page 134. Outreach: https://www.heise.de/select/ct/2022/3/2118013133459359622
Kopal, N. and Waldispühl, M. (2022). „Kryptische Propaganda: Entschlüsselt: Briefe von Kaiser Maximilian II“. Published in c't 4/22 on page 130. Outreach: https://www.heise.de/select/ct/2022/4/2134408482322873782
Kopal, N. and Esslinger, B. (2021). „Krypto ganz unkryptisch – Mit CrypTool 2 moderne Kryptografie ausprobieren und verstehen“. Published in c't 15/21 on page 142. Outreach: https://www.heise.de/ratgeber/Moderne-Kryptografie-ausprobieren-und-verstehen-mit-CrypTool-2-6129885.html
Tillsammans knäcker forskarna historisk kod in Curie (2019) by Carina Järvenhag: https://www.tidningencurie.se/nyheter/tillsammans-knacker-forskarna-historisk-kod
Historiska chiffer ska knäckas med algoritmer, in Populär historia (2019) by Anna Larsdotter: https://popularhistoria.se/vetenskap/forskare-avslojar-historiska-chiffer
Kryptologer från hela världen träffas på historisk konferens in Sveriges Radio P4 Uppland: https://sverigesradio.se/artikel/6980215