Scientists at the University of Chicago are developing a machine learning system that can automatically transcribe text found on ancient clay tablets.
The DeepScribe system will initially focus on transcribing the Cuneiform writing system used in the ancient Iranian Achaemenid Empire (550–330 BC), the University of Chicago News reports.
Existing computer systems struggle to translate this script, due to its complex characters and the 3D form of the tablets on which they’re written.
The team of researchers from the University of Chicago’s Oriental Institute and its Department of Computer Science thinks their system could do better.
To build the model, they’re training it on more than 6,000 annotated images from the Persepolis Fortification Archive. This will teach the system to read tablets in the collection that have never been analyzed before.
They believe the system could uncover new secrets about Achaemenid history, society, and language.
Eventually, it could even be adapted to other ancient forms of writing.
“If we could come up with a tool that is flexible and extensible, that can spread to different scripts and time periods, that would really be field-changing,” said Susanne Paulus, associate professor of Assyriology at the University of Chicago.
Translating the past
The training data is built on a dictionary of the language developed by researchers and a database of more than 100,000 individual signs built by students.
Professor Sanjay Krishnan of the Department of Computer Science at the University of Chicago used this annotated dataset to train a machine learning model to read other tablets. The model managed to decipher the signs with an accuracy of around 80%.
The team will conduct further research to improve the accuracy rate.
In time, it could even help determine the source of artifacts whose origins are currently unknown. However, they will need to act fast, as the tablets may soon return to their own country of origin: Iran.