Interesting article from the MIT Technology Review:
Machine learning has been used to automatically translate long-lost languages
But speaking about new research into AI translation of lost languages, the article notes, “In this paper, Linear A is conspicuous by its absence.” Indeed, the two languages the AI decoded are Linear B and Ugaritic. We already know both these languages, if not well. And the AI used what was known about languages like Ancient Greek and Classical Hebrew, to construct models to test Linear B and Ugaritic against. With languages like Linear A or, God help us, whatever the Voynich Manuscript is written in, we don’t have the needed information to guess that language Y will be close enough to language X to decide they’re worth doing brute force comparisons with. Indeed, the decipherment of Linear B owes much to Ventris’ guess that it was a form of Greek. Had it not been, Ventris’ work would have wound up like what we have for Linear A now: Some idea how the elements go together but no idea what they actually are.
One bit of food for thought, though: How similar are these AI processes to what a Spanish speaker does to learn Italian? And could it turn out, as with Linear B, that computers can do a better and faster job investigating a human hunch, but still require that human hunch to know where to begin?