This article is automatically generated by AI. There may be omissions or inaccurate descriptions in the content. Information sources are X posts, GitHub updates, Current Awareness Portal.

We compile weekly information on new tool development and releases related to Digital Humanities (DH).

Minna de Honkoku Kuzushiji OCR Released

"Minna de Honkoku Kuzushiji OCR" has been newly released by @yuta1984. This is a web-based tool that uses kuzushiji (cursive script) recognition AI to recognize characters from digital images of historical documents and classical texts, supporting transcription work.

The layout recognition function allows adjustment of line positions and reading order, and OCR execution enables confirmation of results in the transcription panel. It supports multiple images and is said to work on smartphones as well.

Recognition processing uses ONNX Web Runtime and is completed within the browser, so image data is not transmitted externally.

Digital Soseki: Complete Full-length Novel TEI Tagged Text Dataset Released

Yoshitaka Hibi has released a text dataset of all 14 full-length novels by Natsume Soseki tagged in TEI (Text Encoding Initiative) format. The 14 works included are "I Am a Cat," "Botchan," "Kusamakura," "Nowaki," "The Miner," "Gubijinso," "Sanshiro," "And Then," "Mon," "Kokoro," "The Wayfarer," "To the Spring Equinox and Beyond," "Michikusa," and "Light and Darkness."

The data is provided under a CC BY-SA license and can be used for research and educational purposes.