A Benchmark Dataset for Manipuri Meetei-Mayek Handwritten Character Recognition
datasetposted on 28.09.2019 by Pangambam Singh
Datasets usually provide raw data for analysis. This raw data often comes in spreadsheet form, but can be any collection of data, on which analysis can be performed.
A benchmark dataset is always required for any machine learning based classification or recognition system. To the best of our knowledge, no benchmark dataset exists for handwritten character recognition of Manipuri Meetei-Mayek script in public domain so far. In this work, we introduce a handwritten Manipuri Meetei-Mayek character dataset which consists of more than 5000 data samples which were collected from a diverse population group among three different districts of Manipur, India (Imphal East District, Thoubal District and Kamjong District) during March and April 2019. Each individual was asked to write all the Manipuri characters on one A4-size paper. The recorded responses are scanned with the help of a scanner and then each character is manually segmented from the scanned image. The whole dataset is divided into five categories: 1. Mapi Mayek 2. Lonsum Mayek 3. Cheitap Mayek 4. Cheising Mayek 5. Khutam Mayek. This dataset consists of scanned images of handwritten Manipuri Meetei-Mayek characters in .JPG format as well as in .MAT format.