2012 ALBAYZIN EVALUATIONS: HANDWRITING RECOGNITION

The Albayzin 2012 Handwriting Recognition Evaluation (Albayzin 2012 HRE) is organised by the Pattern Recognition and Human Language Technology (PRHLT) group of the Universitat Politècnica de València. This evaluation will be part of IberSPEECH 2012 conference, supported by the Spanish Thematic Network on Speech Technology (RTTH) and the ISCA Special Interest Group on Iberian Languages(SIG-IL), which will take place in Madrid (Spain) from November 21st to 23rd.

The goal of this evaluation is to promote the Handwriting Recognition discipline at iberian level, showing the current Natural Language and Speech Processing community that Handwriting Recognition is another field of Natural Language Processing with interest in multimedia and multimodal applications.

Task details

Current state-of-the-art Handwriting Recognition is mainly based in the same technologies than Speech Recognition: each handwritten symbol is usually represented by Hidden Markov Models (HMM), and the relations among the words are modeled by n-gram Language Models. Handwritten documents are divided into lines and each line is coded into a sequence of feature vectors (that have the same purpose that cepstrum coefficients in Speech Recognition), which can be used for training (with the usual Baum-Welch process) or recognition (with the Viterbi algorithm). Nevertheless, other approximations not based on HMM nor n-grams can be used.

In this evaluation, the recognition of an ancient handwritten text is proposed. It is the RODRIGO database, which corresponds to a single-writer Spanish text written in 1545, "Historia de España del arçobispo Don Rodrigo". The book has 853 pages with historical chronicles of Spain; most of the pages consist of a single block of well separated lines of calligraphical text.

The evaluation systems must obtain the most accurate recognition of the test data for that document. The available data will consist of:

  • Training and validation transcriptions for each line.
  • Image for each line.
  • Standard feature vector sequence for each line.

There will be two types of evaluations:

  • From images: the participant can use its own feature extraction method, apart from the training and decoding processes.
  • From feature vectors: the participant must use the provided feature vectors, and their own training and decoding processes will be used on that data.

In any case, the evaluation metric will be based on final word recognition, and Word Error Rate (WER) will be used to determine the performance of the systems. An official WER evaluator for validation data will be provided to the participants via web.

Detailed plan: Albayzin HRE

Registration

Deadline: July 16th 2012 (tentative)

Procedure: submit and e-mail with subject Albayzin 2012 HREto the organiser

  • Carlos Martínez-Hinarejos, cmartine_AT_dsic.upv.es

with copy to the evaluation chairs of the Albayzin 2012 Evaluations:

  • Javier González, javier.gonzalez_AT_uam.es
  • Javier Tejedor, javier.tejedor_AT_uam.es

Please, provide the following information:

  • Group name and acronym
  • Institution
  • Participants and e-mail
  • Contact person

Schedule (tentative)

  • May 23rd, 2012: publication of the evaluation.
  • June 4th, 2012: registration open.
  • June 11th, 2012: training and development data available, standard baseline results available, web evaluation tool available.
  • July 16th, 2012: registration deadline.
  • September 3rd, 2012: test data available.
  • September 21st, 2012: deadline for submitting system results and system descriptions.
  • October 15th, 2012: global results sent to participants.
  • November 21st-23rd, 2012: IberSpeech 2012, Madrid, including the HRE Workshop with systems presentation and results discussion.

Contact

Carlos D. Martínez Hinarejos
Pattern Recognition and Human Language Technologies (PRHLT) group
Departamento de Sistemas Informáticos y Computación
Universitat Politècnica de València
Camino de Vera, s/n
46022, Valencia, Spain

Personal web page: http://www.dsic.upv.es/~cmartine
PRHLT web page: http://prhlt.iti.es
e-mail: cmartine_AT_dsic.upv.es
phone: (+34) 96 387 7007 - Ext: 73529
fax: (+34) 96 387 7359

Additional information