[HASS-RG] Job vacancy at KCL

Tobias Blanke tobias.blanke at kcl.ac.uk
Mon Oct 5 02:48:11 CDT 2009


http://www.kcl.ac.uk/depsta/pertra/vacancy/external/pers_detail.php?jobindex=8277

The Project

The Centre for e-Research at King's College London is seeking to appoint 
a Software Developer to work on the OCRopodium project, a JISC-funded 
project investigating the use of the open source OCRopus software 
(http://sites.google.com/site/ocropus/) for applying Optical Character 
Recognition (OCR) to historical and archival material.

The Centre has been involved in several collaborative projects that 
digitised historical and archive material, and OCR is a key part of such 
digitisation processes. However, in the past we have used proprietary 
OCR software, which raised a number of issues: (i) commercial OCR 
software and consultancy may be costly; (ii) the closed, “black box” 
nature of commercial software is less flexible and adaptable; (iii) 
digitisation staff at the university are de-skilled, as the OCR 
expertise is concentrated in commercial hands; (iv) commercial OCR 
software, which is typically developed for commercial applications, can 
be inappropriate for historical and archival material.

The OCRopodium project aims to address these issues by:

• Trialling and evaluating an open source approach to OCR, using the 
OCRopus software.
• Developing and training OCRopus components for historical and archive 
material.
• Developing a centre of expertise in the use of OCRopus for historical 
and archival material.
• Integrating OCR activities within semi-automated digitisation workflows.

The Role

The successful applicant will be the key technical staff member for the 
project, and will be responsible for:

• Carrying out technical investigations into the functionality and 
architecture of OCRopus. As OCRopus is an actively growing open source 
project and thus imperfectly documented, this will in itself require an 
ability to understand the source code and debug the software.
• Developing and integrating software components for OCRing historical 
material, and enhancing existing components.
• Contributing new and enhanced components to the OCRopus open source 
project.
• Benchmarking and evaluation of OCRopus, in collaboration with our 
project partners at Queen's University, Belfast (QUB).
• Integrating OCR within broader digitisation and digital library workflows.

For an informal discussion of the post please contact Katrin Tiedau on 
Katrin.tiedau at kcl.ac.uk

-- 
Dr Tobias Blanke
Research Fellow
Arts and Humanities e-Science Support Centre
Centre for e-Research, King's College London
26-29 Drury Lane, London WC2B 5RL

+44 (0)20 7848 1975
tobias.blanke at kcl.ac.uk
http://www.ahessc.ac.uk



More information about the hass-rg mailing list