WeOCR is a platform for Web-enabled OCR (Optical Character Reader/Recognition) systems that enables people to use character recognition over networks.
A WeOCR server receives document images from users, recognize texts in the images, and return recognition results to the users.
WeOCR does not have its own character recognition engine. Instead, it is intended to accommodate various character recognition engines. WeOCR provides a simplified user interface so that more people can benefit from OCR easily.
- Receive a document image from each client computer, pass the image to the back-end OCR engine, generate HTML data from the result data, and send the data back to the client.
- Uncompress the incoming image file if required.
- Limit the size of the input data to protect the server from huge data.
- Examine the integrity of image file headers.
- Convert the input image into a common image format (PNM).
- Limit the number of jobs to prevent the server from processing too many documents at once and to maintain acceptable server response.
- Terminate the OCR engine after a specified time has passed, if the engine continues running (in vain) due to unexpected input data or bugs in the engine.
- Support server search function using spec files in XML.