Hello People, this is my first post on this forum and hope anyone can help.
I'm in my final year at Uni (Bsc Internet Technology) and have a project idea.
I would like to find out how to migrate large volumes of paper records into electronic records which can then be stored in a database.
When I say large volumes i'm talking of something like government departments.
How did i come up with this idea?
On a recent visit to Kenya in Africa, I went to get a new birth certificate (as i had lost mine... I live in the UK) from a government office. I was shocked to find out everything was still done on paper. It took around two weeks for them to find the original birth certificate.
That's what has prompted me to do this project....I'm not sure where to start
I'm aware of the security and political issues regarding public records, but would like to find out where i can get more information (historical records etc of how other governments/organisations did it in the past.... how its done at present?? types of software used etc).
Typically, the documents are scanned, and then the image is either stored directly in the database as a BLOB, or the image is left as a file and the path to the image is stored in the database.
Either way, retrieving the image is easy. The challenge is to find a way of searching the documents for key text or subjects.
If it's not practically useful, then it's practically useless.
On one project for a major rail company we had to receive faxes of variable quality (some in longhand, some typed) capture the information and place it in a database. What we came up with was a program that received the fax and displayed the document on one half of the screen and displayed a data entry form on the other half. The database was populated and the originals were not lost.
In your case a dedicated operator would scan all the origingal images and store them as blobs in the database. Next a dedicated operator would read the data from the blob images and key them into the database. In this way you had the data in the correct format and the blob original available for reference all within the same application.
I've never had to do this however I have used software for the visually impaired (scan in document -> computer reads document out to you in a myriad of configurable voices). As a handy extra (and something I've used on occasion) it can create a very acccurate text file of the scanned document - typed text only of course - no good for hand written.
I just thought this interesting given the presented solution of putting the document on screen for a user to manually type in. Surely character recognition is used for this sort of thing too, no?