Tag: Quality


Microfilm or Digitization: Simplified View (Continued…)

This is in continuation to my earlier article “Microfilm or Digitization: Simplified View”. In earlier article I discussed why digitization should be preferred over Microfilms, although microfilms have certain advantages over digitization like life expectancy of microfilm is of 500 years, it is eye readable and is also an analog technology.

In today’s scenario digitization has proved itself over the advantages of microfilms; for e.g. storage media cost is cheap today and transfer of data from one media to another (in case of technology upgradation) is also simple or automatic. Scanned/digitized images cannot be read through naked eye from the storage media directly (without attaching it to a computer system) but a computer system is now available at every corner of the globe and is rapidly expanding its reach.

Below is the comparison of Digitization and Microfilming/Scanning to understand the concept in detail.

Sl. No Activities Microfilm Scanning
(A)   COST    
1 Cost of conversion More than scanning Less than Microfilms
2 Space Requirement More (As physically stored) Less (As electronically stored)
3 Preservation / Storage Method Special Physical method Virtual / Electronic
4 Maintenance Method Special Physical method Virtual / Electronic
5 Technology Refreshing Cost High Low
6 Effect of environmental changes Sensitive (Because it is tangible) Insensitive (Non tangible)
7 Security Procedures Special Physical method Virtual / Electronic
8 Transfer / transportation cost More (Physical) Less (Electronic)
1 Physical damage Possible Not possible
2 Technological essence Outdated Latest
3 User acceptance Not user friendly User friendly
4 Document Sharing Not possible Possible
5 Rapid Retrieval Takes more time Quick
6 Concurrent Access Not possible Possible
7 Access Method Sequential & Physical(Time consuming) Random (Quick)
8 Retrieval Method Requires special device By basic Computer
9 Query resolution based on documents Takes more time Instantaneous
10 Remote Access (Across world) Not Possible (As it is physical) Easily possible (As it is electronic)
11 Online Access (Network, Internet etc.) Not possible Easily possible
12 Viewing Images Requires special device By Click of mouse
13 Printing Poor image & delayed Good Image & quick
14 Reproducing Images Limited Unlimited
15 Alternate to Photocopy Costly Method (Physical) Economical Method (Digital)
16 Alternate to printing Costly Method (Physical) Economical Method (Digital)
17 Workflow compatibility by software Not possible Possible
18 Software Integration (Document Mgt.) Difficult Easy
19 Sensitivity to light More Insensitive (Non tangible)
20 Method to View Requires special device On Any Computer
21 Magnification Percentage Limited Unlimited
22 Inter Office Movement of records Physical (First Printing then Hand carried) By Computer Network
(C)  QUALITY    
1 Quality Control Requires close monitoring Easy
2 Color digitization Poor Quality Good Quality
3 Image Enhancement Not possible Possible
4 Conversion / Duplication Quality degrades Remains same
5 Visual clarity of images Can’t see by naked eye Visible through naked eye
6 Optical Character Recognition Not Possible Possible
7 Rearranging captured sequence Not possible Possible


Name: Hemant

Web Site: http://www.newgensoft.com

Bio: Hemant is Senior Manager - Processing Services with Newgen Software Technologies Limited


Document Quality Analyzer: Automated Quality Checking without Operators

It is observed that in a production level digitization scenario, manual quality check of scanned images never achieves 100% correct results. In the best case 90% accuracy can be achieved. The residual 10% can sometimes lead to huge business risks specially in cases of regulatory requirement or compliance. Newgen’s Document Quality Analyzer(DQA) (patent pending) and Automatic Document Correction significantly reduces this risk to less than 1% by automatically flagging and correcting more than 25 types of errors.

DQA is a fully automated system to analyze the quality of the scanned document. The system exclusively focuses on ascertaining the document quality based on a list of pre-determined unique list of configurable parameters.

The system comprises of unique set of features, with specific parameter values to gauge the image quality of the document. Once the quality is known, the document can either be accepted for further processing or a request can be immediately sent to rescan the poor-quality document.

Why Document Quality Analyzer?

  • Ensure quality legible documents for business applications even if acquired through
    • Single / Multiple Outsourced Vendors
    • Different Internal Departments
  • Avoid hefty penalties imposed by regulators
  • Service customers on time
  • Service decision makers on time
  • High rescan cost if unusable document found at a later date

Inconsistent Scanning Quality – Causes

  •   Different scanning vendors and scanner models
  •   Faulty Scanners
  •   Improper Environment
  •   No single right configuration for all document types
  •   Production pressure
  •   Storage limitation
  •   Unskilled scanning operator
  •   Manual QC done on a sample basis
  •   Unavailability of automatic scanning quality analyzer


(A) Image related Parameters:

  1. Too Dark/Too Light  
  2. Skew 
  3. Wrong Orientation    
  4. Too much Noise
  5. Double Page
  6. Photo on B&W
  7. Blank Page    
  8. Error in Automatic Cropping
  9. Readability Index
  10. Quality of Photographs
  11. Punch Hole Marks    
  12. Stapler Marks
  13. Proper Margins        
  14. Black Bands

(B) Scanning related Parameters:

  1. Piggy Back/Multi-feed
  2. Folded corner
  3. Torn document
  4. Dark/Light    
  5. Out of Focus

(C) Format/Data

  1. Improper resolution
  2. Format/ compression not proper
  3. Dimensions
  4. Size of File


  1. Improper resolution
  2. Format/ compression not proper
  3. Dimensions
  4. Size of File
  5. Skew   
  6. Wrong Orientation
  7. Noise
  8. Double Page
  9. Error in Automatic Cropping
  10. Punch Hole Marks
  11. Stapler Marks
  12. Proper Margins
  13. Black Bands

DIGITIZATION OUTPUT:Operator Not Scanner, Defines Production

Digitization is now an integral part of the business universe and is growing at the speed of lightning. Just like from virtual money, we are not too far from the scenario when we will have negligible amount of paper documents as part of our day to day life.

Newgen with its 15 years of experience in the document management space has found that physical condition of documents plays a major role in the cost of digitization and can impact the decision to delay the digitization in many organization.

Example :: A Scanner with 50PPM speed can produce about 24,000 images in simplex mode and 48,000 images in duplex mode, whereas practically we get a maximum of 10,000 to 12,000 images in a shift of 8 hours for a multiple mode scanning scenario(explained below). This is only 41% leaving the a scope of 59% for improvement and similar is the scope for profit in commercial terms.

Keeping in mind this 59% we have analyzed various bottlenecks in achieving 100% target. Although there are multiple factors which are impacting the daily production, here we are specifying the automated methods to regenerate images, which can increase the scanning output by 30% and can save around 80% of quality checking effort.

Actually quality check is never required in an efficient digitization environment as quality check is nothing but the correction of errors performed at the scanning stage. Subsequently, if we do not commit errors at scanning, quality check is not required. Although it is very difficult to do away with quality check process but we can at least work to minimize quality checking effort.

Here we are proposing to increase scanning output and speed up the quality check process, while reducing errors.

 (a) Scanning Output

Scope-1: Most of the projects like insurance, banking, telecom etc. require multiple-mode scanning (color image for photographs & important documents and monochrome/B&W for other supporting documents), but the pages to be scanned in different modes are not at fixed location, so scanning software cannot be used continuously to get the maximum output from scanners.

 Now, if we can tightly integrate scanning software with the scanner’s driver, then we can use control sheets with barcodes, where value of barcode will help the scanner to determine the scanning mode like color or monochrome automatically while scanning (without having to stop).  So the change in scanner mode will be at driver’s level and not at scanning utility level, now we have the flexibility to keep these control sheets wherever we require in a bundle while scanning, and can use the scanner continuously without interruption.

 Scope-2: There are many scanning utilities in which the scanning operator is not able to view the images while scanning is in progress, so the operator has to wait for scanning to complete (for the bunch kept in feeder tray). So, if operator is scanning 50 pages at a time then he is investing one minute in scanning and one minute in checking images after scanning, so total of 2 minutes are required for 50 pages whereas if he is able to see the images while scanning then only one minute is required to complete scanning thus the output will be doubled at the scanning stage, thus increasing the speed by 100%. But practically if we achieve even 50% we are successful.

This can be achieved with the use of ISIS scanning interface, which allows viewing images while scanning is in progress. So, go ahead and shift to a scanning utility with ISIS viewer.

 (b) Quality Checking Output

Scope-1: QC operator views every image and physical document to verify the errors, if image requires rescanning, then it is marked and sent for rescanning. This entails extra efforts and time wasted in performing the same task again and again. To avoid repeated effort if we deploy flatbed scanners at every QC operator workstation then it becomes costly and time consuming, as a single page scanning on flatbed consumes min 15 to 30 seconds.

Now, if we use entry level professional digital camera (with the option for remote shooting) with scanning utility/software, then we can save lot of time, effort and money. Cost of digital camera is approx 1/3 of A3 flatbed scanner and the scanning speed is also better than that of flatbed scanners. Moreover, the transportation is also easy for digital cameras due to compact size, and we can scan bound documents also and that too at the quality checking stage. Now we can drop dedicated rescanning activity.


DIGITIZATION – King Without Crown of Business Universe

Why should a business have computers when all of the business’s information is on paper?

Wouldn’t it make more sense to have all of the documents available for viewing on the computer or having scanned images of documents on a device that can easily and quickly be connected to the computer. This is becoming increasingly true since PDF files can be viewed by anyone who has a PDF viewer (freeware) and going forward we can use PDF/A technology to make it more secure and long lasting with many add-on features.

Going by various industry reports, we found that:

On an average 7000 sheets are used yearly by an employee in any office which costs to about Rs 2500 / employee / year. About 10% of the employee effort is used in filing and storing of documents and about 20% of employee effort is wasted in retrieving a document. 5% of the documents are lost or misfiled and about 30% of effort is wasted in reconstruction of these documents, and most important is the space used to store these documents at premium office location which is almost 15% to 20% the total space. So around 70% of the effort is wasted to perform 30% of the job efficiently.

Storage of these physical files requires filing cabinets costing as high as Rs 15,000 for a standard five-drawer lateral filing cabinet, with the average filing cabinet using 15.7 square feet at an average cost of Rs 25–Rs 30 per square foot, which come to about Rs 350 per filing cabinet per year, furthermore, 40% of the files in those cabinets are duplicate information, and 85% is never accessed again.

Whereas, if we consider digital storage, a single CD is can save twelve thousand documents or one full filing cabinet and a DVD can save upon 1.5 lacks documents or ten full filing cabinets. For companies which require bulk volume of document scanning and document conversion can be a big problem unless a proper document scanning solution is in place.

But how many of us have broad view of the digitization process (Document Scanning)? And how many realize that decisions made at every step – especially in the procurement of the scanner itself – will affect total costs of the digitization? In fact, the failure to analyze correct equipment and associated software is often the cause of higher-than-expected scanning costs. These costs include the “cost of poor quality,” which an organization might incur months or even years after a document has been scanned, when the document is found to be unreadable and unusable.

The best way to avoid this kind of additional cost is to automate the digitization process by minimizing the dependency on man and machine. To achieve the optimum combination; invest appropriate time in select scanning equipment and digitization software. Please remember the fact that the cost of world’s best workflow and document management software’s (storage/archival and retrieval system for scanned documents)  price is negligible in front the cost of scanning the documents, so have more emphasis to reduce the cost of scanning, even a paisa (Indian currency) saved on every page can result in saving of millions or billions for overall scanning volume, now see where stands the cost of workflow or document management software, now this doesn’t means that you can have any workflow or document management software, pay attention of the features, robustness and scalability.

Selection criteria of scanning equipment (scanner) depends on its feature like dynamic image processing technology, automatic de-skewing, border removal, and other features that ensure clean, crisp usable images, straight paper path, image capturing technology, speed, paper handling mechanism, duty cycle, after sale support etc. The up-front investment in quality will reduce the need for rescans, speed up the quality assurance step in the value chain and minimize the chance that a poor quality image will slip through undetected, creating problems down the road.

Similarly the criteria for selection of digitization software depends on its feature like ISIS/TWAIN driver, view image while scanning, easy to replace, insert, delete at variable locations easily, auto categorization of documents, auto scan on different modes, image compression, available file formats, advance image processing options, customized scanning parameters for different types of documents, easy and customizable indexing/metadata entry provision etc.

And the most important is to judge that all our effort has delivered the right results so the best way to ensure quality, also reduce manpower count is to look for Automatic Document Quality Analyzer which does all types of image quality analysis automatically on more that 50 parameters and ensures that no below standard image moves to final archival software.


Recent Posted Comments

Swati Pandey
19th Feb 14 Posted a comment on Facilitating Financial Inclusion through Digital Transformation
Well written and informative.The article clearly depicts the writers thoughtand knowledge in the domain.All the best!! Looking forward for more ...
13th Feb 14 Posted a comment on Dynamic Case Management – A Fusion of BPM, ECM and Business Analytics
I think this is one of the most vital information for me. And i'm glad reading your article. But wanna ...
12th Feb 14 Posted a comment on Contemplating the Transition to Cloud Storage
I value the blog.Really looking forward to read more. Cool.
11th Feb 14 Posted a comment on Dynamic Case Management – A Fusion of BPM, ECM and Business Analytics
I concur! completely with what you wrote. Fantastic Gear. Be it going..
9th Feb 14 Posted a comment on Opportunities and Challenges for Cloud Adoption in India
Nice post. I used to be checking continuously this weblog and I am impressed! Very useful info specially the closing ...