The following points should be considered when creating a lifecycle:
1.Where do the document come from? - Documents can be captured from multiple sources, and although each lifecycle only supports one capture source, multiple lifecycles can be "daisy chained" together to impose consistency.
2.Do the documents require separation? - In some scenarios, multiple documents are in one physical document and require separation as they need to be more classified at a more granular level. This is typical if documents are received via fax or scanned in by a multi-function device. The type of separation is also an important point to consider as it may require a process change. For example, if you want to use a barcode or a separator sheet to identify new documents then the business and/or vendor may need to add these to the document stack before capture. DocuNECT also supports "contextual" separation which is uses rules applied to the content of the document.
3.How are the indexes assigned? - Once the documents are separated, or classified, the data can be associated with or extracted from the document. This stage can be time consuming so look to automate this step as much as possible. Data can be typed in by the user, retrieved from an external database or extracted from the document content itself using business rules. A combination of all methods can also be supported.
4.Where do the documents/data go? Documents and data can stay in DocuNECT, be distributed to another document management system, or third-party business application.
Understanding the Document Structure
In order to facilitate the document classification and data extraction, DocuNECT document structure (called DocInfo) has the following components:
Component
|
Description
|
Document Content
|
Documents are stored in their native format in a designated storage location.
|
Document Rendition
|
Optionally, a rendition of a document can be stored for viewing purposes. Although part of the document structure it is not required.
|
Pages Info
|
Stores information about the pages. DocuNECT can apply a logical structure to documents. Let’s take an example a couple of scenarios where this is used:
•Soft Separation - A blob that contains multiple document types, and the lifecycle is configured to use contextual separation. If we physically separate the document and it has been incorrectly separated, then it’s hard to glue it back together. The Pages Info allows us to “Soft” separate the document, so it appears separated without touching the physical document. Changes can then be easily made.
•Logical Documents – If you are migrating from a third-party document management system that has a single page architecture, then the Pages Info allows different pages to point to different storage files.
Pages Info stores information about:
•Page orientation
•Annotations
•Separation information
•Barcode information
•Storage file
|
DocText
|
The raw text that has been extracted from the document.
|
Text Info
|
If the document has been OCR’d then this component stores the positional information for the extraction engine.
|
Index Info
|
This is generated by the auto-indexing rules and stores information about each index:
•Rule used to extract information
•Data location information
•Barcode information
•Percentage confidence
•Page number data extracted from
•OCR block information
This information can be viewed through the UI by hovering over a blue information icon.
.png)
|
Classification Info
|
This is generated by the auto-classification rules and stores information about the classification:
•Rules used to extract information and associated confidence levels
•Data location information
•Barcode information
|
Viewing the Document Structure
If you are creating/debugging auto-indexing rules, make sure you have the Designer permission as this will expose the following function in the Work Center.
.png)
|