Therefore™ Smart Capture Classifier

Right-clicking on the Smart Capture node brings up a context menu. This page relates solely to the content surrounding the Classifier tool. For information regarding the other content in this context menu, see:
Therefore™ Smart Capture Dialogs

New Classifier
Opens a dialog to configure a new classifier. This dialog has two tabs: Classifier and Settings. The following settings are under the Classifier tab.

Classifier name
Enter a name for the classifier

Input category name
Enter a name for the input category. The input category will be automatically generated after the classifier is created. Documents placed in this category are then re-categorized by the classifier and placed into one of the data type categories.

A minimum of two categories must be selected to define the data types used by the classifier.

The following settings are under the Settings tab.

Automation level
Sets the required confidence level for the classifier to process documents automatically. This is set by default to 90%, meaning that if the classifier is at least 90% sure it has correctly assigned a category to a document, this will be automatically processed. Any document processed below the 90% threshold will be instead sent for manual classification.

Note:

In the event of a manual classification, this document is set aside for a future re-training.

Training Documents
Sets the number of documents to be used to train the AI classification process. By default it is set to 20, and is required to be at least 5 documents.

Pages used for classification
Sets how many pages of the document are used for classification. The default is one, meaning only the first page will be used. This can be configured up to a maximum of 100 pages.

Opening an existing Classifier will open it's configuration dialog. This dialog shares the same tabs as the 'New classifier configuration' window with the addition of a Training tab.

State
Displays the current status of the classifier. This can be:

  • Ready - the classifier is ready to classify documents.

  • In Progress - training or re-training is still in progress.

  • Training Failed - Training unsuccessful due to an error. Error details found in the 'Training Error' log.

  • Re-training Failed - Re-training unsuccessful due to an error. Error details found in the 'Training Error' log.

Last training
Displays the date of the last training.

Number of documents waiting for training
Displays how many documents are currently in queue for the training. This are documents the classifier could not process previously without manual intervention.

Training error
Lists any training errors that occurred in the last training.

Start training
Starts a re-training. In case of overuse, schedules a re-training to happen the next day. While the classifier is undergoing re-training, it can still be used, however this will still be the version of the classifier before the re-training.

Creating a Classifier
When a classifier is created, a configuration import dialog is opened. This import dialog will display an automatically generated workflow task and a keyword dictionary with automatically generated names related to the classifier's name.

Note:

Both the workflow and the keyword dictionary may be renamed after creation.

While the configuration import happens, the classifier will automatically begin training by selecting the defined number of training documents from each of the chosen data type categories. Training is expected to take anywhere between two to three minutes, once complete, opening the classifier's training tab will show the state as 'Ready'.

Once ready, the classifier can begin processing documents. Anytime a document is added to the defined input category will trigger the auto-generated workflow task. In the event of a document that requires manual processing, after manual assignment, this document will placed into the re-training queue.