Recognition Template

The idea

The Templater is a tool used to create all kind of SMARTdoc templates used to recognize documents, to create workflows and to be sure our data are well protected by creating the right security templates.

To start the Templater, right click the SMARTdoc icon in the taskbar and choose Templater.

The purpose of a recognition template is to make the document recognizable by SMARTdoc and if desired to fetch information out of the document as indexes.

The more recognition you can build in a template, the better the result will be.

The quality of the document to be recognized is also very important: bad quality scans give bad results. Be sure to test your template before you start using it.

 Recognition_nieuw.JPG

Create a new recognition template

In the Templater window, choose Recognition Template.

Fill in the name, version, description, group and author of the template. The group is useful because it is also used in the search window.

Recognition2.jpg

Click Next.

Parts of the recognition template

recognition3.jpg

At the bottom of the screen you see a Select... button. This is an important button, because here you select an example of the document to be recognized, it will be used to test the recognition's and the auto fetches.

  • The upper part of the screen is used to define how the document will be recognized
  • The second part is used for the indexes you want to fetch.

First we select an example, and then we click on + in the Recognitions.

Important remark: check if the text is searchable before you start with the template. If the document is an image send it through an OCR program to make the text searchable.

Recognitions

Recognition always has two parts: the value to look for and the expected result. Every recognition has an unique name. You can’t have two recognition's with the same name.

We will walk through the different possibilities of recognition.

Metadata: extension

The upper part of the recognition contains the metadata linked to the file.

You can have metadata connected to the:

  • file
  • image
  • user (This means that you can connect recognition with the user logged in at that moment.)

We use the example of the file - extension:

Recognition4.jpg

Expected value

We have indicated the kind of recognition, now we have to define what the expected extension is: this is done at the bottom part of the Templater. In our example we set it to be a pdf document.

Recognition5.jpg

We have different options:

  • Contains: means the typed word is part of the found result.
  • = (text): the text is exactly the same as the found result.
  • Starts with, ends with: the found text has to start with or end with the expected value typed.
  • Exclude: the types expected is not part of the found text.
  • <, <=, >, >=: are al used for numbers.

We can indicate if our value is case sensitive or not.

After defining the result we save the recognition index.

Test the recognition

Each time we define a recognition we test to see if it works:

  • To do that we click the button Test at the bottom of our Templater screen.
  • SMARTdoc asks for a document to test.
  • If the test is successful, the result is shown in green, otherwise it is shown in red.
  • In our example, everything is fine, we can continue with the other recognition's.

Recognition6.jpg

Text Recognition: Location

When you have a searchable PDF or a Word document, you can fetch text out of it and check if it is recognized.

In the Templater, you create a new recognition and choose: Text - Location. Make sure that the pdf example is selected. Click Get coördinates.

In the window that opens, select the location where the text is found:

Recognition7.jpg

The Adobe coordinates for the selected text are saved:

Recognition8.jpg

At the expected value line, type the text to be found; in our example it could be: "Construction Project Report".

Don’t forget to save the index and test it.

Excel recognition:

in Excel the x and y coordinates are the cells; if the text is found in cell B3: you indicate ‘x=3;y=2’.

Text Recognition: Direction

It is not always possible to define the region where the text has to be found. Not every document type has the same layout. OCR text recognition can be hard sometimes.

You can also use Direction.

Direction works in three ways:

  • Left: takes the indicated number of characters before the typed text.
  • Right: takes the indicated number of characters after the typed text.
  • Between: takes all the text between the first and second text.

You can even use direction in combination with selection. In that case you don’t have to define a number of characters. The recognition stops at the end of the selection.

Mail recognition

First of all: an email is easily recognized by the extension. A Microsoft Outlook mail has the extension ".msg". Next to that you can use the sender, the receiver, the subject as recognition.

Path recognition

SMARTdoc looks at the folder structure where the document originally was located and checks the name(s).

In the field path you indicate the number of the path. Number 0=the highest level (The drive itself. For example "C:\"). Counting up, you get 1, 2 etc… "Users" is always level 1 on a pc.

Recognition on path can be very useful because the name of a folder is always fixed.

Barcode

SMARTdoc recognizes barcodes and you can indicate that a certain barcode has to be found and can be used to recognize a document.

AutoFetch Indexes

The idea of the autofetch indexes is almost the same as the recognition, but you will not indicate an expected value.

We will pick out the differences between both.

Special options in the header of the autofetch

 Recognition9.jpg

When creating an autofetch, there are some options available in the header of the index:

  • Is Workflow Template: this means that if there is a workflow with the same name as the fetched value, it will be activated.
  • Is Security Template: means that if there is a security template with the same name as the fetched value, it will be activated.
  • Add to AutoComplete Values.
  • Hidden: the index is not shown but is added to the document.
  • Rename file to this index: the file name is changed to the fetched value.
  • Mandatory: index has to be filled in or the file is not saved in SMARTdoc. If the file is saved through an automatic watchfolder the use gets in SMARTdoc a red exclamation mark to show that there is a missing value.
  • Show existing index values: all existing indexes given to other documents are shown in a dropdown-menu.
  • Multi-index: several values can be given for one index.
  • Not editable: the index value fetched cannot be changed.
  • Auto-complete: when using an auto-completing list from the database, the rest of the values will automatically filled in.
  • Add to tree view: adds the index to the tree view.

Metadata

Works the same as the recognition metadata, with the difference that there is no expected value.

This is also used to fetch the user name as metadata: the name of the person logged in who saved to file in SMARTdoc.

Text

Works exactly the same as the recognition part, in autofetch we will often use the combination of location and direction to fetch exactly what we want.

Mail

When adding mails to SMARTdoc we can fetch the sender, the addressed person, the ‘cc’, the “bcc’ and the attachments.

Path

The path fetches the name of the defined folder. This is used to take over an existing tree structure in SMARTdoc.

Fixed

Is an index where the user can type free text. You can already provide something in there or you can use fixed indexes, that can be connected to this index.

Counter

You type the name of the counter, SMARTdoc will add a number each time you archive a document. When documents are deleted the numbers are not reused.

OGM

Used for the structured message in a payment:

Recognition10.jpg

SMARTdoc fetches the numbers between the pluses and can replaces the "/" with other characters.

Barcode

Fetches the value of a barcode found in the document. You can indicate the length of the used barcode to make sure you fetch the right value.

Database Query

In here you can type a SQL statement to fetch information directly from the SMARTdoc database.

Remove whitespace

If there are blanks in the found result, SMARTdoc will remove them. This is good when the original document is not so good and the OCR added extra spaces to text.

Is case sensitive

This distinguishes the difference between upper- and lower-case letters.
For example: Run is not the same as run, because there is a difference in the usage of upper- and lower-cases.

Regular expressions

You can indicate the format of the expected value.

For example, you can indicate that the fetched number has to be 8 characters. This is done by a regular expression.

By clicking the Help button you can get some assistance:

Recognition11.jpg

By typing the expression yourself, you can go much further:

Recognition12.jpg

When you need help on building your regular expressions, we are willing to help you.

Saving the template

Once you are finished, you can save the template.

The template is an XML file. This XML file shows the recognitions and the indexes fetched.

Recognition13.jpg

Add the template to SMARTdoc

A template is added to SMARTdoc the same way as a document, drag and drop the template on the SMARTdoc icon. When adding the Recognition template, you will be asked for the security and workflow to be used with this template.

Have more questions? Submit a request

0 Comments

Article is closed for comments.