TableKit
Table Kit Plugin For T-Plan Robot Enterprise
Contents:
1. Overview
2. Plugin installation
3. Usage
3.1 GetTables Script
3.2 GetCell Script
3.3 FindCell Script
3.4 ExportTable Script
4. Change Log
1. Overview
The Table Kit Plugin finds tables on the screen using graphical screen analysis. This includes:
- Recognition of tables and their properties (row and column count, number of cells, location, width and height)
- Exposing of cell properties (bounds, color, text)
- Optional retrieval of text from individual table cells using the Tesseract OCR engine.
The plugin is primarily designed for test scripts written in the TPR language. To call the plugin scripts from Java either use the DefaultJavaTestScript.run() method or use the script instance approach described in the Java Test Script Basics document.
The table recognition algorithm has a few limitations. Please consider them before relying on it.
- All cells must have solid color background and must be separated by a grid line (border) of another color.
- There must be at least one pixel distance (padding) between the cell content (text) and its border, i.e. the text may not touch the border.
- When the algorithm evaluates whether a set of cells forms a table it prefers regular tables with cells of the same size. This sometimes leads to a situation that merged cells spanning over multiple standard cells are being recognized as another table.
Should you have any questions or suggestions contact the T-Plan support. For the list of other plugins provided by T-Plan go here.
2. Plugin installation
To install the plugin download the appropriate archive from the following location:
- Version 0.8 (T-Plan Robot 7 and higher): https://downloads.t-plan.com/releases/robot/plugins/tablekit_v0.8.zip
- Version 0.7 (T-Plan Robot 4.1.1 and higher): https://downloads.t-plan.com/releases/robot/plugins/tablekit_v0.7.zip
- Version 0.5 (T-Plan Robot 3.5.2 - 4.1): https://downloads.t-plan.com/releases/robot/plugins/tablekit_v0.5.zip
For version differences see the Change Log.
OPTION 1:
- Unzip the file to a location on your hard drive.
- Add the tablekit.jar file to the class path of the Robot start command. For details see the Robot release notes.
- Start or restart Robot. The test scripts will be exposed to Robot. When you create a Run command in your TPR script the property window will list the plugin scripts.
OPTION 2 (NOT FOR MAC APP):
- Unzip the file to the plugins/ directory under the Robot installation directory. This will make Robot load the classes on the start up. Make sure to remove any older versions of the plugin.
IMPORTANT: Do not do this if you are running a Mac app because you would invalidate the app signature. Use Option 3 instead. - Start or restart Robot. The test scripts will be exposed to Robot. When you create a Run command in your TPR script the property window will list the plugin scripts. The plugin will also add the table viewer to the Plugins menu of the Robot GUI:
OPTION 3:
NOTE: This option will not install the table viewer into the menu. The viewer will be however available from the Properties window of the GetTables script. For the viewer screenshot see the Usage chapter below.
- Unzip the file to a location on your hard drive.
- If you plan on using the plugin in the TPR scripts put the following command to the beginning of each test script:
Include "<location>/tablekit.jar" - If you plan on using the plugin in Java test scripts put the tablekit.jar file onto the Java class path (see OPTION 1).
To uninstall the plugin simply delete the file and remove the Include references eventually.
3. Usage
The plugin contains three Java test scripts:
Script Name | Description |
---|---|
Recognize table(s) on the screen. | |
Retrieve information of a cell of the previously recognized table. | |
Find a cell by text in the previously recognized table(s). Supported since v0.8. | |
Export previously recognized table to a text, CSV or MS Excel file. |
The plugin scripts are to be called from TPR test scripts using the Run command. The command instances may be easily created using the Command Wizard tool. To edit an existing Run command right click it and select Properties in the context menu. The GetTable script also provides the table viewer:
A few examples showing typical usage:
EXAMPLE #1:
Recognize a table and save text of its cell at the second row and third column to a variable called TEXT:
// Presume that tablekit.jar is in the script folder
Include "tablekit.jar"
// Recognize table(s) on the screen
Run "com.tplan.table.GetTables"
if ({_EXIT_CODE} > 0) {
Exit 1 desc="No tables were found."
}
// Get details of the cell at row=2 and column=3 and perform OCR
Run "com.tplan.table.GetCell" row="2" column="3" tocr="true"
if ({_EXIT_CODE} > 0) {
Exit 2 desc="Failed to retrieve cell at [2,3]. See the log for details."
}
// Save the cell text to the TEXT variable
Var TEXT= "{_TABLE_CELL_TEXT}"
EXAMPLE #2:
Click cell at [row, column] = [3,5].
// Recognize table(s) on the screen
Run "com.tplan.table.GetTables"
if ({_EXIT_CODE} > 0) {
Exit 1 desc="No tables were found."
}
// Get coordinates of the cell at [3,5]
Run "com.tplan.table.GetCell" row="3" column="5"
if ({_EXIT_CODE} > 0) {
Exit 2 desc="Failed to retrieve cell at [3,5]. See the log for details."
}
// Click at the cell coordinates plus 5 pixels in each direction
Mouse click to="x:{_TABLE_CELL_X}+5,y:{_TABLE_CELL_Y}+5"
EXAMPLE #3:
Verify whether the cells in the first column have the same color and set the SAMECOLOR variable to true or false accordingly.
// Declare the _TABLE1_ROWS variable to suppress compiler warnings in for()
Var _TABLE1_ROWS =0
// Recognize table(s) on the screen
Run "com.tplan.table.GetTables"
if ({_EXIT_CODE} > 0) {
Exit 1 desc="No tables were found."
}
// Get color of the first cell
Run "com.tplan.table.GetCell"
if ({_EXIT_CODE} > 0) {
Exit 2 desc="Failed to retrieve cell at [1,1]. See the log for details."
}
Var COLOR ={_TABLE_CELL_COLOR}
// Iterate over the rest of cells in the first column and test
// whether their color is the same
Var SAMECOLOR =true
for (i=2; {i}<{_TABLE1_ROWS}+1; i={i}+1) {
Run "com.tplan.table.GetCell" row={i} column=1
if ("{_TABLE_CELL_COLOR}" != "{COLOR}") {
// Not the same color
Var SAMECOLOR =false
break
}
}
EXAMPLE #4:
Recognize table(s) and save them to a MS Excel file.
// Recognize tables and perform OCR to retrieve text of all cells
Run "com.tplan.table.GetTables" tocr="true"
if ({_EXIT_CODE} > 0) {
Exit 1 desc="No tables were found."
} // Export the tables to MS Excel
Run "com.tplan.table.ExportTable" file="C:\test.xlsx"
EXAMPLE #5:
Find a cell labeled "Price:" and read the value from the cell next to it.
// Recognize tables and perform OCR to retrieve text of all cells
Run "com.tplan.table.GetTables" tocr="true"
if ({_EXIT_CODE} > 0) {
Exit 1 desc="No tables were found."
} // Find a cell labeled "Price:"
Run "com.tplan.table.FindCell" text="Price:"
if ({_EXIT_CODE} > 0) {
Exit 2 desc="The 'Price:' cell was not found"
}
// Use variables produced by FindCell to get value in the cell in the next column.
Run "com.tplan.table.GetCell" table="{_TABLE_CELL_TABLE}" row="{_TABLE_CELL_ROW}" column="{_TABLE_CELL_COL}+1"
// Log the cell value.
Log "The price is: {_TABLE_CELL_TEXT}"
3.1 GetTables Script
DESCRIPTION
The com.tplan.table.GetTables script performs analysis of the screen in order to find a table or tables. The basic data of detected table(s) is exposed through the following variables:
Variable Name | Description |
---|---|
TABLE_COUNT=<number>_ | Number of recognized tables. |
TABLE<n>_X=<X-coordinate>_ TABLE<n>_Y=<Y-coordinate>_ TABLE<n>_W=<width>_ TABLE<n>_H=<height>_ | The bounding rectangle of the n-th table where <n> is between one and the number of tables. |
TABLE<n>_ROWS=<number>_ TABLE<n>_COLUMNS=<number>_ TABLE<n>_CELLS=<number>_ | Number or rows, columns and cells of the n-th table where <n> is between one and the number of tables. |
TABLE<n>_TEXT=<number>_ | Summary text recognized in the n-th table. This variable is populated only if the OCR is on (tocr=true). The text is composed of the individual cells in the reading mode with the tabulator (tab, \t) and new line (\n) serving asthe cell and row separators. To get text of a single cell use GetCell. |
NOTE: The parameter of scale is obsoleted since v0.5. It is accepted on the command line to preserve compatibility with the previous releases but its value is ignored. The command will identify the best scaling factor on its own.
SYNOPSIS (TPR SCRIPTS)
Run com.tplan.table.GetTables [cw=<width_in_pixels>] [ch=<height_in_pixels>] [minrows=<number>] [mincols=<number>] [spacing=<spacing_in_pixels>] [cmparea=[x:<x>][,y:<y>][,w:<width>][,h:<height>]] [tocr=<true|false>] [language=<3-char_lang_code>] [fixocr=<true|false>]
SYNOPSIS (JAVA SCRIPTS)
run("com.tplan.table.GetTables" [, "cw", "<pixels>"] [, "ch", "<pixels>"] [, "minrows", "<number>"] [, "mincols", "<number>"] [, "spacing", "<pixels>"] [, "cmparea", "[x:<x>][,y:<y>][,w:<width>][,h:<height>]"] [, "tocr", < "false" | "true" >] [, "language", "<lang_code>"] [, "fixocr", < "false" | "true" >] );
* Red color indicates obligatory parameters
OPTIONS
cw=<pixels>
- Minimum cell width in pixels. Defaults to 15.
ch=<pixels>
- Minimum cell height in pixels. Defaults to 10.
minrows=<number>
- Minimum required number of table rows. Tables that don't meet this criteria will be omitted. Defaults to 2 rows.
mincols=<number>
- Minimum required number of table columns. Tables that don't meet this criteria will be omitted. Defaults to 2 columns.
spacing=<spacing>
- Maximum spacing (grid thickness) between two neighboring cells. Defaults to 4 pixels.
cmparea=[x:<x>][,y:<y>][,w:<width>][,h:<height>]
- The rectangular area of the desktop to limit the table recognition to. The syntax and functionality is equal to the cmparea parameter of the CompareTo command. If you omit this parameter the whole remote desktop will be processed. The area coordinates have format of 'x:<x>,y:<y>,w:<width>,h:<height>', where each coordinate can be specified in pixels (for example. "x:225") or as a percentage ("x:23%"). If any of x, y, width or height are omitted, T-Plan Robot will use the full screen values to determine the missing parameters (x:0, y:0, w:<screen_width>, h:<screen_height>). For example, "cmparea=y:50%,height:50%" will process bottom half of the screen.
tocr=<true|false>
- The value of "true" will perform Tesseract OCR for each recognized cell to retrieve its text. The text may be then obtained through a call of the GetCell script. The parameter defaults to false (do not perform OCR). WARNING: This OCR mode may take long, especially for large tables. It is suitable only if you plan on iterating over texts of all cells or export the table to a file. Should you need to retrieve text of a single cell or a few cells it is more efficient to skip the OCR in GetTables and perform single cell OCR using the tocr parameter of GetCell.
language=<lang_code>
- A valid 3-character language code of a properly installed Tesseract language data file. The parameter is used only if OCR is on (tocr=true). If the parameter is omitted it defaults to "eng" (English). See the Tesseract OCR comparison method for details.
fixocr=<true|false>
- The value of true will make an attempt to fix some known Tesseract OCR accuracy errors such as:
- Abundant spaces in numbers ("3. 14" instead of "3.14")
- Digits confused for characters where the cell value appears to be a number, such as recognition of capital 'O' instead of the zero digit '0' ("2.O5" instead of "2.05")
- Slash confused for another character in date values ("29|10|2013" instead of "29/10/2013")
The default value is true (do fix). Set it off if the command fixes texts that are not supposed to be fixed, such as strings similar to numbers or dates (product codes, special char and digit sequences...)
RETURNS
The command returns 0 (zero) when at least one table is found and no error is experienced. Otherwise it returns 1 (one) when no tables get recognized or 2 (two) when the OCR fails to execute for an I/O error, typically when Tesseract is not installed or misconfigured. The failure is also logged into the execution log.
EXAMPLES
See the Usage chapter.
3.2 GetCell Script
DESCRIPTION
The com.tplan.table.GetCell script retrieves properties of a table cell recognized previously through GetTables. The data is exposed to the script in form of variables:
Variable Name | Description |
---|---|
TABLE_CELL_X=<X-coordinate>_ TABLE_CELL_Y=<Y-coordinate>_ TABLE_CELL_W=<width>_ TABLE_CELL_H=<height>_ | The bounding rectangle of the cell. |
TABLE_CELL_ROWSPAN=<number>_ TABLE_CELL_COLSPAN=<number>_ | The cell row span and cell span. |
TABLE_CELL_TEXT=<number>_ | The cell text. This variable is populated only if the OCR is on (tocr=true) or if OCR has been performed before either by GetTables or by a call of GetCell for the samecell row and column number. |
TABLE_CELL_BLANK=<true|false>_ | Indicates whether the cell is blank or not. This is verified independently from OCR through testing whether the cell area is made of a single color. |
TABLE_CELL_COLOR=<color>_ | The cell background color in the HTML notation. For example, white is "ffffff" while black is "000000". |
NOTE: The parameter of scale is obsoleted since v0.5. It is accepted on the command line to preserve compatibility with the previous releases but its value is ignored. The command will identify the best scaling factor on its own.
SYNOPSIS (TPR SCRIPTS)
Run com.tplan.table.GetCell [row=<number>] [column=<number>] [table=<number>] [tocr=<true|false>] [language=<3-char_lang_code>] [fixocr=<true|false>]
SYNOPSIS (JAVA SCRIPTS)
run("com.tplan.table.GetCell" [, "row", "<number>"] [, "column", "<number>"] [, "table", "<number>"] [, "tocr", < "false" | "true" >] [, "language", "<lang_code>"] [, "fixocr", < "false" | "true" >] );
* Red color indicates obligatory parameters
OPTIONS
row=<number>
- The cell row number. When omitted it defaults to the first row (1).
column=<number>
- The cell column number. When omitted it defaults to the first column (1).
table=<number>
- The table number. It makes sense only if GetTables recognizes more than one table. When omitted it defaults to the first table (1).
tocr=<true|false>
- The value of "true" will perform Tesseract OCR to retrieve the cell text unless the cell is blank (solid color). The parameter defaults to false (do not perform OCR).
language=<lang_code>
- A valid 3-character language code of a properly installed Tesseract language data file. The parameter is used only if OCR is on (tocr=true). If the parameter is omitted it defaults to "eng" (English). See the Tesseract OCR comparison method for details.
fixocr=<true|false>
- The value of true will make an attempt to fix some known Tesseract OCR accuracy errors such as:
- Abundant spaces in numbers ("3. 14" instead of "3.14")
- Digits confused for characters where the cell value appears to be a number, such as recognition of capital 'O' instead of the zero digit '0' ("2.O5" instead of "2.05")
- Slash confused for another character in date values ("29|10|2013" instead of "29/10/2013")
The default value is true (do fix). Set it off if the command fixes texts that are not supposed to be fixed, such as strings similar to numbers or dates (product codes, special char and digit sequences...)
RETURNS
The command returns 0 (zero) on success. When it fails it returns 1 (one) when the cell is not available or 2 (two) when the OCR fails to execute for an I/O error, typically when Tesseract is not installed or misconfigured. The failure is also logged into the execution log.
EXAMPLES
See the Usage chapter.
3.2 FindCell Script
DESCRIPTION
The com.tplan.table.FindCell script finds a cell or cells by their text in a table or tables recognized previously through GetTables. It is supported since v0.8.
The search may be performed through an exact text match (the "text" parameter) or a regular expression ("pattern"). It involves all tables and cells unless the "tables", "columns" and/or "rows" are specified to limit the scope. All these parameters accept a single number or a comma separated list or numbers.
Since the script works with the cell text it requires OCR to be set up and working. If the text has been recognized already, for example by running GetTables with the "tocr=true" parameter the script will use it as is regardless of whether the OCR parameters are specified or not. If the text is not yet available it performs the OCR on its own on the fly. You may use this to improve performance of your script. For example, there's a large table on the screen but you are only looking for a cell called "Total:" in the very first column. Performing a full scale OCR with GetTables could be a long operation. You may skip it and leave the the OCR to the FindCell script while limiting the search to the first column ("column=1").
The data of cell(s) found is exposed to the script in form of variables:
Variable Name | Description |
---|---|
_TABLE_CELL_COUNT=<count> | Number of cells found. |
_TABLE_CELL_TABLE=<table_number> | Number of the table the first cell was found in. |
_TABLE_CELL_TABLE<n>=<table_number> | Number of the table the n-th cell was found in. Numbering starts with 1. |
_TABLE_CELL_ROW=<table_number> | Row and column numbers of the first cell found. |
_TABLE_CELL_ROW<n>=<table_number> | Row and column numbers of the n-th cell found. |
_TABLE_CELL_X=<X-coordinate> | The bounding rectangle of the first cell found. |
_TABLE_CELL_X<n>=<X-coordinate> | The bounding rectangle of the n-th cell found. Numbering starts with 1. |
_TABLE_CELL_TEXT=<number> | Text of the first cell found. |
_TABLE_CELL_TEXT<n>=<number> | Text of the n-th cell found. |
SYNOPSIS (TPR SCRIPTS)
Run com.tplan.table.FindCell [text=<text>] [pattern=<regular_expression>] [rows=<numbers>] [columns=<numbers>] [tables=<numbers>] [language=<3-char_lang_code>] [fixocr=<true|false>]
SYNOPSIS (JAVA SCRIPTS)
run("com.tplan.table.FindCell", "text"|"pattern", "text_or_regular_expression" [, "rows", "<number(s)>"] [, "columns", "<number(s)>"] [, "tables", "<number(s)>"] [, "language", "<lang_code>"] [, "fixocr", < "false" | "true" >] );
* Red color indicates obligatory parameters. One of "text" or a "pattern" must be specified.
OPTIONS
text=<text>
- The exact cell text to look for.
pattern=<regular_expression>
- A java.util.regex.Pattern compliant regular expression to match the cell text against. For example, to look for a cell with text starting with "Price" use the expression of "Price.*".
tables=<number(s)>
- A table number or a comma separated list of numbers to limit the search to. If this parameter is omitted the search will be performed in all tables.
rows=<number(s)>
- A row number or a comma separated list of numbers to limit the search to. If this parameter is omitted the search will be performed in all rows.
columns=<number(s)>
- A column number or a comma separated list of numbers to limit the search to. If this parameter is omitted the search will be performed in all columns.
language=<lang_code>
- A valid 3-character language code of a properly installed Tesseract language data file. The parameter will be used only if the cell text hasn't been recognized through OCR yet. If the parameter is omitted it defaults to "eng" (English). See the Tesseract OCR comparison method for details.
fixocr=<true|false>
- The value of true will make an attempt to fix some known Tesseract OCR accuracy errors such as:
- Abundant spaces in numbers ("3. 14" instead of "3.14")
- Digits confused for characters where the cell value appears to be a number, such as recognition of capital 'O' instead of the zero digit '0' ("2.O5" instead of "2.05")
- Slash confused for another character in date values ("29|10|2013" instead of "29/10/2013")
The default value is true (do fix). Set it off if the command fixes texts that are not supposed to be fixed, such as strings similar to numbers or dates (product codes, special char and digit sequences...)
RETURNS
The command returns 0 (zero) if at least one cell is found. When it fails it returns 1 (one) when no cell is found or 2 (two) when the OCR fails to execute for an I/O error or the input parameters are not matching the GetTables results. The failure is also logged into the execution log.
EXAMPLES
See the Usage chapter.
3.4 ExportTable Script
DESCRIPTION
The com.tplan.table.ExportTable script exports table(s) recognized previously through GetTables to a text, CSV or MS Excel file. The format is decided from the file extension:
.txt - text file
- Cells will be separated by tabulators (tabs, \t)
- Rows will be separated by new lines (\n)
- Tables will be separated by two blank lines.
.csv - CSV file
- The file will be IETF RFC 4180 compliant.
- Tables they will be separated by two blank lines.
.xls, .xlsx - MS Excel. The supported formats are Microsoft Excel 97/2000/XP (.xls) and MS Excel 2007 XML documents (.xlsx).
- There will be one sheet per table. Each sheet be named by the table number ("1", "2", ...)
- The table will start from the top left sheet corner (the "A1" cell). It will be bordered by a solid black line.
- The cells will contain the recognized text provided that OCR has been performed. No other cell attributes will be applied to the Excel table (color, text alignment, font size/style...). Should you need any extra functionality int his area ask the T-Plan support.
If the file has no or unrecognized extension the script will append .xls to the file name and it will default to Microsoft Excel 97/2000/XP. To ensure that the file contains all cell texts call the GetTables script with the OCR on (tocr=true). A typical sequence of commands to recognize table(s) and store them to a MS Excel file is:
// Recognize tables and perform OCR to retrieve text of all cells
Run "com.tplan.table.GetTables" tocr="true"
if ({_EXIT_CODE} > 0) {
Exit 1 desc="No tables were found."
}
// Export the tables to MS Excel
Run "com.tplan.table.ExportTable" file="C:\test.xlsx"
SYNOPSIS (TPR SCRIPTS)
Run com.tplan.table.ExportTable [file=<target_file>] [tables=<number(s)>]
SYNOPSIS (JAVA SCRIPTS)
run("com.tplan.table.ExportTable", "file", "<target_file>" [, "tables", "<number(s)>"]);
* Red color indicates obligatory parameters
OPTIONS
file=<target_file>
- The target file to save the tables to (mandatory). The format will be judged from the extension (.txt, .csv, .xls or .xlsx). If the file has no recognized extension the script will default to the Microsoft Excel 97/2000/XP (.xls) format and it will append the .xls extension to the file name. The specified file may be absolute or relative. Relative paths will be resolved against the user home folder.
tables=<number(s)>
- The table number or a semicolon or comma separated list of table numbers (optional). When omitted the script exports all tables recognized by the most recent GetTables call.
RETURNS
The command returns 0 (zero) on success. It returns 1 (one) when it fails for an I/O error or if the table list is invalid. The failure is also logged into the execution log.
EXAMPLES
See the Usage chapter.
4. Change Log
Version 0.8 released on 26 April 2024
- V0.7 used exact matching of start and end coordinates of horizontal and vertical lines. This failed to detect table cells which have a pixel of a different color in one or more cell corners. The matching was changed to a tolerant one (+- 1px).
- Tesseract OCR failed for a NullPointerException when performed from the TableKit viewer. This was due to a change in the Tesseract OCR module on the Robot side. The TableKit code was updated to become backward compatible with all recent Robot versions.
- Export to Excel was fixed. It was failing because it relied on the legacy POI libs we shipped last with Robot 5.x.
- The TableKit viewer has been redesigned. Parameters were moved from the tool bar to a panel at the bottom left. The "tolerance" parameter has been added. There are two new buttons at the tool bar, "Zoom In" and "Zoom Out".
- A new script called FindCell has been introduced.
- The GetCell parameters of "row", "table" and "column" were redefined as STRING types to support variable calls where the variables used don't exist at the time of compilation.
Version 0.7 released on 14 June 2018
- Table recognition logic updated to improve accuracy where one or more columns fail to recognize thanks to a custom content
- Minor viewer enhancements
Version 0.6 released on 15 November 2016
New version for Robot 4.1.1 and higher (the 0.5 version fails to open the viewer for the Local Desktop connection due to a change in the Robot framework)
Version 0.5 released on 19 June 2014
First public release with the initial set of GetTables, GetCell and ExportTable scripts.