Okapi Components - Table Filter

- Overview
- Filter properties
- Processing Details
- Parameters - Options Tab
- Parameters - Columns Tab
- Parameters - Inline Codes Tab

Utilities

Filters

Shared Help

Okapi Framework

Overview

The Table Filter is an Okapi component that implements the Okapi Filter Interface for table-type files where the translatable text lies in specific columns. The files processed with this filter must be text-based.

Filter Properties

The properties for the Table Filter are the following:

Property	This Filter
INPUTFILE	Yes
INPUTSTRING	No
BILINGUALINPUT	Yes
TEXTBASED	Yes
OUTPUTFILE	Yes
OUTPUTSTRING	No
ANCILLARYOUTPUT	No
XMLOUTPUT	No
RTFOUTPUT	Yes
USEKEY	No
ISINDEMOMODE	No

Processing Details

Input Encoding

The filter uses the input encoding defined by the user. Unicode files with Byte-Order-Marks are detected automatically and the proper encoding used regardless of the user-defined encoding.

Output Encoding

The output file is generated in the encoding defined by the user.

Line-Breaks

The line-breaks type of the output are of the same type as the input. Embedded line-breaks are supported for fields with delimiters.

White Spaces

If your parameters are set to use the Columns are separated by a special character option and the Ignore white-spaces after separators option, the white-spaces after a separator will not be written in the output file.

Parameters - Options Tab

Columns are separated by a special character -- Select this option for tables made of fields separated by a delimiter.

Column separator -- Enter the character that serves as separator between columns. For the tab character or any other control character use the form \xHHHH where HHHH is the Unicode value of the character. For example use \x0009 for the tabulation character.

Ignore white-spaces after separators -- Set this option to not include in the fields any white-spaces after the separators.

Text Qualifier -- Enter the characters that starts and ends text fields. If none of the text fields of your table has text delimiters, leave this entry empty.

Each column has a fixed width -- Select this option for tables made of fixed-width fields. You must declare the width of each column. Note that tables with fixed-width are not expected to have text qualifiers.

List of widths -- Enter the width of each columns (using Unicode characters as unit), in the order of the columns. the values must be separated by a least one space, comma, or semi-colon.

Row where the data start -- Enter the row number where the data of the table start. For example, if the first line of the file is made of the titles of the columns, enter 2 as the actual data start at the second line. If the data start at the top of the file, enter 1. Empty lines are not counted as rows.

Use localization directives when they are present -- Set this option to enable the filter to recognize localization directives. If this option is not set, any localization directive in the input file will be ignored.

Extract items outside the scope of localization directives -- Set this option to extract any translatable item that is not within the scope of a localization directive. Selecting to extract or not outside localization directives allows you to mark up fewer parts of the source document. This option is enabled only when the Use localization directives when they are present option is set.

In this filter, localization directives are expected to be in the Comment columns (if any is associated with a translatable column).

Parameters - Columns Tab

Extract the following columns -- Select this option to extract only a set of specified columns.

Extract all columns with a text qualifier -- Select this option to extract all columns with text qualifier. The character used as the text qualifier is defined in the Options Tab.

Add -- Click this button to add a new column definition.

Remove -- Click this button to remove the column definition currently selected.

Source -- Enter the index or the name of the column for which you are specifying properties. The column contains the source text. The column numbers start at 1 (not 0).

Identifier -- Enter the index or the name of the column containing the ID associated with the current source column. The first column is 1. Enter 0 if there is no associated identifier with the current source column. In addition you can set a suffix to add at the end of the ID value. This allows you to have different unique IDs for the different fields of a single row, for the tables where there is only one ID per row. If a row does not have the selected column for ID, no ID is provided.

Target -- Enter the index or the name of the column containing the translation corresponding to the current source column (or where the translation should go). The first column is 1. Enter 0 if there is no associated translation with the current source column.

Comment -- Enter the index or the name of the column containing the comments associated with the current source column. The first column is 1. Enter 0 if there is no associated comments with the current source column. This is also the column where you can set localization directives. If a row does not have the selected column for comment, no comment is processed.

Trim trailing white-spaces -- Set this option to remove any trailing white-spaces from the text of the field.

About column index and column names

You can specify a column using either its index or its name (if names are available).

The index are always possible. The first column is index 1 (not 0). Use 0 to specify that a property is not associated to any column.
The column names are available when the option Row where the data start is set to anything higher than 1. The names are always taken from the first row of the table (Note that empty lines are not counted as actual row). Names are not case-sensitive.

Parameters - Inline Codes Tab

Mark as inline codes the text parts matching this regular expression -- Set this option to use the specified regular expression to be use against the text of the extracted items. Any match will be converted to an inline code. By default the expression is:

((%(([-0+#]?)[-0+#]?)((\d\$)?)(([\d\*]*)(\.[\d\*]*)?)[dioxXucsfeEgGpn])
|((\\r\\n)|\\a|\\b|\\f|\\n|\\r|\\t|\\v)

This matches the C-style printf variables (e.g. "%s", "%2.3f", "%04X", "%1$d", etc.), the escaped sequences: "\r\n", "\a", "\b", "\f", "\n", "\r", "\t", and "\v", as well as the .NET String.Format patterns (e.g. "{0}")

Edit Expression -- Click this button to edit the regular expression and its options.

See the Regular Expressions section for more information about the syntax and rules for building regular matching patterns.