- Overview
- Types of ITS Rules
- ITS Markup
- How To...

UNDER CONSTRUCTION

Overview

This section provides an introduction and a quick reference to ITS, the W3C's Internationalization Tag Set as implemented by Okapi's XML Filter. For a complete and official description of ITS, see the W3C ITS specifications.

ITS is a set of elements and attributes you can use in any XML document to specify different internationalization-related aspect of your documents. For example: what elements should be translated, notes for the translators, how elements should affect sentence segmentation, etc.

Each type of ITS feature is called a "data category". Depending on what kind of function a tool is doing it will implement one or more data categories. The ITS features currently supported by the Okapi XML Filter are the following:

Types of ITS Rules

There are two way to apply ITS information to your document: Global rules and local rules.

There are two ways to associate external rules to a documents:

Note:

To some degree you can compare the way ITS rules are declared to CSS styles: Global rules in standalone files are the equivalent of a CSS file. Embedded global rules are the equivalent of the HTML <style> elements. And local rules are the equivalent of the HTML style attributes.

ITS rules are always applied in the following order:

A later rule overrides a previous one when both apply to the same parts of the document (in other words: the last rule always wins).

Example of external global rules:

<its:rules xmlns:its="http://www.w3.org/2005/11/its"
 its:version="1.0">
 <its:translateRule selector="//fexp" translate="no"/>
 <its:withinTextRule selector="//fexp|//strong" withinText="yes"/>
</its:rules>

Example of a document with embedded global rules (in red) and local rules (in blue):

<book xmlns:its="http://www.w3.org/2005/11/its"
 its:version="1.0">
 <head>
  <its:rules>
   <its:translateRule ... />
  </its:rules>
  <title>The Life of a Simple Man</title>
 </head>
 <body>
  <p>Everything started when Zebulon discovered that he had
a <fexp>doppelgänger</fexp> who was a serious
baseball <fexp its:translate="yes">aficionado</fexp>.</p>
 </body>
</book>

ITS Markup

Translate

<translateRule>
  Required Attributes:
    selector Absolute XPath expression that indicates what nodes should be affected by the given rule.
    translate "yes" if the nodes selected should be translated, "no" if they should not be translated. The property applies to the children elements of the selected nodes, but not to the attributes. By default elements are translatable and attributes are not.
  Optional Attributes:
    None  

For example, in the following XML document, the <docID> element is declared as not translatable, and the alt attribute is declared as translatale.

<myDoc>
 <head>
  <docID>ABC123-456-987</docID>
  <title>Title</title>
  <its:rules xmlns:its="http://www.w3.org/2005/11/its"
 its:version="1.0">
   <its:translateRule selector="//docID" translate="no"/>
   <its:translateRule selector="//@alt" translate="yes"/>
  </its:rules>
 </head>
 <body>
  <para>Look at this picture:
<img href="e1.png" alt="Elephants in the river"/></para>
 </body>
</myDoc>

Localization Note

TODO

 

Terminology

The Terminology data category is implemented, but no Okapi tools takes advantage of it for now.

Element Within Text

TODO

 

How To...

How to set an attribute as translatable

TODO

How to set ITS properties depending on context

TODO

How to deal with language conditions?

In some document you may have the same text in different languages. Usually the language information in XML is specified with the xml:lang attribute. There is an XPath lang() function that allows you to get the language of a given node. This function is case-insensitive and take in account inheritance of xml:lang.

So, for example, the expression selector="//seg[lang('en-us')]" will match the first <seg> in:

<item>
 <var xml:lang='en-US'>
  <seg>Text</seg>
 </var>
 <var xml:lang='fr'>
  <seg>Texte</seg>
 </var>
</item>

How to work with namespaces?

ITS can handle namespaces by using prefixes in its XPath expressions. You can declare the namespace URI and prefixes anywhere as long as the rule is in its scope.

For example, the following XML document is composed of elements that belong to two namespaces: the main one is "myDocumentNamespace", and there are also element from the XHTML namespace ("html").

<myDoc xmlns="myDocumentNamespace"
 xmlns:h="html">
 <head>
  <docID>ABC123-456-987</docID>
 </head>
 <body>
  <para>This text is <h:b>bolded</h:b>, and this is
some <h:i>text in italics</h:i>.</para>
 </body>
</myDoc>

In order to point to the proper nodes to specify that <docID> is not translatable, and that <b> and <i> are inline codes you simply use XPath expressions with prefixes that are mapped to the proper namespaces.

<its:rules xmlns:its="http://www.w3.org/2005/11/its"
 xmlns:m="myDocumentNamespace"
 xmlns:h="html"
 its:version="1.0">
 <its:translateRule selector="//m:docID" translate="no"/>
 <its:withinText selector="//h:*" withinText="yes"/>
</its:rules>