Okapi Framework
Format Specifications
|
|
IMPORTANT NOTICE - Apr-30-2009:
This .NET implementation of Okapi is no longer actively developed.
Instead, a NEW JAVA IMPLEMENTATION IS
AVAILABLE and is being actively developed.
The material on this Web site is for archive propose. Some applications of the old .NET implementation
(e.g. Olifant) will be maintained to some degree until they have a replacement in the Java project.
Industry Standards
The localization industry uses several standard to exchange data between
tools. The applications and components of the Okapi Framework try to support
these formats as much as possible. Note that they at various state of
development. They are:
The localization industry uses several standards to exchange data among
various tools. The applications and components of the Okapi Framework support
these formats to the extent possible. Note that the formats are at various
states of standardization. They include:
- XLIFF - XML Localisation Interchange File Format
Maintained by the XLIFF Technical Committee at OASIS, XLIFF provides a common
markup language for extracted localizable text.
Status (as of Jun-2007): v 1.2 Committee Recommendation, many consistent
implementations exist.
-
XLIFF Specification
- TMX - Translation Memory eXchange
Maintained by the OSCAR Committee at LISA, TMX covers the exchange of
translation memory data.
Status (as of Jun-2007): v 1.4b LISA Standard. Many implementations, some
consistent.
-
TMX
Specification
- SRX - Segmentation Rules eXchange
Maintained by the OSCAR Committee at LISA, SRX addresses the exchange of
segmentation rules among tools.
Status (as of Jun-2007): v 1.0, LISA Standard, few implementations, not consistent
at all. Upcoming v 2.0 resolves this.
- SRX
Specification
- ITS - Internationalization Tag Set
A W3C namespace to provide internationalization information and support
in XML documents.
Status (as of Jun-2007): W3C Recommendation.
- ITS Specification
- TBX - TermBase eXchange
Maintained by the OSCAR Committee at LISA, TBX is a format to exchange
terminological data. It is based on MARTIF.
Status (as of Jun-2007): Not clear. Very few implementations.
- TBX Specification
- GMX-V - GIM (Global Information Management) Metrics eXchange - Volume
Maintained by the OSCAR Committee at LISA, GMX aims at storing metrics in XML
documents (for example, word-counts).
Status (as of Jun-2007): v 1.0 LISA Standard, Very few implementations.
- GMX
Specification
Okapi Formats
Several file formats are specific to Okapi's components and applications.
-
Language List
The Language List file is used to share information related to language identification
across Okapi components and applications. For
example: RFC-3066 codes, language names, Windows LCIDs, etc.
- [Specification not yet available]
- Encoding List
The Encoding List file is used to share information related to encodings
across Okapi components and applications. For example:
Encoding names, codepage numbers, encoding family, etc.
- [Specification not yet available]