The CAT tool should be fully capable of handling the same classification system used by WorldServer, e.g. ICE Match, Repaired matches, etc.
We've dealt with this before, and my memory is that it's not really supported well in the Okapi XLIFF filter. You will need to check for the presence of specific markup specific to SDLXLIFF and also the older Idiom XLIFF (iws:), if you want to support both.
There are two items of metadata which seem relevant: <iws:segment-metadata tm-score="76"><iws:segment-metadata> and <iws:status match-quality="fuzzy"><iws:status>.
Ocelot already reads and renders <target state-qualifier="fuzzy-match"></target> attributes via a configuration in the rules.properties file where the colour defined there is applied to the segment labels.
Is your requirement to be able to "see" the exact match score (e.g. 76%) or is just the category of the macth enough (e.g. "fuzzy"). If just the category, is colouring the segment labels (see
) a satisfactory mechanism?
Phil, I would like translators to be able to:
1) If it's a fuzzy match see the percentage (e.g. 76%).
2) Follow the same color scheme as WorldServer (e.g. 100% Repaired Matches displayed by a stripped blue line).
I think the repair status can be determined from looking at the iws:is-repaired-match value in the alt-trans for the specific match. The rest will require handling additional iws:status/@match-quality values such as guaranteed.
If we did that we could probably expand the segment labelling mechanism that Phil mentioned to get the color-coding. Those colors are already customizable, although supporting a dotted line (which I think WS does?) would require extra work.
The other issue is that as with everything related to the iws: metadata, this only works for XLIFF generated from files that were filtered with the legacy WS filters. Files filtered with the newer FTS filters will generate SDLXLIFF and use a different set of flags for things like ICE matches.
The best place to support this metadata is probably in the Okapi XLIFF filter, because otherwise Ocelot will need to use regular expressions to scrape the data out of the skeleton.
I like Phil's suggestion of starting from the state-qualifier color-coding functionality that we already have.
However, I propose that we take it a step farther and turn it into a feature called "match quality". Ocelot will use various sources of information from XLIFF to assign a segment into one of several match categories:
No match
Fuzzy match
Exact match
Repaired exact match
Context match
ID match
MT match
Then, we can assign a segment to a category in various ways:
For generic XLIFF, via state-qualifier (when present)
For IWS XLIFF, based on the iws:status information
For SDLXLIFF, based on the sdl:seg information
Not categories are possible for all types of XLIFF.
Depending on match quality, we will assign a highlight color based on configuration data, just like we do now with state-qualifier.