Copied to clipboard

Flag this post as spam?

This post will be reported to the moderators as potential spam to be looked at


  • Steve Crook 28 posts 160 karma points
    Jul 06, 2021 @ 02:39
    Steve Crook
    0

    Overriding default mappers

    I've got some content in RTEs (both in grids and not) which has some span tags in it for formatting. These tags are causing the content to be fragmented when exported to XLIFF, which is causing problems for the translators who would much prefer to have whole sentences to translate.

    The formatting isn't necessary for any of the translations, so I'd like to ignore it on export. I'm trying to write a value mapper to do this, which I think should be used instead of the default Inline Tag Mapper. I've set the Editors like this:

    public override string[] Editors
    {
        get
        {
            return new string[] { "Umbraco.TinyMCEv3.Tag", "Umbraco.TinyMCEv3.Image" };
        }
    }
    

    But my GetSourceValue method never gets called when exporting. How can I set my value mapper to be used instead of the default one?

  • Kevin Jump 2342 posts 14889 karma points MVP 8x c-trib
    Jul 06, 2021 @ 08:02
    Kevin Jump
    0

    Hi Steve,

    I've never done it - but the value mappers are collections so i belive you exclude the mapper you don't want

    composition.WithCollectionBuilder<ValueMapperCollectionBuilder>()
          .Exclude<InlineTagMapper>();
    

    (all IValueMappers are loaded so yours should already be there).

    however I don't think this is what you want to do (but i am not sure how you can do it)

    The splitting of the text is done by the XmlSerializer - the value mappers get the text out of Umbraco but if you look at the values in an job you will see they are not split in the UI. that does split out tag title/alt into its own values but only because most translation tools prefer these to be separate.

    The XmlSerializer does the html splitting (so by

    , etc) and it does it based on the Split Html setting in the connector options. With that setting off, nothing is split (so you do end up with lots of html in the text.

    What to split on isn't really configurable at the moment - so its split or not :( . but we could look at this, do you have an example of the text you want to split and what you expect it to look like in the xliff ?


    You could use a value mappers to remove the bits of html before it got to the xliff serializer. but you would have to code some way of putting it back into the translated text, not sure how that might work, (given the text will be translated).

  • Steve Crook 28 posts 160 karma points
    Jul 07, 2021 @ 01:36
    Steve Crook
    0

    Hi Kevin,

    Thanks for replying, that all makes sense.

    This is some of the HTML I've got:

    <h2>Run your shipments <span class="underline">smoothly</span> from port to port.</h2>
    

    It's exported as:

    <group id="u1219-1-g" name="h2">
      <unit id="u1219-1-1" name="#text">
        <segment>
          <source>Run your shipments </source>
        </segment>
      </unit>
      <unit id="u1219-1-2" name="span">
        <mda:metadata>
          <mda:metaGroup id="span_attributes">
            <mda:meta type="class">underline</mda:meta>
          </mda:metaGroup>
        </mda:metadata>
        <segment>
          <source>smoothly</source>
        </segment>
      </unit>
      <unit id="u1219-1-3" name="#text">
        <segment>
          <source> from port to port.</source>
        </segment>
      </unit>
    </group>
    

    I'd like all spans with class=underline to be ignored, so it would be exported as:

    <group id="u1219-1-g" name="h2">
      <unit id="u1219-1-1" name="#text">
        <segment>
          <source>Run your shipments smoothly from port to port.</source>
        </segment>
      </unit>
    </group>
    

    I've managed to exclude the default mapper, although I think that replacing the HtmlDocumentMapper and removing all the spans from the whole block might be easier than trying to do it with a new InlineTagMapper. Not sure if that's the "right" way, but it seems like it should work?

  • Kevin Jump 2342 posts 14889 karma points MVP 8x c-trib
    Jul 07, 2021 @ 12:32
    Kevin Jump
    100

    Hi Steve,

    TL:DR - I think we have made this possible for you in the latest release of Translation manager with no custom code.

    more detail :

    Xliff does have the functionality to do this for well known tag types so for example with the underline tag :

    <h2>Run your shipments <u>smoothly</u> from port to port.</h2>
    

    the xliff will be:

    <unit id="u3-1" name="h2">
       <originalData>
        <data id="d1">&lt;u&gt;</data>
        <data id="d2">&lt;/u&gt;</data>
       </originalData>
      <segment>
        <source>Run your shipments <pc dataRefEnd="d2" dataRefStart="d1" id="1" subType="xlf:u" type="fmt">smoothly</pc> from port to port.</source>
      </segment>
    </unit>
    

    In a translation tool (such as SDL Trados ) the translator sees this like below:

    enter image description here

    This lets the translator see the emphais and it maintains the formatting of the translation on the way back.

    Spans are a little bit harder - because well they are generic and can be anything (from dividers to inline elements) - based on a lot of feedback we actually changed from not splitting on spans to splitting on spans, for this very reason. - However its still not ideal for everyone.

    If we just removed the spans then you would loose the formatting on the returned translation - so that means there would be no underline coming back in, and its not possible to put it back because what happens if the underlined word is in a different place in the translation or it splits into two words for example ?

    The possible solution is for us to split on the spans again . and this would result in the following xliff :

    <unit id="u3-1" name="h2">
      <originalData>
        <data id="d1">&lt;span class="underline"&gt;</data>
        <data id="d2">&lt;/span&gt;</data>
      </originalData>
      <segment>
        <source>Run your shipments <pc dataRefEnd="d2" dataRefStart="d1" id="1" type="fmt">smoothly</pc> from port to port.</source>
      </segment>
    </unit>
    

    again in SDL Trados this shows:

    enter image description here

    this way the Translators can see what is in the span - and the formatting is preserved.

    The only real downside with this is if you turn span splitting on - it will do this for all spans in your html - and there might be places where you actually want to split on the spans and that will no longer happen (its will very much be dependent on implementation/site code then).

    So what we've done for the latest translation manager release is add the option for the xliff connector to add addtional 'inline' codes that you want Translation Manager to treat as inline on your site.

    in this example you would add span to the translations.config / xliff provider section of the file :

     <add key="inlineCodes" value="span" />
    

    (comma delimited - if you want to add more)

  • Steve Crook 28 posts 160 karma points
    Jul 08, 2021 @ 06:20
    Steve Crook
    0

    Hi Kevin,

    Thanks for the new version. I'm just checking with our translators that it will work with their software, but I think it should. Splitting on all spans shouldn't be a problem for our content.

Please Sign in or register to post replies

Write your reply to:

Draft