I suspect the problem is with the generated xliff file, as it doesn't include the inline styles in there. Seems like it just needs to add the inline styles to the originalData field to fix that though.
At the moment the serializers that are producing the xliff content are going for a fairly clean xliff file if possible (we've had lots of random files being produced by translators when we add everything).
if you want to preserve the HTML in a RTE as best as possible it might be better to turn off the split option for the xliff provider.
this will have the effect of not splitting the HTML up into the individual elements - so the translator gets the RTE content as one field, but all the styling and any other html mark up you may have will be preserved.
That does preserve the inline styles, though like you said its not as clean. Unfortunately I can't easily make the switch, as I already have some translations in the split html format. It also introduces potential issues with translation agencies depending on how they process the xliff file. Would it be possible to add support for including additional attributes for the split html version? I know you are concerned about bloat, but its feature that's needed for the rte to work 100% of the time, especially since this setting is currently true by default.
Perhaps it can be handled like TinyMceConfig.config does it and just have a whitelist of allowed attributes in Translations.config? This way you can minimize bloat in the xliff file, but still support various attributes when needed.
but it doesn't do it on the things that are also split out (so you're h3 in the example is split so the actual html node has gone.
the xliff spec definition for how orginalData works, would require for it to then be wrapped in something to get the class which would be inside the h3 tag when we then exported it - so it would need more work, and a good re-reading of the xliff spec to get our heads around if it was doable.
We have been keeping it simple for the very reason that multiple translation agencies are indeed all very different in how they treat the XML - a surprising number actually just take the html, which I think it odd, given the translation tools are built for the splitting.
Also as an aside you can actually turn split on/off when other translation jobs are out to processing.
it only affects how the xliff is generated. the importing process will merge split items back together before Translation Manager gets near them
(translation companies might sometimes take the html version but return a split one! - so we always have to run through the merge just incase.
Inline styles in rich text editor lost on translation with xliff provider
The translation is currently stripping out inline styles added via the rte see the below debug output for an example of the problem.
The original value has the inline styles, but they are stripped out in the translated version. See the xliff file I uploaded below.
I suspect the problem is with the generated xliff file, as it doesn't include the inline styles in there. Seems like it just needs to add the inline styles to the originalData field to fix that though.
Hi
At the moment the serializers that are producing the xliff content are going for a fairly clean xliff file if possible (we've had lots of random files being produced by translators when we add everything).
if you want to preserve the HTML in a RTE as best as possible it might be better to turn off the split option for the xliff provider.
this will have the effect of not splitting the HTML up into the individual elements - so the translator gets the RTE content as one field, but all the styling and any other html mark up you may have will be preserved.
That does preserve the inline styles, though like you said its not as clean. Unfortunately I can't easily make the switch, as I already have some translations in the split html format. It also introduces potential issues with translation agencies depending on how they process the xliff file. Would it be possible to add support for including additional attributes for the split html version? I know you are concerned about bloat, but its feature that's needed for the rte to work 100% of the time, especially since this setting is currently true by default.
Perhaps it can be handled like TinyMceConfig.config does it and just have a whitelist of allowed attributes in Translations.config? This way you can minimize bloat in the xliff file, but still support various attributes when needed.
yeah we actually have something very similar that does work for things that are inside the html (so links, images etc)
https://our.umbraco.com/packages/backoffice-extensions/translation-manager/translation-manager-feedback/95623-is-it-possible-to-extract-the-title-tag-of-links-in-rte-when-translating#comment-302428
but it doesn't do it on the things that are also split out (so you're h3 in the example is split so the actual html node has gone.
the xliff spec definition for how orginalData works, would require for it to then be wrapped in something to get the class which would be inside the h3 tag when we then exported it - so it would need more work, and a good re-reading of the xliff spec to get our heads around if it was doable.
We have been keeping it simple for the very reason that multiple translation agencies are indeed all very different in how they treat the XML - a surprising number actually just take the html, which I think it odd, given the translation tools are built for the splitting.
Also as an aside you can actually turn split on/off when other translation jobs are out to processing.
it only affects how the xliff is generated. the importing process will merge split items back together before Translation Manager gets near them
(translation companies might sometimes take the html version but return a split one! - so we always have to run through the merge just incase.
So would this work to preserve the styles?
If so, that can be a workaround for me, though I would need to recreate the job instead of just reverting the approved job to the submitted state.
Hi
Just coming back to this one, there is actually a small update in v2.3.0 that might help with this.
On a split xliff file, if the element has custom attributes we will attempt to preserve them now as part of the split,
so
becomes
is working on a reply...