LibreOffice plugin to pipe whole Writer documents through Google Translate, that ought to keep most of the page formatting.

βŒˆβŒ‹ βŽ‡ branch:  PageTranslate


Check-in [b1b7271886]

Many hyperlinks are disabled.
Use anonymous login to enable hyperlinks.

Overview
Comment:Notes about ArgosTranslate, CLI alternative
Downloads: Tarball | ZIP archive | SQL archive
Timelines: family | ancestors | descendants | both | trunk
Files: files | file ages | folders
SHA1: b1b727188677578b51e7bcfbb2baea398f77a0ad
User & Date: mario 2021-06-10 14:52:08
Context
2021-06-10
14:53
Change exception names, and use `LangSelection` for dialog (makes error more understandable). Leaf check-in: c62b11cb0e user: mario tags: trunk
14:52
Notes about ArgosTranslate, CLI alternative check-in: b1b7271886 user: mario tags: trunk
14:50
Move MessageBox() to unocompat (not actually used anymore, doesn't work in LO-dev-7.2 anyway), sys.excepthook doesn't suffice for dialog hookup. Add config btn_map{} for external tools from settings dialog. check-in: 3f945d5495 user: mario tags: trunk
Changes
Hide Diffs Unified Diffs Ignore Whitespace Patch

Changes to NEWS.

1
2

3
4
5
6
7
8
9
2.0 (unreleased)
 * Add SYSTRAN backend.

 * ...

1.9 (2021-06-02)
 * Add builtin PONS Text Translation backend (full text, automatic langdetect).
 * Add Google Translate Ajax API endpoint as alternative.
 * Allow behaviour configuration of secondary 🏴 flag button.
 * Menu shortcut to configuration dialog working.


>







1
2
3
4
5
6
7
8
9
10
2.0 (unreleased)
 * Add SYSTRAN backend.
 * Introduce ArgosTranslate binding.
 * ...

1.9 (2021-06-02)
 * Add builtin PONS Text Translation backend (full text, automatic langdetect).
 * Add Google Translate Ajax API endpoint as alternative.
 * Allow behaviour configuration of secondary 🏴 flag button.
 * Menu shortcut to configuration dialog working.

Changes to help/en/vnd.include-once.pagetranslate/config.page.

68
69
70
71
72
73
74

75
76
77
78
79
80
81
82
...
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
      <p>ArgosTranslate is an offline translation library based on
      CTranslate2 and OpenNMT models. It's thus independent from online
      services and connections, but requires prior setup. Specifically
      you need to run <cmd>pip3 install argos-translate</cmd> and
      <cmd>argos-translate-gui</cmd> to download language packs beforehand.
      And this usually just works with LibreOffice installations provided
      through Linux distro package managers (due to the way bundled Python

      is configured). Notably this option might be slower for long documents,
      but provides fairly good results.
      </p>
    </item>
    <item>
      <title>DeepL API</title>
      <p>Utilizes the speedier <link href="https://www.deepl.com/pro">DeepL
      Pro API</link> to translate documents.  As of yet untested.  Requires
................................................................................
      translations.</p>
    </item>
    <item>
      <title>Command</title>
      <p>This field defines the CLI tool to use for translating. Placeholders
      can be noted with {text} curly braces, or shell $lang and %from% percent
      syntax. The Python
      <link href="https://pypi.org/project/translate/">translate</link> or
      <link href="https://pypi.org/project/deep-translator/">deep-translator</link>
      packages provide following CLI wrappers:</p>
      <terms>
        <item><p><cmd>translate-cli -o -f auto -t {lang} {text}</cmd></p></item>
        <item><p><cmd>deep_translator -trans "google" -src "auto" -tg {lang} -txt {text}</cmd></p></item>
      </terms>
    </item>
  </terms>
</section>

<section id="flags">
  <title>Options / Flags</title>
  <terms>







>
|







 







|
|
|
|
|
<
<







68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
...
184
185
186
187
188
189
190
191
192
193
194
195


196
197
198
199
200
201
202
      <p>ArgosTranslate is an offline translation library based on
      CTranslate2 and OpenNMT models. It's thus independent from online
      services and connections, but requires prior setup. Specifically
      you need to run <cmd>pip3 install argos-translate</cmd> and
      <cmd>argos-translate-gui</cmd> to download language packs beforehand.
      And this usually just works with LibreOffice installations provided
      through Linux distro package managers (due to the way bundled Python
      is configured). You can utilize the cmdline tool in any case however.
      Notably this backend might be slower for long documents,
      but provides fairly good results.
      </p>
    </item>
    <item>
      <title>DeepL API</title>
      <p>Utilizes the speedier <link href="https://www.deepl.com/pro">DeepL
      Pro API</link> to translate documents.  As of yet untested.  Requires
................................................................................
      translations.</p>
    </item>
    <item>
      <title>Command</title>
      <p>This field defines the CLI tool to use for translating. Placeholders
      can be noted with {text} curly braces, or shell $lang and %from% percent
      syntax. The Python
      <link href="https://pypi.org/project/translate/">translate</link>,
      <link href="https://pypi.org/project/deep-translator/">deep-translator</link> and
      <link href="https://pypi.org/project/argostranslate/">argos-translate</link>
      packages provide CLI wrappers. Each having a sample configuration in the combobox
      dropdown.</p>


    </item>
  </terms>
</section>

<section id="flags">
  <title>Options / Flags</title>
  <terms>

Changes to help/en/vnd.include-once.pagetranslate/config.xhp.

3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
..
86
87
88
89
90
91
92

93
94
95
96
97
98
99
100
...
212
213
214
215
216
217
218
219
220
221


222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
239
240
241
242
243
244
245
246
247
248
249
250
251
252
253
254
255
256
257
258
259
260
261
262
263
264
265
266
267
268
269
270
271
272
273
274
275
276
277
278
279
280
281
282
283
284
285
286
287
288
289
290
   <meta>
      <topic id="topic_d1e3" indexer="include" status="PUBLISH">
         <title xml-lang="en" id="title_d1e3">Translation settings</title>
         <filename>/help/vnd.include-once.pagetranslate/config.xhp</filename>
      </topic>
      <history>
         <created date="2020-02-02T22:22:22"/>
         <lastedited date="2021-06-06T22:33:51.659+02:00"/>
      </history>
   </meta>
   <body>

      <bookmark id="bm_d1e7" branch="hid/vnd.include-once.pagetranslate:OptionsPageTranslate"
                xml-lang="en">
         <bookmark_value>PageTranslate settings</bookmark_value>
................................................................................
               <paragraph id="par_d1e118" role="paragraph" xml-lang="en">ArgosTranslate is an offline translation library based on
      CTranslate2 and OpenNMT models. It's thus independent from online
      services and connections, but requires prior setup. Specifically
      you need to run <item type="command">pip3 install argos-translate</item> and
      <item type="command">argos-translate-gui</item> to download language packs beforehand.
      And this usually just works with LibreOffice installations provided
      through Linux distro package managers (due to the way bundled Python

      is configured). Notably this option might be slower for long documents,
      but provides fairly good results.
      </paragraph>
            </listitem>
            <listitem id="item_d1e129" xml-lang="en">
               <emph>DeepL API</emph>
               <br/>
               <paragraph id="par_d1e134" role="paragraph" xml-lang="en">Utilizes the speedier <link href="https://www.deepl.com/pro">DeepL
................................................................................
            </listitem>
            <listitem id="item_d1e304" xml-lang="en">
               <emph>Command</emph>
               <br/>
               <paragraph id="par_d1e309" role="paragraph" xml-lang="en">This field defines the CLI tool to use for translating. Placeholders
      can be noted with {text} curly braces, or shell $lang and %from% percent
      syntax. The Python
      <link href="https://pypi.org/project/translate/">translate</link> or
      <link href="https://pypi.org/project/deep-translator/">deep-translator</link>
      packages provide following CLI wrappers:</paragraph>


            </listitem>
         </list>
      </paragraph>

      <paragraph id="sect_d1e335" role="section" xml-lang="en">
         <paragraph id="hd_d1e337" role="heading" level="2" xml-lang="en">Options / Flags</paragraph>
         <list id="terms_d1e340" xml-lang="en">
            <listitem id="item_d1e342" xml-lang="en">
               <emph>❏ quick linebreak handling</emph>
               <br/>
               <paragraph id="par_d1e347" role="paragraph" xml-lang="en">Might speed up table processing with Google Translate, as it avoids sending each newline-split sentence separately.
      It simply conjoins multiple lines temporarily with <item type="command">"/#Β§/"</item> in place of a
      linebreak (and then rejoins them), so there are less requests. Primarily helps with
      tables, but less for documents with lengthy paragraphs.</paragraph>
            </listitem>
            <listitem id="item_d1e354" xml-lang="en">
               <emph>❏ also iterate over TextFrames</emph>
               <br/>
               <paragraph id="par_d1e359" role="paragraph" xml-lang="en">Handles normal and floating TextFrames. Those are essentially subdocuments in a Writer page.
      But you probably don't need this option for standard office documents.</paragraph>
            </listitem>
            <listitem id="item_d1e363" xml-lang="en">
               <emph>❏ super slow mode</emph>
               <br/>
               <paragraph id="par_d1e368" role="paragraph" xml-lang="en">Iterates over paragraph segments, to keep more inline formatting - but seriously harms mid-sentence translations.
      And currently the formatting still bleeds into adjoining paragraph segments, so not very useful in practice yet.</paragraph>
            </listitem>
            <listitem id="item_d1e372" xml-lang="en">
               <emph>β˜‘ debug mode</emph>
               <br/>
               <paragraph id="par_d1e377" role="paragraph" xml-lang="en">Will fill up the <item type="fileitem">/tmp/pagetranslate-libreoffice.txt</item> log file quicker.
      Currently the debug mode is enabled by default anyway.</paragraph>
            </listitem>
         </list>
      </paragraph>

      <paragraph id="sect_d1e386" role="section" xml-lang="en">
         <paragraph id="hd_d1e388" role="heading" level="2" xml-lang="en">🏴 button default behaviour/target language</paragraph>
         <list id="terms_d1e391" xml-lang="en">
            <listitem id="item_d1e393" xml-lang="en">
               <emph>locale</emph>
               <br/>
               <paragraph id="par_d1e398" role="paragraph" xml-lang="en"> Per default uses the Office/system language as target. </paragraph>
            </listitem>
            <listitem id="item_d1e402" xml-lang="en">
               <emph>paragraph</emph>
               <br/>
               <paragraph id="par_d1e407" role="paragraph" xml-lang="en"> Uses the "paragraph" locale as set in the Writer/language status bar. </paragraph>
            </listitem>
            <listitem id="item_d1e411" xml-lang="en">
               <emph>select</emph>
               <br/>
               <paragraph id="par_d1e416" role="paragraph" xml-lang="en"> Always brings up the explicit Fromβ†’ToπŸ—Ί  language selection popup (useful for MyMemory or Pons backends).</paragraph>
            </listitem>
            <listitem id="item_d1e420" xml-lang="en">
               <emph>en, de, it, fr, ...</emph>
               <br/>
               <paragraph id="par_d1e425" role="paragraph" xml-lang="en"> You can set this field to any two-letter language code - to be used as default target. </paragraph>
            </listitem>
            <listitem id="item_d1e429" xml-lang="en">
               <emph>mri-debug</emph>
               <br/>
               <paragraph id="par_d1e434" role="paragraph" xml-lang="en"> Requires the MRI extension, and brings up an introspection dialog on the document when invoked. </paragraph>
            </listitem>
         </list>
      </paragraph>

   </body>
</helpdocument>







|







 







>
|







 







|
|
|
>
>




|
|
|
|


|




|


|


|


|


|


|





|
|
|
|


|

|


|

|


|

|


|

|


|






3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
..
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
...
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
239
240
241
242
243
244
245
246
247
248
249
250
251
252
253
254
255
256
257
258
259
260
261
262
263
264
265
266
267
268
269
270
271
272
273
274
275
276
277
278
279
280
281
282
283
284
285
286
287
288
289
290
291
292
293
   <meta>
      <topic id="topic_d1e3" indexer="include" status="PUBLISH">
         <title xml-lang="en" id="title_d1e3">Translation settings</title>
         <filename>/help/vnd.include-once.pagetranslate/config.xhp</filename>
      </topic>
      <history>
         <created date="2020-02-02T22:22:22"/>
         <lastedited date="2021-06-06T23:05:36.572+02:00"/>
      </history>
   </meta>
   <body>

      <bookmark id="bm_d1e7" branch="hid/vnd.include-once.pagetranslate:OptionsPageTranslate"
                xml-lang="en">
         <bookmark_value>PageTranslate settings</bookmark_value>
................................................................................
               <paragraph id="par_d1e118" role="paragraph" xml-lang="en">ArgosTranslate is an offline translation library based on
      CTranslate2 and OpenNMT models. It's thus independent from online
      services and connections, but requires prior setup. Specifically
      you need to run <item type="command">pip3 install argos-translate</item> and
      <item type="command">argos-translate-gui</item> to download language packs beforehand.
      And this usually just works with LibreOffice installations provided
      through Linux distro package managers (due to the way bundled Python
      is configured). You can utilize the cmdline tool in any case however.
      Notably this backend might be slower for long documents,
      but provides fairly good results.
      </paragraph>
            </listitem>
            <listitem id="item_d1e129" xml-lang="en">
               <emph>DeepL API</emph>
               <br/>
               <paragraph id="par_d1e134" role="paragraph" xml-lang="en">Utilizes the speedier <link href="https://www.deepl.com/pro">DeepL
................................................................................
            </listitem>
            <listitem id="item_d1e304" xml-lang="en">
               <emph>Command</emph>
               <br/>
               <paragraph id="par_d1e309" role="paragraph" xml-lang="en">This field defines the CLI tool to use for translating. Placeholders
      can be noted with {text} curly braces, or shell $lang and %from% percent
      syntax. The Python
      <link href="https://pypi.org/project/translate/">translate</link>,
      <link href="https://pypi.org/project/deep-translator/">deep-translator</link> and
      <link href="https://pypi.org/project/argostranslate/">argos-translate</link>
      packages provide CLI wrappers. Each having a sample configuration in the combobox
      dropdown.</paragraph>
            </listitem>
         </list>
      </paragraph>

      <paragraph id="sect_d1e325" role="section" xml-lang="en">
         <paragraph id="hd_d1e327" role="heading" level="2" xml-lang="en">Options / Flags</paragraph>
         <list id="terms_d1e330" xml-lang="en">
            <listitem id="item_d1e332" xml-lang="en">
               <emph>❏ quick linebreak handling</emph>
               <br/>
               <paragraph id="par_d1e337" role="paragraph" xml-lang="en">Might speed up table processing with Google Translate, as it avoids sending each newline-split sentence separately.
      It simply conjoins multiple lines temporarily with <item type="command">"/#Β§/"</item> in place of a
      linebreak (and then rejoins them), so there are less requests. Primarily helps with
      tables, but less for documents with lengthy paragraphs.</paragraph>
            </listitem>
            <listitem id="item_d1e344" xml-lang="en">
               <emph>❏ also iterate over TextFrames</emph>
               <br/>
               <paragraph id="par_d1e349" role="paragraph" xml-lang="en">Handles normal and floating TextFrames. Those are essentially subdocuments in a Writer page.
      But you probably don't need this option for standard office documents.</paragraph>
            </listitem>
            <listitem id="item_d1e353" xml-lang="en">
               <emph>❏ super slow mode</emph>
               <br/>
               <paragraph id="par_d1e358" role="paragraph" xml-lang="en">Iterates over paragraph segments, to keep more inline formatting - but seriously harms mid-sentence translations.
      And currently the formatting still bleeds into adjoining paragraph segments, so not very useful in practice yet.</paragraph>
            </listitem>
            <listitem id="item_d1e362" xml-lang="en">
               <emph>β˜‘ debug mode</emph>
               <br/>
               <paragraph id="par_d1e367" role="paragraph" xml-lang="en">Will fill up the <item type="fileitem">/tmp/pagetranslate-libreoffice.txt</item> log file quicker.
      Currently the debug mode is enabled by default anyway.</paragraph>
            </listitem>
         </list>
      </paragraph>

      <paragraph id="sect_d1e376" role="section" xml-lang="en">
         <paragraph id="hd_d1e378" role="heading" level="2" xml-lang="en">🏴 button default behaviour/target language</paragraph>
         <list id="terms_d1e381" xml-lang="en">
            <listitem id="item_d1e383" xml-lang="en">
               <emph>locale</emph>
               <br/>
               <paragraph id="par_d1e388" role="paragraph" xml-lang="en"> Per default uses the Office/system language as target. </paragraph>
            </listitem>
            <listitem id="item_d1e392" xml-lang="en">
               <emph>paragraph</emph>
               <br/>
               <paragraph id="par_d1e397" role="paragraph" xml-lang="en"> Uses the "paragraph" locale as set in the Writer/language status bar. </paragraph>
            </listitem>
            <listitem id="item_d1e401" xml-lang="en">
               <emph>select</emph>
               <br/>
               <paragraph id="par_d1e406" role="paragraph" xml-lang="en"> Always brings up the explicit Fromβ†’ToπŸ—Ί  language selection popup (useful for MyMemory or Pons backends).</paragraph>
            </listitem>
            <listitem id="item_d1e410" xml-lang="en">
               <emph>en, de, it, fr, ...</emph>
               <br/>
               <paragraph id="par_d1e415" role="paragraph" xml-lang="en"> You can set this field to any two-letter language code - to be used as default target. </paragraph>
            </listitem>
            <listitem id="item_d1e419" xml-lang="en">
               <emph>mri-debug</emph>
               <br/>
               <paragraph id="par_d1e424" role="paragraph" xml-lang="en"> Requires the MRI extension, and brings up an introspection dialog on the document when invoked. </paragraph>
            </listitem>
         </list>
      </paragraph>

   </body>
</helpdocument>

Changes to help/en/vnd.include-once.pagetranslate/errors.duck.

62
63
64
65
66
67
68





69
70
71
72
73
74
75
..
95
96
97
98
99
100
101













102
103
104
105
106
107
108

== uno.RuntimeException/AttributeError: ... object has no attribute 'getTypes'

Such errors are often resulting from one of the translation backends
not returning a proper text string / translation. Please fill a bug report
with your log, and use a different service for the time being.






== Options dialog is empty

This might be a development bug only. If the log shows nothing but the __init__
call, the extension usually requires removal and reinstall (per CLI *and*
GUI extension manager perhaps). Presumably there's some old schema version
in the OO registry, which prevents updated versions from taking effect.

................................................................................
for MyMemory is 'en' and not 'auto'.

== ImportError: No module named deep_translator

You'll have to install this Python package first, before using
some of the translation backends: `pip install deep-translator`.
Or use a larger OXT with bundled extensions.














== How to report a bug

[--
@link[seealso >>https://fossil.include-once.org/pagetranslate/] repository
--]
The project repository https://fossil.include-once.org/pagetranslate/







>
>
>
>
>







 







>
>
>
>
>
>
>
>
>
>
>
>
>







62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
...
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126

== uno.RuntimeException/AttributeError: ... object has no attribute 'getTypes'

Such errors are often resulting from one of the translation backends
not returning a proper text string / translation. Please fill a bug report
with your log, and use a different service for the time being.

== AttributeError: 'LangSelection' object has no attribute 'to'

That's the shallow error message when the From→To language selection
was canceled by closing the dialog.

== Options dialog is empty

This might be a development bug only. If the log shows nothing but the __init__
call, the extension usually requires removal and reinstall (per CLI *and*
GUI extension manager perhaps). Presumably there's some old schema version
in the OO registry, which prevents updated versions from taking effect.

................................................................................
for MyMemory is 'en' and not 'auto'.

== ImportError: No module named deep_translator

You'll have to install this Python package first, before using
some of the translation backends: `pip install deep-translator`.
Or use a larger OXT with bundled extensions.

== ImportError: No module named argostranslate

The ArgosTranslate backend requires installing the according Python
package.

== IMPORTANT: PLEASE READ THIS FOR ADVICE ON HOW TO SOLVE THIS ISSUE! / Original error was: No module named 'numpy.core._multiarray_umath'

This is a problem of the OpenNMT backend. It can't be used with the
LibreOffice-bundled Python. It's not compatible with the numpy binary
extension. This backend only works with a distro-supplied Office, which
ties to the system-installed Python. // Alternatively use the CLI tool
with "command line tool" as backend.

== How to report a bug

[--
@link[seealso >>https://fossil.include-once.org/pagetranslate/] repository
--]
The project repository https://fossil.include-once.org/pagetranslate/