LibreOffice plugin to pipe whole Writer documents through Google Translate, that ought to keep most of the page formatting.

โŒˆโŒ‹ โŽ‡ branch:  PageTranslate


Check-in [293badd94c]

Many hyperlinks are disabled.
Use anonymous login to enable hyperlinks.

Overview
Comment:Expand D-T backend hooks, abbreviate any ln-CT language specifier, document new backends.
Downloads: Tarball | ZIP archive | SQL archive
Timelines: family | ancestors | descendants | both | trunk
Files: files | file ages | folders
SHA1: 293badd94caefe4ba87d2c48b09f7fec0450d5f3
User & Date: mario 2021-05-13 15:41:11
Context
2021-05-14
03:19
Simplified params["backend"] string instead of individual flags, shorten mapping and parameterization in deep_translator backend, abbreviate D-T and T-P in new config dialog, add DT duplicates, minor manual updates. check-in: c28be6ec87 user: mario tags: trunk
2021-05-13
15:41
Expand D-T backend hooks, abbreviate any ln-CT language specifier, document new backends. check-in: 293badd94c user: mario tags: trunk
05:28
support all 4 new Deep-Translate backends, still needs some rework to omit "auto" source language check-in: 7ad6ace92e user: mario tags: trunk
Changes
Hide Diffs Unified Diffs Ignore Whitespace Patch

Changes to help/en/vnd.include-once.pagetranslate/config.page.

15
16
17
18
19
20
21

22
23
24

25

26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50

51
52
53
54
55

56
57
58
59
60
61
62

63
64
65
66

67
68























































69
70








71
72
73

























74
75
76
77
78
79
80
81
82
<title>Translation settings</title>
<p>The options page can be found under <guiseq><gui>Tools</gui> โ†’
<gui>Options</gui> โ†’ <gui>๐Ÿ—”</gui>  โ†’ <gui>Language Settings</gui> โ†’
<gui>PageTranslate</gui></guiseq>.</p>

<section id="service">
  <title>Translation service to use</title>

  <terms>
    <item>
      <title>โ˜‘ Google Translate</title>

      <p>That's the default, and suitable to both text selection and translating whole pages.

      Provides pretty good machine translations. It incurs some delays for longer texts, as
      each 1900 characters (sentences/paragraphs) have to be transfered individually (managed
      automatically, no user interaction necessary).</p>
    </item>
    <item>
      <title>โ DeepL web</title>
      <p>Only makes sense for translating a single-paragraph / text selection, because it quickly
      blocks with "error 429 - too many requests" otherwise.</p>
    </item>
    <item>
      <title>โ DeepL per API</title>
      <p>Use the speedier API to translate documents. As of yet untested.
         Requires an API key and paid subscription. No XML mode (to retain full
         inline formatting) yet, still translates each text segment/paragraph/sentence
         individually.
      </p>
    </item>
    <item>
      <title>โ Microsoft Translator</title>
      <p>Requires an authorization key, and a `pip install translate` system package.
      There's also a free/test <link href="https://azure.microsoft.com/en-us/pricing/details/cognitive-services/translator/">subscription for an API key</link>.
      Not tested within PageTranslate yet.</p>
    </item>
    <item>
      <title>โ MyMemory</title>

      <p>Should have an email address in the according input box (though
      optional).  No longer requires the python-translate module, but
      <file>langdetect</file> (for supplying the correct source language). 
      Which is why it sometimes fails, and possibly requires the Tools โ†’
      PageTranslate โ†’ From โžœ To option.  Doesn't yield quite as good machine

      translations.  But it's an open source service.  </p>
    </item>
    <item>
      <title>โ Command line tool</title>
      <p>Allows to send each text paragraph to a local application. To use it, set the
      command in the according input field again. Placeholders are `{lang}` for the
      target language, and `{text}` for the paragaphs or current text section. (Both get

      automatically escaped). For <cmd>translate-cli</cmd> you might need
      the <var>-p</var> provider option as well.
      See also the <link href="https://pypi.org/project/translate/">translate-python
      documentation</link> on how to prepare a separate <file>~/.python-translate.cfg</file>.

      Or use <link href="https://github.com/nidhaloff/deep-translator">deep-translator cli</link>
      with for example <cmd>deep_translator -trans "google" -src "auto" -tg {lang} -txt {text}</cmd>.























































      </p>
    </item>








  </terms>
</section>


























<section id="flags">
  <title>Operation flags</title>
  <terms>
    <item>
      <title>โ quick linebreak handling</title>
      <p>Might speed up table processing with Google Translate, as it avoids sending each newline-split sentence separately.
      It simply conjoins multiple lines temporarily with <cmd>"/#ยง/"</cmd> in place of a
      linebreak (and then rejoins them), so there are less requests. Primarily helps with
      tables, but less for documents with lengthy paragraphs.</p>







>



>
|
>
|
|



<
<
<
<
<
<
<
<
<
<
<
<
<
<
<
<
<
<
<

>
|
|
|
|
|
>
|



|
|
|
>
|
|

|
>
|
|
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>


>
>
>
>
>
>
>
>



>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>

|







15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33



















34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
<title>Translation settings</title>
<p>The options page can be found under <guiseq><gui>Tools</gui> โ†’
<gui>Options</gui> โ†’ <gui>๐Ÿ—”</gui>  โ†’ <gui>Language Settings</gui> โ†’
<gui>PageTranslate</gui></guiseq>.</p>

<section id="service">
  <title>Translation service to use</title>
  <p>There's a few built-in backends:</p>
  <terms>
    <item>
      <title>โ˜‘ Google Translate</title>
      <p><link href="https://translate.google.com/">Google Translate</link>
      is the default option, and suitable to both text selection and
      translating whole pages.  Provides pretty good machine translations. 
      It incurs some delays for longer texts, as each 1900 characters
      (sentences/paragraphs) have to be transfered individually (managed
      automatically, no user interaction necessary).</p>
    </item>
    <item>



















      <title>โ MyMemory</title>
      <p>For <link href="https://mymemory.translated.net/">MyMemory</link>
      you should specify an email address in the according input box (though
      it's optional, it unlocks more requests).  No longer requires the
      python-translate module, but <file>langdetect</file> (for supplying
      the correct source language).  Which is why it sometimes fails, and
      possibly requires the Tools โ†’ PageTranslate โ†’ From โžœ To option. 
      Doesn't yield quite as good machine translations.  But it's an open
      source service.  </p>
    </item>
    <item>
      <title>โ Command line tool</title>
      <p>Allows to send each text paragraph to a local application.  To use
      it, set the command in the according input field again.  Placeholders
      are `{lang}` for the target language, and `{text}` for the paragaphs
      or current text section.  (Both get automatically escaped).  For
      <cmd>translate-cli</cmd> you might need the <var>-p</var> provider
      option as well.
      See also the <link href="https://pypi.org/project/translate/">translate-python
      documentation</link> on how to prepare a separate
      <file>~/.python-translate.cfg</file>.  Or use <link
      href="https://github.com/nidhaloff/deep-translator">deep-translator
      cli</link> with for example <cmd>deep_translator -trans "google" -src
      "auto" -tg {lang} -txt {text}</cmd>.  </p>
    </item>
    <item>
      <title>โ DeepL API</title>
      <p>Utilizes the speedier <link href="https://www.deepl.com/pro">DeepL
      Pro API</link> to translate documents.  As of yet untested.  Requires
      an API key and paid subscription.  No XML mode (to retain full inline
      formatting) yet, still translates each text segment/paragraph/sentence
      individually.</p>
    </item>
    <item>
      <title>โ DeepL Free API</title>
      <p>You can now get a free API key for limited usage (500K characters
      per month - around 1 or 2 documents per day).  This secondary API
      might not be as well maintained.  And signup still requires a credit
      card (use one of the privacy or temporary online credit card
      services).</p>
    </item>
    <item>
      <title>โ DeepL web interface</title>
      <p>Utilizes web scraping on the <link
      href="https://www.deepl.com/translator/">DeepL online
      translator</link>.  Only suitable for testing and translating single
      paragraphs or text selection, because it quickly blocks with "error
      429 - too many requests".  It's also kinda redundant now that there's
      a Free API option.</p>
    </item>
  </terms>
  <p>Some provided via <cmd>pip install <link href="https://pypi.org/project/translate/">translate-python</link></cmd>:</p>
  <terms>
    <item>
      <title>โ Microsoft Translator</title>
      <p>Requires an authorization key. There's also a free/test <link
      href="https://azure.microsoft.com/en-us/pricing/details/cognitive-services/translator/">subscription
      for an API key</link>.  Not tested within PageTranslate yet.</p>
    </item>
  </terms>
  <p>And more via <cmd>pip install <link href="https://pypi.org/project/deep-translator/">deep-translator</link></cmd>:</p>
  <terms>
    <item>
      <title>โ QCRI Machine Translation</title>
      <p>Requires a <link href="https://mt.qcri.org/api/">free API
      key</link>, but is suitable for whole-document translations.  </p>
    </item>
    <item>
      <title>โ Yandex Translation</title> <p>Also requires its own <link
      href="https://translate.yandex.com/">API key</link>.</p>
    </item>
    <item>
      <title>โ Linguee Dictionary</title>
      <p>Performs word-wise <link
      href="https://www.linguee.com/">translation</link> lookups, so not
      suitable for translating whole documents, but just text selections. 
      Albeit PageTranslate will split up sentences and pipe each word
      through the service; that won't yield a readable machine translation. 
      </p>
    </item>
    <item>
      <title>โ Pons Dictonary</title>
      <p>Also is more of a <link href="https://de.pons.com/">dictionary</link>
      than a translation service.  Suitable for text-selections, but
      probably not paragraphs or whole documents.  PageTranslate will
      split-process longer selections word-wise through the Pons Translation
      interface.</p>
    </item>
  </terms>
</section>

<section id="service">
  <title>Parameters</title>
  <terms>
    <item>
      <title>API key</title>
      <p>You can set an API or OAuth key for services that require one.  The
      same input field serves for all backends, so you can't switch between
      them without also changing this entry first.  (Not a common use case
      to have multiple API subscriptions really).</p>
    </item>
    <item>
      <title>Email adr</title>
      <p>An email address is only required by MyMemory.  And strictly
      speaking it's not even required; it just allows for more
      translations.</p>
    </item>
    <item>
      <title>Command</title>
      <p>This field defines the CLI tool to use for translation.  You can
      use something other than `translate-cli` or `deep-translator` of
      course.  Placeholders like {lang} and {text} can be used here.</p>
    </item>
  </terms>
</section>

<section id="flags">
  <title>Options / Flags</title>
  <terms>
    <item>
      <title>โ quick linebreak handling</title>
      <p>Might speed up table processing with Google Translate, as it avoids sending each newline-split sentence separately.
      It simply conjoins multiple lines temporarily with <cmd>"/#ยง/"</cmd> in place of a
      linebreak (and then rejoins them), so there are less requests. Primarily helps with
      tables, but less for documents with lengthy paragraphs.</p>

Changes to help/en/vnd.include-once.pagetranslate/config.xhp.

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29

30
31
32
33


34

35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63

64
65
66
67
68

69
70
71
72
73
74
75
76

77
78
79
80
81
82





























































83





































84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
<?xml version="1.0" encoding="UTF-8"?>
<helpdocument version="1.0">
   <meta>
      <topic id="topic_d1e3" indexer="include" status="PUBLISH">
         <title xml-lang="en" id="title_d1e3">Translation settings</title>
         <filename>/help/vnd.include-once.pagetranslate/config.xhp</filename>
      </topic>
      <history>
         <created date="2020-02-02T22:22:22"/>
         <lastedited date="2021-05-13T06:34:46.817+02:00"/>
      </history>
   </meta>
   <body>

      <bookmark id="bm_d1e7" branch="hid/vnd.include-once.pagetranslate:OptionsPageTranslate"
                xml-lang="en">
         <bookmark_value>PageTranslate settings</bookmark_value>
      </bookmark>
      <bookmark id="helpindex_d1e9" branch="index" xml-lang="en">
         <bookmark_value>translation; pagetranslate; options</bookmark_value>
      </bookmark>

      <paragraph id="hd_d1e15" role="heading" level="1" xml-lang="en">Translation settings</paragraph>
      <paragraph id="par_d1e18" role="paragraph" xml-lang="en">The options page can be found under <item type="gui">Tools</item> โ†’
<item type="gui">Options</item> โ†’ <item type="gui">๐Ÿ—”</item>  โ†’ <item type="gui">Language Settings</item> โ†’
<item type="gui">PageTranslate</item>.</paragraph>

      <paragraph id="sect_d1e37" role="section" xml-lang="en">
         <paragraph id="hd_d1e39" role="heading" level="2" xml-lang="en">Translation service to use</paragraph>

         <list id="terms_d1e42" xml-lang="en">
            <listitem id="item_d1e44" xml-lang="en">
               <emph>โ˜‘ Google Translate</emph>
               <br/>


               <paragraph id="par_d1e49" role="paragraph" xml-lang="en">That's the default, and suitable to both text selection and translating whole pages.

      Provides pretty good machine translations. It incurs some delays for longer texts, as
      each 1900 characters (sentences/paragraphs) have to be transfered individually (managed
      automatically, no user interaction necessary).</paragraph>
            </listitem>
            <listitem id="item_d1e53" xml-lang="en">
               <emph>โ DeepL web</emph>
               <br/>
               <paragraph id="par_d1e58" role="paragraph" xml-lang="en">Only makes sense for translating a single-paragraph / text selection, because it quickly
      blocks with "error 429 - too many requests" otherwise.</paragraph>
            </listitem>
            <listitem id="item_d1e62" xml-lang="en">
               <emph>โ DeepL per API</emph>
               <br/>
               <paragraph id="par_d1e67" role="paragraph" xml-lang="en">Use the speedier API to translate documents. As of yet untested.
         Requires an API key and paid subscription. No XML mode (to retain full
         inline formatting) yet, still translates each text segment/paragraph/sentence
         individually.
      </paragraph>
            </listitem>
            <listitem id="item_d1e71" xml-lang="en">
               <emph>โ Microsoft Translator</emph>
               <br/>
               <paragraph id="par_d1e76" role="paragraph" xml-lang="en">Requires an authorization key, and a `pip install translate` system package.
      There's also a free/test <link href="https://azure.microsoft.com/en-us/pricing/details/cognitive-services/translator/">subscription for an API key</link>.
      Not tested within PageTranslate yet.</paragraph>
            </listitem>
            <listitem id="item_d1e83" xml-lang="en">
               <emph>โ MyMemory</emph>
               <br/>

               <paragraph id="par_d1e88" role="paragraph" xml-lang="en">Should have an email address in the according input box (though
      optional).  No longer requires the python-translate module, but
      <item type="fileitem">langdetect</item> (for supplying the correct source language). 
      Which is why it sometimes fails, and possibly requires the Tools โ†’
      PageTranslate โ†’ From โžœ To option.  Doesn't yield quite as good machine

      translations.  But it's an open source service.  </paragraph>
            </listitem>
            <listitem id="item_d1e96" xml-lang="en">
               <emph>โ Command line tool</emph>
               <br/>
               <paragraph id="par_d1e101" role="paragraph" xml-lang="en">Allows to send each text paragraph to a local application. To use it, set the
      command in the according input field again. Placeholders are `{lang}` for the
      target language, and `{text}` for the paragaphs or current text section. (Both get

      automatically escaped). For <item type="command">translate-cli</item> you might need
      the <item type="variable">-p</item> provider option as well.
      See also the <link href="https://pypi.org/project/translate/">translate-python
      documentation</link> on how to prepare a separate <item type="fileitem">~/.python-translate.cfg</item>.
      Or use <link href="https://github.com/nidhaloff/deep-translator">deep-translator cli</link>
      with for example <item type="command">deep_translator -trans "google" -src "auto" -tg {lang} -txt {text}</item>.





























































      </paragraph>





































            </listitem>
         </list>
      </paragraph>

      <paragraph id="sect_d1e126" role="section" xml-lang="en">
         <paragraph id="hd_d1e128" role="heading" level="2" xml-lang="en">Operation flags</paragraph>
         <list id="terms_d1e131" xml-lang="en">
            <listitem id="item_d1e133" xml-lang="en">
               <emph>โ quick linebreak handling</emph>
               <br/>
               <paragraph id="par_d1e138" role="paragraph" xml-lang="en">Might speed up table processing with Google Translate, as it avoids sending each newline-split sentence separately.
      It simply conjoins multiple lines temporarily with <item type="command">"/#ยง/"</item> in place of a
      linebreak (and then rejoins them), so there are less requests. Primarily helps with
      tables, but less for documents with lengthy paragraphs.</paragraph>
            </listitem>
            <listitem id="item_d1e145" xml-lang="en">
               <emph>โ also iterate over TextFrames</emph>
               <br/>
               <paragraph id="par_d1e150" role="paragraph" xml-lang="en">Handles normal and floating TextFrames. Those are essentially subdocuments in a Writer page.
      But you probably don't need this option for standard office documents.</paragraph>
            </listitem>
            <listitem id="item_d1e154" xml-lang="en">
               <emph>โ super slow mode</emph>
               <br/>
               <paragraph id="par_d1e159" role="paragraph" xml-lang="en">Iterates over paragraph segments, to keep more inline formatting - but seriously harms mid-sentence translations.
      And currently the formatting still bleeds into adjoining paragraph segments, so not very useful in practice yet.</paragraph>
            </listitem>
            <listitem id="item_d1e163" xml-lang="en">
               <emph>โ˜‘ debug mode</emph>
               <br/>
               <paragraph id="par_d1e168" role="paragraph" xml-lang="en">Will fill up the <item type="fileitem">/tmp/pagetranslate-libreoffice.txt</item> log file quicker.
      Currently the debug mode is enabled by default anyway.</paragraph>
            </listitem>
         </list>
      </paragraph>

   </body>
</helpdocument>









|



















>
|
|


>
>
|
>
|
|


|
<
<
<
<
<
<
<
<
<
<
<
<
<
<
<
<
<
<
<
<
<
<


>
|
|
|
|
|
>
|

|


|
|
|
>
|
|

|
|
|
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>

>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>




|
|
|
|


|




|


|


|


|


|


|







1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43






















44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
<?xml version="1.0" encoding="UTF-8"?>
<helpdocument version="1.0">
   <meta>
      <topic id="topic_d1e3" indexer="include" status="PUBLISH">
         <title xml-lang="en" id="title_d1e3">Translation settings</title>
         <filename>/help/vnd.include-once.pagetranslate/config.xhp</filename>
      </topic>
      <history>
         <created date="2020-02-02T22:22:22"/>
         <lastedited date="2021-05-13T17:38:33.093+02:00"/>
      </history>
   </meta>
   <body>

      <bookmark id="bm_d1e7" branch="hid/vnd.include-once.pagetranslate:OptionsPageTranslate"
                xml-lang="en">
         <bookmark_value>PageTranslate settings</bookmark_value>
      </bookmark>
      <bookmark id="helpindex_d1e9" branch="index" xml-lang="en">
         <bookmark_value>translation; pagetranslate; options</bookmark_value>
      </bookmark>

      <paragraph id="hd_d1e15" role="heading" level="1" xml-lang="en">Translation settings</paragraph>
      <paragraph id="par_d1e18" role="paragraph" xml-lang="en">The options page can be found under <item type="gui">Tools</item> โ†’
<item type="gui">Options</item> โ†’ <item type="gui">๐Ÿ—”</item>  โ†’ <item type="gui">Language Settings</item> โ†’
<item type="gui">PageTranslate</item>.</paragraph>

      <paragraph id="sect_d1e37" role="section" xml-lang="en">
         <paragraph id="hd_d1e39" role="heading" level="2" xml-lang="en">Translation service to use</paragraph>
         <paragraph id="par_d1e42" role="paragraph" xml-lang="en">There's a few built-in backends:</paragraph>
         <list id="terms_d1e45" xml-lang="en">
            <listitem id="item_d1e47" xml-lang="en">
               <emph>โ˜‘ Google Translate</emph>
               <br/>
               <paragraph id="par_d1e52" role="paragraph" xml-lang="en">
                  <link href="https://translate.google.com/">Google Translate</link>
      is the default option, and suitable to both text selection and
      translating whole pages.  Provides pretty good machine translations. 
      It incurs some delays for longer texts, as each 1900 characters
      (sentences/paragraphs) have to be transfered individually (managed
      automatically, no user interaction necessary).</paragraph>
            </listitem>
            <listitem id="item_d1e58" xml-lang="en">






















               <emph>โ MyMemory</emph>
               <br/>
               <paragraph id="par_d1e63" role="paragraph" xml-lang="en">For <link href="https://mymemory.translated.net/">MyMemory</link>
      you should specify an email address in the according input box (though
      it's optional, it unlocks more requests).  No longer requires the
      python-translate module, but <item type="fileitem">langdetect</item> (for supplying
      the correct source language).  Which is why it sometimes fails, and
      possibly requires the Tools โ†’ PageTranslate โ†’ From โžœ To option. 
      Doesn't yield quite as good machine translations.  But it's an open
      source service.  </paragraph>
            </listitem>
            <listitem id="item_d1e73" xml-lang="en">
               <emph>โ Command line tool</emph>
               <br/>
               <paragraph id="par_d1e78" role="paragraph" xml-lang="en">Allows to send each text paragraph to a local application.  To use
      it, set the command in the according input field again.  Placeholders
      are `{lang}` for the target language, and `{text}` for the paragaphs
      or current text section.  (Both get automatically escaped).  For
      <item type="command">translate-cli</item> you might need the <item type="variable">-p</item> provider
      option as well.
      See also the <link href="https://pypi.org/project/translate/">translate-python
      documentation</link> on how to prepare a separate
      <item type="fileitem">~/.python-translate.cfg</item>.  Or use <link href="https://github.com/nidhaloff/deep-translator">deep-translator
      cli</link> with for example <item type="command">deep_translator -trans "google" -src
      "auto" -tg {lang} -txt {text}</item>.  </paragraph>
            </listitem>
            <listitem id="item_d1e101" xml-lang="en">
               <emph>โ DeepL API</emph>
               <br/>
               <paragraph id="par_d1e106" role="paragraph" xml-lang="en">Utilizes the speedier <link href="https://www.deepl.com/pro">DeepL
      Pro API</link> to translate documents.  As of yet untested.  Requires
      an API key and paid subscription.  No XML mode (to retain full inline
      formatting) yet, still translates each text segment/paragraph/sentence
      individually.</paragraph>
            </listitem>
            <listitem id="item_d1e113" xml-lang="en">
               <emph>โ DeepL Free API</emph>
               <br/>
               <paragraph id="par_d1e118" role="paragraph" xml-lang="en">You can now get a free API key for limited usage (500K characters
      per month - around 1 or 2 documents per day).  This secondary API
      might not be as well maintained.  And signup still requires a credit
      card (use one of the privacy or temporary online credit card
      services).</paragraph>
            </listitem>
            <listitem id="item_d1e123" xml-lang="en">
               <emph>โ DeepL web interface</emph>
               <br/>
               <paragraph id="par_d1e128" role="paragraph" xml-lang="en">Utilizes web scraping on the <link href="https://www.deepl.com/translator/">DeepL online
      translator</link>.  Only suitable for testing and translating single
      paragraphs or text selection, because it quickly blocks with "error
      429 - too many requests".  It's also kinda redundant now that there's
      a Free API option.</paragraph>
            </listitem>
         </list>
         <paragraph id="par_d1e136" role="paragraph" xml-lang="en">Some provided via <item type="command">pip install <link href="https://pypi.org/project/translate/">translate-python</link>
            </item>:</paragraph>
         <list id="terms_d1e144" xml-lang="en">
            <listitem id="item_d1e146" xml-lang="en">
               <emph>โ Microsoft Translator</emph>
               <br/>
               <paragraph id="par_d1e151" role="paragraph" xml-lang="en">Requires an authorization key. There's also a free/test <link href="https://azure.microsoft.com/en-us/pricing/details/cognitive-services/translator/">subscription
      for an API key</link>.  Not tested within PageTranslate yet.</paragraph>
            </listitem>
         </list>
         <paragraph id="par_d1e160" role="paragraph" xml-lang="en">And more via <item type="command">pip install <link href="https://pypi.org/project/deep-translator/">deep-translator</link>
            </item>:</paragraph>
         <list id="terms_d1e168" xml-lang="en">
            <listitem id="item_d1e170" xml-lang="en">
               <emph>โ QCRI Machine Translation</emph>
               <br/>
               <paragraph id="par_d1e175" role="paragraph" xml-lang="en">Requires a <link href="https://mt.qcri.org/api/">free API
      key</link>, but is suitable for whole-document translations.  </paragraph>
            </listitem>
            <listitem id="item_d1e182" xml-lang="en">
               <emph>โ Yandex Translation</emph>
               <br/>
               <paragraph id="par_d1e187" role="paragraph" xml-lang="en">Also requires its own <link href="https://translate.yandex.com/">API key</link>.</paragraph>
            </listitem>
            <listitem id="item_d1e194" xml-lang="en">
               <emph>โ Linguee Dictionary</emph>
               <br/>
               <paragraph id="par_d1e199" role="paragraph" xml-lang="en">Performs word-wise <link href="https://www.linguee.com/">translation</link> lookups, so not
      suitable for translating whole documents, but just text selections. 
      Albeit PageTranslate will split up sentences and pipe each word
      through the service; that won't yield a readable machine translation. 
      </paragraph>
            </listitem>
            <listitem id="item_d1e206" xml-lang="en">
               <emph>โ Pons Dictonary</emph>
               <br/>
               <paragraph id="par_d1e211" role="paragraph" xml-lang="en">Also is more of a <link href="https://de.pons.com/">dictionary</link>
      than a translation service.  Suitable for text-selections, but
      probably not paragraphs or whole documents.  PageTranslate will
      split-process longer selections word-wise through the Pons Translation
      interface.</paragraph>
            </listitem>
         </list>
      </paragraph>

      <paragraph id="sect_d1e220" role="section" xml-lang="en">
         <paragraph id="hd_d1e222" role="heading" level="2" xml-lang="en">Parameters</paragraph>
         <list id="terms_d1e225" xml-lang="en">
            <listitem id="item_d1e227" xml-lang="en">
               <emph>API key</emph>
               <br/>
               <paragraph id="par_d1e232" role="paragraph" xml-lang="en">You can set an API or OAuth key for services that require one.  The
      same input field serves for all backends, so you can't switch between
      them without also changing this entry first.  (Not a common use case
      to have multiple API subscriptions really).</paragraph>
            </listitem>
            <listitem id="item_d1e236" xml-lang="en">
               <emph>Email adr</emph>
               <br/>
               <paragraph id="par_d1e241" role="paragraph" xml-lang="en">An email address is only required by MyMemory.  And strictly
      speaking it's not even required; it just allows for more
      translations.</paragraph>
            </listitem>
            <listitem id="item_d1e245" xml-lang="en">
               <emph>Command</emph>
               <br/>
               <paragraph id="par_d1e250" role="paragraph" xml-lang="en">This field defines the CLI tool to use for translation.  You can
      use something other than `translate-cli` or `deep-translator` of
      course.  Placeholders like {lang} and {text} can be used here.</paragraph>
            </listitem>
         </list>
      </paragraph>

      <paragraph id="sect_d1e257" role="section" xml-lang="en">
         <paragraph id="hd_d1e259" role="heading" level="2" xml-lang="en">Options / Flags</paragraph>
         <list id="terms_d1e262" xml-lang="en">
            <listitem id="item_d1e264" xml-lang="en">
               <emph>โ quick linebreak handling</emph>
               <br/>
               <paragraph id="par_d1e269" role="paragraph" xml-lang="en">Might speed up table processing with Google Translate, as it avoids sending each newline-split sentence separately.
      It simply conjoins multiple lines temporarily with <item type="command">"/#ยง/"</item> in place of a
      linebreak (and then rejoins them), so there are less requests. Primarily helps with
      tables, but less for documents with lengthy paragraphs.</paragraph>
            </listitem>
            <listitem id="item_d1e276" xml-lang="en">
               <emph>โ also iterate over TextFrames</emph>
               <br/>
               <paragraph id="par_d1e281" role="paragraph" xml-lang="en">Handles normal and floating TextFrames. Those are essentially subdocuments in a Writer page.
      But you probably don't need this option for standard office documents.</paragraph>
            </listitem>
            <listitem id="item_d1e285" xml-lang="en">
               <emph>โ super slow mode</emph>
               <br/>
               <paragraph id="par_d1e290" role="paragraph" xml-lang="en">Iterates over paragraph segments, to keep more inline formatting - but seriously harms mid-sentence translations.
      And currently the formatting still bleeds into adjoining paragraph segments, so not very useful in practice yet.</paragraph>
            </listitem>
            <listitem id="item_d1e294" xml-lang="en">
               <emph>โ˜‘ debug mode</emph>
               <br/>
               <paragraph id="par_d1e299" role="paragraph" xml-lang="en">Will fill up the <item type="fileitem">/tmp/pagetranslate-libreoffice.txt</item> log file quicker.
      Currently the debug mode is enabled by default anyway.</paragraph>
            </listitem>
         </list>
      </paragraph>

   </body>
</helpdocument>

Changes to pythonpath/translationbackends.py.

274
275
276
277
278
279
280
281
282
283
284
285
286
287
288
# Registration is broken (error 10040 or whatever, "contact support" lel), even though
# it seems to create an account regardless; but API yields SSL or connection errors.
# Thus STILL UNTESTED.
#
class deepl_free_api(deepl_api):
    def __init__(self, params):
        self.params = params
        self.api_url = "http://api-free.deepl.com/v2/translate"


# Translate-python
# requires `pip install translate`
#
#  ยท provides "microsoft" backend (requires OAuth secret in api_key)
#







|







274
275
276
277
278
279
280
281
282
283
284
285
286
287
288
# Registration is broken (error 10040 or whatever, "contact support" lel), even though
# it seems to create an account regardless; but API yields SSL or connection errors.
# Thus STILL UNTESTED.
#
class deepl_free_api(deepl_api):
    def __init__(self, params):
        self.params = params
        self.api_url = "https://api.deepl.com/v2/translate"


# Translate-python
# requires `pip install translate`
#
#  ยท provides "microsoft" backend (requires OAuth secret in api_key)
#
300
301
302
303
304
305
306
307
308
309
310
311
312
313
314
        try:
            from translate import Translator
        except:
            log.error(format_exc())
            raise Exception("Run `pip install translate` to use this module.")
            
        # interestingly this backend function might just work as is.
        if params.get("mymemory"):
            self.translate = Translator(
                provider="mymemory", to_lang=params["lang"], email=params.get("email", "")
            ).translate
        else:
            self.translate = Translator(
                provider="microsoft", to_lang=params["lang"], secret_access_key=params["api_key"]
            ).translate







|







300
301
302
303
304
305
306
307
308
309
310
311
312
313
314
        try:
            from translate import Translator
        except:
            log.error(format_exc())
            raise Exception("Run `pip install translate` to use this module.")
            
        # interestingly this backend function might just work as is.
        if re.search("mymemory", params.get("backend", ""), re.I):
            self.translate = Translator(
                provider="mymemory", to_lang=params["lang"], email=params.get("email", "")
            ).translate
        else:
            self.translate = Translator(
                provider="microsoft", to_lang=params["lang"], secret_access_key=params["api_key"]
            ).translate
322
323
324
325
326
327
328
329
330
331
332
333
334

335



336
337

338
339
340


341
342


343

344

345

346







347
348
349
350
351
352
353
    #linebreakwise = None



# deep-translator
# requires `pip install deep-translator`
#  ยท more backends than pytranslate,
#    though PONS etc. are just dictionaries
#  โ†’ https://github.com/nidhaloff/deep-translator
#
class deep_translator(google):

    def __init__(self, params={}):

        self.params = params  # config+argparse



        import deep_translator as dt
        import functools as ft

        w = params.get("backend", "Pons")
        if re.search("linguee", w, re.I):
            self.translate = self.from_words(dt.LingueeTranslator(source="auto", target=params["lang"]).translate)


        elif re.search("pons", w, re.I):
            self.translate = self.from_words(dt.PonsTranslator(source="auto", target=params["lang"]).translate)


        elif re.search("QCRI", w, re.I):

            self.translate = ft.partial(dt.QCRI(params["api_key"]).translate, source="auto", target=params["lang"])

        elif re.search("yandex", w, re.I):

            self.translate = ft.partial(dt.YandexTranslator(params["api_key"]).translate, source="auto", target=params["lang"])







    
    # decorator to translate word-wise
    def from_words(self, fn):
        def translate(text):
            words = re.findall("(\w+)", text)
            words = { w: fn(w) for w in list(set(words)) }
            text = re.sub("(\w+)", lambda m: words.get(m[0], m[0]), text)







|





>
|
>
>
>
|
|
>
|
|
|
>
>
|
|
>
>
|
>
|
>
|
>
|
>
>
>
>
>
>
>







322
323
324
325
326
327
328
329
330
331
332
333
334
335
336
337
338
339
340
341
342
343
344
345
346
347
348
349
350
351
352
353
354
355
356
357
358
359
360
361
362
363
364
365
366
367
368
369
370
371
372
    #linebreakwise = None



# deep-translator
# requires `pip install deep-translator`
#  ยท more backends than pytranslate,
#    though PONS+Linguee are just dictionaries
#  โ†’ https://github.com/nidhaloff/deep-translator
#
class deep_translator(google):

    def __init__(self, params={}):
        # config+argparse
        self.params = params
        backend = params.get("backend", "Pons")
        source = self.coarse_lang(params.get("from", "auto"))
        target = self.coarse_lang(params.get("lang", "en"))
        # import
        import functools
        import deep_translator
        # map to backends / uniform decorators
        if re.search("linguee", backend, re.I):
            self.translate = self.from_words(
                deep_translator.LingueeTranslator(source=source, target=target).translate
            )
        elif re.search("pons", backend, re.I):
            self.translate = self.from_words(
                deep_translator.PonsTranslator(source=source, target=target).translate
            )
        elif re.search("QCRI", backend, re.I):
            self.translate = functools.partial(
                deep_translator.QCRI(params["api_key"]).translate, source=source, target=target
            )
        elif re.search("yandex", backend, re.I):
            self.translate = functools.partial(
                deep_translator.YandexTranslator(params["api_key"]).translate, source=source, target=target
            )

    # shorten language co-DE to just two-letter moniker
    def coarse_lang(self, id):
        if id.find("-") > 0:
            id = re.sub("(?<!zh)-\w+", "", id)
        return id
    
    # decorator to translate word-wise
    def from_words(self, fn):
        def translate(text):
            words = re.findall("(\w+)", text)
            words = { w: fn(w) for w in list(set(words)) }
            text = re.sub("(\w+)", lambda m: words.get(m[0], m[0]), text)