Skip to content

Commit 3fbc863

Browse files
authored
Merge pull request #620 from kdmukai/initial_multilanguage_0.8.0
Initial l10n / multilanguage support
2 parents b0d213b + a2ffa7d commit 3fbc863

Some content is hidden

Large Commits have some content hidden by default. Use the searchbox below for content that may be hidden.

53 files changed

+4917
-1811
lines changed

.github/workflows/tests.yml

+12-5
Original file line numberDiff line numberDiff line change
@@ -27,6 +27,9 @@ jobs:
2727

2828
steps:
2929
- uses: actions/checkout@v3
30+
with:
31+
# Needs to also pull the seedsigner-translations repo
32+
submodules: recursive
3033
- name: Set up Python ${{ matrix.python-version }}
3134
uses: actions/setup-python@v4
3235
with:
@@ -45,16 +48,20 @@ jobs:
4548
--cov=seedsigner \
4649
--cov-append \
4750
--cov-branch \
48-
--cov-report term \
49-
--cov-report html \
50-
--cov-report html:./artifacts/cov_html \
51-
--cov-report xml \
5251
--durations 5 \
5352
-vv
5453
- name: Generate screenshots
5554
run: |
56-
python -m pytest tests/screenshot_generator/generator.py
55+
python -m pytest tests/screenshot_generator/generator.py \
56+
--color=yes \
57+
--cov=seedsigner \
58+
--cov-append \
59+
--cov-branch \
60+
--cov-report html:./artifacts/cov_html \
61+
-vv
5762
cp -r ./seedsigner-screenshots ./artifacts/
63+
- name: Coverage report
64+
run: coverage report
5865
- name: Archive CI Artifacts
5966
uses: actions/upload-artifact@v3
6067
with:

.gitignore

+4-3
Original file line numberDiff line numberDiff line change
@@ -5,6 +5,7 @@ src/seedsigner.egg-info/
55
.vscode
66
src/seedsigner/models/settings_definition.json
77
.idea
8-
*.mo
9-
.coverage
10-
seedsigner-screenshots
8+
.coverage*
9+
10+
*.po
11+
*.mo

.gitmodules

+9
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,9 @@
1+
[submodule "src/seedsigner/resources/seedsigner-translations"]
2+
path = src/seedsigner/resources/seedsigner-translations
3+
url = https://github.com/SeedSigner/seedsigner-translations.git
4+
branch = dev
5+
[submodule "seedsigner-screenshots"]
6+
path = seedsigner-screenshots
7+
url = https://github.com/SeedSigner/seedsigner-screenshots.git
8+
branch = dev
9+
update = none

l10n/README.md

+316
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,316 @@
1+
# Localization (l10n) Developer Notes
2+
3+
## High-level overview
4+
1. Python code indicates text that needs to be translated.
5+
1. Those marked strings are extracted into a master `messages.pot` file.
6+
1. That file is uploaded [Transifex](https://app.transifex.com/seedsigner/seedsigner).
7+
1. Translators work within Transifex on their respective languages.
8+
1. Completed translations are downloaded as `messages.po` files for each language.
9+
1. Python "compiles" them into `messages.mo` files ready for use.
10+
1. The `*.po` and `*.mo` files are written to the [seedsigner-translations](https://github.com/SeedSigner/seedsigner-translations) repo.
11+
1. That repo is linked as a submodule here as `seedsigner.resources.seedsigner-translations`.
12+
1. Python code retrieves a translation on demand.
13+
14+
15+
## "Wrapping" text for translation
16+
Any text that we want to be presented in multiple languages needs to "wrapped".
17+
18+
The CORE CONCEPT to understand is that wrapping is used in TWO different contexts:
19+
1. Pre-translation: This is how we identify text that translators need to translate. Any wrapped string literals will appear in translators' Transifex UI.
20+
2. Post-translation: Return the locale-specific translation for that source string (defaults to the English string if no translation is found).
21+
22+
We have three techniques to wrap code, depending on which of the above contexts we're in and where we are in the code:
23+
24+
25+
#### Technique 1: `ButtonOption`
26+
Most `View` classes will render themselves via some variation of the `ButtonListScreen` which takes a `button_data` list as an input. Each entry in
27+
`button_data` must be a `ButtonOption`. The first argument for `ButtonOption` is the `button_label` string. This is the English string literal that
28+
is displayed in that button. If you look at `setup.cfg` you'll see that `ButtonOption` is listed as a keyword in `extract_messages`. That means
29+
that the first argument for `ButtonOption` -- its `button_label` string -- will be marked for translation (by default the `extract_messages`
30+
integration will only look at the first argument of any method listed in `keywords`).
31+
32+
```python
33+
class SomeView(View):
34+
# These string literals will be marked for translation
35+
OPTION_1 = ButtonOption("Option 1!")
36+
OPTION_2 = ButtonOption("Option 2!")
37+
38+
def run(self):
39+
button_data = [self.OPTION_1, self.OPTION_2]
40+
41+
# No way for `extract_messages` to know what's in `some_var`; won't be marked for
42+
# translation unless it's specified elsewhere.
43+
some_var = some_value
44+
button_data.append(ButtonOption(some_var))
45+
```
46+
47+
These `ButtonOption` values are generally specified in class-level attributes, as in the above example. Classes in python are imported once, after
48+
which class-level attributes are never reinterpreted again; **the value at import time for a class-level attribute is its value for the duration of
49+
the program execution.**
50+
51+
This means that we must assume that `ButtonOption.button_label` strings are ALWAYS the original English string. This is crucial because the English
52+
values are the lookup keys for the translations:
53+
54+
* `ButtonOption.button_label` = "Hello!" in the python code.
55+
* Run the code, the class that contains our `ButtonOption` as a class-level attribute is imported.
56+
* Regardless of language selection, that `ButtonOption` will always return "Hello!".
57+
* `Screen` then uses "Hello!" as a key to find the translation "¡Hola!".
58+
* User sees "¡Hola!".
59+
60+
IF `ButtonOption` were wired to return the translated string, we'd have a problem:
61+
* User sets their language to Spanish and enables persistent settings.
62+
* Launch SeedSigner. At import time the `button_label`'s value is translated to "¡Hola!".
63+
* User sees "¡Hola!" in the UI. All good.
64+
* User changes language to English (or any other language).
65+
* Now the `Screen` must find the matching string in a different translation file.
66+
* But the `button_label` value was fixed at import time; it's still providing "¡Hola!" as the lookup key.
67+
* Since all the translation files map English -> translation, no such "¡Hola!" match exists in any translation file.
68+
* So the translation falls back to just displaying the unmatched key: "¡Hola!"
69+
70+
tldr: `ButtonOption` marks its `button_label` English string literal for translation, but NEVER provides a translated value.
71+
72+
---
73+
74+
#### Technique 2: `seedsigner.helpers.l10n.mark_for_translation`
75+
You'll see that `mark_for_translation` is imported as `_mft` for short.
76+
77+
As far as translations are concerned, `_mft` serves the same purpose as `ButtonOption`. The only difference is that `_mft` is for all other
78+
(non-`button_data`) class-level attributes.
79+
80+
```python
81+
from seedsigner.helpers.l10n import mark_for_translation as _mft
82+
83+
@classmethod
84+
class SomeView(View):
85+
title: str = _mft("Default Title")
86+
text: str = _mft("My default body text")
87+
88+
def run(self):
89+
self.run_screen(
90+
SomeScreen,
91+
title=self.title,
92+
text=self.text
93+
)
94+
```
95+
96+
In general we try to avoid using `_mft` at all, but some class-level attributes just can't be avoided.
97+
98+
---
99+
100+
#### Technique 3: `gettext`, aka `_()`
101+
This is the way you'll see text wrapping handled in the vast majority of tutorials.
102+
103+
```python
104+
from gettext import gettext as _
105+
106+
my_text = _("Hello!")
107+
108+
# Specify Spanish
109+
os.environ['LANGUAGE'] = "es"
110+
print(my_text)
111+
>> ¡Hola!
112+
113+
# Specify English
114+
os.environ['LANGUAGE'] = "en"
115+
print(my_text)
116+
>> Hello!
117+
```
118+
119+
This approach marks string literals for translation AND retrieves the translated text.
120+
121+
We do the same in SeedSigner code, but only when the string literal is in a part of the code that is dynamically evaluated:
122+
123+
```python
124+
from gettext import gettext as _
125+
126+
class SomeView(View):
127+
def __init__(self):
128+
# Mark string literal for translation AND dynamically retrieve its translated value
129+
self.some_var = _("I will be dynamically fetched")
130+
```
131+
132+
Though note that there are times when we use `_()` only for the retrieval side:
133+
134+
```python
135+
from seedsigner.helpers.l10n import mark_for_translation as _mft
136+
137+
class SomeView(View):
138+
message = _mft("Hello!") # mark for translation, but always return "Hello!"
139+
140+
def run(self):
141+
self.run_screen(
142+
SomeScreen,
143+
message=self.title
144+
)
145+
146+
# elsewhere...
147+
@dataclass
148+
class SomeScreen(Screen):
149+
message: str = None
150+
151+
def __post_init__(self):
152+
message_display = TextArea(
153+
text=_(self.message) # The _() wrapping here now retrieves the translated value, if one is available
154+
)
155+
```
156+
157+
---
158+
159+
## Basic rules
160+
* English string literals in class-level attributes should be wrapped with either `ButtonOption` (for `button_data` entries) or `_mft` (for misc class-level attrs) so they'll be picked up for translation.
161+
* English string literals anywhere else should be wrapped with `_()` to be marked for translation AND provide the dynamic translated value.
162+
* In general, don't go out of your way to translate text before passing it into `Screen` classes.
163+
* The `Screen` itself should do most of the `_()` calls to fetch translations for final display.
164+
* Minor risk of double-translation weirdness otherwise.
165+
166+
Mark for translation in the `View`. Retrieve translated values in the `Screen`. Pass final display text into the basic gui `Component`s.
167+
168+
---
169+
170+
## Provide translation context hints
171+
In many cases the English string literal on its own does not provide enough context for translators to understand how the word is being used.
172+
173+
For example, is "change" referring to altering a value OR is it the amount coming back to you in a transaction?
174+
175+
Whenever necessary, add explanatory context as a comment. This applies to all three ways of marking strings for translation.
176+
177+
The `extract_messages` command is explictly looking for the exact string: `# TRANSLATOR_NOTE:` in comments.
178+
179+
```python
180+
class SeedAddressVerificationView(View):
181+
# TRANSLATOR_NOTE: Option when scanning for a matching address; skips ten addresses ahead
182+
SKIP_10 = ButtonOption("Skip 10")
183+
```
184+
185+
Note that the comment MUST be on the preceding line of executable code for it to work:
186+
187+
```python
188+
class SettingsConstants
189+
# TRANSLATOR_NOTE: QR code density option: Low, Medium, High <-- ✅ Correct way to add context
190+
density_low = _mft("Low")
191+
192+
ALL_DENSITIES = [
193+
(DENSITY__LOW, density_low),
194+
# TRANSLATOR_NOTE: QR code density option: Low, Medium, High <-- ❌ Note will NOT be picked up
195+
(DENSITY__MEDIUM, "Medium"),
196+
(DENSITY__HIGH, "High"),
197+
]
198+
```
199+
200+
```python
201+
# TRANSLATOR_NOTE: Refers to the user's change output in a psbt
202+
some_var = _("change")
203+
```
204+
205+
---
206+
207+
## `_()` Wrapping syntax details
208+
* Use `.format()` to wrap strings with variable injections. Note that `.format()` is OUTSIDE the `_()` wrapping.
209+
```python
210+
mystr = f"My dad's name is {dad.name} and my name is {self.name}."
211+
mystr = _("My dad's name is {} and my name is {}").format(dad.name, self.name)
212+
```
213+
214+
The translators will only see: "My dad's name is {} and my name is {}" in Transifex. Often the English string literal is
215+
basically incomprehensible on its own so always provide an explanation for what is being injected:
216+
217+
```python
218+
# TRANSLATOR_NOTE: Address verification success message (e.g. "bc1qabc = seed 12345678's receive address #0.")
219+
text = _("{} = {}'s {} address #{}.").format(...)
220+
```
221+
222+
If there are a lot of variables to inject, placeholder names can be used (TODO: how does Transifex display this?):
223+
```python
224+
mystr = _("My dad's name is {dad_name} and my name is {my_name}").format(dad_name=dad.name, my_name=self.name)
225+
```
226+
* Use `ngettext` to dynamically handle singular vs plural forms based on an integer quantity:
227+
```python
228+
n = 1
229+
print(ngettext("apple", "apples", n))
230+
>> apple
231+
232+
n = 5
233+
print(ngettext("apple", "apples", n))
234+
>> apples
235+
```
236+
237+
Transifex will ask translators to provide the singular and plural forms on a language-specific basis (e.g. Arabic as THREE plural forms!).
238+
239+
---
240+
241+
## Set up localization dependencies
242+
```bash
243+
pip install -r l10n/requirements-l10n.txt
244+
```
245+
246+
Make sure that your local repo has fetched the `seedsigner-translations` submodule. It's configured to add it in src/seedsigner/resources.
247+
```bash
248+
# Need --remote in order to respect the target branch listed in .gitmodules
249+
git submodule update --remote
250+
```
251+
252+
253+
### Pre-configured `babel` commands
254+
The `setup.cfg` file in the project root specifies params for the various `babel` commands discussed below.
255+
256+
You should have already added the local code as an editable project in pip:
257+
```bash
258+
# From the repo root
259+
pip install -e .
260+
```
261+
262+
263+
### Rescanning for text that needs translations
264+
Re-generate the `messages.pot` file:
265+
266+
```bash
267+
python setup.py extract_messages
268+
```
269+
270+
This will rescan all wrapped text, picking up new strings as well as updating existings strings that have been edited.
271+
272+
_TODO: Github Action to auto-generate messages.pot and fail a PR update if the PR has an out of date messages.pot?_
273+
274+
275+
### Making new text available to translators
276+
Upload the master `messages.pot` to Transifex. It will automatically update each language with the new or changed source strings.
277+
278+
_TODO: Look into Transifex options to automatically pull updates?_
279+
280+
281+
### Once new translations are complete
282+
The translation file for each language will need to be downloaded via Transifex's "Download for use" option (sends you a `messages.po` file for that language).
283+
284+
This updated `messages.po` should be added to the seedsigner-translations repo in l10n/`{TARGET_LOCALE}`/LC_MESSAGES.
285+
286+
287+
### Compile all the translations
288+
The `messages.po` files must be compiled into `*.mo` files:
289+
290+
```bash
291+
python setup.py compile_catalog
292+
293+
# Or target a specific language code:
294+
python setup.py compile_catalog -l es
295+
```
296+
297+
### Unused babel commands
298+
Transifex eliminates the need for the `init_catalog` and `update_catalog` commands.
299+
300+
301+
## Keep the seedsigner-translations repo up to date
302+
The *.po files for each language and their compiled *.mo files should all be kept up to date in the seedsigner-translations repo.
303+
304+
_TODO: Github Actions automation to regenerate / verify that the *.mo files have been updated after *.po changes._
305+
306+
---
307+
308+
## Generate screenshots in each language
309+
Simply run the screenshot generator:
310+
311+
```bash
312+
pytest tests/screenshot_generator/generator.py
313+
314+
# Or target a specific language code:
315+
pytest tests/screenshot_generator/generator.py --locale es
316+
```

0 commit comments

Comments
 (0)