|
| 1 | +# Localization (l10n) Developer Notes |
| 2 | + |
| 3 | +## High-level overview |
| 4 | +1. Python code indicates text that needs to be translated. |
| 5 | +1. Those marked strings are extracted into a master `messages.pot` file. |
| 6 | +1. That file is uploaded [Transifex](https://app.transifex.com/seedsigner/seedsigner). |
| 7 | +1. Translators work within Transifex on their respective languages. |
| 8 | +1. Completed translations are downloaded as `messages.po` files for each language. |
| 9 | +1. Python "compiles" them into `messages.mo` files ready for use. |
| 10 | +1. The `*.po` and `*.mo` files are written to the [seedsigner-translations](https://github.com/SeedSigner/seedsigner-translations) repo. |
| 11 | +1. That repo is linked as a submodule here as `seedsigner.resources.seedsigner-translations`. |
| 12 | +1. Python code retrieves a translation on demand. |
| 13 | + |
| 14 | + |
| 15 | +## "Wrapping" text for translation |
| 16 | +Any text that we want to be presented in multiple languages needs to "wrapped". |
| 17 | + |
| 18 | +The CORE CONCEPT to understand is that wrapping is used in TWO different contexts: |
| 19 | +1. Pre-translation: This is how we identify text that translators need to translate. Any wrapped string literals will appear in translators' Transifex UI. |
| 20 | +2. Post-translation: Return the locale-specific translation for that source string (defaults to the English string if no translation is found). |
| 21 | + |
| 22 | +We have three techniques to wrap code, depending on which of the above contexts we're in and where we are in the code: |
| 23 | + |
| 24 | + |
| 25 | +#### Technique 1: `ButtonOption` |
| 26 | +Most `View` classes will render themselves via some variation of the `ButtonListScreen` which takes a `button_data` list as an input. Each entry in |
| 27 | +`button_data` must be a `ButtonOption`. The first argument for `ButtonOption` is the `button_label` string. This is the English string literal that |
| 28 | +is displayed in that button. If you look at `setup.cfg` you'll see that `ButtonOption` is listed as a keyword in `extract_messages`. That means |
| 29 | +that the first argument for `ButtonOption` -- its `button_label` string -- will be marked for translation (by default the `extract_messages` |
| 30 | +integration will only look at the first argument of any method listed in `keywords`). |
| 31 | + |
| 32 | +```python |
| 33 | +class SomeView(View): |
| 34 | + # These string literals will be marked for translation |
| 35 | + OPTION_1 = ButtonOption("Option 1!") |
| 36 | + OPTION_2 = ButtonOption("Option 2!") |
| 37 | + |
| 38 | + def run(self): |
| 39 | + button_data = [self.OPTION_1, self.OPTION_2] |
| 40 | + |
| 41 | + # No way for `extract_messages` to know what's in `some_var`; won't be marked for |
| 42 | + # translation unless it's specified elsewhere. |
| 43 | + some_var = some_value |
| 44 | + button_data.append(ButtonOption(some_var)) |
| 45 | +``` |
| 46 | + |
| 47 | +These `ButtonOption` values are generally specified in class-level attributes, as in the above example. Classes in python are imported once, after |
| 48 | +which class-level attributes are never reinterpreted again; **the value at import time for a class-level attribute is its value for the duration of |
| 49 | +the program execution.** |
| 50 | + |
| 51 | +This means that we must assume that `ButtonOption.button_label` strings are ALWAYS the original English string. This is crucial because the English |
| 52 | +values are the lookup keys for the translations: |
| 53 | + |
| 54 | +* `ButtonOption.button_label` = "Hello!" in the python code. |
| 55 | +* Run the code, the class that contains our `ButtonOption` as a class-level attribute is imported. |
| 56 | +* Regardless of language selection, that `ButtonOption` will always return "Hello!". |
| 57 | +* `Screen` then uses "Hello!" as a key to find the translation "¡Hola!". |
| 58 | +* User sees "¡Hola!". |
| 59 | + |
| 60 | +IF `ButtonOption` were wired to return the translated string, we'd have a problem: |
| 61 | +* User sets their language to Spanish and enables persistent settings. |
| 62 | +* Launch SeedSigner. At import time the `button_label`'s value is translated to "¡Hola!". |
| 63 | +* User sees "¡Hola!" in the UI. All good. |
| 64 | +* User changes language to English (or any other language). |
| 65 | +* Now the `Screen` must find the matching string in a different translation file. |
| 66 | +* But the `button_label` value was fixed at import time; it's still providing "¡Hola!" as the lookup key. |
| 67 | +* Since all the translation files map English -> translation, no such "¡Hola!" match exists in any translation file. |
| 68 | +* So the translation falls back to just displaying the unmatched key: "¡Hola!" |
| 69 | + |
| 70 | +tldr: `ButtonOption` marks its `button_label` English string literal for translation, but NEVER provides a translated value. |
| 71 | + |
| 72 | +--- |
| 73 | + |
| 74 | +#### Technique 2: `seedsigner.helpers.l10n.mark_for_translation` |
| 75 | +You'll see that `mark_for_translation` is imported as `_mft` for short. |
| 76 | + |
| 77 | +As far as translations are concerned, `_mft` serves the same purpose as `ButtonOption`. The only difference is that `_mft` is for all other |
| 78 | +(non-`button_data`) class-level attributes. |
| 79 | + |
| 80 | +```python |
| 81 | +from seedsigner.helpers.l10n import mark_for_translation as _mft |
| 82 | + |
| 83 | +@classmethod |
| 84 | +class SomeView(View): |
| 85 | + title: str = _mft("Default Title") |
| 86 | + text: str = _mft("My default body text") |
| 87 | + |
| 88 | + def run(self): |
| 89 | + self.run_screen( |
| 90 | + SomeScreen, |
| 91 | + title=self.title, |
| 92 | + text=self.text |
| 93 | + ) |
| 94 | +``` |
| 95 | + |
| 96 | +In general we try to avoid using `_mft` at all, but some class-level attributes just can't be avoided. |
| 97 | + |
| 98 | +--- |
| 99 | + |
| 100 | +#### Technique 3: `gettext`, aka `_()` |
| 101 | +This is the way you'll see text wrapping handled in the vast majority of tutorials. |
| 102 | + |
| 103 | +```python |
| 104 | +from gettext import gettext as _ |
| 105 | + |
| 106 | +my_text = _("Hello!") |
| 107 | + |
| 108 | +# Specify Spanish |
| 109 | +os.environ['LANGUAGE'] = "es" |
| 110 | +print(my_text) |
| 111 | +>> ¡Hola! |
| 112 | + |
| 113 | +# Specify English |
| 114 | +os.environ['LANGUAGE'] = "en" |
| 115 | +print(my_text) |
| 116 | +>> Hello! |
| 117 | +``` |
| 118 | + |
| 119 | +This approach marks string literals for translation AND retrieves the translated text. |
| 120 | + |
| 121 | +We do the same in SeedSigner code, but only when the string literal is in a part of the code that is dynamically evaluated: |
| 122 | + |
| 123 | +```python |
| 124 | +from gettext import gettext as _ |
| 125 | + |
| 126 | +class SomeView(View): |
| 127 | + def __init__(self): |
| 128 | + # Mark string literal for translation AND dynamically retrieve its translated value |
| 129 | + self.some_var = _("I will be dynamically fetched") |
| 130 | +``` |
| 131 | + |
| 132 | +Though note that there are times when we use `_()` only for the retrieval side: |
| 133 | + |
| 134 | +```python |
| 135 | +from seedsigner.helpers.l10n import mark_for_translation as _mft |
| 136 | + |
| 137 | +class SomeView(View): |
| 138 | + message = _mft("Hello!") # mark for translation, but always return "Hello!" |
| 139 | + |
| 140 | + def run(self): |
| 141 | + self.run_screen( |
| 142 | + SomeScreen, |
| 143 | + message=self.title |
| 144 | + ) |
| 145 | + |
| 146 | +# elsewhere... |
| 147 | +@dataclass |
| 148 | +class SomeScreen(Screen): |
| 149 | + message: str = None |
| 150 | + |
| 151 | + def __post_init__(self): |
| 152 | + message_display = TextArea( |
| 153 | + text=_(self.message) # The _() wrapping here now retrieves the translated value, if one is available |
| 154 | + ) |
| 155 | +``` |
| 156 | + |
| 157 | +--- |
| 158 | + |
| 159 | +## Basic rules |
| 160 | +* English string literals in class-level attributes should be wrapped with either `ButtonOption` (for `button_data` entries) or `_mft` (for misc class-level attrs) so they'll be picked up for translation. |
| 161 | +* English string literals anywhere else should be wrapped with `_()` to be marked for translation AND provide the dynamic translated value. |
| 162 | +* In general, don't go out of your way to translate text before passing it into `Screen` classes. |
| 163 | + * The `Screen` itself should do most of the `_()` calls to fetch translations for final display. |
| 164 | + * Minor risk of double-translation weirdness otherwise. |
| 165 | + |
| 166 | +Mark for translation in the `View`. Retrieve translated values in the `Screen`. Pass final display text into the basic gui `Component`s. |
| 167 | + |
| 168 | +--- |
| 169 | + |
| 170 | +## Provide translation context hints |
| 171 | +In many cases the English string literal on its own does not provide enough context for translators to understand how the word is being used. |
| 172 | + |
| 173 | +For example, is "change" referring to altering a value OR is it the amount coming back to you in a transaction? |
| 174 | + |
| 175 | +Whenever necessary, add explanatory context as a comment. This applies to all three ways of marking strings for translation. |
| 176 | + |
| 177 | +The `extract_messages` command is explictly looking for the exact string: `# TRANSLATOR_NOTE:` in comments. |
| 178 | + |
| 179 | +```python |
| 180 | +class SeedAddressVerificationView(View): |
| 181 | + # TRANSLATOR_NOTE: Option when scanning for a matching address; skips ten addresses ahead |
| 182 | + SKIP_10 = ButtonOption("Skip 10") |
| 183 | +``` |
| 184 | + |
| 185 | +Note that the comment MUST be on the preceding line of executable code for it to work: |
| 186 | + |
| 187 | +```python |
| 188 | +class SettingsConstants |
| 189 | + # TRANSLATOR_NOTE: QR code density option: Low, Medium, High <-- ✅ Correct way to add context |
| 190 | + density_low = _mft("Low") |
| 191 | + |
| 192 | + ALL_DENSITIES = [ |
| 193 | + (DENSITY__LOW, density_low), |
| 194 | + # TRANSLATOR_NOTE: QR code density option: Low, Medium, High <-- ❌ Note will NOT be picked up |
| 195 | + (DENSITY__MEDIUM, "Medium"), |
| 196 | + (DENSITY__HIGH, "High"), |
| 197 | + ] |
| 198 | +``` |
| 199 | + |
| 200 | +```python |
| 201 | +# TRANSLATOR_NOTE: Refers to the user's change output in a psbt |
| 202 | +some_var = _("change") |
| 203 | +``` |
| 204 | + |
| 205 | +--- |
| 206 | + |
| 207 | +## `_()` Wrapping syntax details |
| 208 | +* Use `.format()` to wrap strings with variable injections. Note that `.format()` is OUTSIDE the `_()` wrapping. |
| 209 | + ```python |
| 210 | + mystr = f"My dad's name is {dad.name} and my name is {self.name}." |
| 211 | + mystr = _("My dad's name is {} and my name is {}").format(dad.name, self.name) |
| 212 | + ``` |
| 213 | + |
| 214 | + The translators will only see: "My dad's name is {} and my name is {}" in Transifex. Often the English string literal is |
| 215 | + basically incomprehensible on its own so always provide an explanation for what is being injected: |
| 216 | + |
| 217 | + ```python |
| 218 | + # TRANSLATOR_NOTE: Address verification success message (e.g. "bc1qabc = seed 12345678's receive address #0.") |
| 219 | + text = _("{} = {}'s {} address #{}.").format(...) |
| 220 | + ``` |
| 221 | + |
| 222 | + If there are a lot of variables to inject, placeholder names can be used (TODO: how does Transifex display this?): |
| 223 | + ```python |
| 224 | + mystr = _("My dad's name is {dad_name} and my name is {my_name}").format(dad_name=dad.name, my_name=self.name) |
| 225 | + ``` |
| 226 | +* Use `ngettext` to dynamically handle singular vs plural forms based on an integer quantity: |
| 227 | + ```python |
| 228 | + n = 1 |
| 229 | + print(ngettext("apple", "apples", n)) |
| 230 | + >> apple |
| 231 | + |
| 232 | + n = 5 |
| 233 | + print(ngettext("apple", "apples", n)) |
| 234 | + >> apples |
| 235 | + ``` |
| 236 | + |
| 237 | +Transifex will ask translators to provide the singular and plural forms on a language-specific basis (e.g. Arabic as THREE plural forms!). |
| 238 | + |
| 239 | +--- |
| 240 | + |
| 241 | +## Set up localization dependencies |
| 242 | +```bash |
| 243 | +pip install -r l10n/requirements-l10n.txt |
| 244 | +``` |
| 245 | + |
| 246 | +Make sure that your local repo has fetched the `seedsigner-translations` submodule. It's configured to add it in src/seedsigner/resources. |
| 247 | +```bash |
| 248 | +# Need --remote in order to respect the target branch listed in .gitmodules |
| 249 | +git submodule update --remote |
| 250 | +``` |
| 251 | + |
| 252 | + |
| 253 | +### Pre-configured `babel` commands |
| 254 | +The `setup.cfg` file in the project root specifies params for the various `babel` commands discussed below. |
| 255 | + |
| 256 | +You should have already added the local code as an editable project in pip: |
| 257 | +```bash |
| 258 | +# From the repo root |
| 259 | +pip install -e . |
| 260 | +``` |
| 261 | + |
| 262 | + |
| 263 | +### Rescanning for text that needs translations |
| 264 | +Re-generate the `messages.pot` file: |
| 265 | + |
| 266 | +```bash |
| 267 | +python setup.py extract_messages |
| 268 | +``` |
| 269 | + |
| 270 | +This will rescan all wrapped text, picking up new strings as well as updating existings strings that have been edited. |
| 271 | + |
| 272 | +_TODO: Github Action to auto-generate messages.pot and fail a PR update if the PR has an out of date messages.pot?_ |
| 273 | + |
| 274 | + |
| 275 | +### Making new text available to translators |
| 276 | +Upload the master `messages.pot` to Transifex. It will automatically update each language with the new or changed source strings. |
| 277 | + |
| 278 | +_TODO: Look into Transifex options to automatically pull updates?_ |
| 279 | + |
| 280 | + |
| 281 | +### Once new translations are complete |
| 282 | +The translation file for each language will need to be downloaded via Transifex's "Download for use" option (sends you a `messages.po` file for that language). |
| 283 | + |
| 284 | +This updated `messages.po` should be added to the seedsigner-translations repo in l10n/`{TARGET_LOCALE}`/LC_MESSAGES. |
| 285 | + |
| 286 | + |
| 287 | +### Compile all the translations |
| 288 | +The `messages.po` files must be compiled into `*.mo` files: |
| 289 | + |
| 290 | +```bash |
| 291 | +python setup.py compile_catalog |
| 292 | + |
| 293 | +# Or target a specific language code: |
| 294 | +python setup.py compile_catalog -l es |
| 295 | +``` |
| 296 | + |
| 297 | +### Unused babel commands |
| 298 | +Transifex eliminates the need for the `init_catalog` and `update_catalog` commands. |
| 299 | + |
| 300 | + |
| 301 | +## Keep the seedsigner-translations repo up to date |
| 302 | +The *.po files for each language and their compiled *.mo files should all be kept up to date in the seedsigner-translations repo. |
| 303 | + |
| 304 | +_TODO: Github Actions automation to regenerate / verify that the *.mo files have been updated after *.po changes._ |
| 305 | + |
| 306 | +--- |
| 307 | + |
| 308 | +## Generate screenshots in each language |
| 309 | +Simply run the screenshot generator: |
| 310 | + |
| 311 | +```bash |
| 312 | +pytest tests/screenshot_generator/generator.py |
| 313 | + |
| 314 | +# Or target a specific language code: |
| 315 | +pytest tests/screenshot_generator/generator.py --locale es |
| 316 | +``` |
0 commit comments