diff --git a/README.md b/README.md index 3dd8cfcecbc1e13e8b1d23b3efcec4408713ce8b..bdcee0f83ad00febcd48876c6f1129ac335369b1 100644 --- a/README.md +++ b/README.md @@ -1,6 +1,7 @@ # Translating the Basisklassifikation -Needs Polylang; taxonomy must be marked as being translated first! +This assumes that you are using Polylang and that the taxonomy has already +been translated. To translate the Basisklassifikation, download a copy pf the whole Basisklassifkation in the [JSKOS](https://gbv.github.io/jskos/), feed @@ -24,7 +25,8 @@ This number, in keeping with JSKOS terminology, is hereafter referred to as "notation". Classes also have names. JSKOS distinguishes "preferred" and "alternative" -names, which are referred to as "labels". This plugin ignores alternative labels, every mention of a "label" always refers to "preferred" one. +names, which are referred to as "labels". This plugin ignores alternative labels, +every mention of a "label" always refers to "preferred" one. ## Get a JSKOS file @@ -46,9 +48,9 @@ For the remainder of this guide, it is assumed that the JSKOS file that contains the German labels of the Basisklassifikation is named "01-bk-de.jskos" -`sh +```sh jskos_de=01-bk-de.jskos -` +``` ## Extract German labels @@ -65,20 +67,20 @@ You should find it in the "scripts" directory. Run: -`sh +```sh csv_de=02-bk-de.csv scripts/jskos2csv de "$jskos_de" | sort -u >"$csv_de" -` +``` "jskos2csv" should work without issue, but the earlier we catch errors, the better. So we will check whether classes are missing: -`sh +```sh jskos_de_sn=jskos-de-sub-notations csv_de_sn=csv-de-sub-notations grep -oE '[0-9]+\.[0-9]+' "$jskos_de" | sort -u >"$jskos_de_sn" grep -oE '[0-9]+\.[0-9]+' "$csv_de" | sort -u >"$csv_de_sn" diff "$jskos_de_sn" "$csv_de_sn" -` +``` `diff "$jskos_de_sn" "$csv_de_sn"` prints the notation of each missing class, save for top-level classes, which use a different syntax; if this command @@ -92,20 +94,20 @@ The CSV file can be converted to an Office Open XML file using Pandoc and a custom CSV reader "csv.lua", which ships with this plugin and is also located in the scripts folder: -`sh +```sh docx_de=03-bk-de.docx pandoc -fscripts/csv.lua -o"$docx_de" "$csv_de" -` +``` Again, we check whether any classes are missing in the created file: -`sh +```sh csv_de_n=csv-de-notations docx_de_n=docx-de-notations sed -n 's/^"*\([0-9]*\.[0-9]*\).*/\1/p' "$csv_de" >"$csv_de_n" pandoc -tscripts/csv.lua "$docx_de" | sed -n 's/^"*\([0-9]*\.[0-9]*\).*/\1/p' >"$docx_de_n" diff "$csv_de_n" "$docx_de_n" -` +``` `diff "$csv_de_n" "$docx_de_n"` prints the notation of each missing class, including top-level classes, provided that they are not missing from the CSV @@ -124,13 +126,13 @@ the Basisklassifikation to (e.g., "en"). Yet again, we check whether any classes are missing from the translation: -`sh +```sh lang=XX docx_trans="04-bk-$lang.docx" docx_trans_n="csv-$lang-notations" pandoc -tscripts/csv.lua "$docx_trans" | sed -n 's/^"*\([0-9]*\.[0-9]*\).*/\1/p' >"$docx_trans_n" diff "$csv_de_n" "$docx_trans_n" -` +``` Again, missing classes need to be added manually. @@ -141,27 +143,27 @@ The Office Open XML file can be converted back to a CSV file using Pandoc and a custom CSV writer "csv.lua" (the same script as before), which ships with this plugin and is also located in the scripts folder: -`sh +```sh csv_trans="05-bk-$lang.csv" pandoc -tscripts/csv.lua -o"$csv_trans" "$docx_trans" -` +``` Check whether classes are missing: -`sh +```sh csv_trans_n="csv-$lang-notations" sed -n 's/^"*\([0-9]*\.[0-9]*\).*/\1/p' "$csv_trans" >"$csv_trans_n" diff "$csv_de_n" "$csv_trans_n" -` +``` ## Correct translations errors This is the best time to fix the errors that DeepL made. -` +``` csv_trans_corrected="06-bk-$lang-corrected.csv" cp "$csv_trans" "$csv_trans_corrected" -` +``` Open 06-bk-XX-corrected.csv with your favourite text editor and edit away.