From c3f5a0da17bb68c11fbc4dcf9bc528c0fd75e284 Mon Sep 17 00:00:00 2001
From: Odin Kroeger <odkr@users.noreply.github.com>
Date: Mon, 3 Mar 2025 13:25:27 +0100
Subject: [PATCH] docs(README): Mentioned what the repo is about

---
 README.md | 82 +++++++++++++++++++++++++++++++------------------------
 1 file changed, 47 insertions(+), 35 deletions(-)

diff --git a/README.md b/README.md
index bdcee0f..5398873 100644
--- a/README.md
+++ b/README.md
@@ -1,4 +1,16 @@
-# Translating the Basisklassifikation
+# Basisklassifikation WordPress plugin
+
+This is a WordPress plugin that provides a taxonomy for the so-called
+Basisklassifikation and an admin panel to upload JSKOS-files that contain
+a current version of that classificatory scheme.
+
+Note that processing a JSKOS file will likely take longer than PHP scripts are
+permitted to run on your server. WordPress has not been made to create
+thousands of taxonomy terms at once. You will need to adjust this limit before
+uploading a JSKOS file.
+
+
+## Translating the Basisklassifikation
 
 This assumes that you are using Polylang and that the taxonomy has already
 been translated.
@@ -18,7 +30,7 @@ You will need access to a modern-ish Unix-like system with:
 * [Pandoc](https://www.pandoc.org/) >= v3.0
 
 
-## Terminology
+### Terminology
 
 Basisklassifikation classes have an organisational number (e.g., "08.00").
 This number, in keeping with JSKOS terminology, is hereafter referred to as
@@ -29,7 +41,7 @@ names, which are referred to as "labels". This plugin ignores alternative labels
 every mention of a "label" always refers to "preferred" one.
 
 
-## Get a JSKOS file
+### Get a JSKOS file
 
 The JSKOS format defines a JSON format (hence "JS") for knowledge
 organisation systems (hence "KOS").
@@ -49,10 +61,10 @@ that contains the German labels of the Basisklassifikation is named
 "01-bk-de.jskos"
 
 ```sh
-	jskos_de=01-bk-de.jskos
+jskos_de=01-bk-de.jskos
 ```
 
-## Extract German labels
+### Extract German labels
 
 DeepL only accepts word processing documents as input, so we will
 perform some conversions. The route we will take is:
@@ -68,18 +80,18 @@ You should find it in the "scripts" directory.
 Run:
 
 ```sh
-	csv_de=02-bk-de.csv
-	scripts/jskos2csv de "$jskos_de" | sort -u >"$csv_de"
+csv_de=02-bk-de.csv
+scripts/jskos2csv de "$jskos_de" | sort -u >"$csv_de"
 ```
 
 "jskos2csv" should work without issue, but the earlier we catch errors,
 the better. So we will check whether classes are missing:
 
 ```sh
-	jskos_de_sn=jskos-de-sub-notations csv_de_sn=csv-de-sub-notations
-	grep -oE '[0-9]+\.[0-9]+' "$jskos_de" | sort -u >"$jskos_de_sn"
-	grep -oE '[0-9]+\.[0-9]+' "$csv_de"   | sort -u >"$csv_de_sn"
-	diff "$jskos_de_sn" "$csv_de_sn"
+jskos_de_sn=jskos-de-sub-notations csv_de_sn=csv-de-sub-notations
+grep -oE '[0-9]+\.[0-9]+' "$jskos_de" | sort -u >"$jskos_de_sn"
+grep -oE '[0-9]+\.[0-9]+' "$csv_de"   | sort -u >"$csv_de_sn"
+diff "$jskos_de_sn" "$csv_de_sn"
 ```
 
 `diff "$jskos_de_sn" "$csv_de_sn"` prints the notation of each missing class,
@@ -88,25 +100,25 @@ does *not* print anything, then no non-top-level class is missing;
 otherwise, you have to add the missing classes manually.
 
 
-## Convert to Office Open XML (aka ".docx")
+### Convert to Office Open XML (aka ".docx")
 
 The CSV file can be converted to an Office Open XML file using Pandoc
 and a custom CSV reader "csv.lua", which ships with this plugin and is
 also located in the scripts folder:
 
 ```sh
-    docx_de=03-bk-de.docx
-    pandoc -fscripts/csv.lua -o"$docx_de" "$csv_de"
+docx_de=03-bk-de.docx
+pandoc -fscripts/csv.lua -o"$docx_de" "$csv_de"
 ```
 
 Again, we check whether any classes are missing in the created file:
 
 ```sh
-    csv_de_n=csv-de-notations docx_de_n=docx-de-notations
-    sed -n 's/^"*\([0-9]*\.[0-9]*\).*/\1/p' "$csv_de" >"$csv_de_n"
-    pandoc -tscripts/csv.lua "$docx_de" |
-        sed -n 's/^"*\([0-9]*\.[0-9]*\).*/\1/p' >"$docx_de_n"
-    diff "$csv_de_n" "$docx_de_n"
+csv_de_n=csv-de-notations docx_de_n=docx-de-notations
+sed -n 's/^"*\([0-9]*\.[0-9]*\).*/\1/p' "$csv_de" >"$csv_de_n"
+pandoc -tscripts/csv.lua "$docx_de" |
+    sed -n 's/^"*\([0-9]*\.[0-9]*\).*/\1/p' >"$docx_de_n"
+diff "$csv_de_n" "$docx_de_n"
 ```
 
 `diff "$csv_de_n" "$docx_de_n"` prints the notation of each missing class,
@@ -114,7 +126,7 @@ including top-level classes, provided that they are not missing from the CSV
 file. Again, missing classes need to be added manually.
 
 
-## Translate using DeepL
+### Translate using DeepL
 
 Upload "03-bk-de.docx" to <https://www.deepl.com/translator/files>,
 make sure that the document's language is set to German, and let DeepL
@@ -127,42 +139,42 @@ the Basisklassifikation to (e.g., "en").
 Yet again, we check whether any classes are missing from the translation:
 
 ```sh
-    lang=XX
-    docx_trans="04-bk-$lang.docx" docx_trans_n="csv-$lang-notations"
-    pandoc -tscripts/csv.lua "$docx_trans" |
-        sed -n 's/^"*\([0-9]*\.[0-9]*\).*/\1/p' >"$docx_trans_n"
-    diff "$csv_de_n" "$docx_trans_n"
+lang=XX
+docx_trans="04-bk-$lang.docx" docx_trans_n="csv-$lang-notations"
+pandoc -tscripts/csv.lua "$docx_trans" |
+    sed -n 's/^"*\([0-9]*\.[0-9]*\).*/\1/p' >"$docx_trans_n"
+diff "$csv_de_n" "$docx_trans_n"
 ```
 
 Again, missing classes need to be added manually.
 
 
-## Convert back to CSV
+### Convert back to CSV
 
 The Office Open XML file can be converted back to a CSV file using Pandoc
 and a custom CSV writer "csv.lua" (the same script as before), which ships
 with this plugin and is also located in the scripts folder:
 
 ```sh
-    csv_trans="05-bk-$lang.csv"
-    pandoc -tscripts/csv.lua -o"$csv_trans" "$docx_trans"
+csv_trans="05-bk-$lang.csv"
+pandoc -tscripts/csv.lua -o"$csv_trans" "$docx_trans"
 ```
 
 Check whether classes are missing:
 
 ```sh
-    csv_trans_n="csv-$lang-notations"
-    sed -n 's/^"*\([0-9]*\.[0-9]*\).*/\1/p' "$csv_trans" >"$csv_trans_n"
-    diff "$csv_de_n" "$csv_trans_n"
+csv_trans_n="csv-$lang-notations"
+sed -n 's/^"*\([0-9]*\.[0-9]*\).*/\1/p' "$csv_trans" >"$csv_trans_n"
+diff "$csv_de_n" "$csv_trans_n"
 ```
 
-## Correct translations errors
+### Correct translations errors
 
 This is the best time to fix the errors that DeepL made.
 
 ```
-    csv_trans_corrected="06-bk-$lang-corrected.csv"
-    cp "$csv_trans" "$csv_trans_corrected"
+csv_trans_corrected="06-bk-$lang-corrected.csv"
+cp "$csv_trans" "$csv_trans_corrected"
 ```
 
 Open 06-bk-XX-corrected.csv with your favourite text editor and edit away.
@@ -180,7 +192,7 @@ You will want to look out for:
 * Inconsistently translated abbreviations.
 
 
-## Create a new ndJSON JSKOS file
+### Create a new ndJSON JSKOS file
 
 Create a new JSKOS file that contains the German as well as your
 translated labels:
-- 
GitLab