diff --git a/.docs/examples/air.md b/.docs/examples/air.md new file mode 100644 index 0000000000000000000000000000000000000000..1e4d9ddeddf83d5861ad421ba8d00f7397508f46 --- /dev/null +++ b/.docs/examples/air.md @@ -0,0 +1,38 @@ +--- +author: Martin Weise +--- + +## tl;dr + +[:fontawesome-solid-database: Dataset](https://handle.stage.datacite.org/10.82556/gd17-aq82){ .md-button .md-button--primary target="_blank" } +[:material-file-document: Archive](https://doi.org/10.48436/mtha8-w2406){ .md-button .md-button--secondary target="_blank" } + +## Description + +This digital record contains historical air pollution and air quality data from approximately 20 air monitoring stations +in Vienna, spanning the years from 1980 to 2021. The data was provided by the Umweltbundesamt and is stored in its +original form in this record. This record forms the basis of an analysis carried out in a bachelor's thesis at the TU +Wien. + +## Solution + + +<figure markdown> +{ .img-border } +<figcaption>Figure 1: Jupyter Notebook accessing data on DBRepo using the Python Library.</figcaption> +</figure> + +## DBRepo Features + +- [x] Import complex dataset +- [x] System versioning +- [x] Subset exploration +- [x] Aggregated views +- [x] Precise & PID of queries tables +- [x] External data access for analysis + +## Acknowledgement + +This work was part of a cooperation with the [Umweltbundesamt](https://www.umweltbundesamt.at/). + +<img src="../../images/logos/umweltbundesamt.png" width=100 /> \ No newline at end of file diff --git a/.docs/examples/covid-19.md b/.docs/examples/covid-19.md new file mode 100644 index 0000000000000000000000000000000000000000..ba986e76e9275de13e0be646e908ad8e38256d65 --- /dev/null +++ b/.docs/examples/covid-19.md @@ -0,0 +1,25 @@ +--- +author: Martin Weise +--- + +## tl;dr + +[:fontawesome-solid-database: Dataset](https://dbrepo1.ec.tuwien.ac.at/pid/15){ .md-button .md-button--primary target="_blank" } +[:simple-github: Archive](https://github.com/CSSEGISandData/COVID-19){ .md-button .md-button--secondary target="_blank" } + +## Description + +This dataset contains the daily COVID-19 data provided publicly +by [Johns Hopkins University](https://coronavirus.jhu.edu/about/how-to-use-our-data). + +## Solution + +We imported their daily snapshots provided as 1145 versioned .csv files from their Git repository archive and imported +them daily into DBRepo as system-versioned data that can be queried. During the time of this project the COVID-19 +pandemic was still ongoing and therefore daily snapshots demanded a correct import script to be maintained. + +## DBRepo Features + +- [x] Data pipeline from Git repository +- [x] System versioning +- [x] Subset exploration diff --git a/.docs/examples/influenza.md b/.docs/examples/influenza.md deleted file mode 100644 index 074c413d31f3f5e1d590d2491c6562677ee75a39..0000000000000000000000000000000000000000 --- a/.docs/examples/influenza.md +++ /dev/null @@ -1,21 +0,0 @@ ---- -author: Martin Weise ---- - -## tl;dr - -tbd - -## Description - -TBD - -## Solution - -TBD - -## DBRepo Features - -- [x] Import through CSV-dataset upload -- [x] Data views implementing embargo period (24 hours) -- [x] External access from Grafana Dashboard diff --git a/.docs/examples/music.md b/.docs/examples/music.md new file mode 100644 index 0000000000000000000000000000000000000000..02a848d59d02142a20a1ffeb403ca634eeb7fc59 --- /dev/null +++ b/.docs/examples/music.md @@ -0,0 +1,34 @@ +--- +author: Martin Weise +--- + +## tl;dr + +[:fontawesome-solid-database: Dataset](https://dbrepo1.ec.tuwien.ac.at/pid/34){ .md-button .md-button--primary target="_blank" } +[:material-file-document: Archive](https://gitlab.tuwien.ac.at/martin.weise/fairnb){ .md-button .md-button--secondary target="_blank" } + +## Description + +We use a dataset collected by [Aljanaki et al.](https://www2.projects.science.uu.nl/memotion/emotifydata/), consisting +of 400 MP3 music files, each having a playtime of one minute and labeled with one of four genres: rock, pop, classical +and electronic, each genre contains 100 files, the genre will be used as label for the ML model. Then by generating MFCC +vectors and training a SVM, the ML-model can classify emotions of the provided .mp3 files with and accuracy of 76.25%. + +<figure markdown> +{ .img-border } +<figcaption>Figure 1: Accuracy of predictions matrix in Jupyter Notebook.</figcaption> +</figure> + +## Solution + +DBRepo is used as relational data storage of the raw- and aggregated features, prediction results and the splits of the +training- and test data. For each of the 400 .mp3 files, 40 MFCC feature vectors are generated. This data is stored +in aggregated form in the [`aggregated_features`](https://dbrepo1.ec.tuwien.ac.at/pid/47) table. + +## DBRepo Features + +- [x] Database as storage for machine learning data +- [x] System versioning +- [x] Subset exploration +- [x] Precise & PID of database tables +- [x] External data access for analysis diff --git a/.docs/examples/power.md b/.docs/examples/power.md index 4e86f6c30a2a313e34b6b9616c5968db62b30ea5..e85b1b98ce32b675829a67866fb3bef4340cd965 100644 --- a/.docs/examples/power.md +++ b/.docs/examples/power.md @@ -15,7 +15,7 @@ maintenance, quality of products and ultimately process efficiency and -producti <figure markdown>  -<figcaption>Figure 1: aaaa from <a href="https://publik.tuwien.ac.at/files/PubDat_252294.pdf">Hacksteiner (2016)</a>.</figcaption> +<figcaption>Figure 1: Total power usage of machine floor TU Pilotfabrik, image from <a href="https://publik.tuwien.ac.at/files/PubDat_252294.pdf">Hacksteiner (2016)</a>.</figcaption> </figure> ## Solution diff --git a/.docs/examples/survey.md b/.docs/examples/survey.md new file mode 100644 index 0000000000000000000000000000000000000000..88eea632e066e4ce3ed250d47e7ba6e5c46b5c8f --- /dev/null +++ b/.docs/examples/survey.md @@ -0,0 +1,38 @@ +--- +author: Martin Weise +--- + +## tl;dr + +[:fontawesome-solid-database: Dataset](https://handle.stage.datacite.org/10.82556/g2ac-vh88){ .md-button .md-button--primary target="_blank" } +[:simple-jupyter: Notebook](https://binder.science.datalab.tuwien.ac.at/v2/git/https%3A%2F%2Fgitlab.tuwien.ac.at%2Fmartin.weise%2Ftres/HEAD){ .md-button .md-button--secondary target="_blank" } + +## Description + +This digital record contains historical air pollution and air quality data from approximately 20 air monitoring stations +in Vienna, spanning the years from 1980 to 2021. The data was provided by the Umweltbundesamt and is stored in its +original form in this record. This record forms the basis of an analysis carried out in a bachelor's thesis at the TU +Wien. + +## Solution + + +<figure markdown> +{ .img-border } +<figcaption>Figure 1: Jupyter Notebook accessing data on DBRepo using the Python Library.</figcaption> +</figure> + +## DBRepo Features + +- [x] Import complex dataset +- [x] System versioning +- [x] Subset exploration +- [x] Aggregated views +- [x] Precise & PID of queries tables +- [x] External data access for analysis + +## Acknowledgement + +This work was part of a cooperation with the [Umweltbundesamt](https://www.umweltbundesamt.at/). + +<img src="../../images/logos/umweltbundesamt.png" width=100 /> \ No newline at end of file diff --git a/.docs/images/logos/umweltbundesamt.png b/.docs/images/logos/umweltbundesamt.png new file mode 100644 index 0000000000000000000000000000000000000000..4230cd264615bf50399d9742efaf49dfca445da6 Binary files /dev/null and b/.docs/images/logos/umweltbundesamt.png differ diff --git a/.docs/images/screenshots/air-notebook.png b/.docs/images/screenshots/air-notebook.png new file mode 100644 index 0000000000000000000000000000000000000000..8f1a21e405da9e2ac01d8b79475d40f8aa61924f Binary files /dev/null and b/.docs/images/screenshots/air-notebook.png differ diff --git a/.docs/images/screenshots/mfcc-jupyter.png b/.docs/images/screenshots/mfcc-jupyter.png new file mode 100644 index 0000000000000000000000000000000000000000..36905661efce77254652ea60ea41629426695ba2 Binary files /dev/null and b/.docs/images/screenshots/mfcc-jupyter.png differ diff --git a/mkdocs.yml b/mkdocs.yml index f64ac77869749912be823e4e76682e715fa76928..8939c5fa446d9921fe20b21f9ca51d9713287e2d 100644 --- a/mkdocs.yml +++ b/mkdocs.yml @@ -45,9 +45,13 @@ nav: - UI: - Customization: api/ui.md - Examples: - - Hazardous Materials: examples/hazard.md - - Power in Industry 4.0: examples/power.md - - Transportation Monitoring: examples/transportation.md + - Air Quality Data: examples/air.md + - COVID-19 Data: examples/covid-19.md + - Hazard Data: examples/hazard.md + - Industry 4.0 Power Data: examples/power.md + - Survey Data: examples/survey.md + - Music-ML Data: examples/music.md + - Transportation Data: examples/transportation.md - XPS Data: examples/xps-data.md - publications.md - contact.md