Skip to content
Snippets Groups Projects
Verified Commit 7009df14 authored by Martin Weise's avatar Martin Weise
Browse files

Added more examples

parent c936d771
Branches
Tags
2 merge requests!296Dev,!293Dev
---
author: Martin Weise
---
## tl;dr
[:fontawesome-solid-database:  Dataset](https://handle.stage.datacite.org/10.82556/gd17-aq82){ .md-button .md-button--primary target="_blank" }
[:material-file-document:  Archive](https://doi.org/10.48436/mtha8-w2406){ .md-button .md-button--secondary target="_blank" }
## Description
This digital record contains historical air pollution and air quality data from approximately 20 air monitoring stations
in Vienna, spanning the years from 1980 to 2021. The data was provided by the Umweltbundesamt and is stored in its
original form in this record. This record forms the basis of an analysis carried out in a bachelor's thesis at the TU
Wien.
## Solution
<figure markdown>
![Jupyter Notebook](../../images/screenshots/air-notebook.png){ .img-border }
<figcaption>Figure 1: Jupyter Notebook accessing data on DBRepo using the Python Library.</figcaption>
</figure>
## DBRepo Features
- [x] Import complex dataset
- [x] System versioning
- [x] Subset exploration
- [x] Aggregated views
- [x] Precise & PID of queries tables
- [x] External data access for analysis
## Acknowledgement
This work was part of a cooperation with the [Umweltbundesamt](https://www.umweltbundesamt.at/).
<img src="../../images/logos/umweltbundesamt.png" width=100 />
\ No newline at end of file
---
author: Martin Weise
---
## tl;dr
[:fontawesome-solid-database: &nbsp;Dataset](https://dbrepo1.ec.tuwien.ac.at/pid/15){ .md-button .md-button--primary target="_blank" }
[:simple-github: &nbsp;Archive](https://github.com/CSSEGISandData/COVID-19){ .md-button .md-button--secondary target="_blank" }
## Description
This dataset contains the daily COVID-19 data provided publicly
by [Johns Hopkins University](https://coronavirus.jhu.edu/about/how-to-use-our-data).
## Solution
We imported their daily snapshots provided as 1145 versioned .csv files from their Git repository archive and imported
them daily into DBRepo as system-versioned data that can be queried. During the time of this project the COVID-19
pandemic was still ongoing and therefore daily snapshots demanded a correct import script to be maintained.
## DBRepo Features
- [x] Data pipeline from Git repository
- [x] System versioning
- [x] Subset exploration
---
author: Martin Weise
---
## tl;dr
tbd
## Description
TBD
## Solution
TBD
## DBRepo Features
- [x] Import through CSV-dataset upload
- [x] Data views implementing embargo period (24 hours)
- [x] External access from Grafana Dashboard
---
author: Martin Weise
---
## tl;dr
[:fontawesome-solid-database: &nbsp;Dataset](https://dbrepo1.ec.tuwien.ac.at/pid/34){ .md-button .md-button--primary target="_blank" }
[:material-file-document: &nbsp;Archive](https://gitlab.tuwien.ac.at/martin.weise/fairnb){ .md-button .md-button--secondary target="_blank" }
## Description
We use a dataset collected by [Aljanaki et al.](https://www2.projects.science.uu.nl/memotion/emotifydata/), consisting
of 400 MP3 music files, each having a playtime of one minute and labeled with one of four genres: rock, pop, classical
and electronic, each genre contains 100 files, the genre will be used as label for the ML model. Then by generating MFCC
vectors and training a SVM, the ML-model can classify emotions of the provided .mp3 files with and accuracy of 76.25%.
<figure markdown>
![](../../images/screenshots/mfcc-jupyter.png){ .img-border }
<figcaption>Figure 1: Accuracy of predictions matrix in Jupyter Notebook.</figcaption>
</figure>
## Solution
DBRepo is used as relational data storage of the raw- and aggregated features, prediction results and the splits of the
training- and test data. For each of the 400 .mp3 files, 40 MFCC feature vectors are generated. This data is stored
in aggregated form in the [`aggregated_features`](https://dbrepo1.ec.tuwien.ac.at/pid/47) table.
## DBRepo Features
- [x] Database as storage for machine learning data
- [x] System versioning
- [x] Subset exploration
- [x] Precise & PID of database tables
- [x] External data access for analysis
......@@ -15,7 +15,7 @@ maintenance, quality of products and ultimately process efficiency and -producti
<figure markdown>
![](../../images/screenshots/power.png)
<figcaption>Figure 1: aaaa from <a href="https://publik.tuwien.ac.at/files/PubDat_252294.pdf">Hacksteiner (2016)</a>.</figcaption>
<figcaption>Figure 1: Total power usage of machine floor TU Pilotfabrik, image from <a href="https://publik.tuwien.ac.at/files/PubDat_252294.pdf">Hacksteiner (2016)</a>.</figcaption>
</figure>
## Solution
......
---
author: Martin Weise
---
## tl;dr
[:fontawesome-solid-database: &nbsp;Dataset](https://handle.stage.datacite.org/10.82556/g2ac-vh88){ .md-button .md-button--primary target="_blank" }
[:simple-jupyter: &nbsp;Notebook](https://binder.science.datalab.tuwien.ac.at/v2/git/https%3A%2F%2Fgitlab.tuwien.ac.at%2Fmartin.weise%2Ftres/HEAD){ .md-button .md-button--secondary target="_blank" }
## Description
This digital record contains historical air pollution and air quality data from approximately 20 air monitoring stations
in Vienna, spanning the years from 1980 to 2021. The data was provided by the Umweltbundesamt and is stored in its
original form in this record. This record forms the basis of an analysis carried out in a bachelor's thesis at the TU
Wien.
## Solution
<figure markdown>
![Jupyter Notebook](../../images/screenshots/air-notebook.png){ .img-border }
<figcaption>Figure 1: Jupyter Notebook accessing data on DBRepo using the Python Library.</figcaption>
</figure>
## DBRepo Features
- [x] Import complex dataset
- [x] System versioning
- [x] Subset exploration
- [x] Aggregated views
- [x] Precise & PID of queries tables
- [x] External data access for analysis
## Acknowledgement
This work was part of a cooperation with the [Umweltbundesamt](https://www.umweltbundesamt.at/).
<img src="../../images/logos/umweltbundesamt.png" width=100 />
\ No newline at end of file
.docs/images/logos/umweltbundesamt.png

9.24 KiB

.docs/images/screenshots/air-notebook.png

124 KiB

.docs/images/screenshots/mfcc-jupyter.png

79.5 KiB

......@@ -45,9 +45,13 @@ nav:
- UI:
- Customization: api/ui.md
- Examples:
- Hazardous Materials: examples/hazard.md
- Power in Industry 4.0: examples/power.md
- Transportation Monitoring: examples/transportation.md
- Air Quality Data: examples/air.md
- COVID-19 Data: examples/covid-19.md
- Hazard Data: examples/hazard.md
- Industry 4.0 Power Data: examples/power.md
- Survey Data: examples/survey.md
- Music-ML Data: examples/music.md
- Transportation Data: examples/transportation.md
- XPS Data: examples/xps-data.md
- publications.md
- contact.md
......
0% Loading or .
You are about to add 0 people to the discussion. Proceed with caution.
Please to comment