Skip to content
Snippets Groups Projects
Commit 20a96477 authored by Janos Bekesi's avatar Janos Bekesi
Browse files

workflow example

parent 295fa6b8
Branches
Tags 1.0.0
No related merge requests found
# Newseye
Newseye Jupyter Notebook: János Békési, Martin Gasteiner
\ No newline at end of file
Newseye Jupyter Notebook: János Békési, Martin Gasteiner
## Data
22 MB transkribus json data (down to article level) resp. 12 MB csv data of the same
## Workflow
Sometimes the generating of topic models is rather tedious, since difficulties can arise regarding
input data (datetime series with different starting or ending points, unforeseen gaps, etc.), or
the output formatting has to consider presentation quirks or sequence fittings. Though time aufgewendet
will mostly be more than estimated, any of those obstructions will be mastered (?) with a bit of
patience and thoughtfulness.
When preparing data for topic modelling, especially when using scanned data, it is crucious to allow for
some document structure (pages, sections, documents).
Source diff could not be displayed: it is too large. Options to address this: view the blob.
0% Loading or .
You are about to add 0 people to the discussion. Proceed with caution.
Please register or to comment