Skip to content
Snippets Groups Projects

Dev

Merged Martin Weise requested to merge dev into master
45 files
+ 878
667
Compare changes
  • Side-by-side
  • Inline

Files

+ 57
68
@@ -33,89 +33,78 @@ This package supports Python 3.11+.
@@ -33,89 +33,78 @@ This package supports Python 3.11+.
## Quickstart
## Quickstart
Create a table and import a .csv file from your computer.
Get public data from a table as pandas `DataFrame`:
```python
```python
from dbrepo.RestClient import RestClient
from dbrepo.RestClient import RestClient
from dbrepo.api.dto import CreateTableColumn, ColumnType, CreateTableConstraints
client = RestClient(endpoint='https://test.dbrepo.tuwien.ac.at', username="foo",
client = RestClient(endpoint="https://dbrepo1.ec.tuwien.ac.at")
password="bar")
# Get a small data slice of just three rows
df = client.get_table_data(database_id=7, table_id=13, page=0, size=3, df=True)
# analyse csv
print(df)
analysis = client.analyse_datatypes(file_path="sensor.csv", separator=",")
# x_coord component unit ... value stationid meantype
print(f"Analysis result: {analysis}")
# 0 16.52617 Feinstaub (PM10) µg/m³ ... 21.0 01:0001 HMW
# -> columns=(date=date, precipitation=decimal, lat=decimal, lng=decimal), separator=,
# 1 16.52617 Feinstaub (PM10) µg/m³ ... 23.0 01:0001 HMW
# line_termination=\n
# 2 16.52617 Feinstaub (PM10) µg/m³ ... 26.0 01:0001 HMW
#
# create table
# [3 rows x 12 columns]
table = client.create_table(database_id=1,
name="Sensor Data",
constraints=CreateTableConstraints(checks=['precipitation >= 0'],
uniques=[['precipitation']]),
columns=[CreateTableColumn(name="date",
type=ColumnType.DATE,
dfid=3, # YYYY-MM-dd
primary_key=True,
null_allowed=False),
CreateTableColumn(name="precipitation",
type=ColumnType.DECIMAL,
size=10,
d=4,
primary_key=False,
null_allowed=True),
CreateTableColumn(name="lat",
type=ColumnType.DECIMAL,
size=10,
d=4,
primary_key=False,
null_allowed=True),
CreateTableColumn(name="lng",
type=ColumnType.DECIMAL,
size=10,
d=4,
primary_key=False,
null_allowed=True)])
print(f"Create table result {table}")
# -> (id=1, internal_name=sensor_data, ...)
client.import_table_data(database_id=1, table_id=1, file_path="sensor.csv", separator=",",
skip_lines=1, line_encoding="\n")
print(f"Finished.")
```
```
The library is well-documented, please see the [full documentation](../sphinx) or
Import data into a table:
the [PyPI page](https://pypi.org/project/dbrepo/).
## Supported Features & Best-Practices
```python
 
import pandas as pd
 
from dbrepo.RestClient import RestClient
- Manage user account ([docs](../usage-overview/#create-user-account))
client = RestClient(endpoint="https://dbrepo1.ec.tuwien.ac.at", username="foo",
- Manage databases ([docs](../usage-overview/#create-database))
password="bar")
- Manage database access & visibility ([docs](../usage-overview/#private-database-access))
df = pd.DataFrame(data={'x_coord': 16.52617, 'component': 'Feinstaub (PM10)',
- Import dataset ([docs](../usage-overview/#private-database-access))
'unit': 'µg/m³', ...})
- Create persistent identifiers ([docs](../usage-overview/#assign-database-pid))
client.import_table_data(database_id=7, table_id=13, file_name_or_data_frame=df)
- Execute queries ([docs](../usage-overview/#export-subset))
```
- Get data from tables/views/subsets
## Secrets
## Supported Features & Best-Practices
It is not recommended to store credentials directly in the notebook as they will be versioned with git, etc. Use
- Manage user
environment variables instead:
account ([docs](https://www.ifs.tuwien.ac.at/infrastructures/dbrepo/1.4.4/api/#create-user-account))
 
- Manage
 
databases ([docs](https://www.ifs.tuwien.ac.at/infrastructures/dbrepo//usage-overview/#create-database))
 
- Manage database access &
 
visibility ([docs](https://www.ifs.tuwien.ac.at/infrastructures/dbrepo/1.4.4/api/#create-database))
 
- Import
 
dataset ([docs](https://www.ifs.tuwien.ac.at/infrastructures/dbrepo/1.4.4/api/#import-dataset))
 
- Create persistent
 
identifiers ([docs](https://www.ifs.tuwien.ac.at/infrastructures/dbrepo/1.4.4/api/#assign-database-pid))
 
- Execute
 
queries ([docs](https://www.ifs.tuwien.ac.at/infrastructures/dbrepo/1.4.4/api/#export-subset))
 
- Get data from tables/views/subsets
```properties title=".env"
## Configure
DBREPO_ENDPOINT=https://test.dbrepo.tuwien.ac.at
DBREPO_USERNAME=foo
All credentials can optionally be set/overridden with environment variables. This is especially useful when sharing
DBREPO_PASSWORD=bar
Jupyter Notebooks by creating an invisible `.env` file and loading it:
DBREPO_SECURE=True
 
``` title=".env"
 
REST_API_ENDPOINT="https://dbrepo1.ec.tuwien.ac.at"
 
REST_API_USERNAME="foo"
 
REST_API_PASSWORD="bar"
 
REST_API_SECURE="True"
 
AMQP_API_HOST="https://dbrepo1.ec.tuwien.ac.at"
 
AMQP_API_PORT="5672"
 
AMQP_API_USERNAME="foo"
 
AMQP_API_PASSWORD="bar"
 
AMQP_API_VIRTUAL_HOST="dbrepo"
 
REST_UPLOAD_ENDPOINT="https://dbrepo1.ec.tuwien.ac.at/api/upload/files"
```
```
Then use the default constructor of the `RestClient` to e.g. analyse a CSV. Your secrets are automatically passed:
You can disable logging by setting the log level to e.g. `INFO`:
```python title="analysis.py"
```python
from dbrepo.RestClient import RestClient
from dbrepo.RestClient import RestClient
import logging
client = RestClient()
logging.getLogger().setLevel(logging.INFO)
analysis = client.analyse_datatypes(file_path="sensor.csv", separator=",")
...
 
client = RestClient(...)
```
```
## Future
## Future
Loading