diff --git a/docs/deployment.md b/docs/deployment.md index 45d793c39b32c992bd33a29de8c1af827151f756..3a5bd3e9f6fe66cd5e54417b1b67365357ddcbd1 100644 --- a/docs/deployment.md +++ b/docs/deployment.md @@ -29,20 +29,101 @@ it is *highly encouraged* to use your own certificate, properly issued by a trus you can use the self-signed certificate. You need to accept the risk in most browsers when visiting the [admin panel](https://localhost:8443/admin/). -<figure markdown> - -<figcaption>Google Chrome warning about the self-signed certificate</figcaption> -</figure> - Sign in with the default credentials (username `fda`, password `fda`) or the one you configured during set-up. Be default, users are created using the frontend and the sign-up page. But it is also possible to create users from Keycloak, they will still act as "self-sign-up" created users. Since we do not support all features of Keycloak, leave out required user actions as they will not be enforced, also the temporary password. -<figure markdown> - -<figcaption>Alternative user creation via Keycloak</figcaption> -</figure> +### Obtain Access Token + +=== "Terminal" + + ``` console + curl -X POST \ + -d "username=foo&password=bar&grant_type=password&client_id=dbrepo-client&scope=openid&client_secret=MUwRc7yfXSJwX8AdRMWaQC3Nep1VjwgG" \ + http://localhost/api/auth/realms/dbrepo/protocol/openid-connect/token + ``` + +=== "Python" + + ``` py + import requests + + auth = requests.post("http://localhost/api/auth/realms/dbrepo/protocol/openid-connect/token", data={ + "username": "foo", + "password": "bar", + "grant_type": "password", + "client_id": "dbrepo-client", + "scope": "openid", + "client_secret": "MUwRc7yfXSJwX8AdRMWaQC3Nep1VjwgG" + }) + print(auth.json()["access_token"]) + ``` + +### Refresh Access Token + +=== "Terminal" + + ``` console + curl -X POST \ + -d "grant_type=refresh_token&client_id=dbrepo-client&refresh_token=THE_REFRESH_TOKEN&client_secret=MUwRc7yfXSJwX8AdRMWaQC3Nep1VjwgG" \ + http://localhost/api/auth/realms/dbrepo/protocol/openid-connect/token + ``` + +=== "Python" + + ``` py + import requests + + auth = requests.post("http://localhost/api/auth/realms/dbrepo/protocol/openid-connect/token", data={ + "grant_type": "refresh_token", + "client_id": "dbrepo-client", + "client_secret": "MUwRc7yfXSJwX8AdRMWaQC3Nep1VjwgG", + "refresh_token": "THE_REFRESH_TOKEN" + }) + print(auth.json()["access_token"]) + ``` + +## Broker Service + +### Authentication + +The RabbitMQ client can be authenticated through plain (username, password) and OAuth2 mechanisms. Note that the access +token already contains a field `client_id=foo`, so the username is optional in `PlainCredentials()`. + +=== "Plain" + + ``` py + import pika + + credentials = pika.credentials.PlainCredentials("foo", "bar") + parameters = pika.ConnectionParameters('localhost', 5672, '/', credentials) + connection = pika.BlockingConnection(parameters) + channel = connection.channel() + channel.queue_declare(queue='test', durable=True) + channel.basic_publish(exchange='', + routing_key='test', + body=b'Hello World!') + print(" [x] Sent 'Hello World!'") + connection.close() + ``` + +=== "OAuth2" + + ``` py + import pika + + credentials = pika.credentials.PlainCredentials("", "THE_ACCESS_TOKEN") + parameters = pika.ConnectionParameters('localhost', 5672, '/', credentials) + connection = pika.BlockingConnection(parameters) + channel = connection.channel() + channel.queue_declare(queue='test', durable=True) + channel.basic_publish(exchange='', + routing_key='test', + body=b'Hello World!') + print(" [x] Sent 'Hello World!'") + connection.close() + ``` ## Identifier Service diff --git a/docs/stylesheets/extra.css b/docs/stylesheets/extra.css index 4dce5a10fd61b9dc7ebbea2cf75dba0806a4763a..18179736230f70ba31abd3461adac20ed870788d 100644 --- a/docs/stylesheets/extra.css +++ b/docs/stylesheets/extra.css @@ -8,14 +8,14 @@ border: 1px solid var(--md-primary-fg-color) } -.md-main .md-content a, -.md-main .md-content a { +.md-main .md-content a:not(.action-button), +.md-main .md-content a:not(.action-button) { color: var(--md-typeset-color); border-bottom: 2px solid var(--md-primary-fg-color); } -.md-main .md-content a:focus, -.md-main .md-content a:hover { +.md-main .md-content a:not(.action-button):focus, +.md-main .md-content a:not(.action-button):hover { color: var(--md-typeset-color); border-bottom: 2px solid var(--md-primary-fg-color--dark); } diff --git a/docs/system.md b/docs/system.md index a1d29301f5894e48998d867dd67d883699f4db65..bc043644dc1f40942b5694cdf00a824a91e63f64 100644 --- a/docs/system.md +++ b/docs/system.md @@ -7,6 +7,10 @@ hide: # System +!!! info "Abstract" + + This is the full system description from a technical/developer view. + We invite all open-source developers to help us fixing bugs and introducing features to the source code. Get involved by sending a mail to Prof. Andreas Rauber and Projektass. Martin Weise. @@ -20,28 +24,18 @@ technologies. The conceptualized microservices operate the basic database operat View the docker images for the documentation of the service. -### Discovery Service - -This microservice allows service discovery and registration of containers that provide services. It configures -a [Spring Cloud Netflix Eureka Server](https://cloud.spring.io/spring-cloud-netflix/reference/html/) to discover -services. - -!!! debug "Debug Information" - - * Port(s): 9090 - * Docker Image: [dbrepo/discovery-service](https://hub.docker.com/repository/docker/dbrepo/discovery-service) - * Swagger: not configured - -### Gateway Service +### Analyse Service -Provides a single point of access to the *application programming interface* (API) and configures -the [Spring Cloud Gateway](https://spring.io/projects/spring-cloud-gateway) to route traffic to the services. +It suggests data types for the FAIR Portal when creating a table from a *comma separated values* (CSV) file. It +recommends enumerations for columns and returns e.g. a list of potential primary key candidates. The researcher is able +to confirm these suggestions manually. Moreover, the *Analyze Service* determines basic statistical properties of +numerical columns. !!! debug "Debug Information" - * Port(s): 9095 - * Docker Image: [dbrepo/gateway-service](https://hub.docker.com/repository/docker/dbrepo/gateway-service) - * Swagger: not configured + * Ports: 5000/tcp + * Prometheus: `http://:5000/metrics` + * Swagger UI: `http://:5000/swagger-ui/index.html` [:fontawesome-solid-square-up-right: view online](/swagger/analyse) ### Authentication Service @@ -51,113 +45,122 @@ through an encrypted channel. !!! debug "Debug Information" - * Port(s): 9097 - * Docker Image: [dbrepo/authentication-service](https://hub.docker.com/repository/docker/dbrepo/authentication-service) - * Swagger UI: [/swagger/authentication](/swagger/authentication) + * Ports: 8080/tcp, 8443/tcp + * Admin Console: `http://:8443/` -### Metadata Database - -It is the core component of the project. It is a relational database that contains metadata about all researcher databases -created in the database repository like column names, check expressions, value enumerations or key/value constraints and -relevant data for citing data sets. Additionally, the concept, e.g. URI of units of measurements of numerical columns is -stored in the Metadata Database in order to provide semantic knowledge context. We use MariaDB for its rich capabilities -in the reference implementation. +### Broker Service -The default credentials are `root:dbrepo` for the database `fda`. Connect to the database via the JDBC connector. +It holds exchanges and topics responsible for holding AMQP messages for later consumption. We +use [RabbitMQ](https://www.rabbitmq.com/) in the reference implementation. The AMQP endpoint listens to port `5672` for +regular declares and offers a management interface at port `15672`. !!! debug "Debug Information" - * Port(s): 3306 + * Ports: 5672/tcp, 15672/tcp + * RabbitMQ Management Plugin: `http://:15672` + * RabbitMQ Prometheus Plugin: `http://:15692/metrics` -### Semantics Service +### Container Service -It is designed to map terms in the domain of units of measurement to controlled vocabulary, modelled in -the [ontology of units of measure](https://github.com/HajoRijgersberg/OM). This service validates researcher provided in -units and provides a *uniform resource identifier* (URI) to the related concept, which will be stored in the system. -Furthermore, there is a method for auto-completing text and listing a description as well as commonly used unit symbols. +It is responsible for Docker container lifecycle operations and updating the local copy of the Docker images. !!! debug "Debug Information" - * Port(s): 5010 - * Docker Image: [dbrepo/semantics-service](https://hub.docker.com/repository/docker/dbrepo/semantics-service) - * Swagger UI: [/swagger/semantics](/swagger/semantics) + * Ports: 9091/tcp + * Info: `http://:9091/actuator/info` + * Health: `http://:9091/actuator/health` + * Prometheus: `http://:9091/actuator/prometheus` + * Swagger UI: `http://:9091/swagger-ui/index.html` [:fontawesome-solid-square-up-right: view online](/swagger/container) -### Identifier Service +### Database Service -This microservice is responsible for creating and resolving a *persistent identifier* (PID) attached to a query to -obtain the metadata attached to it and allow re-execution of a query. We store both the query and hashes of the query -and result set to allow equality checks of the originally obtained result set and the currently obtained result set. In -the reference implementation we currently only use a numerical id column and plan to integrate *digital object -identifier* (DOI) through our institutional library soon. +It creates the databases inside a Docker container and the Query Store. Currently, we only +support [MariaDB](https://mariadb.org/) images that allow table versioning with low programmatic effort. !!! debug "Debug Information" - * Port(s): 9096 - * Docker Image: [dbrepo/identifier-service](https://hub.docker.com/repository/docker/dbrepo/identifier-service) - * Swagger UI: [/swagger/identifier](/swagger/identifier) + * Ports: 9092/tcp + * Info: `http://:9092/actuator/info` + * Health: `http://:9092/actuator/health` + * Prometheus: `http://:9092/actuator/prometheus` + * Swagger UI: `http://:9092/swagger-ui/index.html` [:fontawesome-solid-square-up-right: view online](/swagger/database) -### Search Service +### Discovery Service -It processes search requests from the Gateway Service for full-text lookups in the Metadata Database. We use -[Elasticsearch](https://www.elastic.co/) in the reference implementation. +This microservice allows service discovery and registration of containers that provide services. It configures +a [Spring Cloud Netflix Eureka Server](https://cloud.spring.io/spring-cloud-netflix/reference/html/) to discover +services. -The Search Service implements ElasticSearch and creates a retrievable index on all databases that is getting updated -with each save operation on databases in the metadata database. The database name can be queried with ElasticSearch -to e.g. match the term "Airquality" +!!! debug "Debug Information" + + * Ports: 9090/tcp + * Info: `http://:9090/actuator/info` + * Health: `http://:9090/actuator/health` + * Prometheus: `http://:9090/actuator/prometheus` + * Eureka Dashboard: `http://:9090/` -```console -curl http://localhost:9200/databaseindex/_search?q=name:Airquality -``` +### Gateway Service + +Provides a single point of access to the *application programming interface* (API) and configures +the [Spring Cloud Gateway](https://spring.io/projects/spring-cloud-gateway) to route traffic to the services. !!! debug "Debug Information" - * Port(s): 9200, 9600 - * Docker Image: [elasticsearch](https://hub.docker.com/_/elasticsearch) - * ElasticSearch Index + * Ports: 9095/tcp + * Info: `http://:9095/actuator/info` + * Health: `http://:9095/actuator/health` + * Prometheus: `http://:9095/actuator/prometheus` -### Container Service +### Identifier Service -It is responsible for Docker container lifecycle operations and updating the local copy of the Docker images. +This microservice is responsible for creating and resolving a *persistent identifier* (PID) attached to a query to +obtain the metadata attached to it and allow re-execution of a query. We store both the query and hashes of the query +and result set to allow equality checks of the originally obtained result set and the currently obtained result set. In +the reference implementation we currently only use a numerical id column and plan to integrate *digital object +identifier* (DOI) through our institutional library soon. !!! debug "Debug Information" - * Port(s): 9091 - * Docker Image: [dbrepo/container-service](https://hub.docker.com/repository/docker/dbrepo/container-service) - * Swagger UI: [/swagger/container](/swagger/container) + * Ports: 9096/tcp + * Info: `http://:9096/actuator/info` + * Health: `http://:9096/actuator/health` + * Prometheus: `http://:9096/actuator/prometheus` + * Swagger UI: `http://:9096/swagger-ui/index.html` [:fontawesome-solid-square-up-right: view online](/swagger/identifier) -### Database Service +### Metadata Database -It creates the databases inside a Docker container and the Query Store. Currently we only -support [MariaDB](https://mariadb.org/) images that allow table versioning with low programmatic effort. +It is the core component of the project. It is a relational database that contains metadata about all researcher databases +created in the database repository like column names, check expressions, value enumerations or key/value constraints and +relevant data for citing data sets. Additionally, the concept, e.g. URI of units of measurements of numerical columns is +stored in the Metadata Database in order to provide semantic knowledge context. We use MariaDB for its rich capabilities +in the reference implementation. + +The default credentials are `root:dbrepo` for the database `fda`. Connect to the database via the JDBC connector on port `3306`. !!! debug "Debug Information" - * Port(s): 9092 - * Docker Image: [dbrepo/database-service](https://hub.docker.com/repository/docker/dbrepo/database-service) - * Swagger UI: [/swagger/database](/swagger/database) + * Ports: 3306/tcp, 9100/tcp + * Prometheus: `http://:9100/metrics` -### Table Service +### Metadata Service -This microservice handles table operations inside a database that is managed by the Database Service. We -use [Hibernate](https://hibernate.org/orm/) for schema and data ingest operations. +This service provides an OAI-PMH endpoint for metadata crawler. !!! debug "Debug Information" - * Port(s): 9094 - * Docker Image: [dbrepo/table-service](https://hub.docker.com/repository/docker/dbrepo/table-service) - * Swagger UI: [/swagger/table](/swagger/table) + * Ports: 9099/tcp + * Info: `http://:9099/actuator/info` + * Health: `http://:9099/actuator/health` + * Prometheus: `http://:9099/actuator/prometheus` + * Swagger UI: `http://:9099/swagger-ui/index.html` [:fontawesome-solid-square-up-right: view online](/swagger/metadata) -### Broker Service +### Proxy -It holds exchanges and topics responsible for holding AMQP messages for later consumption. We -use [RabbitMQ](https://www.rabbitmq.com/) in the reference implementation. The AMQP endpoint listens to port 5672 for -regular declares and offers a management interface at port 15672. +The NGINX reverse proxy bundles the services and enables SSL/TLS communication for all endpoints. !!! debug "Debug Information" - * Port(s): 9098, 5672, 15672 - * Docker Image: [dbrepo/broker-service](https://hub.docker.com/repository/docker/dbrepo/broker-service) - * RabbitMQ Management Plugin + * Ports: 80/tcp, 443/tcp ### Query Service @@ -167,67 +170,70 @@ Service. !!! debug "Debug Information" - * Port(s): 9093 - * Docker Image: [dbrepo/query-service](https://hub.docker.com/repository/docker/dbrepo/query-service) - * Swagger UI: [/swagger/query](/swagger/query) -### FAIR Portal + * Ports: 9093/tcp + * Info: `http://:9093/actuator/info` + * Health: `http://:9093/actuator/health` + * Prometheus: `http://:9093/actuator/prometheus` + * Swagger UI: `http://:9093/swagger-ui/index.html` [:fontawesome-solid-square-up-right: view online](/swagger/query) -It provides a *graphical user interface* (GUI) for a researcher to interact with the database repository's API. +### Search Service + +It processes search requests from the Gateway Service for full-text lookups in the Metadata Database. We use +[Elasticsearch](https://www.elastic.co/) in the reference implementation. The Search Service implements Elastic Search +and creates a retrievable index on all databases that is getting updated with each save operation on databases in the +metadata database. + +All requests need to be authenticated, by default the credentials `elastic:elastic` are used. !!! debug "Debug Information" - * Port(s): 3000 - * Docker Image: [dbrepo/ui](https://hub.docker.com/repository/docker/dbrepo/ui) - * FAIR Portal + * Ports: 9200/tcp + * Indizes: `http://:9200/_all` -### Analyse Service +### Semantics Service -It suggests data types for the FAIR Portal when creating a table from a *comma separated values* (CSV) file. It -recommends enumerations for columns and returns e.g. a list of potential primary key candidates. The researcher is able -to confirm these suggestions manually. Moreover, the *Analyze Service* determines basic statistical properties of -numerical columns. +It is designed to map terms in the domain of units of measurement to controlled vocabulary, modelled in +the [ontology of units of measure](https://github.com/HajoRijgersberg/OM). This service validates researcher provided in +units and provides a *uniform resource identifier* (URI) to the related concept, which will be stored in the system. +Furthermore, there is a method for auto-completing text and listing a description as well as commonly used unit symbols. + +!!! debug "Debug Information" + + * Ports: 5010/tcp + * Prometheus: `http://:5010/metrics` + * Swagger UI: `http://:5010/swagger-ui/index.html` [:fontawesome-solid-square-up-right: view online](/swagger/semantics) + +### Table Service + +This microservice handles table operations inside a database that is managed by the Database Service. We +use [Hibernate](https://hibernate.org/orm/) for schema and data ingest operations. !!! debug "Debug Information" - * Port(s): 5000 - * Docker Image: [dbrepo/analyse-service](https://hub.docker.com/repository/docker/dbrepo/analyse-service) - * Swagger UI: [/swagger/analyse](/swagger/analyse) + * Ports: 9094/tcp + * Info: `http://:9094/actuator/info` + * Health: `http://:9094/actuator/health` + * Prometheus: `http://:9094/actuator/prometheus` + * Swagger UI: `http://:9094/swagger-ui/index.html` [:fontawesome-solid-square-up-right: view online](/swagger/table) -## Database +### UI -This container runs a relational database engine that allows data versioning and contains the Query Store, a special -table that stores all queries issued to the Researcher Database along with metadata. We store the queries here and not -in the metadata database level to ensure that they are preserved along with the original database for a regular backup -and archival together with the original database once the container is retired. +It provides a *graphical user interface* (GUI) for a researcher to interact with the database repository's API. -### Container +!!! debug "Debug Information" -Currently, we only support databases with -the [MariaDB engine](https://hub.docker.com/_/mariadb?tab=tags&page=1&name=10.5&ordering=-name). -DBRepo creates a *root* user for managing the tables, inserting data, etc. and provides a *mariadb* user that is only -granted `select` access to all tables. The default passwords need to be changed at -[`AbstractSeeder.java`](https://gitlab.phaidra.org/fair-data-austria-db-repository/fda-services/-/blob/master/fda-container-service/services/src/main/java/at/tuwien/seeder/impl/AbstractSeeder.java#L39-L51) + * Ports: 3000/tcp, 9100/tcp + * Prometheus: `http://:9100/metrics` + * UI: `http://:3000/` -### Query Store +### User Service -The Query Store is a special table (`qs_queries`) that stores all queries issued to the database via the HTTP API. It -stores meta-information about the queries directly in the database container: +This microservice handles user information. -<figure markdown> -| Name | Type | Constraint | Default | Comment | -|------------------|--------------|-------------|-------------------------|-------------------------------| -| id | bigint | primary key | nextval(qs_queries_seq) | | -| cid | bigint | | | Column ID | -| dbid | bigint | | | Database ID | -| created | datetime | | now() | | -| created_by | bigint | | | Creator User-ID | -| execution | datetime | | | | -| last_modified | datetime | | | | -| query | text | | | | -| query_normalized | text | | | removing *, randomness | -| query_hash | varchar(255) | | | sha256 hash of `query` field | -| result_hash | varchar(255) | | | sha256 hash of the result set | -| result_number | bigint | | | | +!!! debug "Debug Information" -<figcaption>Query Store table <code>qs_queries</code> schema</figcaption> -</figure> + * Ports: 9098/tcp + * Info: `http://:9098/actuator/info` + * Health: `http://:9098/actuator/health` + * Prometheus: `http://:9098/actuator/prometheus` + * Swagger UI: `http://:9098/swagger-ui/index.html` [:fontawesome-solid-square-up-right: view online](/swagger/user)