From 99d11d0d4ff001e4cfbb040bfe2666f9070fbdad Mon Sep 17 00:00:00 2001 From: MB <michael.blaschek@univie.ac.at> Date: Wed, 9 Oct 2024 13:14:12 +0200 Subject: [PATCH] updated ECMWF infos --- ECMWF/README.md | 285 +++++++++++++++++++++++++++++++++++++++--------- 1 file changed, 236 insertions(+), 49 deletions(-) diff --git a/ECMWF/README.md b/ECMWF/README.md index 96dae08..ccf3768 100644 --- a/ECMWF/README.md +++ b/ECMWF/README.md @@ -2,7 +2,8 @@ # European Center for Medium-Range Weather Forecast {: width="400px"} -[website](https://www.ecmwf.int) / [service status](https://www.ecmwf.int/en/service-status) / [confluence](https://confluence.ecmwf.int) / [support](https://support.ecmwf.int) / [accounting](https://www.ecmwf.int/user) +[website](https://www.ecmwf.int) / [service status](https://www.ecmwf.int/en/service-status) / [confluence](https://confluence.ecmwf.int) / [support](https://support.ecmwf.int) / [accounting](https://www.ecmwf.int/user) / [jupyterhub](https://jupyterhub.ecmwf.int) + If you need access, talk to your supervisor to create an account for you. You will get a username and a password as well as an OTP device (hardware or smartphone). Accounts are handled via [www.ecmwf.int](https://www.ecmwf.int/user/login) @@ -10,19 +11,49 @@ Available Services @ IMGW: - [ecaccess](#connecting-via-ecaccess) - [ecgateway](#ecaccess-gateway) - + ## Connecting to ECMWF Services ???+ warning "Teleport supported versions" Please note that ECMWF does currently only support teleport version up to 13. [ecmwf information](https://confluence.ecmwf.int/display/UDOC/Teleport+SSH+Access) +### from home + +Well you need to download the appropriate package from [teleport](https://goteleport.com/download/) selecting **major version 13** and what ever sub version is available, e.g. 13.4.26 (9.10.2024). Choose your OS (Linux, Mac, Windows) and download the package. For Linux you can also try to install this with the package manager: + +```sh +# adjust to recent version +curl https://goteleport.com/static/install.sh | bash -s 13.4.26 + +# login using the ECMWF teleport server +# this opens a browser and you need to login to ECMWF with OTP. +tsh login --proxy=jump.ecmwf.int +If browser window does not open automatically, open it by clicking on the link: + http://127.0.0.1:38615/f5df50f4-35bf-4f88-a2dc-2a271df6e1d5 + +# finaly you should get a confirmation +> Profile URL: https://jump.ecmwf.int:443 + Logged in as: [MAIL ADDRESS] + Cluster: jump.ecmwf.int + Roles: + Logins: [ECMWF USERNAME] + Kubernetes: disabled + Valid until: 2024-10-09 23:18:33 +0200 CEST [valid for 11h58m0s] + Extensions: permit-X11-forwarding, permit-agent-forwarding, permit-port-forwarding, permit-pty +``` + +look at the [SSH config](#configuration) below and you should be fine to connect. + + +### from IMGW + A ECMWF user can connect to the ECS/ATOS using teleport, first load the teleport module and start the ssh-agent: ```shell title="Using teleport" module load teleport ** INFO: Default jumphost now: jump.ecmwf.int -** INFO: Module loaded. SSH Agent required for login, run 'ssh-agentstart', +** INFO: Module loaded. SSH Agent required for login, run 'ssh-agentstart', ** or 'ssh-agentreconnect' ro reconnect to an existing agent. ** run 'ssh-agent -k' to kill the agent. Login run: 'python3 -m teleport.login' and your ECMWF credentials. @@ -35,14 +66,14 @@ ssh-agentstart ssh-add -l ``` -once you have a running ssh-agent, run a browserless login via python +once you have a running ssh-agent, run a browserless login via python ```shell title="Connecting to ECMWF" # Login to the default teleport jump host (shell.ecmwf.int) Reading python3 -m teleport.login tsh status # run ssh agent again -ssh-add -l +ssh-add -l # now there should be two keys!!! # Login to ECaccess in Bologna ssh -J [user]@jump.ecmwf.int [user]@ecs-login @@ -52,11 +83,14 @@ ssh -J [user]@jump.ecmwf.int [user]@hpc-login tsh logout ``` + +### Configuration + Environment variables configuration: -- `ECMWF_USERNAME` - The ECMWF Username -- `ECMWF_PASSWORD` - The ECMWF Password -- `TSH_EXEC` - The Teleport binary tsh path +- `ECMWF_USERNAME` - The ECMWF Username +- `ECMWF_PASSWORD` - The ECMWF Password +- `TSH_EXEC` - The Teleport binary tsh path - `TSH_PROXY` - The ECMWF Teleport proxy You can set these variables in your `~/.bashrc` file to avoid typing these at every login. Please do not save your `ECMWF_PASSWORD` like this! @@ -64,22 +98,17 @@ You can set these variables in your `~/.bashrc` file to avoid typing these at ev It is highly advised to add this to your `.ssh/config`, although ECMWF has added some [information](https://confluence.ecmwf.int/display/UDOC/Teleport+SSH+Access+-+Linux+configuration) on that too: ```conf title=".ssh/config" -Host jump.ecmwf.int shell.ecmwf.int - HostKeyAlgorithms +ssh-rsa*,rsa-sha2-512 - PubkeyAcceptedKeyTypes +ssh-rsa* - User [ECMWF USERNAME] -# For ecgate and Cray HPCF -Host ecg* cc* - HostKeyAlgorithms +ssh-rsa*,rsa-sha2-512 - PubkeyAcceptedKeyTypes +ssh-rsa* - User [ECMWF USERNAME] - ProxyJump shell.ecmwf.int -# For Atos HPCF +Host jump.ecmwf.int a?-* a??-* hpc-* hpc2020-* ecs-* + User [ECMWF USERNAME] + IdentityFile ~/.tsh/keys/jump.ecmwf.int/[MAIL ADDRESS] + CertificateFile ~/.tsh/keys/jump.ecmwf.int/[MAIL ADDRESS]/jump.ecmwf.int-cert.pub + HostKeyAlgorithms +ssh-rsa*,rsa-sha2-512 + PubkeyAcceptedKeyTypes +ssh-rsa* + ServerAliveInterval 60 + TCPKeepAlive yes + Host a?-* a??-* hpc-* hpc2020-* ecs-* - HostKeyAlgorithms +ssh-rsa*,rsa-sha2-512 - PubkeyAcceptedKeyTypes +ssh-rsa* - User [ECMWF USERNAME] - ProxyJump jump.ecmwf.int + ProxyJump jump.ecmwf.int ``` @@ -95,7 +124,7 @@ ssh-agentstart ssh-agentreconnect # unsure about agents? userservices sshtools -h -# kill all +# kill all agents userservices sshtools -k ``` @@ -128,36 +157,194 @@ using a local installation of ecaccess tools can be used to submit jobs and moni ```bash title="ECAccess module" # load the ecaccess module module load ecaccess-webtoolkit/6.3.1 -# all available tools -ecaccess ecaccess-file-delete -ecaccess-association-delete ecaccess-file-dir -ecaccess-association-get ecaccess-file-get -ecaccess-association-list ecaccess-file-mdelete -ecaccess-association-protocol ecaccess-file-mget -ecaccess-association-put ecaccess-file-mkdir -ecaccess-certificate-create ecaccess-file-modtime -ecaccess-certificate-list ecaccess-file-move -ecaccess-cosinfo ecaccess-file-mput -ecaccess-ectrans-delete ecaccess-file-put -ecaccess-ectrans-list ecaccess-file-rmdir -ecaccess-ectrans-request ecaccess-file-size -ecaccess-ectrans-restart ecaccess-gateway-connected -ecaccess-event-clear ecaccess-gateway-list -ecaccess-event-create ecaccess-gateway-name -ecaccess-event-delete ecaccess-job-delete -ecaccess-event-grant ecaccess-job-get -ecaccess-event-list ecaccess-job-list -ecaccess-event-send ecaccess-job-restart -ecaccess-file-chmod ecaccess-job-submit -ecaccess-file-copy ecaccess-queue-list +# all available tools +ecaccess ecaccess-file-delete +ecaccess-association-delete ecaccess-file-dir +ecaccess-association-get ecaccess-file-get +ecaccess-association-list ecaccess-file-mdelete +ecaccess-association-protocol ecaccess-file-mget +ecaccess-association-put ecaccess-file-mkdir +ecaccess-certificate-create ecaccess-file-modtime +ecaccess-certificate-list ecaccess-file-move +ecaccess-cosinfo ecaccess-file-mput +ecaccess-ectrans-delete ecaccess-file-put +ecaccess-ectrans-list ecaccess-file-rmdir +ecaccess-ectrans-request ecaccess-file-size +ecaccess-ectrans-restart ecaccess-gateway-connected +ecaccess-event-clear ecaccess-gateway-list +ecaccess-event-create ecaccess-gateway-name +ecaccess-event-delete ecaccess-job-delete +ecaccess-event-grant ecaccess-job-get +ecaccess-event-list ecaccess-job-list +ecaccess-event-send ecaccess-job-restart +ecaccess-file-chmod ecaccess-job-submit +ecaccess-file-copy ecaccess-queue-list +``` + +```bash title="ECAccess certificate" # First get a valid certificate to get access -ecaccess-certificate-create -# +$ ecaccess-certificate-create +Please enter your user-id: [ECMWF short username] +Your passcode: [OTP Code] +# Check if the certifcate is fine +$ ecaccess-certificate-list +chmod 168h Oct 15 14:26 change file mode +deleteFile 168h Oct 15 14:26 delete file +deleteJob 168h Oct 15 14:26 delete a job +getFileList 168h Oct 15 14:26 get file list +getFileSize 168h Oct 15 14:26 get file size +getJobList 168h Oct 15 14:26 job list +getJobResult 168h Oct 15 14:26 job result +getTempFile 168h Oct 15 14:26 create temporary file +getTransferList 168h Oct 15 14:26 get transfer list +mkdir 168h Oct 15 14:26 make directory +moveFile 168h Oct 15 14:26 move file +readFile 168h Oct 15 14:26 read file +rmdir 168h Oct 15 14:26 remove directory +spoolTransfer 168h Oct 15 14:26 ectrans request +submitJob 168h Oct 15 14:26 job submission +writeFile 168h Oct 15 14:26 write file ``` +if you have troubles or for some other reason, if you remove this file `~/.eccert.crt` then your current certificate is gone. + +```sh title="ECAccess gateway" +# check what server you are connected to +$ ecaccess-gateway-name +boaccess.ecmwf.int +# connected ? +$ ecaccess-gateway-connected +yes + +# the associations are defined with path and username, password to access the server. See below. +# now it is time to check associations on that server +$ ecaccess-association-list +aurora aurora.img.univie.ac.at active scratch +# NAME GATEWAY STATUS COMMENT + +# check on a different ecaccess server +$ ecaccess-association-list -gateway ecaccess.img.univie.ac.at +jet jet01.img.univie.ac.at active scratch +# NAME GATEWAY STATUS COMMENT +``` + +```sh title="ECAccess ectrans" +# transfer some files using these associations +# +# ecaccess-ectrans-request -lifeTime 1h -overwrite -onFailure [ASSOCIATION NAME] [SOURCE FILE/DIR] +# lifeTime: how long it will retry to do so +# overwrite: overwrites any files existing +# onFailure: reports back to you. +# there are more options available: ecaccess-ectrans-request --help +$ ecaccess-ectrans-request -lifeTime 1h -overwrite -onFailure aurora [SOURCE FILE/DIR] +# this is an async process. It does not happen right away. +$ ecaccess-ectrans-list +176674485 INIT aurora boaccess.ecmwf.int Oct 09 09:42 +# check again +$ ecaccess-ectrans-list +176674485 COPY aurora boaccess.ecmwf.int Oct 09 09:42 +``` +if you encounter a STOP or ERROR, then you can also check the gateway ([boaccess](https://boaccess.ecmwf.int), [imgaccess](https://ecaccess.wolke.img.univie.ac.at)) to have a look at the message (file transfers). + + + +### How to create ecaccess associations + +There are two ways to create these associations: +1. via the web interface: + - [boaccess](https://boaccess.ecmwf.int) + - [imgaccess](https://ecaccess.wolke.img.univie.ac.at) +2. via the ecaccess-webtoolkit + +After creating **new associations** it takes a while before they actually work (about 10min). + +#### web interface + +Depending on where you want to transfer files to, go to: + + - AURORA > [boaccess](https://boaccess.ecmwf.int) + - JET > [imgaccess](https://ecaccess.wolke.img.univie.ac.at) + +Steps: + +1. Login with ECMWF username and OTP +2. Go to **ECtrans setup** +3. Click **add association** (at bottom) +4. Fill in the association + - `name` + - `hostname` (login.img.univie.ac.at or jet01 or jet02) + - `directory` (`/srvfs/scratch/[USERNAME]` or something else) + - `comment` (giving you a hint where it drops the file sto) + - `login` (this is your imgw server username) + - `password` (this is your imgw server password) +5. Click on *Create* + +Later you can also change the password for your associations. + +#### toolkit + +You need to have access to an installation of ecaccess-webtoolkit. + +```sh title="Create an association on one gateway server" +# load module to give access to the commands +$ module load ecaccess-webtoolkit/6.3.1 +# create a certificate +$ ecaccess-certificate-create +Please enter your user-id: [ECMWF short username] +Your passcode: [OTP Code] +# create the template +$ ecaccess-association-get -template [ASSOCIATION NAME] new-association +# now you need to edit that file: new-association +``` + +This file serves as a template only the first part is important. + +Change: + - `active='yes'` + - `comment='Ssomething that explains where is will send the data to'` + - `directory='/srvfs/scratch/[USERNAME]'` or another directory. + - `hostName='login.img.univie.ac.at'` or `jet01...` or `jet02...` + - `login='[USERNAME]'` + - `protocol='genericSftp'` + +save the file and then you can add this to the correct gateway. **Remember that JET is only available from the gateway (ecaccess.img.univie.ac.at)**, which is available only from inside the VPN@UNIVIE under [ecaccess.wolke.img.univie.ac.at](https://ecaccess.wolke.img.univie.ac.at). + + +```conf title="new-association" +############################################################## +# Main Parameters +############################################################## +$name='[ASSOCIATION NAME]'; +$active='no'; +$comment=''; +$grantedUserList=''; +$directory=''; +$hostName=''; +$login=''; +$protocol=''; +``` +finally you can add your newly created association to the gateway: +```sh +# add to IMGW ecaccess server (password: your imgw server password) +$ ecaccess-association-put -password -gateway ecaccess.img.univie.ac.at new-association +New password: +# add to boaccess (password: your imgw server password) +$ ecaccess-association-put -password new-association +New password: +# test the association +$ ecaccess-association-list +aurora login.img.univie.ac.at active scratch +$ ecaccess-association-list -gateway ecaccess.img.univie.ac.at +jet jet01.img.univie.ac.at active scratch +# send a file to both + +``` + + + ## ECaccess Gateway -The department is running a member state ecaccess gateway service. +The department is running a member state ecaccess gateway service. **The purpose of an individual access server is to bridge ECMWF's network with IMGW's network.** Hence, protecting these networks. For example, you can access the JET cluster from the department ecaccess server, but not from boaccess server, but from boaccess you can accesss aurora. There are two gateways: -- GitLab