diff --git a/ECMWF/README.md b/ECMWF/README.md index a14c6adc34a4da323972a5fe634079ede1123157..c2a69374b4879bb9240a1ed087ac77061e3578d8 100644 --- a/ECMWF/README.md +++ b/ECMWF/README.md @@ -1,10 +1,9 @@ - # European Center for Medium-Range Weather Forecast + {: width="400px"} [website](https://www.ecmwf.int) / [service status](https://www.ecmwf.int/en/service-status) / [confluence](https://confluence.ecmwf.int) / [support](https://support.ecmwf.int) / [accounting](https://www.ecmwf.int/user) / [jupyterhub](https://jupyterhub.ecmwf.int) - If you need access, talk to your supervisor to create an account for you. You will get a username and a password as well as an OTP device (hardware or smartphone). Accounts are handled via [www.ecmwf.int](https://www.ecmwf.int/user/login) Available Services @ IMGW: @@ -15,8 +14,7 @@ Available Services @ IMGW: ## Connecting to ECMWF Services ???+ warning "Teleport supported versions" - Please note that ECMWF does currently only support teleport version up to 13. [ecmwf information](https://confluence.ecmwf.int/display/UDOC/Teleport+SSH+Access) - +Please note that ECMWF does currently only support teleport version up to 13. [ecmwf information](https://confluence.ecmwf.int/display/UDOC/Teleport+SSH+Access) ### from home @@ -38,7 +36,7 @@ If browser window does not open automatically, open it by clicking on the link: > Profile URL: https://jump.ecmwf.int:443 Logged in as: [MAIL ADDRESS] Cluster: jump.ecmwf.int - Roles: + Roles: Logins: [ECMWF USERNAME] Kubernetes: disabled Valid until: 2024-10-09 23:18:33 +0200 CEST [valid for 11h58m0s] @@ -47,7 +45,6 @@ If browser window does not open automatically, open it by clicking on the link: look at the [SSH config](#configuration) below and you should be fine to connect. - ### from IMGW A ECMWF user can connect to the ECS/ATOS using teleport, first load the teleport module and start the ssh-agent: @@ -85,7 +82,6 @@ ssh -J [user]@jump.ecmwf.int [user]@hpc-login tsh logout ``` - ### Configuration Environment variables configuration: @@ -108,13 +104,13 @@ Host jump.ecmwf.int a?-* a??-* hpc-* hpc2020-* ecs-* PubkeyAcceptedKeyTypes +ssh-rsa* ServerAliveInterval 60 TCPKeepAlive yes - + Host a?-* a??-* hpc-* hpc2020-* ecs-* ProxyJump jump.ecmwf.int ``` - ### SSH-agent + It is required to have an SSH-agent running in order to connect to the ECMWF servers. The teleport module includes a `startagent` function to allow to reconnect to an existing ssh-agent. Do not start too many agents! ```bash title="start ssh-agent" @@ -130,7 +126,6 @@ userservices sshtools -h userservices sshtools -k ``` - ## ECMWF Access Server (ECS) There is an issue with ssh-keys @@ -151,6 +146,23 @@ Sometimes there are also different issues with the connection. You can [search t - Example - [Teleport permission denied](https://gitlab.phaidra.org/imgw/computer-resources/-/issues/61#note_36373) - Example - [Teleport authentication failed](https://gitlab.phaidra.org/imgw/computer-resources/-/issues/58) +## Storage at ECMWF + +There is the information for the new ATOS system: [ecmwf wiki](https://confluence.ecmwf.int/display/UDOC/HPC2020%3A+Filesystems). + +A short summart + +| File System | Suitable for ... | Technology | Features | Quota | +| ----------- | ----------------------------------------------------------------------------------------------------------------------------------------- | ------------------------------------------ | ------------------------------------------------------------------------------------------------------ | ------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- | +| HOME | permanent files, e.g. profile, utilities, sources, libraries, etc. | NFS | It is backed up. | **10GB** | +| PERM | permanent files. no backup. | NFS | *DO NOT USE IN PARALLEL APPLICATIONS. DO NOT USE FOR JOB STANDARD OUTPUT/ERROR* | **500 GB** | +| HPCPERM | permanent files. no backup | Lustre | | 100 GB for users without HPC access. **1 TB** for users with HPC access | +| SCRATCH | all temporary (large) files. Main storage for your jobs and experiments input and output files. | Lustre | Automatic deletion after 30 days of last access. no backup. | **50 TB** for users with HPC access. 2 TB for users without HPC access | +| SCRATCHDIR | Big temporary data for an individual session or job, not as fast as TMPDIR but higher capacity. Files accessible from all cluster. | Lustre | Deleted at the end of session or jobCreated per session/ job as a subdirectory in SCRATCH | part of SCRATCH quota | +| TMPDIR | Fast temporary data for an individual session or job, small files only. Local to every node. | SSD | Deleted at the end of session or job. Created per session/ job | space and limits are shared with LOCALSSD | + +Typically all users from Austria a grouped together into the `at` group and can share data with each other. + ## Connecting via ECaccess @@ -212,7 +224,7 @@ if you have troubles or for some other reason, if you remove this file `~/.eccer ```sh title="ECAccess gateway" # check what server you are connected to -$ ecaccess-gateway-name +$ ecaccess-gateway-name boaccess.ecmwf.int # connected ? $ ecaccess-gateway-connected @@ -246,16 +258,16 @@ $ ecaccess-ectrans-list $ ecaccess-ectrans-list 176674485 COPY aurora boaccess.ecmwf.int Oct 09 09:42 ``` -if you encounter a STOP or ERROR, then you can also check the gateway ([boaccess](https://boaccess.ecmwf.int), [imgaccess](https://ecaccess.wolke.img.univie.ac.at)) to have a look at the message (file transfers). - +if you encounter a STOP or ERROR, then you can also check the gateway ([boaccess](https://boaccess.ecmwf.int), [imgaccess](https://ecaccess.wolke.img.univie.ac.at)) to have a look at the message (file transfers). ### How to create ecaccess associations There are two ways to create these associations: + 1. via the web interface: - - [boaccess](https://boaccess.ecmwf.int) - - [imgaccess](https://ecaccess.wolke.img.univie.ac.at) + - [boaccess](https://boaccess.ecmwf.int) + - [imgaccess](https://ecaccess.wolke.img.univie.ac.at) 2. via the ecaccess-webtoolkit After creating **new associations** it takes a while before they actually work (about 10min). @@ -264,8 +276,8 @@ After creating **new associations** it takes a while before they actually work ( Depending on where you want to transfer files to, go to: - - AURORA > [boaccess](https://boaccess.ecmwf.int) - - JET > [imgaccess](https://ecaccess.wolke.img.univie.ac.at) +- AURORA > [boaccess](https://boaccess.ecmwf.int) +- JET > [imgaccess](https://ecaccess.wolke.img.univie.ac.at) Steps: @@ -273,13 +285,13 @@ Steps: 2. Go to **ECtrans setup** 3. Click **add association** (at bottom) 4. Fill in the association - - `name` - - `hostname` (login.img.univie.ac.at or jet01 or jet02) - - `directory` (`/srvfs/scratch/[USERNAME]` or something else) - - `comment` (giving you a hint where it drops the file sto) - - `login` (this is your imgw server username) - - `password` (this is your imgw server password) -5. Click on *Create* + - `name` + - `hostname` (login.img.univie.ac.at or jet01 or jet02) + - `directory` (`/srvfs/scratch/[USERNAME]` or something else) + - `comment` (giving you a hint where it drops the file sto) + - `login` (this is your imgw server username) + - `password` (this is your imgw server password) +5. Click on _Create_ Later you can also change the password for your associations. @@ -302,15 +314,15 @@ $ ecaccess-association-get -template [ASSOCIATION NAME] new-association This file serves as a template only the first part is important. Change: - - `active='yes'` - - `comment='Ssomething that explains where is will send the data to'` - - `directory='/srvfs/scratch/[USERNAME]'` or another directory. - - `hostName='login.img.univie.ac.at'` or `jet01...` or `jet02...` - - `login='[USERNAME]'` - - `protocol='genericSftp'` -save the file and then you can add this to the correct gateway. **Remember that JET is only available from the gateway (ecaccess.img.univie.ac.at)**, which is available only from inside the VPN@UNIVIE under [ecaccess.wolke.img.univie.ac.at](https://ecaccess.wolke.img.univie.ac.at). +- `active='yes'` +- `comment='Ssomething that explains where is will send the data to'` +- `directory='/srvfs/scratch/[USERNAME]'` or another directory. +- `hostName='login.img.univie.ac.at'` or `jet01...` or `jet02...` +- `login='[USERNAME]'` +- `protocol='genericSftp'` +save the file and then you can add this to the correct gateway. **Remember that JET is only available from the gateway (ecaccess.img.univie.ac.at)**, which is available only from inside the VPN@UNIVIE under [ecaccess.wolke.img.univie.ac.at](https://ecaccess.wolke.img.univie.ac.at). ```conf title="new-association" ############################################################## @@ -325,14 +337,16 @@ $hostName=''; $login=''; $protocol=''; ``` + finally you can add your newly created association to the gateway: + ```sh # add to IMGW ecaccess server (password: your imgw server password) $ ecaccess-association-put -password -gateway ecaccess.img.univie.ac.at new-association -New password: +New password: # add to boaccess (password: your imgw server password) $ ecaccess-association-put -password new-association -New password: +New password: # test the association $ ecaccess-association-list aurora login.img.univie.ac.at active scratch @@ -342,8 +356,6 @@ jet jet01.img.univie.ac.at active scratch ``` - - ## ECaccess Gateway The department is running a member state ecaccess gateway service. **The purpose of an individual access server is to bridge ECMWF's network with IMGW's network.** Hence, protecting these networks. For example, you can access the JET cluster from the department ecaccess server, but not from boaccess server, but from boaccess you can accesss aurora.