UploadingFiles:Frontenac

From CAC Wiki
Jump to: navigation, search

Uploading / Downloading Files

Using a Secure File Transfer (sftp) client

We recommend using FileZilla to transfer files to and from the cluster. You can get this software from this link. Once you've installed and opened the FileZilla Client, use the following instructions to connect.

In the "Quickconnect" bar at the top of FileZilla, enter the following information:

  • Host: sftp://login.cac.queensu.ca
  • User: (your username)
  • Password: (your password, don't use temporary passwords)
  • Port: (leave blank, in case of problems use "22")
  • Hit "Quickconnect" to connect

Once connected, you should see your files on the cluster along the right hand side, and the files from your computer on the left. To transfer files between your computer and the cluster, drag-and-drop the files from one side to the other (or to and from your desktop).

Using Globus through a command-line interface

Globus provides a means to transfer large amounts of data in a batch framework, i.e. without "standing by" while the transfer is on-going. Since the setup of an individual "endpoint" is required for this, we don't recommend this method if only small amounts of data need to be transferred. However, if you are planning to move large amounts (in the TB range), then Globus is a reliable and convenient method.

If you decide to go this route, follow the following steps.

Installing Globus Command-Line Interface (CLI)

We reccommend to do the following installs in a spearate directory.

$ mkdir globus
$ cd globus

The Globus CLI needs to be installed individually by the user. This is very simple using the python "pip" tool:

$ module load python
$ pip install --upgrade --user globus-cli
Collecting globus-cli [...response from pip installer...]

In addition, the "Globus Connect Personal CLI" needs to be installed too. We're adding the directory it's in to the path.

$ wget https://downloads.globus.org/globus-connect-personal/linux/stable/globusconnectpersonal-latest.tgz
[...download response from wget...]
2018-11-12 09:57:00 (24.3 MB/s) - ‘globusconnectpersonal-latest.tgz’ saved [14501379/14501379]
$ tar xzf globusconnectpersonal-latest.tgz
$ cd globusconnectpersonal-2.3.6/
$ export PATH=`pwd`:$PATH

Login to Globus

Once the CLI is installed it can be used to login to your Globus account. You need a Globus ID which you can make yourself, or (more likely) obtain through Compute Canada. Authentication is done through a browser. Thge globus login command will provide a link to a Globus page, which you cut-and-paste. At the page you will be required to provide your Globus ID and authorize some access. Eventually you will be give an authorization code which you can cu-and-paste back into the login session:

hpc1005@caclogin03$ globus login --no-local-server
Please authenticate with Globus here:
------------------------------------
https://auth.globus.org/v2/oauth2/authorize?[...etc...]
------------------------------------

Enter the resulting Authorization Code here: qLdfgbsbhdfugisbsusidfgsdbu

You have successfully logged in to the Globus CLI!

You can check your primary identity with
  globus whoami

For information on which of your identities are in session use
  globus session show

Logout of the Globus CLI with
  globus logout

"globus --help" provides a list of available commands that are used from the Globus CLI to initiate transfer sessions etc.

Creating, connecting, and verifying a personal endpoint

Globus works on the basis of "endpoints" between which any file transfer takes place. We need to create such an endpoint, then connect and verify it. First the creation. Make sure you are logged into Globus when you do this:

$ globus endpoint create --personal test-endpoint
Message:     Endpoint created successfully
Endpoint ID: cb8eed54-e72e-1e28-8aca-0a1edd5c824a
Setup Key:   b2224504-e78d-4a87-b8e5-679164e0877f

The Endpoint ID is used to initiate any transfer from the present system. The sedtup key is necessary to connect the endpoint and verify it using the "globusconnectpersonal" command (make sure both directories for Globus CLI and Globus Personal Connect CLI are in the path.

hpc1005@caclogin03$ globusconnectpersonal -setup b2224504-e78d-4a87-b8e5-679164e0877f
Configuration directory: $HOME .globusonline/lta
Contacting relay.globusonline.org:2223
Done!

At this point, you new endpoint should appear in a list of endpoints you can generate with the "globus endpoint" command:

$ globus endpoint search --filter-scope my-endpoints
ID                                   | Owner                     | Display Name
------------------------------------ | ------------------------- | -------------------
6345e4d2-5aab-1ab8-9565-0426a3d44368 | hschmide@computecanada.ca | Hartmut's PC at CAC
cb8eed54-e72e-1e28-8aca-0a1edd5c824a | hschmide@computecanada.ca | test-endpoint

The second line is obviously the present endpoint we just created.

Find and verify the remote endpoint

An endpoint search can be used to find the system you want to transfer to (or from). We use the Compute Canada system "Cedar" as an example:

hpc1005@caclogin03$ globus endpoint search cedar
ID                                   | Owner                      | Display Name
------------------------------------ | -------------------------- | -----------------------------------
c99fd40c-5545-11e7-beb6-22000b9a448b | computecanada@globusid.org | computecanada#cedar-dtn
a962d108-7b4b-11e8-9446-0a6d4e044368 | computecanada@globusid.org | computecanada#cedar-mial
[...more lines...]

The first line (the one with -dtn) is a data transfer node, so that is what we are going for. To be allowed to transfer to that node, you need to authenticate to it.

hpc1005@caclogin03$ globus endpoint activate --no-browser --web a962d108-7b4b-11e8-9446-0a6d4e044368
Autoactivation succeeded with message: Endpoint activated successfully using cached credential

In this case, the credentials are already available to Globus because of earlier usage. If you're doing this for the first time, you will be provided with a "Web activation url" that you can cu-and-paste to a browser to authenticate. If the endpoint has already been activated and is still usable, you are being told an expiry date for the activation.

Starting Globus Connect

Finally, "globusconnect" can be started in the background. Again, be sure to have the executable in your path.

$ nohup globusconnectpersonal -start &
[1] 116748

You're given a process number. It is good idea to note that down.

Initiating a file transfer

File transfer itself is now done with the globus transfer command:

$ globus transfer --encrypt cb8eed54-e72e-1e28-8aca-0a1edd5c824a:wfn.tar c99fd40c-5545-11e7-beb6-22000b9a448b:wfn.tar
Message: The transfer has been accepted and a task has been created and queued for execution
Task ID: 0d2a128c-e695-11e8-8c9a-0a1d4c5c824a

The first argument is the endpoint id and file name of the source, the second argument likewise for the target of the transfer. The progress of the transfer can be monitored from the Globus portal.

Shutting down a globus process

The "globusconnectpersonal" process that was started in the background before the transfer could start can be shut down by bringing it into the foreground and stoppoing it with Cntrl-C:

$ fg
nohup globusconnectpersonal -start
^C
$

The Globus portals & Help

Compute Canada operates a Globus Portal which can be used to create a Globus account (if you don't already have one), or to initiate file transfers using a GUI. For the latter to work the personal endpoint has to be set up as described above, and the "globusconnectpersonal" process has to be running.

Alternatively, Globus offers a similar portal that can be accessed with the same credentials.

Extensive documentation about Globus is avalable at https://docs.computecanada.ca/wiki/Globus.

If you need assistance with using globus on our systems, please send email to cac.help@queensu.ca ; we can guide you through the process.