Large Transfers with Globus
For large data transfers both within Yale and to external collaborators, we recommend using Globus. Globus is a file transfer service that is efficient and easy to use. It has several advantages:
- Robust and fast transfers of large files and/or large collections of files.
- Files can be transferred between your computer and the clusters.
- Files can be transferred between Yale and other sites.
- A web and command-line interface for starting and monitoring transfers.
- Access to specific files or directories granted to external collaborators in a secure way.
Globus transfers data between computers set up as "endpoints". The official YCRC endpoints are listed below. Transfers can be to and from these endpoints or those you have defined for yourself with Globus Connect.
Course Accounts
Globus does not work for course accounts (<course_id>_<netid>
).
Please try the other transfer methods listed in our Transfer documentation instead.
Cluster Endpoints
We currently support endpoints for the following clusters.
Cluster | Globus Endpoint |
---|---|
Grace | Yale CRC Grace |
McCleary | Yale CRC McCleary |
Milgram | Yale CRC Milgram |
For Grace and McCleary, these endpoints provide access to all files you normally have access to.
For security reasons, Milgram Globus uses a staging area (/gpfs/milgram/globus/$NETID
).
Once uploaded, data should be moved from this staging area to its final location within Milgram.
Files in the staging area are purged after 21 days.
Get Started with Globus
- In a browser, go to app.globus.org.
- Use the pull-down menu to select Yale and click "Continue".
- If you are not already logged into CAS, you will be prompted to log in.
- [First login only] Do not associate with another account yet unless you are familiar with doing this
- [First login only] Select "non-profit research or educational purposes"
- [First login only] Click on "Allow" for allowing Globus Web App
- From the file manager interface enter the name of the endpoint you would like to browse in the collection field (e.g. Yale CRC Grace)
- Click on the right-hand side menu option "Transfer or Sync to..."
- Enter the second endpoint name in the right search box (e.g. another cluster or your personal endpoint)
- Select one or more files you would like to transfer and click the appropriate start button on the bottom.
- To complete a partial transfer, you can click the "sync" checkbox in the Transfer Setting window on the Globus page, and then Globus should resume the transfer where it left off.
Manage Your Endpoints
To manage your endpoints, such as delete an endpoint, rename it, or share it with additional people (be aware, they will be able to access your storage), go to Manage Endpoint on the Globus website.
Set Up an Endpoint on Your Computer
You can set up your own endpoint for transferring data to and from your own computer with Globus Connect Personal.
To transfer or share data between two personal endpoints, you will need to request access to the YCRC's Globus Plus subscription on this page.
Set Up a Microsoft OneDrive Endpoint
- Click on the following link: Globus OneDrive Endpoint
- Log into Globus, if needed.
- The first time you log into the endpoint, you will be asked ot grant access to your OneDrive account. Click to allow access and be taken through the approval process.
- After granting approval, you will be able to access the top level of your Yale OneDrive via the Globus Collection "Yale OneDrive".
Set Up a Dropbox Endpoint
- Click on the following link: Globus Dropbox Endpoint
- Log into Globus, if needed.
- The first time you log into the endpoint, you will be asked to grant access to your DropBox account. Click to allow access and be taken through the approval process.
- After granting approval, you will be able to access the top level of your DropBox storage via the Globus Collection "Yale Dropbox".
Set Up a Google Drive Endpoint
The Globus connector is configured to only allow data to be uploaded into EliApps (Yale's GSuite for Education) Google Drive accounts. If you don't have an EliApps account, request one as described above.
- Click on the following link: Globus Google Drive Endpoint
- Log into Globus, if needed.
- The first time you login to the Globus Google Drive endpoint, you will be asked to grant access to your Google Drive. Click to allow access and be taken through the approval process.
- You may see your Yale EliApps account expressed in an uncommon format, such as netid@yale.edu@accounts.google.com. This is normal, and expected.
- After granting approval, you will be able to access your Google Drive via the Globus Collection "YCRC Globus Google Drive Collection". The default view is "/My Drive". To see "/Team Drives" and other Google Drive features use the "up one folder" arrow icon in the File Manager.
Note
There are "rate limits" to how much data and how many files you can transfer in any 24 hours period. If you have hit your rate limit, Globus should automatically resume the transfer during the next 24 hour period. You see a "Endpoint Busy" error during this time.
Google has a 400,000 file limit per Shared Drive, so if you are archiving data to Google Drive, it is better to compress folders that contain lots of small files (e.g. using tar) before transferring.
In our testing, we have seen up to 10MB/s upload and 100MB/s download speeds.
Setup a S3 Endpoint
We support creating Globus S3 endpoints. To request a Globus S3 Endpoint, please contact YCRC. Please include in your request:
- S3 bucket name
- The Amazon Region for that bucket
- An initial list of Yale NetIDs who should be able to access the bucket
Warning
Please DO NOT send us the Amazon login credentials through an insecure method such as email or our ticketing system.
After we have created your Globus S3 endpoint, you will be able to further self-serve you own access controls with the Globus portal.