AI on Yale Research Computing

This section documents supported workflows for running large language models and related AI tools on Yale Research Computing resources.

This documentation builds on existing YCRC documentation, including:

Scope

Covered workflows:

Future workflows:

Use the pages below based on your workflow and questions:

Where should this run? What are some beginner tips about using LLMs on YCRC systems?
See Available Resources and Recommendations for guidance on clusters, GPU characteristics, and workflow placement.
Are my GPUs actually being used?/Have I received an email from Jobstats about not using a GPU?
See GPU Monitoring with Jobstats for how to validate GPU utilization and memory usage.
I want to use Ollama
See Ollama for supported usage and multi-GPU considerations.
I want structured practice and validation examples
See Exercises for examples on using Ollama in Jupyter Notebooks.
I want to use Hugging Face
See Hugging Face for supported environment setup and inference workflows.
I want to use multiple GPUs
See Multi-GPU Usage in Miniconda Environments for common causes and diagnostics.
I need help installing AI/ML python packages like flash attention or vllm
See Common Package installation Methods for instructions
I need access to closed-source models but make sure my data is secure(Claude, etc)
See Clarity API for Yale-managed access and usage guidance.
I want to use AI coding tools
See AI Coding Tools on YCRC Systems for recommendations and data security concerns

Last update: January 30, 2026