Skip to content

AI on Yale Research Computing

This section documents supported workflows for running large language models and related AI tools on Yale Research Computing resources.

This documentation builds on existing YCRC documentation, including:

Scope

Covered workflows:

  • Open-source inference using Hugging Face and Ollama
  • Interactive usage via OpenWebUI, Terminals, and Jupyter Lab
  • GPU monitoring using Jobstats
  • Multi-GPU usage on a single node
  • Yale-managed access to closed-source models via Clarity
  • Recommendations/Warnings on AI coding tools
  • vLLM and Flash Attention Installation instructions
  • Resource recommendations

Future workflows:

  • Retrieval-augmented generation
  • Fine-tuning and training

Use the pages below based on your workflow and questions:

  • Where should this run? What are some beginner tips about using LLMs on YCRC systems?
    See Available Resources and Recommendations for guidance on clusters, GPU characteristics, and workflow placement.

  • Are my GPUs actually being used?/Have I received an email from Jobstats about not using a GPU?
    See GPU Monitoring with Jobstats for how to validate GPU utilization and memory usage.

  • I want to use Ollama
    See Ollama for supported usage and multi-GPU considerations.

  • I want structured practice and validation examples
    See Exercises for examples on using Ollama in Jupyter Notebooks.

  • I want to use Hugging Face
    See Hugging Face for supported environment setup and inference workflows.

  • I want to use multiple GPUs
    See Multi-GPU Usage in Miniconda Environments for common causes and diagnostics.

  • I need help installing AI/ML python packages like flash attention or vllm
    See Common Package installation Methods for instructions

  • I need access to closed-source models but make sure my data is secure(Claude, etc)
    See Clarity API for Yale-managed access and usage guidance.

  • I want to use AI coding tools
    See AI Coding Tools on YCRC Systems for recommendations and data security concerns


Last update: January 30, 2026