Glossary

VRAM (Video RAM)

The dedicated graphics memory on a GPU.

GPU servers with 24–32 GB VRAM

VRAM (Video Random Access Memory) is the dedicated memory on a graphics card. For AI workloads, VRAM size determines how large a model you can load: an LLM must fit in VRAM along with its weights and context. An RTX 4090 has 24 GB, an RTX 5090 32 GB. When VRAM is short, quantization (e.g. 4-bit) or a card with more memory helps.