Jeffrey P.

Commented on Deploying Llama2 on AWS EC2: Solutions for GPU Mem...·Posted inArize News

If you are running th quantized version of llama, it should work on an ec2 instance. However, if you are trying to run the 16bit or 32bit versions, you will run out of memory.