Deploying Llama2 on AWS EC2: Solutions for GPU Memory Issues

·Jan 22, 2024 03:10 PM

I am trying to run llama2 on an ec2 instance with GPU T4 and encountering some errors. the current GPU has the memory of 14 Gb. what is the best way of deploying llama2 on aws without constraints and issues with memory?

2 comments

· Sorted by Oldest

Arize News

Deploying Llama2 on AWS EC2: Solutions for GPU Memory Issues

Shadrack D.

·Jan 22, 2024 03:10 PM

I am trying to run llama2 on an ec2 instance with GPU T4 and encountering some errors. the current GPU has the memory of 14 Gb. what is the best way of deploying llama2 on aws without constraints and issues with memory?

2 comments

· Sorted by Oldest

Amber R.
·
Hi Shadrack D., let me check with our team. I'm also going to move this conversation into 🔒[private feed]!
🙌1
Jeffrey P.
·
If you are running th quantized version of llama, it should work on an ec2 instance. However, if you are trying to run the 16bit or 32bit versions, you will run out of memory.

Amber R.
·
Hi Shadrack D., let me check with our team. I'm also going to move this conversation into 🔒[private feed]!
🙌1
Jeffrey P.
·
If you are running th quantized version of llama, it should work on an ec2 instance. However, if you are trying to run the 16bit or 32bit versions, you will run out of memory.