Australia/Sydney
BlogJuly 19, 2023

Run Llama 2 on Petals - Step by Step Local Installation

Fahd Mirza

 Following are the steps you can use in your Jupyter or AWS Sagemaker notebook or even in Linux instances to run and install Llama 2 on Petals easily.



Prerequisites:

- Subscribe with your email at Meta's website here.

- Login with same email at Hugging face and Submit Request to access the Llama 2 model here.

- Generate Hugging Face Token here.


Then use following commands in order (Make sure to replace your own hugging face token below):


%pip install petals


import torch

from transformers import AutoTokenizer

from petals import AutoDistributedModelForCausalLM


model_name = "meta-llama/Llama-2-70b-hf"


!huggingface-cli login --token <Your huggingface Token>


tokenizer = AutoTokenizer.from_pretrained(model_name, use_fast=False, add_bos_token=False)

model = AutoDistributedModelForCausalLM.from_pretrained(model_name)

model = model.cuda()


I hope this helps.

Share this post:
On this page

Let's Partner

If you are looking to build, deploy or scale AI solutions — whether you're just starting or facing production-scale challenges — let's chat.

Subscribe to Fahd's Newsletter

Weekly updates on AI, cloud engineering, and tech innovations