BlogJuly 19, 2023

Run Llama 2 on Petals - Step by Step Local Installation

Fahd Mirza

Following are the steps you can use in your Jupyter or AWS Sagemaker notebook or even in Linux instances to run and install Llama 2 on Petals easily.

Prerequisites:

- Subscribe with your email at Meta's website here.

- Login with same email at Hugging face and Submit Request to access the Llama 2 model here.

- Generate Hugging Face Token here.

Then use following commands in order (Make sure to replace your own hugging face token below):

%pip install petals

import torch
from transformers import AutoTokenizer
from petals import AutoDistributedModelForCausalLM

model_name = "meta-llama/Llama-2-70b-hf"

!huggingface-cli login --token <Your huggingface Token>

tokenizer = AutoTokenizer.from_pretrained(model_name, use_fast=False, add_bos_token=False)
model = AutoDistributedModelForCausalLM.from_pretrained(model_name)
model = model.cuda()

I hope this helps.

Share this post:

Let's Partner

If you are looking to build, deploy or scale AI solutions — whether you're just starting or facing production-scale challenges — let's chat.

Send me a message

Subscribe to Fahd's Newsletter

Weekly updates on AI, cloud engineering, and tech innovations

Run Llama 2 on Petals - Step by Step Local Installation

Recent posts

Let's Partner

Subscribe to Fahd's Newsletter