BlogMay 24, 2024

How to Install and Run Model with Llama-cpp-Python Locally

Fahd Mirza

This video is a step-by-step tutorial to locally install Alphex 118B model with llama-cpp-python. As per model card, it defeats GPT-4 on most of the benchmarks and performs close to GPT-4o.

Code:

conda create -n alphex python=3.11

pip install llama-cpp-python

wget 

from llama_cpp import Llama

llm = Llama(
      model_path="./Alphex-118b.Q2_K.gguf",
      n_gpu_layers=-1,
      seed=1337,
      n_ctx=2048,
)
output = llm(
      "Q: What is the capital of Australia? A: ",
      max_tokens=120,
      stop=["Q:", "\
"],
      echo=True
)
print(output['choices'][0])

Share this post:

Let's Partner

If you are looking to build, deploy or scale AI solutions — whether you're just starting or facing production-scale challenges — let's chat.

Send me a message

Subscribe to Fahd's Newsletter

Weekly updates on AI, cloud engineering, and tech innovations

How to Install and Run Model with Llama-cpp-Python Locally

Recent posts

Let's Partner

Subscribe to Fahd's Newsletter