Australia/Sydney
BlogMay 24, 2024

How to Install and Run Model with Llama-cpp-Python Locally

Fahd Mirza

 This video is a step-by-step tutorial to locally install Alphex 118B model with llama-cpp-python. As per model card, it defeats GPT-4 on most of the benchmarks and performs close to GPT-4o.




Code:

conda create -n alphex python=3.11

pip install llama-cpp-python

wget

from llama_cpp import Llama

llm = Llama(
      model_path="./Alphex-118b.Q2_K.gguf",
      n_gpu_layers=-1,
      seed=1337,
      n_ctx=2048,
)
output = llm(
      "Q: What is the capital of Australia? A: ",
      max_tokens=120,
      stop=["Q:", "\ "],
      echo=True
)
print(output['choices'][0])
Share this post:
On this page

Let's Partner

If you are looking to build, deploy or scale AI solutions — whether you're just starting or facing production-scale challenges — let's chat.

Subscribe to Fahd's Newsletter

Weekly updates on AI, cloud engineering, and tech innovations