Australia/Sydney
BlogJanuary 13, 2024

Talk with Comics Using AI in Any Language

Fahd Mirza

 This video shows step by step demo with code as how to analyze comics in any language and talk to them using LlamaIndex and ChatGPT.




Code Used:

%pip install llama_index ftfy regex tqdm
%pip install git+https://github.com/openai/CLIP.git
%pip install torch torchvision
%pip install matplotlib scikit-image
%pip install -U qdrant_client

import os

openai_api_key = os.environ['OPENAI_API_KEY']

from PIL import Image
import matplotlib.pyplot as plt
import os

image_paths = []
for img_path in os.listdir("./urdu"):
    image_paths.append(str(os.path.join("./urdu", img_path)))


def plot_images(image_paths):
    images_shown = 0
    plt.figure(figsize=(25, 12))
    for img_path in image_paths:
        if os.path.isfile(img_path):
            image = Image.open(img_path)

            plt.subplot(2, 2, images_shown + 1)
            plt.imshow(image)
            plt.xticks([])
            plt.yticks([])

            images_shown += 1
            if images_shown >= 9:
                break


plot_images(image_paths)


from llama_index.multi_modal_llms.openai import OpenAIMultiModal
from llama_index import SimpleDirectoryReader

image_documents = SimpleDirectoryReader("./urdu").load_data()

openai_mm_llm = OpenAIMultiModal(
    model="gpt-4-vision-preview", api_key=openai_api_key, max_new_tokens=1500
)

response_eng = openai_mm_llm.complete(
    prompt="Describe the comic strip panels as an alternative text",
    image_documents=image_documents,
)

print(response_eng)


Share this post:
On this page

Let's Partner

If you are looking to build, deploy or scale AI solutions — whether you're just starting or facing production-scale challenges — let's chat.

Subscribe to Fahd's Newsletter

Weekly updates on AI, cloud engineering, and tech innovations