BlogJanuary 13, 2024

Talk with Comics Using AI in Any Language

Fahd Mirza

This video shows step by step demo with code as how to analyze comics in any language and talk to them using LlamaIndex and ChatGPT.

Code Used:

%pip install llama_index ftfy regex tqdm
%pip install git+https://github.com/openai/CLIP.git
%pip install torch torchvision
%pip install matplotlib scikit-image
%pip install -U qdrant_client

import os

openai_api_key = os.environ['OPENAI_API_KEY']

from PIL import Image
import matplotlib.pyplot as plt
import os

image_paths = []
for img_path in os.listdir("./urdu"):
    image_paths.append(str(os.path.join("./urdu", img_path)))


def plot_images(image_paths):
    images_shown = 0
    plt.figure(figsize=(25, 12))
    for img_path in image_paths:
        if os.path.isfile(img_path):
            image = Image.open(img_path)

            plt.subplot(2, 2, images_shown + 1)
            plt.imshow(image)
            plt.xticks([])
            plt.yticks([])

            images_shown += 1
            if images_shown >= 9:
                break


plot_images(image_paths)


from llama_index.multi_modal_llms.openai import OpenAIMultiModal
from llama_index import SimpleDirectoryReader

image_documents = SimpleDirectoryReader("./urdu").load_data()

openai_mm_llm = OpenAIMultiModal(
    model="gpt-4-vision-preview", api_key=openai_api_key, max_new_tokens=1500
)

response_eng = openai_mm_llm.complete(
    prompt="Describe the comic strip panels as an alternative text",
    image_documents=image_documents,
)

print(response_eng)

Share this post:

Let's Partner

If you are looking to build, deploy or scale AI solutions — whether you're just starting or facing production-scale challenges — let's chat.

Send me a message

Subscribe to Fahd's Newsletter

Weekly updates on AI, cloud engineering, and tech innovations

Talk with Comics Using AI in Any Language

Recent posts

Let's Partner

Subscribe to Fahd's Newsletter