Visualize Image
Answer a prompt based on the content of an image.
Description
This endpoint returns a response based on the content of an image and a base prompt. The prompt can be a question, statement, or any text that you want to ask about the image. The API will analyze the content of the image and generate a response based on the prompt using a pre-trained model. Right now there are two models available for this endpoint:- Uform-Gen: UForm-Gen is a small generative vision-language model primarily designed for Image Captioning and Visual Question Answering.
- Llava: LLaVA is a large multimodal model that can generate text based on images and text prompts.
- Gemini: Gemini is a multimodal model with advanced capabilities for understanding and generating text based on images.
Authorizations
API Key for authentication
Path Parameters
The unique identifier of the image. This identifier is used to reference the image in subsequent requests.
Body
The prompt to answer based on the content of the image. This is a natural language question or instruction that the model will respond to.
1 - 1000The model to use for the visualization. Supported models are uform-gen, llava, and gemini. If not provided, the default model will be used.
uform-gen, llava, gemini Response
The API will return the Image object in the response body.
Response object for the visualize endpoint.
The response from the AI model. This is the description of the image based on the prompt provided.

