OpenAI has introduced vision access in ChatGPT, allowing users to upload images and receive analysis and answers about them.
The vision access feature is slowly rolling out to ChatGPT Plus subscribers over the next 2 weeks.
Despite the limitations of not being able to upload images directly to Dolly 3, users can still create a separate Dolly 3 chat and an image upload chat to generate ideas and receive advice.
🦁 The layered hairstyle resembles a lion's mane.
👀 The almond-shaped eyes have brown irises.
📷 The image depicts a stylized animated character with yellow rounded head, white glasses, and a cheerful smile.
The transcript discusses the AI's ability to analyze images and provide accurate descriptions of facial features.
It mentions that the AI is programmed not to identify real people, but can provide general information about their appearance.
The AI is unable to determine a person's actual state or emotion based solely on an image.
The GPT-4 Vision Access in ChatGPT allows users to view and describe visual content, without storing or accessing past images or using facial recognition.
The AI can recognize general visual attributes of people, like someone wearing a blue shirt, but doesn't make subjective judgments or speculate on personal characteristics.
The AI can analyze images and provide accurate information about objects, such as identifying car models, translating text in images, and suggesting upgrades for food items.
🍽️ The AI can analyze a photo of a messy fridge and suggest meal ideas based on the available ingredients.
👩🍳 The AI can provide detailed recipes and instructions for the suggested meals, even accommodating dietary preferences and restrictions.
😄 The AI has a sense of humor and can understand and analyze memes, providing insightful commentary.
🤖 GPT-4 Vision has the ability to recognize humor, which is mind-blowing for a machine.
🖼️ The goal is to use ChatGPT and Dolly 3 together to improve image generation, with a focus on creating a complex image of a school band of cats playing instruments.
🌟 Dolly 3 shows impressive results with the initial image of the cat band, but there is room for improvement in terms of cat poses and instrument diversity.
🔍 GPT-4 Vision is applied to the image to get feedback and generate variations. The generated images show improvements in cat diversity, instrument selection, and incorporating music notes.
🔍 The video demonstrates the use of GPT-4 Vision Access in ChatGPT.
🖼️ The generated images align well with the given prompts, but there is room for improvement in instrumental details and facial expressions.
🤔 The model's ability to evaluate the images accurately is limited, but the third image stands out the most.