Google introduces PaliGemma 2 vision-language AI fashions

December 5, 2024

2

Google has launched a brand new household of PaliGemma vision-language fashions, providing scalable efficiency, lengthy captioning, and assist for specialised duties.

PaliGemma 2 was introduced December 5, practically seven months after the preliminary model launched as the primary vision-language mannequin within the Gemma household. Constructing on Gemma 2, PaliGemma 2 fashions can see, perceive, and work together with visible enter, in keeping with Google.

PaliGemma 2 makes it simpler for builders so as to add more-sophisticated vision-language options to apps, Google stated. It additionally permits more-sophisticated captioning talents, together with figuring out feelings and actions in photos. Scalable efficiency capabilities in PaliGemma 2 imply efficiency will be optimized for any job by way of a number of mannequin sizes (3B, 10B, 28B parameters) and resolutions (224px, 448px, 896px). Lengthy captioning in PaliGemma 2 generates detailed, contextually related captions for photos, going past easy object identification to explain actions, feelings, and the general narrative of the scene, Google stated.

Supply hyperlink

Google introduces PaliGemma 2 vision-language AI fashions

Related Articles

Effective-Tuning an Open-Supply LLM with Axolotl Utilizing Direct Desire Optimization (DPO) — SitePoint

Clarifai previews AI compute orchestration

Relive the moments that made your yr

LEAVE A REPLY Cancel reply

Latest Articles

Effective-Tuning an Open-Supply LLM with Axolotl Utilizing Direct Desire Optimization (DPO) — SitePoint

Clarifai previews AI compute orchestration

Relive the moments that made your yr

OpenAI releases o1 LLM, unveils ChatGPT Professional

exploreCSR places college departments on a path towards lasting change