The right way to Construct a Calorie Advisor App Utilizing GenAI?

April 2, 2024

1

Introduction

Synthetic Intelligence has many use circumstances, and a few of the finest ones are within the Well being Business. It will probably actually assist folks preserve a more healthy life. With the growing growth in generative AI, sure functions are made nowadays with much less complexity. One very helpful software that may be constructed is the Calorie Advisor App. On this article, we’ll solely take a look at this, impressed by taking good care of our well being. We can be constructing a easy Calorie Advisor App the place we will enter the photographs of the meals, and the app will assist us calculate the energy of every merchandise current within the meals. This venture is part of NutriGen, specializing in well being via AI.

Studying Goal

The App we can be creating on this article can be based mostly on primary Immediate engineering and picture processing strategies.
We can be utilizing Google Gemini Professional Imaginative and prescient API for our use case.
Then, we are going to create the code’s construction, the place we are going to carry out Picture Processing and Immediate Engineering. Lastly, we are going to work on the Consumer Interface utilizing Streamlit.
After that, we are going to deploy our app to the Hugging Face Platform for Free.
We will even see a few of the issues we are going to face within the output the place Gemini fails to depict a meals merchandise and offers the unsuitable calorie rely for that meals. We will even talk about completely different options for this drawback.

Pre-Requisites

Let’s begin with implementing our venture, however earlier than that, please guarantee you’ve a primary understanding of generative AI and LLMs. It’s okay if you understand little or no as a result of, on this article, we can be implementing issues from scratch.

For Important Python Immediate Engineering, a primary understanding of Generative AI and familiarity with Google Gemini is required. Moreover, primary information of Streamlit, Github, and Hugging Face libraries is important. Familiarity with libraries resembling PIL for picture preprocessing functions can also be useful.

This text was printed as part of the Information Science Blogathon.

Undertaking Pipeline

On this article, we can be engaged on constructing an AI assistant who assists nutritionists and people in making knowledgeable choices about their meals decisions and sustaining a wholesome life-style.

The circulate can be like this: enter picture -> picture processing -> immediate engineering -> last operate calling to get the output of the enter picture of the meals. It is a transient overview of how we are going to method this drawback assertion.

Overview of Gemini Professional Imaginative and prescient

Gemini Professional is a multimodal LLM constructed by Google. It was educated to be multimodal from the bottom up. It will probably carry out properly on varied duties, together with picture captioning, classification, summarisation, question-answering, and many others. One of many fascinating details about it’s that it makes use of our well-known Transformer Decoder Structure. It was educated on a number of sorts of knowledge, lowering the complexity of fixing multimodal inputs and offering high quality outputs.

Step1: Creating the Digital Atmosphere

Making a digital surroundings is an effective observe to isolate our venture and its dependencies such that they don’t coincide with others, and we will at all times have completely different variations of libraries we want in several digital environments. So, we are going to create a digital surroundings for the venture now. To do that, comply with the talked about steps under:

Create an Empty folder on the desktop for the venture.
Open this folder in VS Code.
Open the terminal.

Write the next command:

pip set up virtualenv
python -m venv genai_project

You need to use the next command when you’re getting sa et execution coverage error:

Set-ExecutionPolicy RemoteSigned -Scope Course of

Now we have to activate our digital surroundings, for that use the next command:

.genai_projectScriptsactivate

We’ve efficiently created our digital surroundings.

Step Create Digital Atmosphere in Google Colab

We will additionally create our Digital Atmosphere in Google Colab; right here’s the step-by-step process to try this:

Create a New Colab Pocket book
Use the under instructions step-by-step

!which python
!python --version
#to test if python is put in or not

%env PYTHONPATH=
# setting python path surroundings variable in empty worth making certain that python
# will not seek for modules and packages in extra listing. It helps
# in avoiding conflicts or unintended module loading.

!pip set up virtualenv

# create digital surroundings 
!virtualenv genai_project

!wget https://repo.anaconda.com/miniconda/Miniconda3-latest-Linux-x86_64.sh

#This may assist obtain the miniconda installer script which is used to create
# and handle digital environments in python

!chmod +x Miniconda3-latest-Linux-x86_64.sh
# this command is making our mini conda installer script executable inside
# the colab surroundings.

!./Miniconda3-latest-Linux-x86_64.sh -b -f -p /usr/native
# that is used to run miniconda installer script and 
# specify the trail the place miniconda needs to be put in

!conda set up -q -y --prefix /usr/native python=3.8 ujson
#this can assist set up ujson and python 3.8 set up in our venv.

import sys
sys.path.append('/usr/native/lib/python3.8/site-packages/')
#it should permit python to find and import modules from a venv listing

import os
os.environ['CONDA_PREFIX'] = '/usr/native/envs/myenv'

# used to activate miniconda enviornment

!python --version
#checks the model of python inside the activated miniconda surroundings

Therefore, we additionally created our digital surroundings in Google Colab. Now, let’s test and see how we will make a primary .py file there.

!supply myenv/bin/activate
#activating the digital surroundings

!echo "print('Whats up, world!')" >> my_script.py
# writing code utilizing echo and saving this code in my_script.py file

!python my_script.py
#operating my_script.py file

This may print Whats up World for us within the output. So, that’s it. That was all about working with Digital Environments in Google Colab. Now, let’s proceed with the venture.

Step2: Importing Mandatory Libraries

import streamlit as st
import google.generativeaias genai 
import os 
from dotenv import load_dotenv
load_dotenv()
from PIL import Picture

If you’re having hassle importing any of the above libraries, you possibly can at all times use the command “pip set up library_name” to put in it.

We’re utilizing the Streamlit library to create the essential person interface. The person will be capable to add a picture and get the outputs based mostly on that picture.

We use Google Generative to get the LLM and analyze the picture to get the calorie rely item-wise in our meals.

Picture is getting used to carry out some primary picture preprocessing.

Step3: Establishing the API Key

Create a brand new .env file in the identical listing and retailer your API key. You may get the Google Gemini API key from Google MakerSuite.

Step4: Response Generator Operate

Right here, we are going to create a response generator operate. Let’s break it down step-by-step:

Firstly, we used genes. Configure to configure the API we created from the Google MakerSuite Web site. Then, we made the operate get_gemini_response, which takes in 2 enter parameters: the enter immediate and the picture. That is the first operate that may return the output in textual content.

genai.configure(api_key=os.getenv("GOOGLE_API_KEY"))

def get_gemini_response(input_prompt, picture):

    mannequin = genai.GenerativeModel('gemini-pro-vision')

    response = mannequin.generate_content([input_prompt, image[0]])

    return response

Right here, we’re utilizing the ‘Gemini-pro-vision’ mannequin as a result of it’s multimodal. After calling our mannequin from the genie.GenerativeModel dependency, we’re simply passing in our immediate and the picture knowledge to the mannequin. Lastly, based mostly on the directions supplied within the immediate and the picture knowledge we fed, the mannequin will return the output within the type of textual content that represents the calorie rely of various meals gadgets current within the picture.

Step5: Picture Preprocessing

This operate checks if the uploaded_file parameter is None, that means the person has uploaded a file. If a file has been uploaded, the code proceeds to learn the file content material into bytes utilizing the getvalue() technique of the uploaded_file object. This may return the uploaded file’s uncooked bytes.

The bytes knowledge obtained from the uploaded file is saved in a dictionary format below the key-value pair “mime_type” and “knowledge.” The “mime_type” key shops the uploaded file’s MIME sort, which signifies the kind of content material (e.g., picture/jpeg, picture/png). The “knowledge” key shops the uploaded file’s uncooked bytes.

The picture knowledge is then saved in a listing named image_parts, which comprises a dictionary with the uploaded file’s MIME sort and knowledge.

def input_image_setup(uploaded_file):
    if uploaded_file isnotNone:
        #Learn the file into bytes
        bytes_data = uploaded_file.getvalue()
        image_parts = [
            {
                "mime_type":uploaded_file.type, 
                "data":bytes_data
            }
        ]
        return image_parts
    else:
        increase FileNotFoundError("No file uploaded")

Step6: Creating the UI

So, lastly, it’s time to create the person interface for our venture. As talked about earlier than, we can be utilizing the Streamlit library to jot down the code for the entrance finish.

## initialising the streamlit app
st.set_page_config(page_title="Energy Advisor App")
st.header("Energy Advisor App")
uploaded_file = st.file_uploader("Select a picture...", sort=["jpg", "jpeg", "png"])
picture = ""
if uploaded_file isnotNone:
    picture = Picture.open(uploaded_file)
    st.picture(picture, caption="Uploaded Picture", use_column_width=True)
submit = st.button("Inform me in regards to the whole energy")

Initially, we arrange the web page configuration utilizing set_page_config and gave the app a title. Then, we created a header and added a file uploader field the place customers can add pictures. St. Picture reveals the picture that the person uploaded to the UI. Finally, there’s a submit button, after which we are going to get the outputs from our massive language mannequin, Gemini Professional Imaginative and prescient.

Step7: Writing the System Immediate

Now’s the time to be artistic. Right here, we are going to create our enter immediate, asking the mannequin to behave as an professional nutritionist. It’s not mandatory to make use of the immediate under; you can too present your customized immediate. We’re asking our mannequin to behave a sure method for now. Primarily based on the enter picture of the meals supplied, we’re asking our mannequin to learn that picture knowledge and generate the output, which can give us the calorie rely of the meals gadgets current within the picture and supply a judgment of whether or not the meals is wholesome or unhealthy. If the meals is dangerous, we ask it to offer extra nutritious alternate options to the meals gadgets in our picture. You may customise it extra in keeping with your wants and get a superb method to maintain monitor of your well being.

Typically it won’t capable of learn the picture knowledge correctly, we are going to talk about options concerning this additionally on the finish of this text.

input_prompt = """

You might be an professional nutritionist the place you might want to see the meals gadgets from the 
picture and calculate the full energy, additionally give the main points of all 
the meals gadgets with their respective calorie rely within the under fomat.

        1. Merchandise 1 - no of energy

        2. Merchandise 2 - no of energy

        ----

        ----

Lastly you can too point out whether or not the meals is wholesome or not and likewise point out 
the share cut up ratio of carbohydrates, fat, fibers, sugar, protein and 
different necessary issues required in our eating regimen. If you happen to discover that meals is just not wholesome 
then you could present some various wholesome meals gadgets that person can have 
in eating regimen.

"""
if submit:

    image_data = input_image_setup(uploaded_file)

    response = get_gemini_response(input_prompt, image_data)

    st.header("The Response is: ")

    st.write(response)

Lastly, we’re checking that if the person clicks the Submit button, we are going to get the picture knowledge from the

input_image_setup operate we created earlier. Then, we go our enter immediate and this picture knowledge to the get_gemini_response operate we created earlier. We name all of the capabilities we created earlier to get the ultimate output saved in response.

Step8: Deploying the App on Hugging Face

Now’s the time for deployment. Let’s start.

Will clarify the best method to deploy this app that we created. There are two choices that we will look into if we need to deploy our app: one is Streamlit Share, and the opposite one is Hugging Face. Right here, we are going to use Hugging Face for the deployment; you possibly can strive exploring deployment on Streamlit Share iFaceu if you’d like. Right here’s the reference hyperlink for that – Deployment on Streamlit Share

First, let’s shortly create the necessities.txt file we want for the deployment.

Open the terminal and run the under command to create a necessities.txt file.

pip freeze > necessities.txt1plainText

This may create a brand new textual content file named necessities. All of the venture dependencies can be out there there. If this causes an error, it’s okay. You may at all times create a brand new textual content file in your working listing and duplicate and paste the necessities.txt file from the GitHub hyperlink I’ll present subsequent.

Now, just be sure you have these information useful (as a result of that’s what we want for the deployment):

app.py
.env (for the API credentials)
necessities.txt

If you happen to don’t have one, take all these information and create an account on the cuddling face. Then, create a brand new area and add the information there. That’s all. Your app can be routinely deployed this fashion. Additionally, you will be capable to see how the deployment is going down in real-time. If some error happens, you possibly can at all times determine it out with the easy interface and, in fact, the cuddling face neighborhood, which has numerous content material on resolving some frequent bugs throughout deployment.

After a while, it is possible for you to to see the app working. Woo hoo! We’ve lastly created and deployed our calorie predictor app. Congratulations!!, You may share the working hyperlink of the app with the family and friends you simply constructed.

Right here’s the working hyperlink to the app that we simply created – The Alorcalorieisor App

Let’s check our app by offering an enter picture to it:

Earlier than:

After:

Full Undertaking GitHub Hyperlink

Right here’s the whole github repository hyperlink that features supply code and different useful data concerning the venture.

You may clone the repository and customise it in keeping with your necessities. Attempt to be extra artistic and clear in your immediate, as this can give your mannequin extra energy to generate appropriate and correct outputs.

Scope of Enchancment

Issues that may happen within the outputs generated by the mannequin and their options:

Typically, there could possibly be conditions the place you’ll not get the proper output from the mannequin. This may increasingly occur as a result of the mannequin was not capable of predict the picture appropriately. For instance, when you give enter pictures of your meals and your meals merchandise comprises pickles, then our mannequin would possibly contemplate it one thing else. That is the first concern right here.

One method to deal with that is via efficient immediate engineering strategies, like few-shot immediate engineering, the place you possibly can feed the mannequin with examples, after which it should generate the outputs based mostly on the learnings from these examples and the immediate you supplied.
One other answer that may be thought-about right here is creating our customized knowledge and fine-tuning it. We will create knowledge containing a picture of the meals merchandise in a single column and an outline of the meals gadgets current within the different column. This may assist our mannequin study the underlying patterns and predict the gadgets appropriately within the picture supplied. Thus, getting extra appropriate outputs of the calorie rely for the images of the meals is important.
We will take it additional by asking the person about his/her vitamin objectives and asking the mannequin to generate outputs based mostly on that. (This fashion, we will tailor the outputs generated by the mannequin and provides extra user-specific outputs.)

Conclusion

We’ve delved into the sensible software of Generative AI in healthcare, specializing in the creation of the Calorie Advisor App. This venture showcases the potential of AI to help people in making knowledgeable choices about their meals decisions and sustaining a wholesome life-style. From organising the environment to implementing picture processing and immediate engineering strategies, we’ve lined the important steps. The app’s deployment on Hugging Face demonstrates its accessibility to a wider viewers. Challenges like picture recognition inaccuracies have been addressed with options resembling efficient immediate engineering. As we conclude, the Calorie Advisor App stands as a testomony to the transformative energy of Generative AI in selling well-being.

Key Takeaways

We’ve mentioned so much to this point, Beginning with the venture pipeline after which a primary introduction to the massive language mannequin Gemini Professional Imaginative and prescient.
Then, we began with the hands-on implementation. We created our digital surroundings and API key from Google MakerSuite.
Then, we carried out all our coding within the created digital surroundings. Additional, we mentioned the right way to deploy the app on a number of platforms, resembling Hugging Face and Streamlit Share.
Aside from that, we thought-about the potential issues that may happen, and mentioned soluFaces to these issues.
Therefore, it was enjoyable engaged on this venture. Thanks for staying until the top of this text; I hope you bought to study one thing new.

Steadily Requested Questions

Q1. What’s the Google Gemini Professional Imaginative and prescient Mannequin?

Google developed Gemini Professional Imaginative and prescient, a famend LLM identified for its multimodal capabilities. It performs duties like picture captioning, technology, and summarization. Customers can create an API key on the MakerSuite Web site to entry Gemini Professional Imaginative and prescient.

Q2. How can Generative AI be utilized to the Healthcare/Vitamin area?

A. Generative AI has numerous potential for fixing real-world issues. Among the methods it may be utilized to the well being/vitamin area are that it will possibly assist docs give medication prescriptions based mostly on signs and act as a vitamin advisor, the place customers can get wholesome suggestions for his or her diets.

Q3. How does immediate engineering resolve the Generative AIuse case?

A. Immediate engineering is an important ability to grasp nowadays. The most effective place to study trompt engineering from primary to superior is right here – https://www.promptingguide.ai/

This autumn. The right way to enhance the mannequin’s skill to generate extra appropriate outputs?

A. To extend the mannequin’s skill to generate extra appropriate outputs, we will use the next techniques: Efficient Prompting, Superb Tuning, and Retrieval-Augmented Technology (RAG).

The media proven on this article is just not owned by Analytics Vidhya and is used on the Creator’s discretion.

Supply hyperlink