All About AI-powered Jupyter notebooks with JupyterAI

May 22, 2024

1

Introduction

Generative AI has been on the forefront of latest developments in synthetic intelligence. It has turn out to be part of each main sector, from tech and healthcare to finance and leisure, and continues reworking our work. It has enabled us to create high-quality content material and carry out complicated duties in minutes.

Now, think about a world the place you should utilize easy textual content prompts to harness the facility of generative AI, permitting you to jot down high-quality code or analyze complicated information immediately from a Jupyter Pocket book. Welcome to Jupyter AI, which seamlessly integrates cutting-edge generative AI fashions into your notebooks, permitting you to carry out all these complicated duties effortlessly whereas rising productiveness and effectivity.

Studying Goals

By the tip of this text, you should have a transparent understanding of

The variations between conventional Jupyter notebooks and Jupyter AI
The way to successfully use Jupyter AI to carry out complicated duties and increase productiveness
Utilizing textual content prompts to generate code, visualize information, and automate handbook duties in Jupyter AI
Information and privateness considerations when utilizing Jupyter AI
Limitations and disadvantages of utilizing Jupyter AI

This text was revealed as part of the Information Science Blogathon.

What’s Jupyter AI?

Not like conventional Jupyter notebooks, which require the person to carry out all duties manually, Jupyter AI can simply automate tedious and repetitive duties. It permits customers to jot down high-quality code and analyze information extra successfully than ever by utilizing easy textual content prompts. It has entry to a number of giant language mannequin suppliers, together with Open AI, Google, Anthropic, and Cohere. The interface is easy, user-friendly, and accessible immediately from a Jupyter Pocket book.

On this article, I’ll stroll you thru your complete means of utilizing Jupyter AI to turn out to be a extra productive information scientist and increase your effectivity. Jupyter AI can be utilized in two other ways. The primary technique is to work together with an AI chatbot by means of JupyterLab, and the second is to run the `jupyter_ai_magics` command in a Jupyter pocket book. We can be each of those choices on this article. So, let’s get began straight away.

Generate API Keys

To make use of Jupyter AI with a selected mannequin supplier, we first want to supply the API keys in order that the mannequin supplier can serve our requests. There are alternatives for open-source fashions that gained’t require an API key. Nonetheless, you need to set up all of the configuration information in your system to run them, which can require further space for storing. Moreover, on this case, the inference could be made in your CPU, which might be a lot slower and take a very long time to generate responses to even a single immediate. Until you might be coping with extremely confidential information, I like to recommend utilizing cloud suppliers as a result of they’re beginner-friendly and deal with all of the complicated stuff.

I can be utilizing TogetherAI and Google Gemini for this tutorial. Collectively, AI supplies seamless integration with a number of main LLM fashions and supplies quick inference. Additionally, signing in with a brand new account offers you $25 in free credit. These are sufficient to run 110 million tokens on the Llama-2 13B mannequin. In perspective, 1 million tokens are roughly equal to 700,000 phrases. As compared, the large Lord of the Rings trilogy has a mixed phrase depend of roughly 500,000 solely. This implies you would want greater than 150 of those books to make use of up all the tokens. The free credit can be greater than enough for any use case.

In the event you use a unique mannequin supplier and have already got an API key, be happy to skip this part.

TogetherAI API key

To generate a TogetherAI API key, comply with the steps beneath:

Create an account on the collectively.ai platform
Sign up to your account
Go to collectively.ai to see your API keys

Google API key

You have to create an API key to make use of the Google Gemini mannequin. The steps are:

Go to Google Dev
Choose the “Get API key in Google AI Studio” choice
Sign up along with your Google account
In Google AI Studio, click on on Get API key and generate your API key

Cohere API key

To fine-tune the mannequin to our native information, we might additionally have to have entry to an embedding mannequin. I can be utilizing Cohere’s textual content embeddings for this. Observe the steps beneath to generate a Cohere API key:

Go to Cohere API
Create your Cohere account
Go to Trial keys and create your API key

Set up crucial dependencies

Jupyter AI is appropriate with any system that helps Python variations 3.8 to three.11, together with Home windows, macOS, and Linux machines. Additionally, you will want a conda distribution to put in the mandatory packages. In the event you don’t have already got a conda distribution put in in your pc, you need to first set up conda from right here. I favor Anaconda, however the different two are additionally viable choices.

Create digital surroundings

The subsequent step is to create a digital surroundings for our challenge. Earlier than beginning any challenge, you must create a digital surroundings to keep away from piling up packages within the default Python surroundings and potential conflicts with different packages. Copy the code beneath into your shell to create an remoted surroundings with Python 3.11.

$ conda create -n jupyter-ai-env python=3.11

This can create a brand new conda surroundings known as `jupyter-ai-env` and set up Python model 3.11 to this surroundings. Subsequent, activate this surroundings utilizing the command

$ conda activate jupyter-ai-env

Set up JupyterLab and Jupyter AI

Subsequent, set up JupyterLab and Jupyter AI with the `conda set up` command

$ conda set up -c conda-forge jupyter-ai

This can set up JupyterLab and JupyterAI with all the opposite crucial dependencies to the environment.

To make use of a number of the mannequin suppliers, resembling Open AI, Google, Anthropic, and NVIDIA, you need to first set up their required langchain dependencies. We’d additionally want to put in two further packages: `pypdf` for pdf help and `cohere` for the embedding mannequin. To put in these, write

$ pip set up langchain-google-genai
$ pip set up langchain-openai
$ pip set up langchain-anthropic
$ pip set up langchain_nvidia_ai_endpoints
$ pip set up pypdf cohere

$ jupyter lab

You don’t want to put in all of them. Merely choose those you require primarily based in your wants and availability of the API key. Then begin an occasion of JupyterLab utilizing `jupyter lab`.

Jupyter AI in JupyterLab

On startup, the JupyterLab interface would appear to be this:

Chat Interface

On the left aspect is Jupyternaut, the chatbot with which we’ll work together. Along with the fundamental chat performance, it provides a wide range of different options. It could additionally find out about our native information after which present tailor-made responses to our prompts. As we’ll see within the later sections of this tutorial, it will possibly even generate an entire Jupyter pocket book from only a single textual content immediate. You possibly can choose the fashions by clicking on the settings icon on the prime proper of the Jupyternaut interface.

Language Mannequin vs Embedding Mannequin

There are two varieties of fashions right here: language mannequin and embedding mannequin. Let’s perceive the distinction between the 2. The language mannequin is the one which powers the chat UI, which we’ll use to talk and generate responses to prompts. The embedding mannequin, however, generates vector embeddings of our native information information and shops them in a vector database. This enables the language mannequin to retrieve related data when requested particular questions concerning the information. Utilizing Retrieval-Augmented Technology (RAG), the mannequin can extract related data from the vector database and mix it with its current data to reply questions on a selected matter in an in depth and correct method.

Jupyter AI helps a variety of mannequin suppliers and their fashions. You possibly can see the listing of all of the mannequin suppliers within the dropdown.

Choose your most well-liked mannequin from the dropdown, enter your API keys into the suitable packing containers and save the modifications.

Easy Activity

Let’s chat with our AI assistant and take a look at its data with a easy query.

It just about nailed it. Together with the definitions, it accurately supplies the instance of picture classification for supervised studying and clustering for buyer segmentation process, that fall beneath the unsupervised studying class.

Code Technology

Now, allow us to see the way it performs on a coding downside.

The code above appears to be like environment friendly and logically appropriate. Allow us to ask some follow-up inquiries to see if it is aware of what it’s discussing.

It absolutely is aware of its ideas effectively. To check it additional, we are able to add a pocket book to the right-side panel and have it optimize our code for us.

Code Optimization

To do that, you possibly can spotlight a piece of your pocket book and embrace it along with your immediate. Choose the embrace choice choice along with your immediate to make the code seen to the chatbot. Now you can ask questions concerning the chosen code, as depicted within the picture beneath

Jupyternaut may even substitute your choice with its personal response by selecting the substitute choice choice. Allow us to inform it to print a extra optimized code model, together with feedback explaining it.

Jupyternaut sends the code to your chosen language mannequin after which replaces the choice with the mannequin’s response. It optimizes the code accurately by utilizing a set reasonably than a listing after which explaining it with correct feedback, as proven above.

Be taught from native information

To date, so good, however allow us to take it one step additional. Allow us to now ask a number of questions on our native information. To make use of this characteristic, we should add some paperwork, ideally in textual content format, resembling .pdf or .txt information, to the present listing. Create a brand new folder named docs, and add your information information to this folder. After that, use the /study docs command as depicted beneath:

I fed it a analysis paper on sleep paralysis. Now, use /ask to ask any particular questions concerning the information. You’ll discover a major distinction between the AI’s responses earlier than and after studying from the paperwork. Right here’s an instance of me asking it about sleep paralysis

Earlier than studying the specifics of the doc, the chatbot supplied a obscure and generic response that conveyed no helpful data. Nonetheless, after studying the embeddings, it supplied a significantly better response. That is the facility of retrieval-augmented era (RAG). It permits the language mannequin to cater to the very specifics of the information, offering extremely correct outcomes. The one factor to notice right here is that the accuracy and correctness of the responses rely fully on the standard of the information we’re feeding into the mannequin. As famously mentioned in information science, “Rubbish in, rubbish out.”

You may as well delete the data it discovered with the /study -d command, which can trigger it to overlook the whole lot it has discovered concerning the native information.

Generate notebooks from scratch

To exhibit the complete potential of JupyterAI, we’ll now permit it to create an entire pocket book from scratch. As that is such a posh process, it would require a extremely developed and nuanced mannequin like GPT-4 or Gemini Professional. These fashions use their langchain libraries to take care of complicated eventualities like these. I’m selecting Gemini Professional for this process. To generate a Jupyter Pocket book from a textual content immediate, begin the immediate with the /generate command. Let’s check out an instance of this

It created a pocket book demonstrating a classification use case from scratch in only one minute. You possibly can examine the time stamps for your self to confirm this. That is what the generated pocket book appears to be like like.

I used to be amazed to see this degree of element within the generated pocket book, and after testing completely different fashions on the identical process, I wasn’t anticipating this from Gemini. Nothing even got here near this. The pocket book generated by Gemini is just excellent. It additionally adopted all the directions I supplied within the immediate. This actually unleashes the final word energy of LLMs. Information scientists, beware!!

Export chat historical past

JupyterLab supplies yet one more helpful characteristic. You may as well save your chat information utilizing the /export command. This command exports your complete chat historical past to a Markdown file and saves it within the present listing. This makes JupyterAI a particularly versatile instrument.

Jupyter AI in Jupyter notebooks

The chat interface is actually outstanding, however there’s extra to JupyterAI. In the event you can not set up JupyterLab or it doesn’t work correctly in your system, there’s yet another different for utilizing JupyterAI. It may also be utilized in notebooks by way of JupyterAI magics with the `%%ai` command. This implies you’ll be able to make the most of JupyterAI’s options with out relying solely on JupyterLab. This works with any IPython interface, resembling Google Colab, VSCode, or your native Jupyter set up.

Allow Jupyter AI magics

If you have already got `jupyter_ai` put in, the magics package deal `jupyter_ai_magics` can be put in mechanically. In any other case, use the next command to put in it:

pip set up jupyter_ai_magics

To load JupyterAI to your IPython interface, run the command beneath and the magics extension can be loaded to your pocket book.

%load_ext jupyter_ai_magics

To try the completely different mannequin suppliers, sort `%ai listing`, or you’ll be able to listing solely the fashions from a selected supplier utilizing %ai listing <provider-id>. You’ll now see an extended listing of all of the completely different mannequin suppliers and their fashions.

Once more, I can be utilizing the TogetherAI fashions and Gemini Professional. However earlier than going additional, we have to present our API key once more and retailer it in an surroundings variable. To do that, write

%env TOGETHER_API_KEY={YOUR_API_KEY}
%env GOOGLE_API_KEY={YOUR_API_KEY}

In case you are utilizing a unique mannequin supplier, merely change the mannequin supplier title above, and also you’ll be good to go.

The mannequin’s full title incorporates the mannequin supplier, adopted by the mannequin title. We will use an alias as an alternative of writing the complete title each time earlier than calling a cell. To set an alias for our mannequin title, use the code beneath:

%ai register raven togetherai:Nexusflow/NexusRaven-V2-13B
%ai register llama-guard togetherai:Meta-Llama/Llama-Guard-7b 
%ai register hermes togetherai:Austism/chronos-hermes-13b
%ai register mytho-max togetherai:Gryphe/MythoMax-L2-13b
%ai register llama2 togetherai:NousResearch/Nous-Hermes-Llama2-13b
%ai register gemini gemini:gemini-pro

Now you can use these aliases as every other mannequin title with the `%%ai` magic command. To allow Jupyter AI for a selected cell and ship textual content prompts to our mannequin, we first have to invoke the `%%ai` magic command with the mannequin title after which present the immediate beneath it

%%ai llama2
{Write your immediate right here}

Jupyter AI assumes {that a} mannequin will output markdown by default, so the output of a `%%ai` command can be in markdown format. This will typically trigger issues, inflicting some fashions to output nothing. You possibly can change this by including the `-f` or `–format` flag to your magic command. Different legitimate codecs embrace code, math, html, textual content, photographs, and json.

Textual content Technology

Subsequently, setting the flag to textual content is at all times higher in order for you a textual content output. An instance of that is proven beneath:

Mathematical Equations

We will additionally use it to jot down mathematical equations, altering the format to math.

HTML Tables

It could additionally generate handsome HTML tables when the format modifications to HTML.

Language Translation

Utilizing curly braces, we are able to additionally embrace variables and different Python expressions within the immediate. Let’s perceive it utilizing an instance of translating textual content from English to Hindi

Much like f-strings, the `{lang}` and `{title}` placeholders are interpolated because the values assigned to the variables, respectively. It didn’t spell my title accurately, however I’ll let it get away with that.

Error Correction

It’s good at writing and optimizing code. Allow us to see how effectively it does at correcting errors in code.

Jupyter AI has a particular “Err” technique that captures errors from completely different cell executions. This technique can then be utilized in one other cell to ask questions concerning the error. Within the instance above, it accurately detects the error and rewrites the corrected code.

Producing a report

Let’s now give it a relatively extra sophisticated process to check its caliber once more. Right here is an instance the place I instructed it to generate a report on COVID-19 and its influence on the world.

As proven within the picture above, the report is well-structured, with distinct sections for introduction, international well being influence, financial influence, and social influence. It additionally elaborated on ongoing challenges and the way nations worldwide are addressing them.

Textual content Summarization

The interpolation performance will be prolonged additional by combining the enter/output of a selected cell with our immediate. Right here’s an instance the place I requested it to create a short abstract of the COVID-19 report.

It lists out the abstract of the report in crisp bullet factors. Additionally, interpolation permits the mannequin to learn the report immediately from Jupyter, saving us from the ache of copying and pasting the textual content every time.

Information Visualization

Now, let’s put it to a last take a look at. For this, I uploaded the Titanic CSV file and instructed it to jot down the code for univariate evaluation on the Titanic dataset.

Wow! Not unhealthy in any respect. There may be not even a single error within the code. Each time the AI generates code, it’s labeled as AI-generated, as proven within the picture above. The code it supplied labored and resulted in some pretty plots.

It additionally used subplots within the implementation, as specified within the immediate. It’s superb how effectively it adapts to the specifics of the immediate.

Limitations and Challenges

To date, we now have appeared on the constructive points of Jupyter AI, however like the rest, it has limitations, too. Let’s take a look at these limitations one after the other.

Biased Response

As a result of LLMs are skilled on large quantities of textual content information from everywhere in the web, they generally produce biased responses to questions. Let’s take a look at an instance of this:

Firstly, it didn’t try to argue that the query was biased earlier than answering it. Second, it didn’t even think about the chance that its factors may very well be incorrect. That is simply typical biased habits.

Hallucinations

When the mannequin merely invents one thing nonexistent or makes stuff up, it’s mentioned to be hallucinating. Hallucinations are one of the crucial outstanding issues with LLMs, significantly hampering their reliability.

It doesn’t ask for clarifications and completes the sentence in keeping with its choice. That’s why it’s at all times advisable to fact-check every bit of knowledge an LLM generates reasonably than blindly trusting the whole lot it says.

Factual Inconsistency

When requested about an individual who has been to Mars, this was the response:

That is yet one more instance of the AI confidently stating improper info.

Jupyter AI poses another challenges as effectively. These are:

It’s tough to pick out a single dependable mannequin for every process as a result of fashions that carry out effectively on one process might carry out poorly on others.
If the query isn’t structured, it could misread the immediate, leading to a suboptimal or hallucinated response.

Further Info

Aside from these, listed here are some further factors to remember when utilizing Jupyter AI:

Jupyter AI sends information to third-party mannequin suppliers. Evaluate the supplier’s privateness coverage and pricing choices to grasp information utilization and cost obligations higher.
Together with further context in messages can enhance token depend and prices. Subsequently, it’s suggested to examine price insurance policies earlier than making giant requests.
AI-generated code might comprise errors, so it’s at all times finest to fastidiously evaluate all generated code earlier than working it.
Evaluate the supplier’s insurance policies for third-party embedding fashions earlier than sending any confidential or delicate information.

Conclusion

On this article, we appeared on the unimaginable energy of Jupyter AI and the way it can help in varied duties, releasing us from tedious and repetitive duties and permitting us to concentrate on the extra inventive points of our jobs. This brings us to the tip of this text. That is only a glimpse of what Jupyter AI and LLMs, on the whole, are able to. They’ve limitless potential but to be unfolded.

I hope you loved this text. As at all times, thanks for studying, and I sit up for seeing you at one other AI tutorial.

Key Takeaways

Jupyter AI supplies chat help by means of a conversational assistant. This assistant can assist summarize textual content, write good-quality code, and supply extra particular data by studying about native information.
We solved extremely complicated duties by writing easy textual content prompts, resembling creating a complete pocket book from scratch.
Then, we examined the way to rework our Jupyter notebooks into generative AI playgrounds utilizing the `%%ai` magic command.
We used completely different fashions for varied duties, resembling code optimization, information visualization, and producing a well-structured report.
Lastly, we examined a number of the language fashions’ limitations, together with their skill to often generate inconsistent and biased responses and hallucinations.

The media proven on this article usually are not owned by Analytics Vidhya and is used on the Creator’s discretion.

Steadily Requested Questions

Q1. What are the other ways to entry Jupyter AI?

A. Jupyter AI will be accessed in two foremost methods. The primary approach is to make use of the chatbot as a conversational assistant in JupyterLab. The second technique is to make use of the %%ai magic command in an IPython kernel resembling Google Colab, Visible Studio Code or your native Jupyter set up.

Q2. What are the completely different mannequin suppliers supported by Jupyter AI?

A. Jupyter AI helps a variety of mannequin suppliers and fashions, together with Open AI, Cohere, Hugging Face, Anthropic, and Gemini. Go to the official documentation to see the whole listing of supported mannequin suppliers.

Q3. How does Jupyter AI guarantee information privateness?

A. Jupyter AI solely contacts an LLM whenever you particularly request it to. It doesn’t learn your information or transmit it to fashions with out your express consent.

This autumn. What are the completely different duties that Jupyter AI can be utilized for?

A. Jupyter AI can be utilized for a variety of duties, starting from answering easy inquiries to producing code, creating complicated information visualizations, summarizing paperwork, composing inventive content material like tales or articles, translating textual content between languages, and lots of extra.

Q5. Ought to I select cloud fashions or host them regionally?

A. The selection between cloud and regionally hosted fashions boils all the way down to the trade-off between privateness and sooner inference. In different phrases, if in case you have delicate or extremely confidential information and wish to guarantee most privateness, you must use native fashions. If information privateness isn’t a serious concern for you and also you need fast inference, you must go for cloud mannequin suppliers.

Supply hyperlink