Hugging Face LLMs - Data science and AI for Bio/medical applications using python

Sometimes you don’t want or can’t use a corporate foundation model for your work. Then, it’ makes sense to download an open source model. Hugging Face is a aggregator of open source models that has a lot of features.

It’s probably better to have a GPU runtime. To do this click on the upper right hand runtime and select a GPU instance.

I adapted these notes from this colab tutorial and this is as easy as I could make it. In addition to downloading the model, it’s possible to run the model from Hugging face or a server that you set up. In this case, I’m using an open source model so you don’t need a HF private key. But, if you want to use models that require a login and credentials, you’ll need to set that up.

Below I set up the models (output omitted).

## Installs; I needed to restart the runtime and run this twice
%pip install llama-index-llms-huggingface
%pip install llama-index-llms-huggingface-api

Below I install the inference engine.

## Installs 2; installing inference engine
!pip install "transformers[torch]" "huggingface_hub[inference]"
!pip install llama-index

Let’s do our imports.

import os
from typing import List, Optional
from llama_index.llms.huggingface import HuggingFaceLLM
from llama_index.llms.huggingface_api import HuggingFaceInferenceAPI

Consider the mode from https://huggingface.co/HuggingFaceH4/zephyr-7b-alpha. This will be autodownloaded for the first invocation. On something like colab, it will wind up needing to be redownloaded every time the runtime is disconnected (which occurs often). On your own machine, it will still be there. Note again, it only makes sense to do this if you have an Nvidia GPU to run the models on.

Here we create an llm instance called locally_run that references our model. The querry usually needs a specific format, which is specified on the model’s Hugging Face page.


model_name = "microsoft/Phi-4-mini-instruct"

locally_run = HuggingFaceLLM(
    model_name=model_name,
    tokenizer_name=model_name,
    # If you want you can specify the wrapper, I prefer doing this manually, here's an example
    # query_wrapper_prompt=f"<|system|>{system_message}<|end|><|user|>{{query}}<|end|><|assistant|>",
    context_window=4096,
    max_new_tokens=256,
    device_map="auto"#
)

Let’s set up a system prompt (instructions to the LLM) and a user prompt, in the correct format for the LLM. Then we pass the prompt to our engine and we get the result.

system = "<|system|>"+"You are a helpful friend"+"<|end|>"
user = "<|user|>"+"How can I address my plantar fasciitis"+"<|end|>"
prompt = system + user + "<|assistant|>"
print(prompt)
response = locally_run.complete(prompt)
print(response)

<|system|>You are a helpful friend<|end|><|user|>How can I address my plantar fasciitis<|end|><|assistant|>
I'm sorry to hear that you're dealing with plantar fasciitis. Here are some steps you can take to address and manage the condition:

1. **Rest and Ice**: Give your feet a break from activities that exacerbate the pain. Apply ice packs to the affected area for 15-20 minutes several times a day to reduce inflammation.

2. **Stretching Exercises**: Regular stretching can help alleviate pain and improve flexibility. Try the following stretches:
   - **Calf Stretch**: Stand facing a wall with one foot in front of the other. Bend the front knee and keep the back leg straight. Hold for 20-30 seconds and switch sides.
   - **Towel Stretch**: Sit on the floor with your legs extended. Loop a towel around the ball of one foot and gently pull it towards you while keeping your knee straight. Hold for 20-30 seconds and switch sides.

3. **Proper Footwear**: Wear supportive shoes with good arch support. Avoid high heels and flat, rigid shoes. Consider using orthotic inserts if recommended by your healthcare provider.

4. **Weight Management**: If you're overweight, losing weight can reduce the strain on your plantar fascia.

5. **Anti-inflammatory Medications**: Over-the-counter pain

This took only a few seconds on a GPU instance on colab.