Sometimes you don’t want or can’t use a corporate foundation model for your work. Then, it’ makes sense to download an open source model. Hugging Face is a aggregator of open source models that has a lot of features.
It’s probably better to have a GPU runtime. To do this click on the upper right hand runtime and select a GPU instance.
I adapted these notes from this colab tutorial and this is as easy as I could make it. In addition to downloading the model, it’s possible to run the model from Hugging face or a server that you set up. In this case, I’m using an open source model so you don’t need a HF private key. But, if you want to use models that require a login and credentials, you’ll need to set that up.
Below I set up the models (output omitted).
## Installs; I needed to restart the runtime and run this twice
%pip install llama-index-llms-huggingface
%pip install llama-index-llms-huggingface-apiBelow I install the inference engine.
## Installs 2; installing inference engine
!pip install "transformers[torch]" "huggingface_hub[inference]"
!pip install llama-indexLet’s do our imports.
import os
from typing import List, Optional
from llama_index.llms.huggingface import HuggingFaceLLM
from llama_index.llms.huggingface_api import HuggingFaceInferenceAPIConsider the mode from https://
Here we create an llm instance called locally_run that references our model.
The querry usually needs a specific format, which is specified on the model’s Hugging Face page.
model_name = "microsoft/Phi-4-mini-instruct"
locally_run = HuggingFaceLLM(
model_name=model_name,
tokenizer_name=model_name,
# If you want you can specify the wrapper, I prefer doing this manually, here's an example
# query_wrapper_prompt=f"<|system|>{system_message}<|end|><|user|>{{query}}<|end|><|assistant|>",
context_window=4096,
max_new_tokens=256,
device_map="auto"#
)Let’s set up a system prompt (instructions to the LLM) and a user prompt, in the correct format for the LLM. Then we pass the prompt to our engine and we get the result.
system = "<|system|>"+"You are a helpful friend"+"<|end|>"
user = "<|user|>"+"How can I address my plantar fasciitis"+"<|end|>"
prompt = system + user + "<|assistant|>"
print(prompt)
response = locally_run.complete(prompt)
print(response)<|system|>You are a helpful friend<|end|><|user|>How can I address my plantar fasciitis<|end|><|assistant|>
I'm sorry to hear that you're dealing with plantar fasciitis. Here are some steps you can take to address and manage the condition:
1. **Rest and Ice**: Give your feet a break from activities that exacerbate the pain. Apply ice packs to the affected area for 15-20 minutes several times a day to reduce inflammation.
2. **Stretching Exercises**: Regular stretching can help alleviate pain and improve flexibility. Try the following stretches:
- **Calf Stretch**: Stand facing a wall with one foot in front of the other. Bend the front knee and keep the back leg straight. Hold for 20-30 seconds and switch sides.
- **Towel Stretch**: Sit on the floor with your legs extended. Loop a towel around the ball of one foot and gently pull it towards you while keeping your knee straight. Hold for 20-30 seconds and switch sides.
3. **Proper Footwear**: Wear supportive shoes with good arch support. Avoid high heels and flat, rigid shoes. Consider using orthotic inserts if recommended by your healthcare provider.
4. **Weight Management**: If you're overweight, losing weight can reduce the strain on your plantar fascia.
5. **Anti-inflammatory Medications**: Over-the-counter pain
This took only a few seconds on a GPU instance on colab.