Ollama, 임베딩 모델 지원 시작

9bow · 4월 10, 2024, 3:45오후

~~(공지 블로그의 이미지가 귀여워 위로 올려봤습니다)~~

PyTorchKR

RAG에 대한 관심이 올라가고 있는 가운데, Local LLM을 위한 주요 플랫폼 중 하나인 ollama에서 임베딩 모델의 지원을 시작했다고 밝혔습니다. 공지 글에서는 임베딩 모델이 무엇이고, 어떻게 사용하는지에 대해서도 간략히 소개해두었는데요, 함께 살펴보시죠.

Ollama, 임베딩 모델 지원 시작

소개

임베딩 모델은 특정 텍스트 시퀀스의 의미를 나타내는 숫자 배열인 벡터 임베딩을 생성하도록 특별히 훈련된 모델입니다. 이렇게 다차원의 벡터로 변환된 임베딩은 흔히들 Vector DB로 불리우는 데이터베이스에 저장되며, 의미상 유사한 데이터를 검색하는데 사용됩니다.

이렇게 임베딩 모델을 사용함으로써 기존의 검색 방식과 달리 텍스트의 의미를 더 깊이 이해하고, 관련성 높은 결과를 도출할 수 있습니다. 특히, RAG 애플리케이션 구축에 있어서 이러한 임베딩 모델의 활용은 검색의 정확성과 효율성을 대폭 개선합니다.

주요 특징

지원하는 임베딩 모델: Ollama는 다양한 크기와 용량을 가진 여러 임베딩 모델을 지원합니다. 예를 들어, mxbai-embed-large 모델은 334M의 파라미터 크기를 가지고 있으며, nomic-embed-text는 137M, all-minilm은 23M의 파라미터 크기를 가집니다.

Model Parameter Size

mxbai-embed-large 334M View model

nomic-embed-text 137M View model

all-minilm 23M View model
사용 방법: 모델을 불러온 후 REST API, Python 또는 JavaScript 라이브러리를 사용하여 모델에서 벡터 임베딩을 생성할 수 있습니다.

사용 방법

임베딩 모델을 사용하는 과정은 크게 세 단계로 나눌 수 있습니다. 첫 번째 단계에서는 Ollama와 ChromaDB를 설치한 후, 임베딩을 생성할 문서들을 준비합니다. 두 번째 단계에서는 준비된 문서를 기반으로 가장 관련성 높은 문서를 검색합니다. 마지막 단계에서는 검색된 문서를 활용해 새로운 응답을 생성합니다. 각 단계들을 조금 더 자세히 살펴보겠습니다:

1단계: 임베딩 생성

Ollama와 ChromaDB를 설치한 후, 여러 문서에 대한 벡터 임베딩을 생성합니다. 이 과정에서는 Ollama의 임베딩 모델을 사용하여 각 문서의 의미를 벡터로 변환하고, 이를 데이터베이스에 저장합니다.

설치 방법:

pip install ollama chromadb

콘텐츠를 포함하고 있는 예시 코드 (example.py):

import ollama
import chromadb

documents = [
  "Llamas are members of the camelid family meaning they're pretty closely related to vicuñas and camels",
  "Llamas were first domesticated and used as pack animals 4,000 to 5,000 years ago in the Peruvian highlands",
  "Llamas can grow as much as 6 feet tall though the average llama between 5 feet 6 inches and 5 feet 9 inches tall",
  "Llamas weigh between 280 and 450 pounds and can carry 25 to 30 percent of their body weight",
  "Llamas are vegetarians and have very efficient digestive systems",
  "Llamas live to be about 20 years old, though some only live for 15 years and others live to be 30 years old",
]

client = chromadb.Client()
collection = client.create_collection(name="docs")

# store each document in a vector embedding database
for i, d in enumerate(documents):
  response = ollama.embeddings(model="mxbai-embed-large", prompt=d)
  embedding = response["embedding"]
  collection.add(
    ids=[str(i)],
    embeddings=[embedding],
    documents=[d]
  )

2단계: 문서 검색

사용자로부터 받은 질문이나 프롬프트에 대해, 생성된 벡터 임베딩을 기반으로 가장 관련성이 높은 문서를 검색합니다. 이 단계에서는 검색된 문서가 사용자의 요구에 가장 잘 부합하는 정보를 담고 있는지를 판단합니다.

# an example prompt
prompt = "What animals are llamas related to?"

# generate an embedding for the prompt and retrieve the most relevant doc
response = ollama.embeddings(
  prompt=prompt,
  model="mxbai-embed-large"
)
results = collection.query(
  query_embeddings=[response["embedding"]],
  n_results=1
)
data = results['documents'][0][0]

3단계: 응답 생성

검색된 문서를 활용하여, 사용자의 질문이나 프롬프트에 대한 새로운 텍스트 응답을 생성합니다. 이 과정에서는 Ollama의 텍스트 생성 모델을 사용하여, 검색된 문서의 정보를 통합하고 확장하여 사용자에게 제공될 새로운 내용을 생성합니다.

# generate a response combining the prompt and data we retrieved in step 2
output = ollama.generate(
  model="llama2",
  prompt=f"Using this data: {data}. Respond to this prompt: {prompt}"
)

print(output['response'])

위와 같은 예시 코드를 실행하면, LLama2 모델은 What animals are llamas related to?이라는 질문에 대해 다음과 같이 답변합니다:

Llamas are members of the camelid family, which means they are closely related to two other animals: vicuñas and camels. All three species belong to the same evolutionary lineage and share many similarities in terms of their physical characteristics, behavior, and genetic makeup. Specifically, llamas are most closely related to vicuñas, with which they share a common ancestor that lived around 20-30 million years ago. Both llamas and vicuñas are members of the family Camelidae, while camels belong to a different family (Dromedary).

Embedding API 사용 예시

cURL을 사용한 활용 예시:

curl http://localhost:11434/api/embeddings -d '{
  "model": "mxbai-embed-large",
  "prompt": "Llamas are members of the camelid family"
}'

Python 라이브러리를 사용한 예시:

ollama.embeddings(
  model='mxbai-embed-large',
  prompt='Llamas are members of the camelid family',
)

지원 예정 기능들

임베딩과 관련된 워크플로우를 지원하기 위해 더 많은 기능이 추가될 예정입니다:

Batch Embedding(일괄 임베딩): 여러 입력 데이터 프롬프트 동시 처리
OpenAI API 호환성: /v1/embeddings 와 같은 OpenAI 호환 엔드포인트 지원
더 많은 임베딩 모델 아키텍처 지원: ColBERT, RoBERTa 및 기타 임베딩 모델 아키텍처 지원

더 읽어보기

공지 글

Ollama 임베딩 모델 API 문서들

Ollama REST API 문서

github.com/ollama/ollama

docs/api.md

main

# API

## Endpoints

- [Generate a completion](#generate-a-completion)
- [Generate a chat completion](#generate-a-chat-completion)
- [Create a Model](#create-a-model)
- [List Local Models](#list-local-models)
- [Show Model Information](#show-model-information)
- [Copy a Model](#copy-a-model)
- [Delete a Model](#delete-a-model)
- [Pull a Model](#pull-a-model)
- [Push a Model](#push-a-model)
- [Generate Embeddings](#generate-embeddings)
- [List Running Models](#list-running-models)
- [Version](#version)

## Conventions

### Model names

This file has been truncated. show original

Python Library 문서
https://github.com/ollama/ollama-python

Javascript Library 문서
https://github.com/ollama/ollama-js

Model	Parameter Size
`mxbai-embed-large`	334M	View model
`nomic-embed-text`	137M	View model
`all-minilm`	23M	View model

Ollama, 임베딩 모델 지원 시작

PyTorchKR

Ollama, 임베딩 모델 지원 시작

소개

주요 특징

사용 방법

1단계: 임베딩 생성

2단계: 문서 검색

3단계: 응답 생성

Embedding API 사용 예시

지원 예정 기능들

더 읽어보기

공지 글

Ollama 임베딩 모델 API 문서들

관련 글들