OpenAI의 GPT 모범 사례: 전략5. 외부 도구 사용하기 / Strategy: Use external tools

9bow · 9월 2, 2023, 7:22오전

시작하기 전에 - GPT 모범 사례 문서 개요

OpenAI의 GPT 모범 사례 문서는 GPT 모델로부터 더 나은 답변을 얻기 위한 6가지 전략(starategy)들과 함께 각 전략별 세부 전술들을 소개하고 있습니다.

이 게시글은 그 중 두번째 전략인 참고 문헌 제공하기 (Provide reference text) 를 소개하겠습니다.

전체 GPT 모범 사례에 대한 설명이 궁금하신 분들께서는 아래 목차를 참고해주세요!

OpenAI의 GPT 모범 사례: 개요. 더 나은 답변을 얻기 위한 6가지 프롬프트 작성 전략 (Six strategies for getting better results)
OpenAI의 GPT 모범 사례: 전략1. 명확한 지침 작성하기 (Write clear instructions)
OpenAI의 GPT 모범 사례: 전략2. 참고 문헌 제공하기 (Provide reference text)
OpenAI의 GPT 모범 사례: 전략3. 복잡한 작업을 더 간단한 하위 작업들로 나누기 (Split complex tasks into simpler subtasks)
OpenAI의 GPT 모범 사례: 전략4. GPT 모델에게 "생각할" 시간 주기 / Strategy: Give GPTs time to "think"
OpenAI의 GPT 모범 사례: 전략5. 외부 도구 사용하기 / Strategy: Use external tools 현재 글
OpenAI의 GPT 모범 사례: 전략6. 변경 사항을 체계적으로 테스트하기 / Strategy: Test changes systematically

https://platform.openai.com/docs/guides/gpt-best-practices/gpt-best-practices

전략 5. 외부 도구 사용하기 / Strategy: Use external tools

세부 전략: 임베딩-기반 검색을 사용하여 효율적인 지식 검색 구현하기 / Tactic: Use embeddings-based search to implement efficient knowledge retrieval

입력의 일부로 외부 소스의 정보가 제공되는 경우 모델이 이를 활용할 수 있습니다. 이렇게 하면 모델이 더 많은 정보를 바탕으로 최신 답변을 생성하는데 도움이 될 수 있습니다. 예를 들어, 사용자가 특정 영화에 대해 질문하는 경우, 해당 영화에 대한 고품질 정보(예: 배우, 감독 등)를 모델의 입력에 추가하는 것이 유용할 수 있습니다. 임베딩을 사용하면 효율적인 지식 검색을 구현하여 실행 시점(run-time)에 관련 정보를 모델 입력에 동적으로 추가할 수 있습니다.

A model can leverage external sources of information if provided as part of its input. This can help the model to generate more informed and up-to-date responses. For example, if a user asks a question about a specific movie, it may be useful to add high quality information about the movie (e.g. actors, director, etc…) to the model's input. Embeddings can be used to implement efficient knowledge retrieval, so that relevant information can be added to the model input dynamically at run-time.

텍스트 임베딩은 텍스트 문자열 간의 연관성을 측정할 수 있는 벡터입니다. 유사하거나 관련성이 있는 문자열은 관련없는 문자열보다 가깝게 배치됩니다. 그렇기 때문에 빠른 벡터 검색 알고리즘이 있으면 임베딩을 효율적인 지식 검색을 구현하는 데 사용할 수 있습니다. 특히, 텍스트 말뭉치(corpus)를 청크(chunk)로 분할하고 각 청크를 임베딩하고 저장할 수 있습니다. 그런 다음 주어진 쿼리를 임베딩하고 벡터 검색을 수행하여 쿼리와 가장 관련성이 높은 말뭉치 청크를 찾을 수 있습니다. (즉, 임베딩 공간에서 가장 가까운 위치에 있음)

A text embedding is a vector that can measure the relatedness between text strings. Similar or relevant strings will be closer together than unrelated strings. This fact, along with the existence of fast vector search algorithms means that embeddings can be used to implement efficient knowledge retrieval. In particular, a text corpus can be split up into chunks, and each chunk can be embedded and stored. Then a given query can be embedded and vector search can be performed to find the embedded chunks of text from the corpus that are most related to the query (i.e. closest together in the embedding space).

이에 대한 구현 예제는 OpenAI Cookbook에서 확인할 수 있습니다. 지식 검색을 사용하여 모델이 잘못된 사실을 만들 가능성을 최소화하는 방법에 대한 예제는 "모델이 검색된 지식을 사용하여 쿼리에 답변하도록 지시하기" 전략을 참조하세요.

Example implementations can be found in the OpenAI Cookbook. See the tactic “Instruct the model to use retrieved knowledge to answer queries” for an example of how to use knowledge retrieval to minimize the likelihood that a model will make up incorrect facts.

세부 전략: 코드 실행을 사용하여 보다 정확히 계산하거나 외부 API를 호출하세요 / Tactic: Use code execution to perform more accurate calculations or call external APIs

GPT는 자체적으로 산술 연산이나 긴 연산을 정확하게 수행할 수 없습니다. 이러한 계산이 필요한 경우 모델에 모델이 직접 계산을 하기보다는 코드를 작성하고 실행할 수 있도록 지시할 수 있습니다. 특히, 모델이 실행할 코드를 역따옴표(백틱, `) 3개로 감싸는 식으로 지정한 형식으로 입력하도록 지시할 수 있습니다. 출력이 생성된 뒤, 코드 부분을 추출하여 실행할 수 있습니다. 마지막으로, 필요한 경우 코드 실행 엔진(예: Python 인터프리터)에서의 출력을 다음 쿼리의 모델 입력으로 제공할 수 있습니다.

GPTs cannot be relied upon to perform arithmetic or long calculations accurately on their own. In cases where this is needed, a model can be instructed to write and run code instead of making its own calculations. In particular, a model can be instructed to put code that is meant to be run into a designated format such as triple backtics. After an output is produced, the code can be extracted and run. Finally, if necessary, the output from the code execution engine (i.e. Python interpreter) can be provided as an input to the model for the next query.

시스템 메시지(SYSTEM)

```코드는 여기에 있습니다```와 같은 식으로 3개의 역따옴표로 코드를 감싸서 Python 코드를 작성하고 실행할 수 있습니다. 이를 사용하여 계산을 수행하세요.
> You can write and execute Python code by enclosing it in triple backticks, e.g. ```code goes here```. Use this to perform calculations.

사용자 메시지(USER)

다음 다항식의 모든 실수 해를 찾으세요: 3*x**5 - 5*x**4 - 3*x**3 - 7*x - 10.
> Find all real-valued roots of the following polynomial: 3*x**5 - 5*x**4 - 3*x**3 - 7*x - 10.

위 대화를 실행해보기 / Open in Playground

코드 실행을 위한 또 다른 좋은 사용 사례는 외부 API를 호출하는 것입니다. 모델이 API의 올바른 사용법을 알려주면, 이를 활용하는 코드를 작성할 수 있습니다. API를 사용하는 방법을 보여주는 문서 또는 코드 샘플(혹은 둘 다)을 제공하여 모델에게 API 사용법을 알려줄 수 있습니다.

Another good use case for code execution is calling external APIs. If a model is instructed in the proper use of an API, it can write code that makes use of it. A model can be instructed in how to use an API by providing it with documentation and/or code samples showing how to use the API.

시스템 메시지(SYSTEM)

파이썬 코드를 세 개의 백틱으로 묶어 작성하고 실행할 수 있습니다.
또한 사용자가 친구에게 메시지를 보내는 데 도움이 되는 다음 모듈에 액세스할 수 있습니다:
```python import message message.write(to="John", message="퇴근 후 만날래요?")```
> You can write and execute Python code by enclosing it in triple backticks.
> Also note that you have access to the following module to help users send messages to their friends:
> ```python import message message.write(to="John", message="Hey, want to meetup after work?")```

위 대화를 실행해보기 / Open in Playground

경고: 모델에서 생성된 코드를 실행하는 것은 본질적으로 안전하지 않으므로, 이를 실행하려는 어플리케이션에서는 주의를 기울여야 합니다. 특히, 신뢰할 수 없는 코드로 인해 생기는 피해를 제한하기 위해 샌드박스화된 코드 실행 환경이 필요합니다.

WARNING: Executing code produced by a model is not inherently safe and precautions should be taken in any application that seeks to do this. In particular, a sandboxed code execution environment is needed to limit the harm that untrusted code could cause.