[TLDR] 오늘의 AI 뉴스, 2023-05-15

9bow · 5월 16, 2023, 6:41오전

파이토치 한국 사용자 모임에서는 TLDR 뉴스레터의 승인을 받아 AI 소식을 DeepL로 번역하여 전합니다.

더 많은 AI 소식 및 정보를 공유하고 함께 성장하고 싶으시면 파이토치 한국 사용자 모임에 방문해주세요!

주요 뉴스 & 신규 출시 소식 / Headlines & Launches

OpenAI, ChatGPT 플러그인 출시 / OpenAI Rolling Out ChatGPT Plugins (2 minute read)

OpenAI는 이번 주 프리미엄 가입자들을 대상으로 ChatGPT 플러그인 70종 이상을 출시합니다. 이를 통해 사용자들은 ChatGPT를 통해 인터넷에 접근할 수 있습니다.

OpenAI is rolling out ChatGPT 70+ plugins to premium subscribers this week, allowing users to access the internet.

구글, 코드 작성 AI 코디(Codey) 공개 / Google Unveils Codey (1 minute read)

이 기사에서는 프로그래머의 코드 작성을 돕기 위해 설계된 새로운 생성형 AI 모델인 Codey의 도입에 대해 설명합니다. 코디는 사용자에게 제안을 하고, 오류를 식별하고, 전체 코드 블록을 완성할 수도 있는 코드 생성 모델입니다. 사용자가 빠르고 효율적으로 솔루션을 찾을 수 있도록 도와 코딩 프로세스를 간소화하고 개발자의 생산성을 높이는 것을 목표로 합니다.

The article discusses Google's introduction of a new generative AI model called Codey, designed to assist programmers in writing code. Codey is a code-generating model that can provide users with suggestions, identify errors, and even complete entire blocks of code. It aims to streamline the coding process and boost developer productivity by helping users find solutions quickly and efficiently.

연구 & 혁신 관련 소식 / Research & Innovation

추론을 통한 (객체) 검출 / Detection through reasoning (GitHub Repo)

객체를 감지할 때는 일반적으로 미리 정의된 클래스 집합에서 가져옵니다. 또한 장면에 대해 질문하는 것도 어렵습니다. 이 경우 명령어로 조정된 감지기와 함께 강력한 언어 모델(Vicuna)을 사용하여 쿼리를 추론하고 그 결과 오브젝트를 감지할 수 있습니다.

When detecting objects in a scene, you usually pull from a set of predefined classes. Also, asking questions about the scene is challenging. In this case, we can use powerful language models (Vicuna) with instruction tuned detectors to reason about queries and detect objects as a result.

Open-LLaMa (GitHub Repo)

오픈 소스 고성능 라마 모델의 전체 트레이닝 코드로, 사전 트레이닝부터 RLHF까지 전체 프로세스를 포함합니다.

The complete training code of the open-source high-performance Llama model, including the full process from pre-training to RLHF.

PaLM (GitHub Repo)

Google의 PaLM 모델을 오픈소스로 구현한 것입니다.

An open-source implementation of Google's PaLM models.

엔지니어링 및 리소스 관련 소식 / Engineering & Resources

프롬프트 인젝션에 대한 설명 / Prompt Injection Explained (17 minute read)

AI 분야에서 원하는 출력을 얻기 위해 의도적으로 특정 입력을 AI 모델에 삽입하는 관행을 설명하기 위해 사용되는 용어인 프롬프트 인젝션에 대한 전체 소개와 함께 이것이 중요한 문제인 이유와 제안된 많은 솔루션이 효과적이지 않은 이유를 설명합니다.

A full introduction into prompt injection, a term used in the field of AI to describe the practice of deliberately inserting specific inputs into an AI model to obtain desired outputs, including why it’s an important issue and why many of the proposed solutions won’t be effective.

사전 지식으로 이미지 품질 향상 / Enhancing Image Quality with Prior Knowledge (16 minute read)

미리 설정된 텍스트-이미지 모델을 사용하여 흐릿한 이미지의 품질을 향상시키는 새로운 기술이 개발되었습니다. 이 방법은 특수 인코더를 스마트하게 사용하며 기존 이미지 생성 모델을 변경할 필요가 없어 학습 시간을 절약할 수 있습니다. 또한 사용자는 한 번의 간단한 조정으로 이미지 품질을 제어할 수 있습니다. 또한 이 전략은 이전 방법보다 큰 이미지를 더 잘 처리합니다. 인공 이미지와 실제 이미지를 모두 사용한 테스트에서 기존 솔루션보다 더 효과적이라는 것이 입증되었습니다.

A new technique has been created that enhances the quality of blurry images using pre-set text-to-image models. This method smartly uses a special encoder and doesn't need to change the existing image-making model, saving training time. Plus, users can control image quality with one simple adjustment. The strategy also handles larger images better than previous methods. Tests with both artificial and real-world images prove it's more effective than current solutions.

BLIP 사용법 / Instruct Blip (33 minute read)

Blip은 시각 및 언어 작업에 사용할 수 있는 Salesforce에서 개발한 모델입니다. 언어 모델이 지침을 따르도록 튜닝하는 일련의 작업에 따라 이러한 비전 언어 모델도 지침을 따르도록 튜닝할 수 있습니다. 이렇게 하면 성능이 크게 향상되며, 이 경우 공개된 GPT-4의 수치를 능가하기도 합니다.

Blip is a model developed by Salesforce that can be used for vision and language tasks. Following the line of work in language models that tunes them to follow instructions, we can also tune these vision language models to follow instructions. This dramatically improves the performance, and in this case it even outperforms the published numbers of GPT-4.

그 외 소식 / Miscellaneous

Cohere의 LLM 유니버시티 / Cohere LLM university (6 minute read)

Cohere는 다른 대형 업체들과 유사한 언어 모델 API를 제공하는 스타트업입니다. 애플리케이션을 구축하는 데 사용할 수 있는 강력한 모델 세트를 보유하고 있습니다. 이 언어 모델 대학은 최신 언어 모델에 대한 최신 정보를 제공하고 Cohere의 도구를 사용하여 이를 기반으로 구축하는 방법을 보여주기 위해 설계되었습니다.

Cohere is a startup that offers a language model API similar to other big players. They have a set of powerful models that you can use to build applications. This language model university is designed to bring you up to speed on modern language models and show how to build on them with cohere's tools.

빅테크들은 AI 자료를 얼마나 더 공개할까? / How Long Will Big Tech Open Source AI Handouts Last? (6 minute read)

이 글에서는 특히 최근 유출된 메모에 따르면 구글과 OpenAI가 오픈소스 커뮤니티에 밀릴 것이라는 전망에 비추어 빅테크 기업들이 얼마나 오랫동안 오픈소스 AI 자료를 제공할 것인지에 대해 자세히 살펴봅니다.

This article dives into how long big tech companies will continue to provide open-source AI material, especially in light of the recent leaked memo suggesting that Google and OpenAI will lose out to the open-source community.

챗봇은 여전히 부정과 싸우고 있다 / Chatbots Still Struggle With Negation (7 minute read)

ChatGPT와 같은 AI 모델의 놀라운 기능에도 불구하고 한계가 있습니다. 특히 상식적인 추론이 부족하고 처리 중인 텍스트 이외의 맥락을 이해하지 못합니다. 이러한 무능력은 무의미하거나 일관성이 없거나 편향된 응답으로 이어질 수 있습니다. 이는 방대한 양의 텍스트 데이터에서 학습하지만 세상에 대한 이해나 추론 능력이 포함되지 않은 훈련 방법 때문입니다. 연구자들은 AI의 이러한 측면을 개선하기 위한 방법을 모색하고 있지만 여전히 중요한 과제가 남아 있습니다.

Despite the remarkable capabilities of AI models like ChatGPT, they do have limitations. In particular, they lack common sense reasoning and are unable to understand context beyond the immediate text they are processing. This inability can lead to nonsensical, inconsistent, or biased responses. This is due to the training method, which involves learning from vast amounts of text data but doesn't include an understanding of the world or the ability to reason about it. Researchers are exploring ways to improve these aspects of AI, but significant challenges remain.

더 읽어보기 / Quick Links

순다르 피차이가 말하는 검색, AI, 그리고 Microsoft / Sundar Pichai Talks Search, AI, & Microsoft (20 minute read)

Google의 AI 전략에 대해 이야기하는 Google CEO 순다르 피차이와의 인터뷰 기사

An interview with Google CEO Sundar Pichai discussing Google’s AI push.

Bard API (Product Launch)

Google의 최신 LLM인 PaLM-2로 구동되는 Bard의 AI를 사용하기 위한 리버스 엔지니어링 API

A reverse-engineered API for using Bard's AI, which is powered by Google's newest LLM; PaLM-2.

Jax 샤드 맵을 사용한 간편한 병렬 처리 / Easy parallelism with Jax shard map (8 minute read)

최신 머신러닝은 GPU와 같은 다양한 하드웨어 가속기에서 계산을 수행해야 합니다. 이를 코드에서 올바르게 구현하는 것은 까다롭습니다. Jax는 많은 실험적인 기능으로 선도해 왔습니다. Shmap은 수많은 혁신 중 하나입니다. 강력하고 최신 알고리즘과 확장 가능한 훈련을 쉽게 구현할 수 있습니다.

Modern ML requires computation across many hardware accelerators like GPUs. Getting this right, in code, is tricky. Jax has been leading with many experimental features. Shmap is another in a long line of innovations. It is powerful and allows easy implementation of modern algorithms and scalable training.

랭체인의 새로운 검색 프레임워크 / New retrieval framework in langchain (4 minute read)

언어 모델은 검색을 사용하여 최신 정보나 문맥에 맞지 않는 정보를 가져옵니다. 이 미래 지향적 검색은 Google 검색 API와 Open AI의 대규모 언어 모델을 사용하여 강력한 검색 질문 답변 시스템을 구축합니다.

Language models use retrieval to get up to date information or information that doesn't fit in the context. This forward looking retrieval uses a Google search API and large language models from Open AI to build a robust retrieval question answering system.