[TLDR] 오늘의 AI 뉴스, 2023-05-31: ChatGPT 환각 현상 해결 방법☁️, AI 위험에 대한 성명🚨, 동영상용 ControlNet📹

9bow · 6월 1, 2023, 8:17오전

파이토치 한국 사용자 모임에서는 TLDR 뉴스레터의 승인을 받아 AI 소식을 DeepL로 번역하여 전합니다.

더 많은 AI 소식 및 정보를 공유하고 함께 성장하고 싶으시면 파이토치 한국 사용자 모임에 방문해주세요!

주요 뉴스 & 신규 출시 소식 / Headlines & Launches

한국의 네이버, 최신 ChatGPT와 유사한 AI 모델로 외국 정부 공략 / South Korea’s Naver to target foreign governments with latest ChatGPT-like AI model (5 minute read)

한국의 거대 인터넷 기업인 네이버는 정치적, 문화적 요구를 해결하기 위해 현지화에 중점을 둔 최신 ChatGPT와 유사한 AI 모델의 맞춤형 버전을 외국 정부에 제공할 계획입니다. 또한 국내 서비스를 강화하기 위한 업그레이드 버전인 하이퍼클로바 X를 출시할 예정입니다.

South Korean internet giant Naver plans to provide foreign governments with tailored versions of its latest ChatGPT-like AI model, focusing on localization to address political and cultural needs. The company is also set to launch HyperClova X, an upgrade to enhance its domestic services.

AI 위험에 대한 성명서 / Statement On AI Risk (1-minute read)

AI로 인한 멸종 위험을 팬데믹이나 핵전쟁과 유사하게 취급해 줄 것을 사회에 호소하는 선도적인 AI 연구자들의 성명서입니다.

A statement from leading AI researchers imploring society to treat the risk of extinction from AI similarly to the pandemic and nuclear war.

ChatGPT '환각' 문제 - 일부 연구자들은 이 문제를 고칠 수 없음을 우려합니다. / ChatGPT ‘hallucinates.’ Some researchers worry it isn’t fixable. (9 minute read)

MIT 연구진은 OpenAI의 GPT-4와 같은 대규모 언어 모델이 허위 또는 부정확한 정보를 생성하는 AI "환각"을 해결하기 위해 새로운 "마음의 사회" 접근 방식을 제안했습니다. 이 방법은 여러 개의 챗봇을 사용하여 동일한 질문에 대해 서로 다른 답변을 제공하고, 봇들이 합의에 도달할 때까지 토론하는 것입니다. 이러한 솔루션의 창의성에도 불구하고 환각은 AI 기술에서 여전히 중요한 과제로 남아 있습니다. 환각은 AI가 생성한 콘텐츠의 신뢰성에 영향을 미치고 의학, 법률, 사이버 보안과 같은 민감한 분야에서 오용될 수 있다는 우려를 불러일으킵니다.

MIT researchers have proposed a novel "society of minds" approach to address AI "hallucinations", where large language models like OpenAI's GPT-4 generate false or inaccurate information. The method involves using multiple chatbots to provide different responses to the same question, with the bots debating until they reach a consensus. Despite the creativity of such solutions, it's clear that hallucinations remain a significant challenge in AI technology. Hallucinations impact the reliability of AI-generated content and raise concerns about potential misuse in sensitive fields like medicine, law, and cybersecurity.

연구 & 혁신 관련 소식 / Research & Innovation

대규모 언어 모델은 공정한 평가자가 아니다 / Large language models are not fair evaluators (GitHub Repo)

여러 모델의 성능을 서로 비교하는 것은 어려운 일입니다. 기존의 자동화된 측정 방식은 강력하지 않고 사람이 평가하는 방식은 확장성이 떨어집니다. 최근 사람들은 사람의 선호도를 대신하기 위해 GPT4를 사용하고 있습니다. 결국 RLHF로 훈련되었기 때문입니다. 하지만 이렇게 순진하게 접근하면 오차가 발생할 수 있습니다. 이 연구에서는 GPT4 평가를 사람의 평가와 더 가깝게 조정하는 몇 가지 간단한 수정 사항을 제안합니다.

It is challenging to compare the performance of multiple models to one another. Traditional automated measures aren't robust and human evaluation isn't scalable. Recently people have been turning to GPT4 as a proxy for human preference. After all, it was trained with RLHF. However, doing this naively results in miscalibration. This work proposes a few simple fixes that aligns GPT4 evaluations more closely with those of humans.

Gorilla: API 호출 생성에서 LLM의 성능 향상 / Gorilla: Enhancing the Power of LLMs in Generating API Calls (3 minute read)

대규모 언어 모델(LLM)에 기반한 개선된 모델인 Gorilla는 API 호출을 작성할 때 GPT-4와 같은 기존 모델보다 훨씬 뛰어난 성능을 발휘합니다. 언어 모델이 도구를 더 효과적으로 사용할 수 있도록 도와줍니다. 문서 검색기와 결합된 Gorilla는 업데이트된 문서에 잘 적응하고 부정확한 생성을 최소화하며 출력의 신뢰성을 유지하는데, 이는 APIBench라는 새로운 종합 데이터 세트를 사용하여 입증되었습니다.

Gorilla, a refined model based on Large Language Models (LLMs), significantly outperforms existing models like GPT-4 in writing API calls. It helps language models use tools more effectively. Combined with a document retriever, Gorilla adjusts well to updated documents, minimizes inaccurate creations, and maintains the reliability of outputs, as demonstrated using a new comprehensive dataset called APIBench.

ProlificDreamer로 Text-3D 생성 향상 / Enhancing Text-to-3D Generation with ProlificDreamer (GitHub Repo)

ProlificDreamer는 텍스트로부터 3D 생성을 개선하기 위한 새로운 방법입니다. 3D 파라미터를 무작위 변수로 처리하여 이전의 점수 증류 샘플링(SDS) 접근 방식에서 발견되는 과포화 및 낮은 다양성 등의 문제를 해결합니다. 이 가변 점수 증류(VSD) 프레임워크는 더 높은 품질과 다양한 결과를 생성할 수 있으므로 연기나 물방울과 같은 복잡한 효과를 포함하여 텍스트로부터 매우 디테일하고 사실적인 3D 이미지를 생성할 수 있습니다.

ProlificDreamer is a new method for improving text-to-3D generation. It solves problems like over-saturation and low-diversity found in the earlier Score Distillation Sampling (SDS) approach by treating the 3D parameter as a random variable. This Variational Score Distillation (VSD) framework can produce higher quality and more diverse results, allowing the creation of highly detailed and photo-realistic 3D images from text, including complex effects such as smoke and drops.

엔지니어링 및 리소스 관련 소식 / Engineering & Resources

ControlVideo: ControlNet을 동영상으로 확장하기 / ControlVideo: Extending ControlNet to Videos (18 minute read)

ControlVideo는 서면 지침에 따라 비디오를 편집하는 새로운 방법으로, 원본 비디오의 구조를 유지하면서 비디오를 텍스트에 더 충실하고 일관되게 만듭니다. 고급 기술을 결합하여 이를 달성합니다. 컨트롤비디오는 정확성과 사실성 측면에서 이전 방법보다 개선되었습니다. 원본 비디오 정보를 얼마나 많이 사용할지 유연하게 선택할 수 있습니다.

ControlVideo is a new way to edit videos based on written instructions, making the video more faithful and consistent with the text while maintaining the original video's structure. It achieves this by combining advanced techniques. ControlVideo improves upon previous methods in terms of accuracy and realism. It offers a lot of flexibility in how much of the original video information to use.

셀프 노트 테이킹 방법으로 언어 모델 성능 향상 / Boosting Language Model Performance with Self-Note Taking Method (6 minute read)

규모가 큰 언어 모델은 때때로 여러 단계를 거쳐 기억하고 생각하는 데 어려움을 겪습니다. 새로운 방법을 사용하면 이러한 모델이 스스로 메모를 작성하여 더 잘 기억하고 더 복잡한 문제를 해결할 수 있습니다.

Large language models sometimes have trouble remembering things and thinking through multiple steps. A new method lets these models take notes on their own, helping them remember better and solve more complex problems.

언어 모델로 이미지 생성하기 / Generate images with a language model (7 minute read)

최근 두 가지 이상의 데이터 유형에 대한 다중 모드(멀티모달) 학습이 인기를 얻고 있습니다. 일반적으로 모델 아키텍처는 복잡하고 다소 취약한 솔루션을 필요로 합니다. Gill은 텍스트 토큰과 이미지 토큰을 인터리빙하여 진정한 멀티모달 입출력을 가능하게 하는 멋진 새 프로젝트입니다. 아직 한계가 있긴 하지만 보고된 결과는 설득력이 있습니다.

Multimodal training of more than one data type has been gaining popularity recently. Usually, the model architecture requires complicated and somewhat brittle solutions. Gill is a cool new project that interleaves text tokens with image tokens to allow for truly multimodal input and output. The reported results are compelling although there are still limitations.

그 외 소식 / Miscellaneous

기업 VC, AI 물결을 타다 / Corporate VCs Ride The AI Wave (2 minute read)

세일즈포스, 워크데이 등 기업 벤처캐피털 투자자들이 AI 스타트업에 대한 투자를 늘리고 있습니다.

Corporate VC investors, including Salesforce and Workday, are increasingly investing in AI startups.

StabilityAI가 (미국) 상원에 보내는 편지 / StabilityAI’s Letter To The Senate (12 minute read)

StabilityAI가 상원에 보낸 서한에서 오픈 소스 AI의 활성화를 지지하는 내용입니다.

A letter by StabilityAI to the Senate in which they advocate for the promotion of open source AI.

ChatGPT의 대화 공유 링크 / ChatGPT Shared Links (3 minute read)

공유 링크는 사용자가 각 대화에 대해 고유한 URL을 생성할 수 있는 ChatGPT의 새로운 기능입니다. 링크를 친구, 동료, 공동 작업자와 공유할 수 있습니다. 공유 링크는 스크린샷을 공유하던 기존의 번거로운 방법을 대체하여 사용자가 ChatGPT 대화를 공유할 수 있는 새로운 방법을 제공합니다.

Shared links is a new feature in ChatGPT that allows users to generate a unique URL for each conversation. Links can be shared with friends, colleagues, and collaborators. Shared links offer a new way for users to share their ChatGPT conversations, replacing the old and burdensome method of sharing screenshots.

더 읽어보기 / Quick Links

에듀테크 그룹, AI는 친구라고 주장 / Edtech Groups Insist AI Is A Friend (4 minute read)

듀오링고와 같은 에듀테크 그룹은 저렴한 플랫폼이 비즈니스를 약화시킬 가능성이 있음에도 불구하고 AI가 비즈니스를 향상시킬 것이라고 주장합니다.

Edtech groups such as Duolingo insist that AI will enhance their business, despite the possibility that cheap platforms undercut their business.

Threestudio (GitHub Repo)

쓰리스튜디오는 텍스트 프롬프트, 단일 이미지, 몇 장의 이미지에서 3D 콘텐츠를 제작하기 위한 통합 프레임워크로, 2D 텍스트-이미지 생성 모델을 끌어올려 3D 콘텐츠를 제작할 수 있습니다.

Threestudio is a unified framework for 3D content creation from text prompts, single images, and few-shot images, by lifting 2D text-to-image generation models.

Freepik AI 이미지 생성기 / Freepik AI image generator (Product Launch)

AI 이미지 생성기는 단어를 사용하여 독특한 이미지를 만들 수 있는 새로운 디지털 아트 도구입니다. 검색창에 텍스트를 입력하기만 하면 사용 가능한 아트 스타일 중 하나로 멋진 이미지를 만들 수 있습니다.

The AI Image Generator is a new digital art tool that allows you to create unique images using your words. All you have to do is to type a text in the search bar, and you’ll have stunning images in one of the available art styles.

CapeChat (Product Launch)

케이프챗은 문서를 자동으로 암호화하고 민감한 데이터를 삭제합니다. ChatGPT API로 구동되므로 개인 정보를 보호하면서 최고의 언어 모델을 사용할 수 있습니다.

CapeChat automatically encrypts your documents and redacts any sensitive data. It’s powered by the ChatGPT API, so you get the best language model while preserving your privacy.