[TLDR] 오늘의 AI 뉴스, 2023-10-02: Llama 2 Long 🦙, 사운드로 동영상 제작 🎵, 비전 트랜스포머 👓

9bow · 10월 3, 2023, 2:44오전

파이토치 한국 사용자 모임에서는 TLDR 뉴스레터 의 승인을 받아 AI 소식을 DeepL로 번역 하여 전합니다.
더 많은 AI 소식 및 정보를 공유하고 함께 성장하고 싶으신가요? 지금 파이토치 한국어 커뮤니티에 방문해주세요!

주요 뉴스 & 신규 출시 소식 / Headlines & Launches

메타, 라마 2 롱(Llama2 Long)을 조용히 공개 / Meta Quietly Unveils Llama 2 Long (3 minute read)

메타는 긴 사용자 프롬프트에서 GPT-3.5 터보와 클로드 2보다 뛰어난 성능을 발휘하는 새로운 AI 모델인 라마 2 롱을 출시했습니다. llama

Meta has released Llama 2 Long, a new AI model that outperforms GPT-3.5 Turbo and Claude 2 for long user prompts.

제품 사진 생성을 위한 Shopify AI / Shopify AI for product photo creation (3 minute read)

온라인 스토어를 구축하려면 고품질의 제품 및 마케팅 이미지가 많이 필요합니다. 기존 제품 이미지를 사용하여 새로운 이미지를 쉽게 만들 수 있도록 Shopify AI 팀이 Stable Diffusion XL로 구축한 배경 교체 도구의 초기 데모입니다.

Building an online store requires lots of high-quality product and marketing images. This is an early demo of a background replacement tool built with Stable Diffusion XL by the Shopify AI team that makes it easy to use your existing product images to make something new.

AI와 일자리 / AI and jobs (1 minute read)

AI는 작업을 수행하는 데는 능숙하지만 모든 작업을 수행하는 것은 아닙니다. 샘 알트먼은 AI를 통해 인간은 AI가 할 수 없는 다른 일에 집중할 수 있게 될 것이라고 주장합니다. sam-altman

AI is great at doing tasks, but not so much doing full jobs. Sam Altman argues that AI will enable humans to focus on different work that AI can’t do.

연구 & 혁신 관련 소식 / Research & Innovation

Ollama: 간편한 로컬 LLM / Local LLMs made easy (3 minute read)

새로운 모델이 계속 출시됨에 따라 이를 활용하고 실행하는 방법을 찾는 것이 다소 어려울 수 있습니다. 또한 쉽게 리믹스하고 리메이크하는 것도 어렵습니다. 이 깔끔한 작은 도구인 Ollama를 사용하면 최신 모델을 쉽게 테스트할 수 있습니다. ollama

With all the new models coming out, it can be somewhat challenging to find a way to pull and run them. It's also hard to remix and remake them easily. This neat little tool, Ollama, makes it easy to test the latest models.

TempoTokens: 사운드로 동영상 만들기 / Creating Videos from Sound (4 minute read)

연구원들은 전체 주제와 순간별 세부 사항 모두에서 짝을 이루는 사운드와 거의 일치하는 동영상을 만드는 방법을 개발했습니다.

Researchers have developed a method to create videos that closely match a sound they're paired with, both in overall theme and moment-by-moment details.

SapientML (GitHub Repo)

SapientML은 기존 데이터 세트와 사람이 작성한 파이프라인의 코퍼스에서 학습하고 새로운 데이터 세트에 대한 예측 작업을 위한 고품질 파이프라인을 효율적으로 생성할 수 있는 자동화된 머신러닝 기술입니다.

SapientML is an AutoML technology that can learn from a corpus of existing datasets and their human-written pipelines and efficiently generate a high-quality pipeline for a predictive task on a new dataset.

엔지니어링 및 리소스 관련 소식 / Engineering & Resources

비전 트랜스포머에는 레지스터가 필요합니다 / Vision Transformers need registers (30 minute read)

최근 몇 주 동안 나온 가장 멋지고 단순한 비전 문서 중 하나입니다. 비전 트랜스포머는 "쓸모없는" 픽셀 값을 글로벌 정보를 저장하는 장소로 사용합니다. 따라서 주의력 지도를 해석할 수 없게 됩니다. 하지만 어휘에 간단한 [reg] 토큰을 추가하면 모델이 이를 사용하여 픽셀 값으로 정보를 저장하지 않습니다. vision-transformer

One of the coolest and simplest vision papers in recent weeks. Vision Transformers use "useless" pixel values as places to store global information. This makes attention maps uninterpretable. However, if you add a simple [reg] token to the vocabulary, the model will use it and won't store information in pixel values.

(더 읽어보기 [2023/09/25 ~ 10/01] 이번 주의 주요 ML 논문 (Top ML Papers of the Week))

Transformer-VQ: 효율적인 선형-시간 어텐션 / Transformer-VQ: Efficient Linear-Time Attention (17 minute read)

이 연구에서는 고유한 벡터 기반 키와 캐싱 덕분에 어텐션을 더 빠르게 처리하는 새로운 디자인의 트랜스포머인 트랜스포머-VQ를 소개합니다. transformer

This study presents Transformer-VQ, a new design of transformer that processes attention faster thanks to unique vector-based keys and caching.

FLIP: 진짜 얼굴과 가짜 얼굴 식별하기 / Identifying Real vs Fake Faces (2 minute read)

이 연구는 시각 및 언어 기반 도구를 사용하여 시스템이 실제 얼굴과 가짜 얼굴을 인식하는 방식을 개선합니다.

This study uses visual and language-based tools to improve how systems recognize real vs. fake faces.

그 외 소식 / Miscellaneous

어쩌면 AI는 현대의 연금술일지도...? / Perhaps AI Is Modern Alchemy (5 minute read)

ChatGPT와 기타 AI 기술은 '비과학적'이라는 비판을 받기도 하지만, 연금술이 화학보다 먼저 발전한 것처럼 '과학을 지향하는 기술'로 볼 수도 있습니다.

ChatGPT and other AI technologies, while critiqued as "unscientific," can be viewed as an "aspiring science" akin to how alchemy preceded chemistry.

Dataiku, LLM 메시 공개 및 LLM 메시 출시 파트너 Snowflake, Pinecone, AI21 Labs 발표 / Dataiku Unveils LLM Mesh and Announces LLM Mesh Launch Partners Snowflake, Pinecone, and AI21 Labs (5 minute read)

Dataiku는 기업에서 LLM을 통합하기 위한 효과적이고 확장 가능하며 안전한 플랫폼에 대한 중요한 요구 사항을 해결하는 LLM 메시를 공개했습니다. LLM 서비스 제공업체와 최종 사용자 애플리케이션 사이에 위치한 LLM 메시를 통해 기업은 데이터와 대응의 안전을 보장하기 위해 필요에 따라 가장 비용 효율적인 모델을 선택할 수 있습니다. snowflake pinecone

Dataiku has unveiled the LLM Mesh, addressing the critical need for an effective, scalable, and secure platform for integrating LLMs in the enterprise. With the LLM Mesh sitting between LLM service providers and end-user applications, companies can choose the most cost-effective models for their needs to ensure the safety of their data and responses.

마음 읽기 AI의 내부 (24분 동영상) / Inside mind reading AI (24 minute video)

우리의 정신 상태를 읽고 조작하여 긴장을 풀고, 학습하고, 고통을 줄이는 데 도움을 주는 스타트업이 점점 더 많이 등장하고 있습니다. 뉴럴링크, 멘디, 포커스캄은 이러한 스타트업 중 일부에 불과합니다. 이 회사들은 사용자의 두뇌에 대한 접근 권한이 주어지면 사용자로부터 데이터를 수집합니다. 이 데이터는 누가 소유할까요?

More startups are popping up to help us read and manipulate our mental states and help us relax, learn, and reduce pain. Neuralink, Mendi, and FocusCalm are just some of these startups. These companies will gather data from users if they are given access to their brains. Who owns this data?

더 읽어보기 / Quick Links

BulkCorrector (Product)

BulkCorrector가 텍스트를 관리하기 쉬운 조각으로 분할하여 일괄 수정하므로 ChatGPT의 글자 수 제한에 다시는 신경 쓰지 않아도 됩니다.

Never deal with ChatGPT’s character limit again, as BulkCorrector segments your text into manageable pieces for bulk corrections.

게티가 라이선스 이미지로 학습한 AI 생성기를 만들었습니다 / Getty Made An AI Generator Trained On Its Licensed Images (2 minute read)

게티이미지는 엔비디아와 협력하여 사용자가 게티의 방대한 사진 라이브러리를 사용하여 이미지를 생성할 수 있는 도구인 '제너레이티브 AI 바이 게티이미지'를 출시했습니다.

Getty Images, in collaboration with Nvidia, has launched "Generative AI by Getty Images," a tool that allows users to create images using Getty's extensive photo library.

오디오 AI 서브 레딧(서브 레딧) / Audio AI subreddit (Subreddit)

AI 기반 음악, 음성, 오디오 제작 및 기타 모든 새로운 AI 오디오 기술에 대한 서브 레딧입니다.

A subreddit about AI-driven music, speech, audio production, and all other emerging AI audio technologies.