[TLDR] 오늘의 AI 뉴스, 2023-09-29: 구글 PaLM + Chroma 🤝, AI가 SaaS에 미칠 영향 💻, Meta의 텍스트-이미지 변환 모델 🖼️

9bow · 9월 29, 2023, 9:56오후

파이토치 한국 사용자 모임에서는 TLDR 뉴스레터 의 승인을 받아 AI 소식을 DeepL로 번역 하여 전합니다.
더 많은 AI 소식 및 정보를 공유하고 함께 성장하고 싶으신가요? 지금 파이토치 한국어 커뮤니티에 방문해주세요!

주요 뉴스 & 신규 출시 소식 / Headlines & Launches

Chroma, PaLM 모델 활용을 위해 Google과 파트너십 / Chroma x Google for building on PaLM (2 minute read)

Chroma는 AI 애플리케이션용 임베딩을 저장하는 선도적인 벡터 데이터베이스입니다. PaLM과 파트너십을 맺어 Google의 플래그십 모델 위에 애플리케이션을 구축했습니다.

Chroma is a leading vector database for storing embeddings for AI applications. They've partnered with PaLM to build applications on top of the flagship Google model.

A16z GP 크리스티나 셴, '향후 몇 년 동안 AI가 SaaS 환경에 미치는 영향은' / How Is AI Going to Impact The SaaS Landscape Over The Coming Years with a16z GP Kristina Shen (7 minute read)

SaaS가 AI의 영향을 받을 것이라는 것은 누구나 알고 있지만, 문제는 그 다음 물결이 어떤 모습일까요? 크리스티나 셴과의 심층 토론에서는 정보보다 인사이트가 더 중요한 이유와 SaaS를 위한 AI의 진정한 힘을 발휘하는 방법에 대해 설명합니다.

We all know SaaS is going to be impacted by AI but the question is what is the next wave going to look like? This in-depth discussion with Kristina Shen goes into why it’s more about insights than information and unlocking the real power of AI for SaaS.

AutoFlows: 자연어로 워크플로를 구축할 수 있는 Forethought의 새로운 도구 / New Forethought tool lets you build workflows with natural language (2 minute read)

포어씽크는 단순한 Q&A를 넘어 고객 서비스 업무를 위한 새로운 AI 기반 시스템인 '오토플로우'를 공개합니다. 자동 흐름은 자연어 프롬프트를 사용하여 시스템 전반에서 단계와 작업을 결정하므로 코드가 없는 기존의 워크플로와는 다릅니다. OpenAI 모델에 기반한 SupportGPT로 구동되는 이 시스템은 자동화와 사람의 감독을 결합합니다. autoflows

Forethought unveils "Autoflows", a new AI-driven system for customer service tasks that goes beyond simple Q&A. Using natural language prompts, Autoflows determines steps and tasks across systems, differing from traditional no-code workflows. Powered by SupportGPT, which is based on OpenAI models, the system blends automation and human oversight.

연구 & 혁신 관련 소식 / Research & Innovation

Google에서 공개한 SigLIP 모델 가중치 / SigLIP checkpoints released by Google (GitHub Repo)

공동 임베딩 모델은 두 가지 데이터 유형을 하나의 공간에 결합합니다. CLIP은 이미지와 텍스트에 널리 사용되는 모델입니다. 최근 Google 연구원들은 비전 트랜스포머를 기반으로 하는 매우 우수한 성능의 시그모이드 CLIP 모델을 제안했습니다. 이제 이 모델에 대한 자세한 정보를 공개하고 코드와 논문을 업데이트했습니다. multimodal vision-language google

Joint embedding models combine two data types into a single space. CLIP is a popular one for images and text. Recently, Google researchers proposed a Sigmoid CLIP model which performed very well and was Vision Transformer based. They've now released more information about the models and updated their code and paper.

InternLM-XComposer: 이미지-텍스트 이해와 생성의 미래 / The Future of Image-Text Understanding and Creation (GitHub Repo)

InternLM-XComposer는 텍스트와 이미지가 완벽하게 조화를 이루는 기사를 작성할 수 있는 최첨단 도구입니다.

InternLM-XComposer is a cutting-edge tool that can create articles with both text and images that fit perfectly together.

엔지니어링 및 리소스 관련 소식 / Engineering & Resources

Emu: 메타의 텍스트-이미지 변환 모델 / Meta's text to image model (6 minute read)

11억 개의 이미지 쌍에 대해 사전 학습하고 수천 개의 엄선된 이미지에 대해 미세 조정된 Emu 이미지 생성은 사용자 선호도 조사에서 SDXL을 능가하는 성능을 발휘합니다. 이 기술은 메타의 새로운 AI 어시스턴트 플레이의 중추 역할을 합니다. text-to-image

Pre-trained on 1.1B image pairs and fine-tuned on just a few thousand highly curated images, Emu image generation outperforms SDXL in user preference surveys. It acts as the backbone to much of Meta’s new AI assistant play.

VectorQuantized-VAE: VQ-VAE 간소화 / Simplifying VQ-VAE (30 minute read)

벡터 양자화-VAE는 일반적으로 특정 이산 표현(예: 토큰 또는 코드)을 학습할 때 가장 최신의 기술입니다. 그러나 일반적으로 복잡하고 취약합니다. 이 새로운 논문에서는 코드북 붕괴와 커미션 손실, 코드북 재시딩, 코드 분할, 엔트로피 페널티 등과 같은 복잡한 메커니즘을 제거하는 간단한 양자화 체계를 제안합니다. vae

VectorQuantized-VAEs are typically the state of the art when trying to learn a specific discrete representation (e.g., tokens or codes). However, they are usually complex and fragile. This new paper proposes a simple quantization scheme that eliminates codebook collapse and complicated machinery such as commitment losses, codebook reseeding, code splitting, entropy penalties, etc.

JAM: 대규모 멀티모달 모델 공동 학습 / Jointly training large multimodal models (23 minute read)

모델은 일반적으로 특정 작업(예: 언어 대 이미지 생성)에 대해 개별적으로 훈련됩니다. 새로 제안된 공동 자동 회귀 혼합(JAM) 알고리즘은 서로 다른 모델을 영리한 인터리브 교차 주의와 결합하고 부드러운 미세 조정을 통해 많은 다중 모드 작업에서 지배적인 성능을 달성합니다. multimodal vision-language

Models are usually trained separately on specific tasks (e.g., language vs image generation). The newly proposed joint autoregressive mixture (JAM) algorithm combines disparate models with a clever interleaved cross attention and gentle fine-tuning achieves dominant performance in many multimodal tasks.

그 외 소식 / Miscellaneous

직장에서의 생성형 AI를 위한 리더 가이드(PDF) / A Leader's Guide to Generative AI in the Workplace (20 minute read)

기술 리더의 경우, 직장에서 제너레이티브 AI를 도입하면 생산성과 혁신을 주도할 수 있는 풍부한 기회를 얻을 수 있습니다. 또한 리더는 팀원들을 위해 의미 있는 커리어 경로를 설계하는 방법을 고려해야 합니다. Workera의 가이드는 리더가 이를 수행할 수 있는 방법을 보여줍니다.

For technical leaders, embracing generative AI in the workplace brings a wealth of opportunity to drive productivity and transformation. It also means leaders need to consider how to design meaningful career pathways for team members. Workera’s guide shows leaders how they can do it.

백악관이 클라우드 기업에 AI 고객 공개를 강제할 수 있습니다 / White House could force cloud companies to disclose AI customers (4 minute read)

백악관은 잠재적인 해외 AI 위협을 겨냥하여 클라우드 기업에 대규모 컴퓨팅 구매를 보고하도록 의무화할 수 있습니다. 비평가들은 감시 우려와 컴퓨팅 비용 변화를 이유로 들기도 합니다. Microsoft와 OpenAI 같은 기업도 비슷한 감독 조치를 지원합니다. ai-act ai-regulation

The White House may mandate cloud companies to report large-scale computing purchases, targeting potential foreign AI threats. Critics cite surveillance concerns and changing compute costs. Entities like Microsoft and OpenAI support similar oversight measures.

Anthropic, BCG와 파트너십 체결 / Anthropic partners with BCG (2 minute read)

앤트로픽은 보스턴 컨설팅 그룹(BCG)과 파트너십을 맺고 전 세계 BCG 고객에게 AI 어시스턴트인 Claude를 제공합니다. 이번 협력은 시장 조사, 사기 탐지, 비즈니스 분석 등의 사용 사례와 함께 책임감 있는 AI 배포에 중점을 두고 있습니다. 두 기관은 윤리적으로 배포된 엔터프라이즈 AI의 새로운 표준을 수립하는 것을 목표로 합니다. anthropic claude

Anthropic has partnered with Boston Consulting Group (BCG) to offer their AI assistant, Claude, to BCG customers globally. This collaboration focuses on responsible AI deployment, with use cases including market research, fraud detection, and business analysis. Both entities aim to establish a new standard for ethically deployed enterprise AI.

더 읽어보기 / Quick Links

Metaphor (Product)

Metaphor의 API를 사용하면 LLM을 인터넷에 연결하여 강력한 검색 및 연구 기능을 사용할 수 있습니다. 몇 줄의 코드만으로 고품질의 검색 결과와 즉각적인 HTML 콘텐츠를 얻을 수 있습니다.

Metaphor’s API lets you connect your LLMs to the internet, enabling powerful search and research capabilities. In a few lines of code, you can get high-quality search results as well as instant HTML content.

구글, 퍼블리셔가 AI 학습 데이터 제공을 거부할 수 있는 스위치 추가 / Google Adds A Switch For Publishers To Opt Out Of Becoming AI Training Data (1 minute read)

구글은 웹사이트 게시자가 자신의 데이터가 구글의 AI 학습에 사용되지 않도록 하면서도 구글 검색 결과에 표시되는 것을 방지할 수 있는 도구인 구글 익스텐디드(Google-Extended)를 도입했습니다.

Google introduced a tool, Google-Extended, which enables website publishers to prevent their data from being used in Google's AI training while still appearing in Google Search results.

Docera (Product)

개인용 및 업무용 문서 작성을 간소화하는 AI 기반 도구입니다.

An AI-powered tool that simplifies document creation for personal and professional use.