[2024/07/15 ~ 07/21] 이번 주의 주요 ML 논문 (Top ML Papers of the Week)

9bow · 7월 22, 2024, 1:02오전

[2024/07/15 ~ 07/21] 이번 주의 주요 ML 논문 (Top ML Papers of the Week)

PyTorchKR

이번 주에 선정된 논문들을 살펴볼 때, 큰 트렌드 중 하나는 LLMs(대규모 언어 모델)과 관련된 연구가 주를 이루고 있다는 점입니다. "Improving Legibility of LLM Outputs", "SpreadsheetLLM", "A Survey of Prompt Engineering Methods in LLMs", "Does Refusal Training in LLMs Generalize to the Past Tense?", "Can LLMs Do Retrieval and Reasoning in 1 Million Context Window?"와 같은 제목을 통해, 연구자들이 LLMs의 다양한 측면을 탐구하고 있음을 알 수 있습니다. 이는 LLMs가 인공지능 연구와 응용에 있어 중요한 분야로 자리 잡고 있음을 나타내며, 특히 이러한 모델의 효율성, 이해력, 그리고 범용성 향상에 초점이 맞춰져 있는 것으로 보입니다.
또한, 여러 논문들은 LLMs의 성능을 개선하거나 이해하기 위한 새로운 방법론들을 제안하고 있습니다. 예를 들어, "Distilling System 2 into System 1"이나 "Weak-to-Strong Reasoning"은 추론 과정과 LLMs의 결론 도출 메커니즘을 보다 효율적으로 만들고자 하는 시도를 담고 있습니다. 이는 인공지능 분야에서 추론(reasoning)과 지식 표현(knowledge representation)의 중요성이 점점 더 강조되고 있음을 시사합니다. 반면, "Beyond Euclid"는 좀 더 이론적이거나 추상적인 수학적 개념을 LLMs와 연결짓는 시도로 보이며, 이는 LLMs의 적용 범위를 확장하고자 하는 노력의 일환으로 해석될 수 있습니다.
종합하자면, 이번 주 선택된 논문들은 LLMs의 성능, 효율성 그리고 응용 가능성을 탐구하고 확장하는 데 중점을 두고 있음을 보여줍니다. 이는 LLMs가 인공지능 연구의 중심축으로 자리잡았으며, 이러한 모델들을 더욱 발전시키기 위한 다양한 방법론적 접근과 이론적 심화가 활발히 이루어지고 있음을 단적으로 보여줍니다. LLMs와 관련된 연구가 계속해서 성장함에 따라, 우리는 이 분야에서 더욱 혁신적이고 실용적인 발전을 기대할 수 있을 것입니다.

증명자-검증자 게임으로 LLM 출력의 가독성 향상 / Prover-Verifier Games improve legibility of LLM outputs

논문 소개

솔루션의 정확성을 예측하는 작은 검증자, 검증자가 인정하는 올바른 솔루션을 생성하는 유용한 검증자, 검증자를 속이는 잘못된 솔루션을 생성하는 교활한 검증자를 반복적으로 훈련하여 인간과 AI 시스템 모두 정확하고 이해하기 쉬운 텍스트를 생성할 수 있는 모델을 훈련함으로써 더욱 신뢰할 수 있는 시스템을 구축할 수 있습니다.

Iteratively trains small verifiers to predict solution correctness, helpful provers to produce correct solutions accepted by the verifier, and sneaky provers that produce incorrect solutions that fool the verifier; this process helps train models that can produce text that is correct and easy to understand by both humans and AI systems which leads to more trustworthy systems.

논문 초록(Abstract)

대규모 언어 모델(LLM)의 결과물에 대한 신뢰도를 높이는 한 가지 방법은 명확하고 쉽게 확인할 수 있는 추론, 즉 가독성이라는 속성으로 이를 지원하는 것입니다. 초등학교 수학 문제 풀이의 맥락에서 가독성을 연구한 결과, 정답 정확도만을 위해 사고의 연쇄 솔루션을 최적화하면 가독성이 떨어질 수 있음을 보여주었습니다. 가독성 손실을 완화하기 위해 Anil 외(2021)의 증명자-검증자 게임에서 영감을 얻은 훈련 알고리즘을 제안합니다. 이 알고리즘은 솔루션의 정확성을 예측하는 작은 검증자, 검증자가 수락하는 올바른 솔루션을 생성하는 '도움이 되는' 증명자, 검증자를 속이는 잘못된 솔루션을 생성하는 '교활한' 증명자를 반복적으로 훈련시킵니다. 우리는 훈련 과정에서 도움이 되는 증명자의 정확성과 적대적 공격에 대한 검증자의 견고성이 증가한다는 것을 발견했습니다. 또한 가독성 훈련이 시간 제약이 있는 사람에게도 솔루션의 정확성을 검증하는 데 도움이 된다는 것을 보여줍니다. LLM 훈련 과정에서 인간의 정확도는 도움이 되는 증명자의 솔루션을 확인할 때 증가하고, 교활한 증명자의 솔루션을 확인할 때는 감소합니다. 따라서 소규모 검증자에 의한 가독성 훈련은 출력 가독성을 높이기 위한 그럴듯한 기술입니다. 우리의 결과는 소규모 검증자에 대한 가독성 훈련이 대규모 LLM의 인간에 대한 가독성을 높이기 위한 실질적인 방법이며, 따라서 초인간 모델의 정렬에 도움이 될 수 있음을 시사합니다.

One way to increase confidence in the outputs of Large Language Models (LLMs) is to support them with reasoning that is clear and easy to check -- a property we call legibility. We study legibility in the context of solving grade-school math problems and show that optimizing chain-of-thought solutions only for answer correctness can make them less legible. To mitigate the loss in legibility, we propose a training algorithm inspired by Prover-Verifier Game from Anil et al. (2021). Our algorithm iteratively trains small verifiers to predict solution correctness, "helpful" provers to produce correct solutions that the verifier accepts, and "sneaky" provers to produce incorrect solutions that fool the verifier. We find that the helpful prover's accuracy and the verifier's robustness to adversarial attacks increase over the course of training. Furthermore, we show that legibility training transfers to time-constrained humans tasked with verifying solution correctness. Over course of LLM training human accuracy increases when checking the helpful prover's solutions, and decreases when checking the sneaky prover's solutions. Hence, training for checkability by small verifiers is a plausible technique for increasing output legibility. Our results suggest legibility training against small verifiers as a practical avenue for increasing legibility of large LLMs to humans, and thus could help with alignment of superhuman models.

논문 링크

더 읽어보기

https://x.com/OpenAI/status/1813623470452064432

스프레드시트LLM: 대규모 언어 모델을 위한 스프레드시트 인코딩 / SpreadsheetLLM: Encoding Spreadsheets for Large Language Models

논문 소개

스프레드시트에 대한 LLM의 이해와 추론 능력을 최적화하는 효율적인 인코딩 방법을 제시하고, 구조 앵커 기반 압축, 역 인덱스 변환, 데이터 형식 인식 집계 모듈로 구성된 시트 압축기를 개발하여 스프레드시트를 효율적으로 압축하고 인코딩하며, GPT-4의 컨텍스트 내 학습에서 스프레드시트 테이블 감지 성능을 25.6% 개선합니다.

Presents an efficient encoding method to optimize an LLM’s understanding and reasoning capability on spreadsheets; develops a sheet compressor consisting of structural-anchor-based compression, inverse index translation, and data-format-aware aggregation modules to efficiently compress and encode spreadsheets; in GPT-4’s in-context learning, it improves performance in spreadsheet table detection by 25.6%.

논문 초록(Abstract)

광범위한 2차원 그리드, 다양한 레이아웃, 다양한 서식 옵션을 갖춘 스프레드시트는 대규모 언어 모델(LLM)에 상당한 도전 과제를 안겨줍니다. 이에 대응하여 스프레드시트에서 LLM의 강력한 이해 및 추론 능력을 발휘하고 최적화하도록 설계된 효율적인 인코딩 방법을 개척한 SpreadsheetLLM을 소개합니다. 처음에는 셀 주소, 값, 형식을 통합하는 바닐라 직렬화 방식을 제안했습니다. 하지만 이 접근 방식은 LLM의 토큰 제약으로 인해 대부분의 애플리케이션에서 실용적이지 못했습니다. 이러한 문제를 해결하기 위해 저희는 LLM을 위해 스프레드시트를 효과적으로 압축하는 혁신적인 인코딩 프레임워크인 SheetCompressor를 개발했습니다. 구조 앵커 기반 압축, 역 인덱스 변환, 데이터 형식 인식 집계라는 세 가지 모듈로 구성되어 있습니다. 이 모듈은 스프레드시트 테이블 탐지 작업의 성능을 크게 향상시켜 GPT4의 컨텍스트 내 학습 환경에서 바닐라 접근 방식보다 25.6% 더 뛰어난 성능을 발휘합니다. 또한 SheetCompressor로 미세 조정된 LLM은 평균 압축률이 25배이지만 F1 점수는 78.9%로 기존 최고 모델을 12.3%나 능가하는 최첨단 모델입니다. 마지막으로, 새롭고 까다로운 스프레드시트 QA 작업에서 스프레드시트 이해 및 검증의 다운스트림 작업을 위한 체인 오브 스프레드시트를 제안합니다. 스프레드시트 고유의 레이아웃과 구조를 체계적으로 활용하여 다양한 스프레드시트 작업에서 SpreadsheetLLM이 매우 효과적이라는 것을 입증합니다.

Spreadsheets, with their extensive two-dimensional grids, various layouts, and diverse formatting options, present notable challenges for large language models (LLMs). In response, we introduce SpreadsheetLLM, pioneering an efficient encoding method designed to unleash and optimize LLMs' powerful understanding and reasoning capability on spreadsheets. Initially, we propose a vanilla serialization approach that incorporates cell addresses, values, and formats. However, this approach was limited by LLMs' token constraints, making it impractical for most applications. To tackle this challenge, we develop SheetCompressor, an innovative encoding framework that compresses spreadsheets effectively for LLMs. It comprises three modules: structural-anchor-based compression, inverse index translation, and data-format-aware aggregation. It significantly improves performance in spreadsheet table detection task, outperforming the vanilla approach by 25.6% in GPT4's in-context learning setting. Moreover, fine-tuned LLM with SheetCompressor has an average compression ratio of 25 times, but achieves a state-of-the-art 78.9% F1 score, surpassing the best existing models by 12.3%. Finally, we propose Chain of Spreadsheet for downstream tasks of spreadsheet understanding and validate in a new and demanding spreadsheet QA task. We methodically leverage the inherent layout and structure of spreadsheets, demonstrating that SpreadsheetLLM is highly effective across a variety of spreadsheet tasks.

논문 링크

더 읽어보기

SpreadsheetLLM: LLM을 위한 스프레드시트(엑셀) 인코딩에 대한 연구 (feat. Microsoft) 읽을거리&정보공유

SpreadsheetLLM 소개 [SpreadsheetLLM 파이프라인] 스프레드시트(Spreadsheet)는 데이터 관리와 분석에 널리 사용되며, Microsoft Excel 및 Google Sheets와 같은 플랫폼에서 광범위하게 사용됩니다. 이러한 스프레드시트는 종종 방대한 2차원 그리드를 가지고 있어 대규모 언어 모델(LLM)이 이를 효과적으로 이해하고 처리하는 데 어려움을 겪습니다. 특히 셀 주소, 값 및 형식을 포함한 데이터는 LLM의 토큰 한계를 초과할 수 있습니다. 이러한 문제를 해결하기 위해, 저자들은 SpreadsheetLLM이라는 새로운 프레임워크를 제안합니다. 이는 LLM이 스프레드시트를 효율적으로 이해하고 추론할 수 있도록 돕는 혁신적인 인코딩 방법입니다. 기존의 직렬화(Serialization) 접근 방식은 셀 주소, 값 및 형식을 포함하여 데이터를 순차적으로 인코딩하지만, 이는 LLM의 토큰 제한으로 인해 대규모 스프레드시트를 처리하는 데 한계가 있습니…

https://x.com/_akhaliq/status/1812674543963578794

RAG에서 효율적인 답변 생성을 위한 컨텍스트 임베딩 / Context Embeddings for Efficient Answer Generation in RAG

논문 소개

RAG 시스템에서 긴 컨텍스트를 줄이고 생성 시간을 단축하는 효과적인 컨텍스트 압축 방법을 제안합니다. 긴 컨텍스트를 적은 수의 컨텍스트 임베딩으로 압축하여 디코딩 시간과 생성 품질을 절충하는 다양한 압축률을 허용하고, 고성능을 유지하면서 추론 시간을 최대 5.69배, GFLOP을 최대 22배까지 단축합니다.

Proposes an effective context compression method to reduce long context and speed up generation time in RAG systems; the long contexts are compressed into a small number of context embeddings which allow different compression rates that trade-off decoding time for generation quality; reduces inference time by up to 5.69 × and GFLOPs by up to 22 × while maintaining high performance.

논문 초록(Abstract)

검색 증강 생성(RAG)을 사용하면 외부 정보로 입력을 확장하여 LLM의 제한된 지식을 극복할 수 있습니다. 결과적으로 모델에 대한 문맥 입력이 훨씬 길어져 디코딩 시간이 느려지고 이는 곧 사용자가 답변을 기다려야 하는 시간으로 직결됩니다. 이 문제를 해결하기 위해 효과적인 컨텍스트 압축 방법인 COCOM을 제시하여 긴 컨텍스트를 소수의 컨텍스트 임베딩으로 줄여 생성 시간을 크게 단축합니다. 이 방법을 사용하면 디코딩 시간과 답변 품질을 맞바꾸는 다양한 압축률을 사용할 수 있습니다. 이전 방식에 비해 COCOM은 여러 컨텍스트를 보다 효과적으로 처리할 수 있어 긴 입력의 디코딩 시간을 크게 단축할 수 있습니다. 기존의 효율적인 컨텍스트 압축 방식에 비해 최대 5.69배의 속도 향상을 보여주면서 더 높은 성능을 달성합니다.

Retrieval-Augmented Generation (RAG) allows overcoming the limited knowledge of LLMs by extending the input with external information. As a consequence, the contextual inputs to the model become much longer which slows down decoding time directly translating to the time a user has to wait for an answer. We address this challenge by presenting COCOM, an effective context compression method, reducing long contexts to only a handful of Context Embeddings speeding up the generation time by a large margin. Our method allows for different compression rates trading off decoding time for answer quality. Compared to earlier methods, COCOM allows for handling multiple contexts more effectively, significantly reducing decoding time for long inputs. Our method demonstrates a speed-up of up to 5.69 × while achieving higher performance compared to existing efficient context compression methods.

논문 링크

더 읽어보기

https://x.com/omarsar0/status/1812937765769867561

약한 추론에서 강한 추론으로 / Weak-to-Strong Reasoning

논문 소개

사람의 주석이나 고급 모델에 의존하지 않고도 LLM에서 강력한 추론 능력을 이끌어내기 위해 약한 감독을 사용하는 방법을 시연하고, 강력한 모델이 명시적으로 학습하지 않아도 자동으로 학습 데이터를 개선할 수 있으며, 모델의 학습 범위를 확장하고 추론 성능을 확장할 수 있음을 보고합니다.

Demonstrates the use of weak supervision to elicit strong reasoning capabilities in LLMs without relying on human annotations or advanced models; reports that strong models can automatically refine their training data without explicitly being trained to do so; enables expanding a model's learning scope and scaling performance on reasoning.

논문 초록(Abstract)

대규모 언어 모델(LLM)이 인간 수준의 역량을 넘어설 경우, 이러한 모델에 대한 전면적이고 정확한 감독을 제공하는 것이 점점 더 어려워지고 있습니다. 이러한 맥락에서 능력이 떨어지는 모델을 활용하여 더 강력한 모델의 잠재 능력을 끌어내는 약강약 학습이 유용하다는 것이 입증되었습니다. 하지만 복잡한 추론 과제에 대한 이 접근법의 효과는 아직 검증되지 않았습니다. 또한, 현재 약자 대 강자 설정에서 추론 과제를 해결하는 데는 약한 감독자의 오류를 포함한 맹목적인 모방을 피할 수 있는 효율적인 방법이 부족합니다. 이 백서에서는 고급 모델이나 사람이 주석을 단 데이터의 입력 없이도 강력한 모델이 자율적으로 훈련 데이터를 개선할 수 있는 점진적 학습 프레임워크를 소개합니다. 이 프레임워크는 작지만 고품질의 선별적인 데이터 세트에 대한 감독 미세 조정으로 시작하여 강력한 모델 자체에서 식별한 대조적인 샘플에 대한 선호도 최적화로 이어집니다. GSM8K 및 MATH 데이터 세트에 대한 광범위한 실험을 통해 이 방법이 세 가지 개별 약한 모델을 사용하여 Llama2-70b의 추론 능력을 크게 향상시킨다는 것을 입증했습니다. 이 방법은 매우 까다로운 올림픽 아레나 데이터 세트에서 Llama3-8b-instruct가 효과적으로 Llama3-70b를 감독하는 미래 지향적인 실험 설정에서 더욱 검증되었습니다. 이 작업은 AI 추론 능력을 향상시키기 위한 보다 확장 가능하고 정교한 전략의 기반을 마련합니다. 모든 관련 코드와 리소스는 \url{GitHub - GAIR-NLP/weak-to-strong-reasoning}에서 확인할 수 있습니다.

When large language models (LLMs) exceed human-level capabilities, it becomes increasingly challenging to provide full-scale and accurate supervisions for these models. Weak-to-strong learning, which leverages a less capable model to unlock the latent abilities of a stronger model, proves valuable in this context. Yet, the efficacy of this approach for complex reasoning tasks is still untested. Furthermore, tackling reasoning tasks under the weak-to-strong setting currently lacks efficient methods to avoid blindly imitating the weak supervisor including its errors. In this paper, we introduce a progressive learning framework that enables the strong model to autonomously refine its training data, without requiring input from either a more advanced model or human-annotated data. This framework begins with supervised fine-tuning on a selective small but high-quality dataset, followed by preference optimization on contrastive samples identified by the strong model itself. Extensive experiments on the GSM8K and MATH datasets demonstrate that our method significantly enhances the reasoning capabilities of Llama2-70b using three separate weak models. This method is further validated in a forward-looking experimental setup, where Llama3-8b-instruct effectively supervises Llama3-70b on the highly challenging OlympicArena dataset. This work paves the way for a more scalable and sophisticated strategy to enhance AI reasoning powers. All relevant code and resources are available in \url{GitHub - GAIR-NLP/weak-to-strong-reasoning}.

논문 링크

더 읽어보기

https://github.com/GAIR-NLP/weak-to-strong-reasoning

https://x.com/omarsar0/status/1814130275485704597

다양한 NLP 작업을 위한 대규모 언어 모델의 프롬프트 엔지니어링 방법 조사 / A Survey of Prompt Engineering Methods in Large Language Models for Different NLP Tasks

논문 소개

다양한 NLP 작업을 위한 신속한 엔지니어링 방법 모음입니다.

A collection of prompt engineering methods for a variety of NLP tasks.

논문 초록(Abstract)

대규모 언어 모델(LLM)은 다양한 자연어 처리(NLP) 작업에서 괄목할 만한 성능을 보여 왔습니다. 프롬프트 엔지니어링은 LLM의 기존 기능에 더 많은 기능을 추가하여 다양한 NLP 작업에서 상당한 성능 향상을 달성하는 데 핵심적인 역할을 합니다. 프롬프트 엔지니어링을 위해서는 프롬프트라는 자연어 명령을 작성하여 LLM으로부터 지식을 구조화된 방식으로 이끌어내야 합니다. 프롬프트 엔지니어링은 이전의 최신(SoTA) 모델과 달리 주어진 NLP 작업에 따라 광범위한 매개변수 재교육이나 미세 조정이 필요하지 않으므로 LLM의 내재된 지식만으로 작동합니다. 또한 LLM 애호가들은 기본적인 자연어 대화 교환이나 프롬프트 엔지니어링을 통해 LLM의 지식을 지능적으로 추출할 수 있어 수학적 머신러닝에 대한 깊은 배경 지식이 없어도 점점 더 많은 사람들이 LLM을 실험할 수 있습니다. 지난 2년 동안 프롬프트 엔지니어링이 인기를 얻으면서 연구자들은 LLM에서 정보 추출의 정확도를 높이기 위한 프롬프트 설계와 관련된 다양한 엔지니어링 기법을 고안해냈습니다. 이 백서에서는 다양한 프롬프트 기법을 요약하고, 각 기법이 사용된 다양한 NLP 작업을 기반으로 이를 한데 묶어 설명합니다. 또한 해당 NLP 작업에 속하는 다양한 데이터 세트에서 이러한 프롬프트 전략의 성능을 세분화하고, 사용된 해당 LLM에 대해 이야기하고, 분류 다이어그램을 제시하고, 특정 데이터 세트에 대해 가능한 SoTA에 대해 논의합니다. 총 44개의 연구 논문을 읽고 그 중 대부분이 지난 2년 동안 발표된 29개의 서로 다른 NLP 작업에 대한 39개의 프롬프트 방법에 대해 설명합니다.

Large language models (LLMs) have shown remarkable performance on many different Natural Language Processing (NLP) tasks. Prompt engineering plays a key role in adding more to the already existing abilities of LLMs to achieve significant performance gains on various NLP tasks. Prompt engineering requires composing natural language instructions called prompts to elicit knowledge from LLMs in a structured way. Unlike previous state-of-the-art (SoTA) models, prompt engineering does not require extensive parameter re-training or fine-tuning based on the given NLP task and thus solely operates on the embedded knowledge of LLMs. Additionally, LLM enthusiasts can intelligently extract LLMs' knowledge through a basic natural language conversational exchange or prompt engineering, allowing more and more people even without deep mathematical machine learning background to experiment with LLMs. With prompt engineering gaining popularity in the last two years, researchers have come up with numerous engineering techniques around designing prompts to improve accuracy of information extraction from the LLMs. In this paper, we summarize different prompting techniques and club them together based on different NLP tasks that they have been used for. We further granularly highlight the performance of these prompting strategies on various datasets belonging to that NLP task, talk about the corresponding LLMs used, present a taxonomy diagram and discuss the possible SoTA for specific datasets. In total, we read and present a survey of 44 research papers which talk about 39 different prompting methods on 29 different NLP tasks of which most of them have been published in the last two years.

논문 링크

더 읽어보기

https://x.com/omarsar0/status/1814135222562165104

LLM의 거절 교육이 과거형으로 일반화되나요? / Does Refusal Training in LLMs Generalize to the Past Tense?

논문 소개

예를 들어 "화염병 만드는 법?"을 "사람들이 화염병을 어떻게 만들었나요?"로 바꾸면 많은 최신 LLM이 과거형으로 재구성될 수 있다는 사실, GPT-4o에서 직접 요청을 사용하면 이러한 요청의 성공률을 1%에서 88%로 높일 수 있다는 사실, 현재의 정렬 기술이 항상 의도대로 일반화되지 않을 수 있다는 결론을 도출했습니다.

Finds that simply reformulating an LLM request into past tense can jailbreak many state-of-the-art LLMs; for example "How to make a Molotov cocktail?" can be rephrased as "How did people make a Molotov cocktail?"; finds that the success rate of such requests can increase from 1% to 88% using direct requests on GPT-4o; concludes that current alignment techniques may not always generalize as intended.

논문 초록(Abstract)

거부 훈련은 LLM이 유해하거나 바람직하지 않거나 불법적인 결과물을 생성하는 것을 방지하기 위해 널리 사용됩니다. 우리는 현재의 거부 훈련 접근법에서 흥미로운 일반화 격차를 발견했습니다. 유해한 요청을 단순히 과거형으로 재구성하는 것(예: "화염병을 만드는 방법?"을 "사람들이 화염병을 어떻게 만들었나요?"로)만으로도 많은 최신 LLM을 탈옥시킬 수 있다는 것입니다. 저희는 이 방법을 GPT-3.5 터보를 재구성 모델로 사용하여 Llama-3 8B, GPT-3.5 터보, Gemma-2 9B, Phi-3-Mini, GPT-4o 및 R2D2 모델에서 체계적으로 평가합니다. 예를 들어, GPT-4o에 대한 이 간단한 공격의 성공률은 직접 요청을 사용할 경우 1%에서 GPT-4를 탈옥 판단자로 사용하여 탈옥 벤치에서 유해한 요청에 대한 과거형 재구성 시도를 20회 수행하면 88%로 증가합니다. 흥미롭게도 미래 시제로 재구성하는 것이 덜 효과적인 것으로 나타났는데, 이는 거부 가드레일이 가상의 미래 질문보다 과거의 역사적 질문을 더 선의로 간주하는 경향이 있음을 시사합니다. 또한, GPT-3.5 터보의 미세 조정 실험을 통해 과거 시제 예시가 미세 조정 데이터에 명시적으로 포함될 때 과거 재구성을 방어하는 것이 가능하다는 것을 알 수 있었습니다. 전반적으로, 연구 결과는 연구 모델을 정렬하는 데 널리 사용되는 정렬 기법(예: SFT, RLHF, 적대적 훈련)이 취약할 수 있으며 항상 의도한 대로 일반화되지 않을 수 있음을 강조합니다. 코드와 탈옥 아티팩트는 GitHub - tml-epfl/llm-past-tense: Does Refusal Training in LLMs Generalize to the Past Tense? [ICLR 2025] 에서 확인할 수 있습니다.

Refusal training is widely used to prevent LLMs from generating harmful, undesirable, or illegal outputs. We reveal a curious generalization gap in the current refusal training approaches: simply reformulating a harmful request in the past tense (e.g., "How to make a Molotov cocktail?" to "How did people make a Molotov cocktail?") is often sufficient to jailbreak many state-of-the-art LLMs. We systematically evaluate this method on Llama-3 8B, GPT-3.5 Turbo, Gemma-2 9B, Phi-3-Mini, GPT-4o, and R2D2 models using GPT-3.5 Turbo as a reformulation model. For example, the success rate of this simple attack on GPT-4o increases from 1% using direct requests to 88% using 20 past tense reformulation attempts on harmful requests from JailbreakBench with GPT-4 as a jailbreak judge. Interestingly, we also find that reformulations in the future tense are less effective, suggesting that refusal guardrails tend to consider past historical questions more benign than hypothetical future questions. Moreover, our experiments on fine-tuning GPT-3.5 Turbo show that defending against past reformulations is feasible when past tense examples are explicitly included in the fine-tuning data. Overall, our findings highlight that the widely used alignment techniques -- such as SFT, RLHF, and adversarial training -- employed to align the studied models can be brittle and do not always generalize as intended. We provide code and jailbreak artifacts at GitHub - tml-epfl/llm-past-tense: Does Refusal Training in LLMs Generalize to the Past Tense? [ICLR 2025].

논문 링크

더 읽어보기

https://x.com/maksym_andr/status/1813608842699079750

니들벤치: LLM이 1백만 개의 컨텍스트 창에서 검색 및 추론을 수행할 수 있을까요? / NeedleBench: Can LLMs Do Retrieval and Reasoning in 1 Million Context Window?

논문 소개

LLM의 긴 문맥 검색 및 추론 능력을 평가하기 위해 점진적으로 도전적인 과제의 프레임워크(NeedleBench)를 제안하고, 실제 긴 문맥 과제에서 흔히 볼 수 있는 복잡한 논리적 추론의 필요성을 증가시키는 조상 추적 과제를 제시하며, 연구 결과에 따르면 현재 LLM은 2K 토큰보다 짧은 텍스트에서도 복잡한 논리적 관계를 가진 추론 과제를 처리하는 데 어려움을 겪는다고 합니다.

Proposes a framework (NeedleBench) of progressively challenging tasks to assess the long-context retrieval and reasoning capabilities of LLMs; they also present the Ancestral Trace Challenge that increases the need for complex logical reasoning which is common in real-world long-context tasks; their findings suggest that current LLMs struggle to handle reasoning tasks with complex logical relationships, even with texts shorter than 2K tokens.

논문 초록(Abstract)

대규모 언어 모델(LLM)의 긴 문맥 기능을 평가할 때, 긴 텍스트를 기반으로 질문에 답하기 위해서는 원본 긴 문서에서 사용자의 질의와 관련된 콘텐츠를 식별하는 것이 모든 LLM의 중요한 전제 조건입니다. 다양한 길이 간격(4k, 8k, 32k, 128k, 200k, 1000k 등)과 다양한 깊이 범위에 걸쳐 이중 언어의 긴 문맥 기능을 평가하기 위한 일련의 점진적으로 더 까다로운 작업으로 구성된 프레임워크인 NeedleBench를 통해 다양한 텍스트 깊이 영역에 중요한 데이터 요소를 전략적으로 삽입하여 다양한 맥락에서 모델의 검색 및 추론 기능을 엄격하게 테스트할 수 있습니다. 우리는 NeedleBench 프레임워크를 사용하여 주요 오픈 소스 모델이 질문과 관련된 주요 정보를 얼마나 잘 식별하고 이중 언어의 긴 텍스트에서 해당 정보를 추론에 적용할 수 있는지 평가합니다. 또한 실제 긴 문맥 과제에서 나타날 수 있는 논리적 추론 과제의 복잡성을 모방하여 복잡한 긴 문맥 상황을 처리할 때 LLM을 평가할 수 있는 간단한 방법을 제공하는 조상 추적 과제(ATC)를 제안합니다. 연구 결과에 따르면 현재의 LLM은 실제 긴 컨텍스트 작업에서 나타날 수 있는 논리적 추론 과제의 복잡성에 어려움을 겪고 있어 실제 긴 컨텍스트 응용 분야에서 개선의 여지가 상당하다는 것을 알 수 있습니다. 모든 코드와 리소스는 OpenCompass(GitHub - open-compass/opencompass: OpenCompass is an LLM evaluation platform, supporting a wide range of models (Llama3, Mistral, InternLM2,GPT-4,LLaMa2, Qwen,GLM, Claude, etc) over 100+ datasets.)에서 확인할 수 있습니다.

In evaluating the long-context capabilities of large language models (LLMs), identifying content relevant to a user's query from original long documents is a crucial prerequisite for any LLM to answer questions based on long text. We present NeedleBench, a framework consisting of a series of progressively more challenging tasks for assessing bilingual long-context capabilities, spanning multiple length intervals (4k, 8k, 32k, 128k, 200k, 1000k, and beyond) and different depth ranges, allowing the strategic insertion of critical data points in different text depth zones to rigorously test the retrieval and reasoning capabilities of models in diverse contexts. We use the NeedleBench framework to assess how well the leading open-source models can identify key information relevant to the question and apply that information to reasoning in bilingual long texts. Furthermore, we propose the Ancestral Trace Challenge (ATC) to mimic the complexity of logical reasoning challenges that are likely to be present in real-world long-context tasks, providing a simple method for evaluating LLMs in dealing with complex long-context situations. Our results suggest that current LLMs have significant room for improvement in practical long-context applications, as they struggle with the complexity of logical reasoning challenges that are likely to be present in real-world long-context tasks. All codes and resources are available at OpenCompass: GitHub - open-compass/opencompass: OpenCompass is an LLM evaluation platform, supporting a wide range of models (Llama3, Mistral, InternLM2,GPT-4,LLaMa2, Qwen,GLM, Claude, etc) over 100+ datasets..

논문 링크

더 읽어보기

https://github.com/open-compass/opencompass

https://x.com/omarsar0/status/1813581074624070109

시스템 2를 시스템 1로 증류 / Distilling System 2 into System 1

논문 소개

시스템 2 기법에서 고품질의 결과물을 추출한 다음 시스템 2 기법의 예측과 일치하도록 시스템 1을 미세 조정하되 중간 단계를 생성하지 않고 추론을 시스템 1로 증류하는 과정을 통해 추론 비용을 줄이는 자체 감독 방법을 연구합니다.

Investigates self-supervised methods to distill high-quality outputs from System 2 techniques and then fine-tune System 1 to match the predictions of the System 2 technique but without generating intermediate steps; the process of distilling reasoning into System 1 results in less inference cost.

논문 초록(Abstract)

대규모 언어 모델(LLM)은 추론 중에 중간 생각을 생성하기 위해 추가 컴퓨팅을 사용할 수 있으며, 이는 더 나은 최종 응답을 생성하는 데 도움이 됩니다. Chain-of-Thought(Wei et al., 2022) 이후, Rephrase and Respond(Deng et al., 2023a), System 2 Attention(Weston and Sukhbaatar, 2023), Branch-Solve-Merge(Saha et al., 2023) 등 많은 시스템 2 기법들이 제안되었습니다. 이 연구에서는 이러한 추론이 시스템 1로 증류되었기 때문에 시스템 2 기법의 고품질 출력을 중간 추론 토큰 시퀀스 없이 다시 LLM 생성으로 '컴파일'(증류)하는 자가 지도 방법을 조사합니다. 우리는 이러한 여러 기법을 성공적으로 증류하여 원래의 시스템 1 성능에 비해 향상된 결과를 얻을 수 있으며 시스템 2보다 적은 추론 비용으로 결과를 얻을 수 있음을 보여줍니다. 우리는 이러한 시스템 2 증류가 향후 지속적으로 학습하는 AI 시스템의 중요한 기능이 될 것이며, 이를 통해 시스템 2의 역량을 아직 잘 수행하지 못하는 추론 작업에 집중할 수 있을 것이라고 가정합니다.

Large language models (LLMs) can spend extra compute during inference to generate intermediate thoughts, which helps to produce better final responses. Since Chain-of-Thought (Wei et al., 2022), many such System 2 techniques have been proposed such as Rephrase and Respond (Deng et al., 2023a), System 2 Attention (Weston and Sukhbaatar, 2023) and Branch-Solve-Merge (Saha et al., 2023). In this work we investigate self-supervised methods to ``compile'' (distill) higher quality outputs from System 2 techniques back into LLM generations without intermediate reasoning token sequences, as this reasoning has been distilled into System 1. We show that several such techniques can be successfully distilled, resulting in improved results compared to the original System 1 performance, and with less inference cost than System 2. We posit that such System 2 distillation will be an important feature of future continually learning AI systems, enabling them to focus System 2 capabilities on the reasoning tasks that they cannot yet do well.

논문 링크

더 읽어보기

https://x.com/willccbb/status/1813012865454121179

LLMsuite로 고급 대규모 언어 모델 살펴보기 / Exploring Advanced Large Language Models with LLMsuite

논문 소개

LLM을 사용하여 개발하고 평가하기 위한 실용적인 팁을 공유하며, ReAct부터 RAG, 매개변수 효율적 방법까지 다양한 솔루션을 다룹니다.

Shares practical tips for developing with and evaluating LLMs; solutions covered range from ReAct to RAG to parameter-efficient methods.

논문 초록(Abstract)

이 튜토리얼에서는 ChatGPT와 Gemini와 같은 대규모 언어 모델(LLM) 개발의 발전과 과제를 살펴봅니다. 시간적 지식 단절, 수학적 부정확성, 잘못된 정보 생성 등의 내재적 한계를 다루며 검색 증강 생성(RAG), 프로그램 지원 언어 모델(PAL), ReAct 및 LangChain과 같은 프레임워크 등의 솔루션을 제안합니다. 이러한 기술을 통합하면 특히 다단계 추론과 복잡한 작업 실행에서 LLM의 성능과 안정성이 향상됩니다. 이 백서에서는 명령어 미세 조정, LoRA와 같은 매개변수 효율적 방법, 인간 피드백을 통한 강화 학습(RLHF), 강화 자기 훈련(ReST) 등의 미세 조정 전략도 다룹니다. 또한 LLM을 위한 트랜스포머 아키텍처와 트레이닝 기법에 대한 포괄적인 조사도 제공합니다. 이러한 기법을 구현하기 위한 도구 상자는 https://github.com/giorgioroffo/large_language_models_open_suite 에서 공개적으로 사용할 수 있습니다

This tutorial explores the advancements and challenges in the development of Large Language Models (LLMs) such as ChatGPT and Gemini. It addresses inherent limitations like temporal knowledge cutoffs, mathematical inaccuracies, and the generation of incorrect information, proposing solutions like Retrieval Augmented Generation (RAG), Program-Aided Language Models (PAL), and frameworks such as ReAct and LangChain. The integration of these techniques enhances LLM performance and reliability, especially in multi-step reasoning and complex task execution. The paper also covers fine-tuning strategies, including instruction fine-tuning, parameter-efficient methods like LoRA, and Reinforcement Learning from Human Feedback (RLHF) as well as Reinforced Self-Training (ReST). Additionally, it provides a comprehensive survey of transformer architectures and training techniques for LLMs. The toolbox for implementing these techniques is publicly available at https://github.com/giorgioroffo/large_language_models_open_suite

논문 링크

더 읽어보기

https://github.com/giorgioroffo/large_language_models_open_suite

https://x.com/omarsar0/status/1813980712346763589

유클리드 너머 / Beyond Euclid

논문 소개

비유클리드 머신 러닝의 최근 발전에 대한 그림 가이드와 그래픽 분류법을 제공합니다.

Provides an illustrated guide and graphical taxonomy of recent advances in non-Euclidean machine learning.

논문 초록(Abstract)

유클리드 기하학의 오랜 유산은 수십 년 동안 주로 유클리드 공간에 있는 데이터를 위해 개발되어 온 고전적인 머신 러닝을 뒷받침합니다. 그러나 현대의 머신 러닝은 본질적으로 유클리드가 아닌 풍부한 구조의 데이터를 점점 더 많이 접하게 됩니다. 이러한 데이터는 시공간 곡률의 기하학부터 뇌의 뉴런 간의 위상학적으로 복잡한 상호작용, 물리 시스템의 대칭성을 설명하는 대수 변환에 이르기까지 복잡한 기하학적, 위상학적, 대수적 구조를 나타낼 수 있습니다. 이러한 비유클리드 데이터에서 지식을 추출하려면 보다 폭넓은 수학적 관점이 필요합니다. 비유클리드 기하학을 탄생시킨 19세기 혁명을 반영하여, 비유클리드 구조로 현대 머신 러닝을 재정의하는 새로운 연구 분야가 등장하고 있습니다. 이 연구의 목표는 기하학, 위상수학, 대수학을 통해 고전적인 방법을 비전통적인 데이터 유형에 일반화하는 것입니다. 이 리뷰에서는 빠르게 성장하는 이 분야에 대한 접근 가능한 관문을 제공하고 최근의 발전을 직관적인 통합 프레임워크에 통합하는 그래픽 분류법을 제안합니다. 이어서 현재의 과제에 대한 인사이트를 추출하고 이 분야에서 향후 발전할 수 있는 흥미로운 기회를 강조합니다.

The enduring legacy of Euclidean geometry underpins classical machine learning, which, for decades, has been primarily developed for data lying in Euclidean space. Yet, modern machine learning increasingly encounters richly structured data that is inherently nonEuclidean. This data can exhibit intricate geometric, topological and algebraic structure: from the geometry of the curvature of space-time, to topologically complex interactions between neurons in the brain, to the algebraic transformations describing symmetries of physical systems. Extracting knowledge from such non-Euclidean data necessitates a broader mathematical perspective. Echoing the 19th-century revolutions that gave rise to non-Euclidean geometry, an emerging line of research is redefining modern machine learning with non-Euclidean structures. Its goal: generalizing classical methods to unconventional data types with geometry, topology, and algebra. In this review, we provide an accessible gateway to this fast-growing field and propose a graphical taxonomy that integrates recent advances into an intuitive unified framework. We subsequently extract insights into current challenges and highlight exciting opportunities for future development in this field.

논문 링크

더 읽어보기

https://x.com/omarsar0/status/1812927886766010653

원문

이 글은 GPT 모델로 정리한 것으로, 잘못된 부분이 있을 수 있으니 글 아래쪽의 원문도 함께 참고해주세요! 읽으시면서 어색하거나 잘못된 내용을 발견하시면 덧글로 알려주시기를 부탁드립니다.

파이토치 한국 사용자 모임이 정리한 이 글이 유용하셨나요? 회원으로 가입하시면 주요 글들을 이메일로 보내드립니다! (기본은 Weekly지만 Daily로 변경도 가능합니다.)

아래쪽에 좋아요를 눌러주시면 뉴스 발행에 힘이 됩니다~

[2024/07/15 ~ 07/21] 이번 주의 주요 ML 논문 (Top ML Papers of the Week)

PyTorchKR​

증명자-검증자 게임으로 LLM 출력의 가독성 향상 / Prover-Verifier Games improve legibility of LLM outputs

논문 소개

논문 초록(Abstract)

논문 링크

더 읽어보기

스프레드시트LLM: 대규모 언어 모델을 위한 스프레드시트 인코딩 / SpreadsheetLLM: Encoding Spreadsheets for Large Language Models

논문 소개

논문 초록(Abstract)

논문 링크

더 읽어보기

RAG에서 효율적인 답변 생성을 위한 컨텍스트 임베딩 / Context Embeddings for Efficient Answer Generation in RAG

논문 소개

논문 초록(Abstract)

논문 링크

더 읽어보기

약한 추론에서 강한 추론으로 / Weak-to-Strong Reasoning

논문 소개

논문 초록(Abstract)

논문 링크

더 읽어보기

다양한 NLP 작업을 위한 대규모 언어 모델의 프롬프트 엔지니어링 방법 조사 / A Survey of Prompt Engineering Methods in Large Language Models for Different NLP Tasks

논문 소개

논문 초록(Abstract)

논문 링크

더 읽어보기

LLM의 거절 교육이 과거형으로 일반화되나요? / Does Refusal Training in LLMs Generalize to the Past Tense?

논문 소개

논문 초록(Abstract)

논문 링크

더 읽어보기

니들벤치: LLM이 1백만 개의 컨텍스트 창에서 검색 및 추론을 수행할 수 있을까요? / NeedleBench: Can LLMs Do Retrieval and Reasoning in 1 Million Context Window?

논문 소개

논문 초록(Abstract)

논문 링크

더 읽어보기

시스템 2를 시스템 1로 증류 / Distilling System 2 into System 1

논문 소개

논문 초록(Abstract)

논문 링크

더 읽어보기

LLMsuite로 고급 대규모 언어 모델 살펴보기 / Exploring Advanced Large Language Models with LLMsuite

논문 소개

논문 초록(Abstract)

논문 링크

더 읽어보기

유클리드 너머 / Beyond Euclid

논문 소개

논문 초록(Abstract)

논문 링크

더 읽어보기

원문

PyTorchKR