주제에 rlhf 태그가 달렸습니다

글	댓글	조회수	활동
OAT 🌾: 대규모 언어 모델(LLM)의 온라인 정렬을 위한 연구 친화적 프레임워크 (Online Alignment Toolkit for LLMs) 읽을거리&정보공유 rlhf , llm-alignment , reinforcement-learning , online-alignment , alignment-framework , oat , sample-efficient-alignment-for-llms , online-alignment-for-llms , active-learning	0	169	10월 9, 2025
GLM-4.1V-Thinking: 강화학습 기반의 범용 멀티모달 추론 모델 (feat. Zhipu AI) 읽을거리&정보공유 rlhf , reinforcement-learning , tsinghua-university , rl-with-verifiable-rewards , glm-4-1v-thinking , zhipu-ai , rl-with-curriculum-sampling	0	400	7월 7, 2025
[2024/12/09 ~ 12/15] 이번 주의 주요 ML 논문 (Top ML Papers of the Week) 읽을거리&정보공유 rlhf , paper , top-ml-papers-of-the-week , survey-paper , llm-as-a-judge , chain-of-continuous-thought , phi-4 , llm-async-function-calling , mag-v , multi-agent-framework , clio , autoreason , byte-latent-transformer , granite-guardian	0	493	12월 16, 2024
[GN] "RLHF는 RL의 작은 부분일 뿐입니다." - Andrej Karpathy 읽을거리&정보공유 geeknews , rlhf , andrej-karpathy , hallucination , reinforcement-learning , reducing-hallucination	0	262	8월 9, 2024
[2024/02/19 ~ 02/25] 이번 주의 주요 ML 논문 (Top ML Papers of the Week) 읽을거리&정보공유 stable-diffusion , rlhf , paper , top-ml-papers-of-the-week , survey-paper , chain-of-thought , llm-survey , gemma , opencodeinterpreter , llm-planning , babilong , proximal-policy-optimization , lora-plus , gritlm , llm4annotation	0	741	2월 26, 2024
[2023/09/25 ~ 10/01] 이번 주의 주요 ML 논문 (Top ML Papers of the Week) 읽을거리&정보공유 rlhf , multimodal , paper , vision-transformer , top-ml-papers-of-the-week , chain-of-thought , llm-alignment , reversal-curse , long-context-scaling , graph-neural-prompting , boolformer , mentalllama , logicot	0	1044	10월 2, 2023
[TLDR] 오늘의 AI 뉴스, 2023-09-04: 컨텍스츄얼, 2,000만 달러 펀딩 💰, ChatGPT 사용자 세션 분석 🤖, RLHF 대 RLAIF 👊 읽을거리&정보공유 meta , rlhf , meta-ai , funding , wasm , mvdream , evolutionaryscale , contextual , gaussian-painters , modular-diffusion , textbase , remfx , rlaif , askmore , pipio	1	312	12월 31, 2023
[GN] pykoi - LLM을 위한 데이터 & 피드백 수집용 UI 라이브러리 읽을거리&정보공유 geeknews , llm , rlhf , pykoi , ui , rag	0	294	8월 22, 2023
[2023/07/31 ~ 08/06] 이번 주의 주요 ML 논문 (Top ML Papers of the Week) 읽을거리&정보공유 rlhf , paper , openflamingo , metagpt , top-ml-papers-of-the-week , selfcheck , med-flamingo , toolllm , skeleton-of-thought , hydrea-effect , autorobotics-zero	0	394	8월 8, 2023
[TLDR] 오늘의 AI 뉴스, 2023-07-20: Apple-GPT 🍎, Keras Core 출시 🤖, GPT-4의 흔들리는 마음 🧠 읽을거리&정보공유 rlhf , tldr-ai , llama2 , sam-pt , superhuman-ai , sharcs , arctic , repvit , keras-core , apple-gpt	1	565	12월 31, 2023
[TLDR] 오늘의 AI 뉴스, 2023-05-23: OpenAI의 초지능 거버넌스🏛️, Apple - ChatGPT 사용 제한🍎, 강화 학습으로 디퓨전 모델 학습🦾 읽을거리&정보공유 tldr-ai , mms , rlhf , codi , any-to-any , gpt-json , critic , flux-ai , zeda-io	1	362	12월 31, 2023
Stability AI, 세계 최초의 오픈 소스 RLHF LLM 챗봇인 StableVicuna 출시 읽을거리&정보공유 llm , rlhf , stablevicuna , stability-ai	1	1707	4월 29, 2023

OAT 🌾: 대규모 언어 모델(LLM)의 온라인 정렬을 위한 연구 친화적 프레임워크 (Online Alignment Toolkit for LLMs)

rlhf , llm-alignment , reinforcement-learning , online-alignment , alignment-framework , oat , sample-efficient-alignment-for-llms , online-alignment-for-llms , active-learning

0

169

10월 9, 2025

GLM-4.1V-Thinking: 강화학습 기반의 범용 멀티모달 추론 모델 (feat. Zhipu AI)

읽을거리&정보공유

rlhf , reinforcement-learning , tsinghua-university , rl-with-verifiable-rewards , glm-4-1v-thinking , zhipu-ai , rl-with-curriculum-sampling

0

400

7월 7, 2025

[2024/12/09 ~ 12/15] 이번 주의 주요 ML 논문 (Top ML Papers of the Week)

읽을거리&정보공유

rlhf , paper , top-ml-papers-of-the-week , survey-paper , llm-as-a-judge , chain-of-continuous-thought , phi-4 , llm-async-function-calling , mag-v , multi-agent-framework , clio , autoreason , byte-latent-transformer , granite-guardian

0

493

12월 16, 2024

[GN] "RLHF는 RL의 작은 부분일 뿐입니다." - Andrej Karpathy

읽을거리&정보공유

geeknews , rlhf , andrej-karpathy , hallucination , reinforcement-learning , reducing-hallucination

0

262

8월 9, 2024

[2024/02/19 ~ 02/25] 이번 주의 주요 ML 논문 (Top ML Papers of the Week)

읽을거리&정보공유

stable-diffusion , rlhf , paper , top-ml-papers-of-the-week , survey-paper , chain-of-thought , llm-survey , gemma , opencodeinterpreter , llm-planning , babilong , proximal-policy-optimization , lora-plus , gritlm , llm4annotation

0

741

2월 26, 2024

[2023/09/25 ~ 10/01] 이번 주의 주요 ML 논문 (Top ML Papers of the Week)

읽을거리&정보공유

rlhf , multimodal , paper , vision-transformer , top-ml-papers-of-the-week , chain-of-thought , llm-alignment , reversal-curse , long-context-scaling , graph-neural-prompting , boolformer , mentalllama , logicot

0

1044

10월 2, 2023

[TLDR] 오늘의 AI 뉴스, 2023-09-04: 컨텍스츄얼, 2,000만 달러 펀딩 💰, ChatGPT 사용자 세션 분석 🤖, RLHF 대 RLAIF 👊

읽을거리&정보공유

meta , rlhf , meta-ai , funding , wasm , mvdream , evolutionaryscale , contextual , gaussian-painters , modular-diffusion , textbase , remfx , rlaif , askmore , pipio

1

312

12월 31, 2023

[GN] pykoi - LLM을 위한 데이터 & 피드백 수집용 UI 라이브러리

읽을거리&정보공유

geeknews , llm , rlhf , pykoi , ui , rag

0

294

8월 22, 2023

[2023/07/31 ~ 08/06] 이번 주의 주요 ML 논문 (Top ML Papers of the Week)

읽을거리&정보공유

rlhf , paper , openflamingo , metagpt , top-ml-papers-of-the-week , selfcheck , med-flamingo , toolllm , skeleton-of-thought , hydrea-effect , autorobotics-zero

0

394

8월 8, 2023

[TLDR] 오늘의 AI 뉴스, 2023-07-20: Apple-GPT 🍎, Keras Core 출시 🤖, GPT-4의 흔들리는 마음 🧠

읽을거리&정보공유

rlhf , tldr-ai , llama2 , sam-pt , superhuman-ai , sharcs , arctic , repvit , keras-core , apple-gpt

1

565

12월 31, 2023

[TLDR] 오늘의 AI 뉴스, 2023-05-23: OpenAI의 초지능 거버넌스🏛️, Apple - ChatGPT 사용 제한🍎, 강화 학습으로 디퓨전 모델 학습🦾

읽을거리&정보공유

tldr-ai , mms , rlhf , codi , any-to-any , gpt-json , critic , flux-ai , zeda-io