주제에 mixture-of-experts 태그가 달렸습니다

글	댓글	조회수	활동
dots.llm1: 14B의 활성 파라매터로 Qwen2.5-72B 모델과 유사한 성능을 보이는 MoE 모델 읽을거리&정보공유 mixture-of-experts , dots-llm1 , rednote	0	566	6월 12, 2025
DeepEP: 효율적인 Mixture-of-Experts 병렬 통신 라이브러리 (feat. DeepSeek) 읽을거리&정보공유 mixture-of-experts , deepseek , deepep , expert-parallelism	0	414	2월 25, 2025
EPLB: MoE 모델에서 GPU들 간의 부하를 분배(Load Balancing)하는 라이브러리 (feat. DeepSeek) 읽을거리&정보공유 mixture-of-experts , deepseek , expert-parallelism , expert-parallelism-load-balancer	0	295	2월 27, 2025
Microsoft, Phi-3 모델들을 개선한 Phi-3.5 모델 시리즈 공개 (+ Phi-3.5-MoE-instruct) 읽을거리&정보공유 microsoft , mixture-of-experts , small-llm , small-multimodal , phi-3 , phi-3-5	0	1018	8월 22, 2024
[2024/07/08 ~ 07/14] 이번 주의 주요 ML 논문 (Top ML Papers of the Week) 읽을거리&정보공유 paper , mixture-of-experts , top-ml-papers-of-the-week , llm-reasoning , flashattention , routellm , flashattention-3 , rankrag , lookback-lens , internet-of-agents , 3dgen , ttt-linear	0	580	7월 14, 2024
NVIDIA H100 & TensorRT-LLM으로 Mixtral 8x7B 모델 고성능 달성하기 (feat. NVIDIA 블로그 글) 읽을거리&정보공유 nvidia , nvidia-h100 , mixture-of-experts , tensorrt-llm , tensorrt , mixtral-8x7b	0	311	7월 10, 2024
MoA(Mixture-of-Agents, 에이전트 혼합 기법), LLM 성능을 향상시키기 위한 새로운 기법 읽을거리&정보공유 together , paper , llm-framework , mixture-of-experts , mixture-of-depths , mixture-of-agents	0	1888	6월 21, 2024
DeepSeek-V2: 강력하고 경제적이며 효율적인 전문가 혼합(MoE) 언어모델 읽을거리&정보공유 mixture-of-experts , deepseek-v2 , deepseek , open-weights	0	1111	5월 15, 2024
[GN⁺] Mistral AI, 새로운 오픈 모델 Mixtral 8x22B 공개 읽을거리&정보공유 geeknews , mistral-ai , mixture-of-experts , mistral , mixtral , mixtral-8x22b	0	488	4월 19, 2024
MoD(Mixture-of-Depths): Transformer 기반 언어 모델 연산 최적화를 위한 접근법, 그리고 MoDE(MoD+MoE) 읽을거리&정보공유 paper , deepmind , mixture-of-experts , mod , mixture-of-depths , moe , mod-transformer , mode , expert-choice-mod-routing , mixture-of-depths-and-experts	0	2644	4월 7, 2024
Jamba: AI21이 공개한 Mamba 기반의 MoE 공개 모델 (OpenLLM) 읽을거리&정보공유 mixture-of-experts , mamba , jamba , mamba-ssm , ai21labs , open-weights	0	1575	3월 29, 2024
Qwen1.5-MoE: 2.7B 규모의 활성화된 매개변수로 7B 규모의 모델과 유사한 성능을 보이는 Qwen의 새로운 MoE 모델 읽을거리&정보공유 mixture-of-experts , qwen , qwen-1-5-moe , qwen-moe , qwen-1-5-moe-a2-7b	0	558	3월 29, 2024
[GN⁺] 구글의 차세대 모델: Gemini 1.5 읽을거리&정보공유 geeknews , mixture-of-experts , long-context , gemini , google-gemini	0	523	2월 16, 2024
MoE-LLaVA: 대규모 Vision-Language 모델을 위한 전문가 혼합 기법 적용 (Mixture of Experts for Large Vision-Language Models) 읽을거리&정보공유 multimodal , vision-language , mixture-of-experts , llava , moe-llava	0	1421	2월 6, 2024
[2024/01/01 ~ 01/07] 이번 주의 주요 ML 논문 (Top ML Papers of the Week) 읽을거리&정보공유 paper , top-ml-papers-of-the-week , llm-survey , hallucination , mobile-aloha , llm-agent , multimodal , llm-finetuning , mixture-of-experts , gpt-4v , code-llm , llama-pro , llm-augment , docllm , instruct-imagen	0	826	1월 8, 2024
HyperRouter: HyperNetwork를 통한 효율적인 학습 및 추론을 위한 희소 전문가 혼합 모델(SMoE) 읽을거리&정보공유 mixture-of-experts , sparse-moe , hyperrouter , hypernetwork	0	272	12월 15, 2023
[TLDR] 오늘의 AI 뉴스, 2023-09-18: Adobe의 생성형 AI Firefly, 일반 사용 가능 👋, AI 저작권 문제 관련 설문조사 📃, AI 보안 🔐 읽을거리&정보공유 gpt-4 , tldr-ai , adobe-firefly , mixture-of-experts , ai-security , ai-copyright , ai-regulation , syncdreamer , repilot , encodecmae , syntheworld , illusion-diffusion , generative-dynamics , trickle	1	330	12월 31, 2023
[TLDR] 오늘의 AI 뉴스, 2023-09-15: Microsoft 오픈소스 EvoDiff 🌐, RAG 기반 LLM 앱 구축 가이드 🤖, 가짜 유명인 이미지를 발견하기 위한 데이터셋 💃 읽을거리&정보공유 tldr-ai , summit , mixture-of-experts , mlcommons , evodiff , patronus-ai , deepfakeface , llm-applications , differentiablejpeg , hamur , corqui , retfound , algomo , llm-alignment	1	505	12월 31, 2023
[TLDR] 오늘의 AI 뉴스, 2023-08-07: 알리바바의 오픈소스 AI 모델 💻, TPU 제조업체, 칩 회사 설립 💾, 제로-샷 이미지 분류 🖼️ 읽을거리&정보공유 tldr-ai , babys-cothought , apple , tpu , mixture-of-experts , alibaba , lisa , tongyi-qianwen , functionary , magic123 , perceptionclip , structure-of-reasoning , slidespeak , google	1	288	12월 31, 2023
[TLDR] 오늘의 AI 뉴스, 2023-07-10: 구글의 병원 내 의료 AI 🏥, 알리바바의 이미지 생성기 🖼️, 크리스토퍼 놀란은 어떻게 AI를 사랑하는 방법을 배웠을까 ❤️ 읽을거리&정보공유 tldr-ai , instructblip , mistral-ai , replit-ai , mindos , causal-ai , reasoning-or-reciting , inverse-reinforcement-learning , mixture-of-experts , test-time-adaptation , focused-transformer , chatgpt-js , med-palm-2 , alibaba	1	554	12월 31, 2023

dots.llm1: 14B의 활성 파라매터로 Qwen2.5-72B 모델과 유사한 성능을 보이는 MoE 모델

읽을거리&정보공유

mixture-of-experts , dots-llm1 , rednote

0

566

6월 12, 2025

DeepEP: 효율적인 Mixture-of-Experts 병렬 통신 라이브러리 (feat. DeepSeek)

읽을거리&정보공유

mixture-of-experts , deepseek , deepep , expert-parallelism

0

414

2월 25, 2025

EPLB: MoE 모델에서 GPU들 간의 부하를 분배(Load Balancing)하는 라이브러리 (feat. DeepSeek)

읽을거리&정보공유

mixture-of-experts , deepseek , expert-parallelism , expert-parallelism-load-balancer

0

295

2월 27, 2025

Microsoft, Phi-3 모델들을 개선한 Phi-3.5 모델 시리즈 공개 (+ Phi-3.5-MoE-instruct)

읽을거리&정보공유

microsoft , mixture-of-experts , small-llm , small-multimodal , phi-3 , phi-3-5

0

1018

8월 22, 2024

[2024/07/08 ~ 07/14] 이번 주의 주요 ML 논문 (Top ML Papers of the Week)

읽을거리&정보공유

paper , mixture-of-experts , top-ml-papers-of-the-week , llm-reasoning , flashattention , routellm , flashattention-3 , rankrag , lookback-lens , internet-of-agents , 3dgen , ttt-linear

0

580

7월 14, 2024

NVIDIA H100 & TensorRT-LLM으로 Mixtral 8x7B 모델 고성능 달성하기 (feat. NVIDIA 블로그 글)

읽을거리&정보공유

nvidia , nvidia-h100 , mixture-of-experts , tensorrt-llm , tensorrt , mixtral-8x7b

0

311

7월 10, 2024

MoA(Mixture-of-Agents, 에이전트 혼합 기법), LLM 성능을 향상시키기 위한 새로운 기법

읽을거리&정보공유

together , paper , llm-framework , mixture-of-experts , mixture-of-depths , mixture-of-agents

0

1888

6월 21, 2024

DeepSeek-V2: 강력하고 경제적이며 효율적인 전문가 혼합(MoE) 언어모델

읽을거리&정보공유

mixture-of-experts , deepseek-v2 , deepseek , open-weights

0

1111

5월 15, 2024

[GN⁺] Mistral AI, 새로운 오픈 모델 Mixtral 8x22B 공개

읽을거리&정보공유

geeknews , mistral-ai , mixture-of-experts , mistral , mixtral , mixtral-8x22b

0

488

4월 19, 2024

MoD(Mixture-of-Depths): Transformer 기반 언어 모델 연산 최적화를 위한 접근법, 그리고 MoDE(MoD+MoE)

읽을거리&정보공유

paper , deepmind , mixture-of-experts , mod , mixture-of-depths , moe , mod-transformer , mode , expert-choice-mod-routing , mixture-of-depths-and-experts

0

2644

4월 7, 2024

Jamba: AI21이 공개한 Mamba 기반의 MoE 공개 모델 (OpenLLM)

읽을거리&정보공유

mixture-of-experts , mamba , jamba , mamba-ssm , ai21labs , open-weights

0

1575

3월 29, 2024

Qwen1.5-MoE: 2.7B 규모의 활성화된 매개변수로 7B 규모의 모델과 유사한 성능을 보이는 Qwen의 새로운 MoE 모델

읽을거리&정보공유

mixture-of-experts , qwen , qwen-1-5-moe , qwen-moe , qwen-1-5-moe-a2-7b

0

558

3월 29, 2024

[GN⁺] 구글의 차세대 모델: Gemini 1.5

읽을거리&정보공유

geeknews , mixture-of-experts , long-context , gemini , google-gemini

0

523

2월 16, 2024

MoE-LLaVA: 대규모 Vision-Language 모델을 위한 전문가 혼합 기법 적용 (Mixture of Experts for Large Vision-Language Models)

읽을거리&정보공유

multimodal , vision-language , mixture-of-experts , llava , moe-llava

0

1421

2월 6, 2024

[2024/01/01 ~ 01/07] 이번 주의 주요 ML 논문 (Top ML Papers of the Week)

읽을거리&정보공유

paper , top-ml-papers-of-the-week , llm-survey , hallucination , mobile-aloha , llm-agent , multimodal , llm-finetuning , mixture-of-experts , gpt-4v , code-llm , llama-pro , llm-augment , docllm , instruct-imagen

0

826

1월 8, 2024

HyperRouter: HyperNetwork를 통한 효율적인 학습 및 추론을 위한 희소 전문가 혼합 모델(SMoE)

읽을거리&정보공유

mixture-of-experts , sparse-moe , hyperrouter , hypernetwork

0

272

12월 15, 2023

읽을거리&정보공유

gpt-4 , tldr-ai , adobe-firefly , mixture-of-experts , ai-security , ai-copyright , ai-regulation , syncdreamer , repilot , encodecmae , syntheworld , illusion-diffusion , generative-dynamics , trickle

1

330

12월 31, 2023

[TLDR] 오늘의 AI 뉴스, 2023-09-15: Microsoft 오픈소스 EvoDiff 🌐, RAG 기반 LLM 앱 구축 가이드 🤖, 가짜 유명인 이미지를 발견하기 위한 데이터셋 💃

읽을거리&정보공유

tldr-ai , summit , mixture-of-experts , mlcommons , evodiff , patronus-ai , deepfakeface , llm-applications , differentiablejpeg , hamur , corqui , retfound , algomo , llm-alignment

1

505

12월 31, 2023

[TLDR] 오늘의 AI 뉴스, 2023-08-07: 알리바바의 오픈소스 AI 모델 💻, TPU 제조업체, 칩 회사 설립 💾, 제로-샷 이미지 분류 🖼️

읽을거리&정보공유

tldr-ai , babys-cothought , apple , tpu , mixture-of-experts , alibaba , lisa , tongyi-qianwen , functionary , magic123 , perceptionclip , structure-of-reasoning , slidespeak , google

1

288

12월 31, 2023

[TLDR] 오늘의 AI 뉴스, 2023-07-10: 구글의 병원 내 의료 AI 🏥, 알리바바의 이미지 생성기 🖼️, 크리스토퍼 놀란은 어떻게 AI를 사랑하는 방법을 배웠을까 ❤️

읽을거리&정보공유

tldr-ai , instructblip , mistral-ai , replit-ai , mindos , causal-ai , reasoning-or-reciting , inverse-reinforcement-learning , mixture-of-experts , test-time-adaptation , focused-transformer , chatgpt-js , med-palm-2 , alibaba

1

554

12월 31, 2023