아래 소개해드린 MPT-7B에 이어, 상업적으로 사용 가능한 (라이선스를 가진) LLM들이 속속 등장하는 가운데, 이를 정리한 (& 하고 있는) GitHub 저장소가 있어 공유드립니다.
아래는 오늘(2023년 05월 09일) 기준, MIT, Apache 2.0, OpenRAIL-M 라이선스를 가진 LLM들의 목록입니다.
Open LLMs
These LLMs are all licensed for commercial use (e.g., Apache 2.0, MIT, OpenRAIL-M). Contributions welcome!
LLMs for code
Language Model | Checkpoints | Paper/Blog | Size | Context Length | Licence |
---|---|---|---|---|---|
SantaCoder | santacoder | SantaCoder: don't reach for the stars! | 1.1B | 2048 | OpenRAIL-M v1 |
StarCoder | starcoder | StarCoder: A State-of-the-Art LLM for Code, StarCoder: May the source be with you! | 15B | 8192 | OpenRAIL-M v1 |
Replit Code | replit-code-v1-3b | Training a SOTA Code LLM in 1 week and Quantifying the Vibes — with Reza Shabani of Replit | 2.7B | infinity? (ALiBi) | CC BY-SA-4.0 |
CodeGen2 | codegen2 1B-16B | CodeGen2: Lessons for Training LLMs on Programming and Natural Languages | 1B - 16B | 2048 | Apache 2.0 |
Evals on open LLMs
- Leaderboard by lmsys.org
- Evals by MosaicML
- Holistic Evaluation of Language Models (HELM)
- LLM-Leaderboard
LLM datasets for fine-tuning
PENDING
Want to contribute? Just add a row above.
What do the licences mean?
- Apache 2.0: Allows users to use the software for any purpose, to distribute it, to modify it, and to distribute modified versions of the software under the terms of the license, without concern for royalties.
- MIT: Similar to Apache 2.0 but shorter and simpler. Also, in contrast to Apache 2.0, does not require stating any significant changes to the original code.
- CC BY-SA-4.0: Allows (i) copying and redistributing the material and (ii) remixing, transforming, and building upon the material for any purpose, even commercially. But if you do the latter, you must distribute your contributions under the same license as the original. (Thus, may not be viable for internal teams.)
- OpenRAIL-M v1: Allows royalty-free access and flexible downstream use and sharing of the model and modifications of it, and comes with a set of use restrictions (see Attachment A)
Disclaimer: The information provided in this repo does not, and is not intended to, constitute legal advice. Maintainers of this repo are not responsible for the actions of third parties who use the models. Please consult an attorney before using models for commercial purposes.