<aside> ๐Ÿ’ก

์š”์•ฝ

์ œํ•œ๋œ ํ™˜๊ฒฝ(On-device ๋“ฑ)์—์„œ ์‹คํ–‰ ๊ฐ€๋Šฅํ•œ LLM๋“ค์„ ์กฐ์‚ฌํ•˜๊ณ , ๋‹ค์–‘ํ•œ Evaluation Set์„ ํ†ตํ•ด ๊ฐ ๋ชจ๋ธ์˜ ์ •ํ™•๋„์™€ ์ถ”๋ก  ์‹œ๊ฐ„์„ ๋ถ„์„ํ•ด๋ด„

Investigating LLMs that can run in resource-constrained environments (such as on-device) and analyzed the accuracy and inference time of each model through various evaluation sets

</aside>

ํŒ€ ๋งํฌ : โ€ฃ (์™ธ๋ถ€ ๋น„๊ณต๊ฐœ)

Tiny LLM


https://github.com/hoonably/TinyLLM

<aside> ๐Ÿ’ก

์‹œ์ž‘์€ Jetson Nano

However, ๋ฒ„์ „๊ณผ ์„ฑ๋Šฅ์ด ๋„ˆ๋ฌด ๋‚ฎ์•„ ๋น„๊ตํ•  ์ˆ˜ ์žˆ๋Š” ๋ชจ๋ธ์ด ๋ณ„๋กœ ์—†์–ด์„œ Orin-nano๋กœ ์ง„ํ–‰

์ถ”๊ฐ€๋กœ NVIDIA A100-SXM4-80GB๋กœ๋„ ์ง„ํ–‰ํ•ด latency ๋น„๊ต

</aside>

Result

Models

Model Name Affiliation Model Size Release Date ๐Ÿ”— Link
Bloom BigScience 560M 2022.11 Bloom
Bloomz BigScience 560M 2022.11 Bloomz
Cerebras-GPT Cerebras 590M 2023.03 Cerebras-GPT
Cerebras-GPT Cerebras 256M 2023.03 Cerebras-GPT
Cerebras-GPT Cerebras 111M 2023.03 Cerebras-GPT
Danube3 H2O 500M 2024.07 Danube3
Flan-T5 Google Base 2023.01 Flan-T5
LaMini-GPT MBZUAI 774M 2023.04 LaMini-GPT
LaMini-GPT MBZUAI 124M 2023.04 LaMini-GPT
LiteLlama ahxt 460M N/A LiteLlama
OPT Meta 350M 2022.05 OPT
OPT Meta 125M 2022.05 OPT
Pythia EleutherAI 410M 2023.03 Pythia
Pythia EleutherAI 160M 2023.03 Pythia
PhoneLM mllmTeam 0.5B 2024.11 PhoneLM
Qwen1.5 Alibaba 0.5B 2024.02 Qwen1.5
Qwen2.5 Alibaba 0.5B 2024.09 Qwen2.5
SmolLM Hugging Face 360M 2024.07 SmolLM
SmolLM Hugging Face 135M 2024.07 SmolLM
TinyLlama TinyLlama 1.1B 2023.12 TinyLlama

Evaluation Datasets

Dataset Name Explanation ๐Ÿ”— Link
ARC Science question dataset for QA.<br>- ARC-e : ARC-easy ai2_arc
OBQA a QA dataset modeled after open-book exams, designed to test multi-step reasoning, commonsense knowledge, and deep text comprehension. openbookqa
BoolQ QA dataset for yes/no questions boolq
PIQA QA dataset for physical commonsense reasoning and a corresponding piqa
SIQA question-answering, designed to evaluate social commonsense reasoning about people's actions and their social implications. social_i_qa
WinoGrande fill-in-the-blank problems winogrande
HellaSwag Common sense natural language reasoning hellaswag

Environment

Jetson Orin Nano 8GB RAM Link python: 3.10.2 CUDA: (์ถ”๊ฐ€ ํ•„์š”)

Evaluation Result