Alex Ziskind | Zero to Hero LLMs with M3 Max BEAST @AZisk | Uploaded December 2023 | Updated October 2024, 3 hours ago.
M3 Max is a Machine Learning BEAST. So I took it for a spin with some LLM's running locally.
I also show how to gguf quantizations with llama.cpp
Temperature/fan on your Mac: tunabellysoftware.com/tgpro/index.php?fpr=alex (affiliate link)
Run Windows on a Mac: https://prf.hn/click/camref:1100libNI (affiliate)
Use COUPON: ZISKIND10
๐ Gear Links ๐
* ๐๐ฅ New MacBook Air M1 Deal: amzn.to/3S59ID8
* ๐ป๐ Renewed MacBook Air M1 Deal: amzn.to/45K1Gmk
* ๐งโก Great 40Gbps T4 enclosure: amzn.to/3JNwBGW
* ๐ ๏ธ๐ My nvme ssd: amzn.to/3YLEySo
* ๐ฆ๐ฎ My gear: amazon.com/shop/alexziskind
๐ฅ Related Videos ๐ฅ
* ๐ RAM torture test on Mac - youtu.be/l3zIwPgan7M
* ๐ ๏ธ Set up Conda on Mac - youtu.be/2Acht_5_HTo
* ๐จโ๐ป 15" MacBook Air | developer's dream - youtu.be/A1IOZUCTOkM
* ๐ค INSANE Machine Learning on Neural Engine - youtu.be/Y2FOUg_jo7k
* ๐ป M2 MacBook Air and temps - youtu.be/R7F-TxEukdY
* ๐ฐ This is what spending more on a MacBook Pro gets you - youtu.be/iLHrYuQjKPU
* ๐ ๏ธ Developer productivity Playlist - youtube.com/playlist?list=PLPwbI_iIX3aQCRdFGM7j4TY_7STfv2aXX
๐ AI for Coding Playlist: ๐ - youtube.com/playlist?list=PLPwbI_iIX3aSlUmRtYPfbQHt4n0YaX0qw
Timestamps
00:00 Intro
00:40 Build from scratch - manual
09:44 Bonus script - automated
11:21 LM Studio - one handed
Repo
github.com/ggerganov/llama.cpp
Commands
//assuming you already have a conda environment set up, and dev tools installed (see videos above for instructions)
*Part 1 - manual*
brew install git-lfs
git lfs install
git clone github.com/ggerganov/llama.cpp
cd llama.cpp
pip install -r requirements.txt
make
git clone huggingface.co/teknium/OpenHermes-2.5-Mistral-7B openhermes-7b-v2.5
mv openhermes-7b-v2.5 models/
python3 convert.py ./models/openhermes-7b-v2.5 --outfile ./models/openhermes-7b-v2.5/ggml-model-f16.gguf --outtype f16
./quantize ./models/openhermes-7b-v2.5/ggml-model-f16.gguf ./models/openhermes-7b-v2.5/ggml-model-q8_0.gguf q8_0
./quantize ./models/openhermes-7b-v2.5/ggml-model-f16.gguf ./models/openhermes-7b-v2.5/ggml-model-q4_k.gguf q4_k
./batched-bench ./models/openhermes-7b-v2.5/ggml-model-f16.gguf 4096 0 99 0 2048 128,512 1,2,3,4
./server -m models/openhermes-7b-v2.5/ggml-model-q4_k.gguf --port 8888 --host 0.0.0.0 --ctx-size 10240 --parallel 4 -ngl 99 -n 512
*Part 2 - auto*
bash -c "$(curl -s ggml.ai/server-llm.sh)"
๐ป MacBooks in this video
M2 Max 16" MacBook Pro 64GB/2TB
โ โ โ โ โ โ โ โ โ
โค๏ธ SUBSCRIBE TO MY YOUTUBE CHANNEL ๐บ
Click here to subscribe: youtube.com/@AZisk?sub_confirmation=1
โ โ โ โ โ โ โ โ โ
Join this channel to get access to perks:
youtube.com/channel/UCajiMK_CY9icRhLepS8_3ug/join
#m3max #macbook #macbookpro
โ โ โ โ โ โ โ โ โ
๐ฑ ALEX ON X: twitter.com/digitalix
M3 Max is a Machine Learning BEAST. So I took it for a spin with some LLM's running locally.
I also show how to gguf quantizations with llama.cpp
Temperature/fan on your Mac: tunabellysoftware.com/tgpro/index.php?fpr=alex (affiliate link)
Run Windows on a Mac: https://prf.hn/click/camref:1100libNI (affiliate)
Use COUPON: ZISKIND10
๐ Gear Links ๐
* ๐๐ฅ New MacBook Air M1 Deal: amzn.to/3S59ID8
* ๐ป๐ Renewed MacBook Air M1 Deal: amzn.to/45K1Gmk
* ๐งโก Great 40Gbps T4 enclosure: amzn.to/3JNwBGW
* ๐ ๏ธ๐ My nvme ssd: amzn.to/3YLEySo
* ๐ฆ๐ฎ My gear: amazon.com/shop/alexziskind
๐ฅ Related Videos ๐ฅ
* ๐ RAM torture test on Mac - youtu.be/l3zIwPgan7M
* ๐ ๏ธ Set up Conda on Mac - youtu.be/2Acht_5_HTo
* ๐จโ๐ป 15" MacBook Air | developer's dream - youtu.be/A1IOZUCTOkM
* ๐ค INSANE Machine Learning on Neural Engine - youtu.be/Y2FOUg_jo7k
* ๐ป M2 MacBook Air and temps - youtu.be/R7F-TxEukdY
* ๐ฐ This is what spending more on a MacBook Pro gets you - youtu.be/iLHrYuQjKPU
* ๐ ๏ธ Developer productivity Playlist - youtube.com/playlist?list=PLPwbI_iIX3aQCRdFGM7j4TY_7STfv2aXX
๐ AI for Coding Playlist: ๐ - youtube.com/playlist?list=PLPwbI_iIX3aSlUmRtYPfbQHt4n0YaX0qw
Timestamps
00:00 Intro
00:40 Build from scratch - manual
09:44 Bonus script - automated
11:21 LM Studio - one handed
Repo
github.com/ggerganov/llama.cpp
Commands
//assuming you already have a conda environment set up, and dev tools installed (see videos above for instructions)
*Part 1 - manual*
brew install git-lfs
git lfs install
git clone github.com/ggerganov/llama.cpp
cd llama.cpp
pip install -r requirements.txt
make
git clone huggingface.co/teknium/OpenHermes-2.5-Mistral-7B openhermes-7b-v2.5
mv openhermes-7b-v2.5 models/
python3 convert.py ./models/openhermes-7b-v2.5 --outfile ./models/openhermes-7b-v2.5/ggml-model-f16.gguf --outtype f16
./quantize ./models/openhermes-7b-v2.5/ggml-model-f16.gguf ./models/openhermes-7b-v2.5/ggml-model-q8_0.gguf q8_0
./quantize ./models/openhermes-7b-v2.5/ggml-model-f16.gguf ./models/openhermes-7b-v2.5/ggml-model-q4_k.gguf q4_k
./batched-bench ./models/openhermes-7b-v2.5/ggml-model-f16.gguf 4096 0 99 0 2048 128,512 1,2,3,4
./server -m models/openhermes-7b-v2.5/ggml-model-q4_k.gguf --port 8888 --host 0.0.0.0 --ctx-size 10240 --parallel 4 -ngl 99 -n 512
*Part 2 - auto*
bash -c "$(curl -s ggml.ai/server-llm.sh)"
๐ป MacBooks in this video
M2 Max 16" MacBook Pro 64GB/2TB
โ โ โ โ โ โ โ โ โ
โค๏ธ SUBSCRIBE TO MY YOUTUBE CHANNEL ๐บ
Click here to subscribe: youtube.com/@AZisk?sub_confirmation=1
โ โ โ โ โ โ โ โ โ
Join this channel to get access to perks:
youtube.com/channel/UCajiMK_CY9icRhLepS8_3ug/join
#m3max #macbook #macbookpro
โ โ โ โ โ โ โ โ โ
๐ฑ ALEX ON X: twitter.com/digitalix