ZERO Cost AI Agents: Are ELMs ready for your prompts? (Llama3, Ollama, Promptfoo, BUN) @indydevdan

IndyDevDan | ZERO Cost AI Agents: Are ELMs ready for your prompts? (Llama3, Ollama, Promptfoo, BUN) @indydevdan | Uploaded April 2024 | Updated October 2024, 3 hours ago.
🚀 Are Efficient Language Models (ELMs) READY for On-Device Use?

How do know when it is?

Using the ITV Benchmark with Llama 3, Gemma, PHI 3, you can be 100% sure that the ELM is ready for your use case.

Let's make 1 thing absolutely clear: The cost of the prompt is going to ZERO.

The world of AI is evolving at a BREAKNECK pace, and the latest advancements in efficient language models (ELMs) like Llama 3, Gemma, OpenELM, and PHI 3 are pushing the boundaries of what's possible with on-device AI. 🤖💡

LLama 3 8b, and Llama 3 70b have hit the top 20 on the LMSYS Chatbot Arena Leaderboard in less than a week of launch. You can bet that the open source LLM community is tweaking and tuning llama3 to make it even better. It's likely we'll see the 8k context window improved to 32k and above in a matter of days.

But with so many options and rapid developments, how do you know if an ELM (efficient language model aka on device language model) is truly ready for YOUR specific use case? 🤔

Enter this video and the ITV Benchmark - a powerful tool that helps you quickly assess the viability of an ELM for your needs. 📊💪

In this video, we dive deep into the world of ELMs, exploring:

✅ The key attributes you should consider when evaluating an ELM, including accuracy, speed, memory consumption, and context window
✅ How to set your personal standards for each metric to ensure the ELM meets your requirements
✅ A detailed breakdown of the ITV Benchmark and how it can help you determine if an ELM (llama3, phi3, gemma, etc) is ready for prime time
✅ Real-world examples of running the ITV Benchmark on Llama 3 and Gemma to see how they stack up 🥊
✅ Gain access to a hyper modern, minimalist prompt testing framework built on top of Bun, Promptfoo, and Ollama

We'll also discuss the game-changing implications of ELMs for your agentic tools and products. Imagine running prompts directly on your device, reducing the cost of building to ZERO! 💸

By the end of this video, you'll have a clear understanding of how to evaluate ELMs for your specific use case and be well-equipped to take advantage of these incredible advancements for both LLMs and ELMs. 🚀

ELMs, setting standards and clean prompt testing enable you to stay ahead of the curve and unlock the full potential of on-device AI! 🔓💡

Like and subscribe for more cutting-edge insights into the world of AI, and let's continue pushing the boundaries of what's possible together! 👍🌟

💻 Reduce your agentic costs with the ELM-ITV Codebase
github.com/disler/elm-itv-benchmark

🔗 Links:
Bun https://bun.sh/
Ollama ollama.com
Promptfoo https://promptfoo.dev/
Apples OpenELM machinelearning.apple.com/research/openelm

📚 Chapters:
00:00 The cost of agentic tools is going to ZERO
00:48 Are ELMs ready for on device use?
02:28 Setting standards for ELMs
04:05 My (IndyDevDan) personal standards for ELMs
06:36 The ITV benchmark
07:05 ELM benchmark codebase
09:30 Bun, Ollama, Promptfoo, llama3, phi3, Gemma
12:10 Llama3, Phi3, Gemma, GPT3 TEST Results
16:10 New LLM class system
18:45 On Device PREDICTION
19:05 Make this prompt testing codebase your own
19:45 The cost of the prompt is going to ZERO
20:15 How do you know if ELMs are ready for your use case?

#promptengineering #aiagents #llama3

The Notion API - its simpler than you think.

How to use dynamic css variables to create custom color themes for Timeva.

Achieving Balance as Engineers who want more from life (Raw Discussion)

The CREATOR economy is transforming into the NARRATOR economy

Want To Level Up Your Engineering? Focus on This.

Local - ON DEVICE LLMs are the future, but NOT yet - Here’s why

Notion Text To Speech - Listen to your Notion Pages

Realtime API with Tool Chaining. ADA is BACK. o1 assistant FILE AI Agents

GitHub Copilot being hella useful when coding Electron + Vue.js app