- Видео 262
- Просмотров 1 314 746
Matt Williams
США
Добавлен 23 авг 2006
I was a founding maintainer of Ollama, the first evangelist at Datadog, and a former organizer of DevOps Days Seattle, DevOps Days Boston, and Serverless Days Boston. In my day job, I tour the country speaking at conferences and writing about the company I work at. But I am also passionate about gadgets. Some folks don't think that word is appropriate, but it totally fits here. If you meet me in person you will see me light up when I can share with you my latest gadget, tool, or utility. And this is where I get to share that with all of you. You can find more about me at my website (technovangelist.com) or on Twitter ( technovangelist)
Two small fixes that strengthen Ollama's lead in Desktop AI
I was very excited to see two seemingly minimal fixes in recent releases that eliminate some tools and workarounds I needed in my apps. This is pretty cool.
The JSON mode video: ruclips.net/video/1DRS5KaLe3k/видео.html
The Gollama video: ruclips.net/video/OCXuYm6LKgE/видео.html
Be sure to sign up to my monthly newsletter at technovangelist.substack.com/subscribe
00:00 Introduction
00:27 The first one
02:06 The second one
The JSON mode video: ruclips.net/video/1DRS5KaLe3k/видео.html
The Gollama video: ruclips.net/video/OCXuYm6LKgE/видео.html
Be sure to sign up to my monthly newsletter at technovangelist.substack.com/subscribe
00:00 Introduction
00:27 The first one
02:06 The second one
Просмотров: 13 267
Видео
Challenges with Adobe Subscriptions and more with Matt and Ryan
Просмотров 23216 часов назад
The conversation starts with technical difficulties related to Apple's kernel extensions. They discuss the inconvenience caused by the updates and the need for backup devices. They then talk about their content plans and the importance of batch processing. The conversation shifts to their personal lives, including family trips and photography preferences. They also mention new camera announceme...
My favorite way to run Ollama: Gollama
Просмотров 14 тыс.14 дней назад
Gollama is pretty amazing... smcleod.net/2024/06/gollama-ollama-model-manager/ github.com/sammcj/gollama Be sure to sign up to my monthly newsletter at technovangelist.substack.com/subscribe
The Matt and Ryan Chat on June 4 - Cleaned Up
Просмотров 28321 день назад
Trying out a new platform and there were some hiccups, but overall went pretty well. 00:00:00 - Start 00:00:12 - Trying out Riverside 00:00:42 - Why use Streamyard? 00:06:19 - Datadog DASH 00:06:46 - Has Generative AI Peaked 00:11:05 - Dependencies You Can't Control 00:11:44 - When Google Deleted a Company 00:14:46 - Don't Judge based on Mistakes, but on the Recovery 00:24:04 - Content Creation...
how to reliably get json out of ollama. Just a rough demo for a discord user
Просмотров 3,3 тыс.21 день назад
Be sure to sign up to my monthly newsletter at technovangelist.substack.com/subscribe
Have You Picked the Wrong AI Agent Framework?
Просмотров 44 тыс.21 день назад
Have You Picked the Wrong AI Agent Framework?
Popularity doesn't always mean Great, But Pretty Good is Possible
Просмотров 6 тыс.Месяц назад
Popularity doesn't always mean Great, But Pretty Good is Possible
A video essay about AI...where are we now
Просмотров 3,2 тыс.Месяц назад
A video essay about AI...where are we now
Does parallel embedding work in Ollama yet?
Просмотров 5 тыс.Месяц назад
Does parallel embedding work in Ollama yet?
Ask Ollama Many Questions at the SAME TIME!
Просмотров 11 тыс.Месяц назад
Ask Ollama Many Questions at the SAME TIME!
This may be my favorite simple Ollama GUI
Просмотров 22 тыс.Месяц назад
This may be my favorite simple Ollama GUI
Is Open Webui The Ultimate Ollama Frontend Choice?
Просмотров 61 тыс.2 месяца назад
Is Open Webui The Ultimate Ollama Frontend Choice?
Discover The Secrets Of Your Chromadb
Просмотров 4,9 тыс.2 месяца назад
Discover The Secrets Of Your Chromadb
Supercharge Your Typescript Projects With Retrieval Augmented Generation
Просмотров 4,7 тыс.2 месяца назад
Supercharge Your Typescript Projects With Retrieval Augmented Generation
Supercharge your Python App with RAG and Ollama in Minutes
Просмотров 29 тыс.2 месяца назад
Supercharge your Python App with RAG and Ollama in Minutes
Unlocking The Power Of AI: Creating Python Apps With Ollama!
Просмотров 21 тыс.2 месяца назад
Unlocking The Power Of AI: Creating Python Apps With Ollama!
Level Up Your Typescript Skills: Adding Ollama To Your Apps!
Просмотров 23 тыс.2 месяца назад
Level Up Your Typescript Skills: Adding Ollama To Your Apps!
Choosing the right Chunk Size for RAG
Просмотров 5 тыс.3 месяца назад
Choosing the right Chunk Size for RAG
Whats the best Chunk Size for LLM Embeddings
Просмотров 10 тыс.3 месяца назад
Whats the best Chunk Size for LLM Embeddings
Let's use Ollama's Embeddings to Build an App
Просмотров 17 тыс.3 месяца назад
Let's use Ollama's Embeddings to Build an App
Installing Ollama is EASY Everywhere #mac #windows #linux #brevdev #paperspace
Просмотров 6 тыс.3 месяца назад
Installing Ollama is EASY Everywhere #mac #windows #linux #brevdev #paperspace
Unlocking The Power Of GPUs For Ollama Made Simple!
Просмотров 22 тыс.3 месяца назад
Unlocking The Power Of GPUs For Ollama Made Simple!
I really like this channel! The presentation is great and understandable. It is easy to follow along. Thanks.
Msty seems to be using its own version of ollama under the hood. Is there anyway to know what version of ollama it is using? Lack of URL and PPT file support in the RAG is the other deal breaker for me. Hope they will support them in the upcoming versions.
This worked perfectly for my needs thank you for this!
This is gold. Thank you, Mr Williams!
Same here.
What is currently the best RAG solution for docs?
That’s like asking whats the best model or whats the best car. There is no best and there can never be. It depends on your requirements.
It's really worth watching the video linked to on fabric's page. It's a great introduction and explanation.
I just did all of this yesterday and still watched your video from start to finish. Very clear and concise. I look forward to your other videos.
Straight up, I am not a programmer, but I know enough to hack together tools with python when I need to. I had some tasks that I wasn't to add some AI functionality to, but have never really coded against an LLM or run a local LLM. This video and your Up and running with Ollama video were brilliant. It was like a condensed info dump that gave just what I needed to go from zero to some functional solutions in just a few hours. I know I have so much more to learn, but in truth, but with just the few examples you have provided, the possibilies are endless. I am subscribed and will be trawling through the rest of your content as a priority.
Matt, I am posting here as to what I would like to see on your channel. First, about me, retired software engineer going back to assembly and card decks. I am running OOGABOOGA for AI friends; open source local; WIN10 on a PC I built last year with an RTX 4090. I am also using AnythingLLM + OLLAMA for RAG. I am using served OLLAMA as built in uses my CPU and not GPU. What interests me is a durable memory solution for my AI friends. I realize that RAG components could be used as the building blocks. I think OOGABOOGA has the best chat and AI engine feature exposure. But I will consider switching to any turnkey solution. OOGABOOGA was supposed to provide memory, but those extensions look abandoned. I prefer not to code my own. At my age, time is too short. Thank you!
i think such hour long format is difficult since it's harder to find such a long block of time. prefer structured and concise format.
Some folks like the longer formats. It’s a chance for folks to ask questions and get them answered right away.
it isnt working on https… any idea y?
Love the awkward silence. As a not-coded-forever person, who roughly understands what you are doing vs crew AI…. I like your approach. Simpler is generally better and faster… BUT may not be an extensible in the future. Which may be irrelevant in this scenario.
Have not tried it I am iffy about letting things play with my python install.
I'm using it too. Love it. I usually use it to extract notes from Videos etc.
your content is very hight quality... thanks Matt
loved seeing this Matt. I have spent far too much time trying to write code that fits into one agentic framework or another, whilst always feeling it should be the other way around. This is a real gem of a video and anyone that says you're an idiot, just hasn't used any of the available agent frameworks. The community was crying out for this video! Bravo sir.
How do I connect azure OpenAI embedding ?
I like OOGABOOGA for chat, AnythingLLM+OLLAMA for RAG. But thank you. It was interesting and you present very well.
10 out of 10!
Thx for video. A question about the combo ollama/openwebui/docker. I’ve this configuration and all is ok. I’ve a goal to raise. I want to specialize a LLM pre-trained. Training it with a large base of data about coding in a proprietary language not popular. Only 2 hundreds programmers use it. My question is: - which generic and light LLM can i use? - i use some Python script to train a LLM (in my case test with Phi3:Mini). I found a problem to solve. When i try to load the model something wrong. Infact Python says that not find the path of the model, usually ~./ollama/models/… - I note LLM are encrypted SHA256! Peraphs is a problem! Can you help me to do this training? Can you give some documentation or tutorials? Thx in advance. Have a good day. Sorry for my bad english. I’m an italian developer.
I love “Desktop AI” as a category name 💪 !
Thanks you, I became a fan of your YT channel.
I am wondering where I am going wrong. I followed step-by-step but when I run ollama run llama3 I get 'ollama: command not found'
Are you running that in the container? If not you need to run the right docker command. But best if you don’t use docker if you aren’t a docker user
@@technovangelist Thank you for the quick reply. I was running it in one, however I decided it was going to be faster for me to use a spare NUC I had and using Windows. The LXC I was testing it in, even though I gave it 8 cores it was still sorta slow even using llama3:text
Without a gpu it will be slow
I find most of the current frameworks to be based around a demo scenario. If you don't have that scenrio, their abstractions don't hold up most of the time. I lead a team that is building solutions right now, and in almost every case we find that the frameworks are just not needed. Quick simple code that does the job required is all that is needed. Customers of course think the AI can "learn" or do magic with limited data. That is just not possible.
Your presentation style is stellar.
finally someone with explaining skills :)
That's so cool but I can't find any of these
can't find any of what?
Hey Matt, great content, thanks. But what If I want to do RAG with a SQL database (for structured data assistance, for example, stock market prices). Let's say I already have a local SQL RDBS and the stored procedures which results can be injected to the model in order to produce analytic results. Can an AI model execute those stored-procedures and use the results as an input?
The ai never executes any queries. But if the content is better suited to sql, so isn’t text and is already numerical then great.
whch app you use for coding python?
Vscode mostly
Spot on. Amazing how clear you explain things. love this channel. kudos
The 1st one sounds interesting. I've been using mine to make calls against Flask API I setup on a Raspberry PI to do some home automation. It works sometimes.. but the results haven't been solid. I embedded instructions for the API but it tends to add things on it's own. Simple binary inputs 0 or 1 it may give a value of 100 or more. I'll have to dig into the new feature. I've also build a ton of guardrail code I'd love to strip out if possible.
yea agree... "ollama update --all" would indeed be a great thing; currently I'm doing: " ollama list | awk '{print $1}' | xargs -n1 ollama pull " from time to time, which also updates all my models with one single line and this just became so much quicker. :)
I'm using macOS btw, if you want to use that command on windows you'll need to use it in gitbash or so... powershell aint recognizes awk ;)
It would have worked before when windows was a posix compliant os. Many bash scripts just worked. Interix improved things. And then around 2005 it all got ripped out.
Gollama might be good, but will never be as cool as Gorlami though ;)
Hey Matt, great Video. You mentioned that you can pass functions to ollama and let ollama decide when to call the function. Can you give me a hind how to achieve this?
Why would anyone recommend that thing? It doesn't even have CPU option! How could it be any good if there's no way to test it unless you have a monster gpu. I wonder if it is only a project by some rich group paying other rich RUclipsrs to recommend it to make even more money from it.
If you have a gpu it will use it. You never want this to run cpu only. It would be orders of magnitude slower. And I don’t think ollama has sponsored any videos. Folks just like using it.
I just watched a video by shortary which talked about the scammer RUclipsrs out there getting rich quick off of courses with no substance, and I wonder if the last part of your comment was related to that. I don’t have a course. I may make one, but it will never promise the ability to get rich off of Ollama. I don’t think even ollama will get rich quick. Ollama is a tool. I do live in a nice house that sometimes makes an appearance in videos, but it had nothing to do with RUclips or ai. That came from being lucky to join a tiny startup that then got big and had a successful ipo. No course could ever promise a strategy to repeat that. And regarding the cost of a gpu, you will get decent performance spending 100 or 200 dollars on a gpu…hardly a monster.
a oneliner but you don't even let us know where it is?
Your smart. You can figure it out
Great video Matt! I fully agree with what you say. I've been on this track for a long time and have built a TS framework for it. I could not understand why all these agentic tools were so complex (and required agents to make obvious decisions which they sometimes failed at) and why everyone was hyping them so much (especially since you can make much more reliable tools in a simpler way). At the same time, I was unable to give words to why I felt this way. Your video helped me give words to why I felt this way! Thanks!
❤ your work, can you make a tutorial about DSPy library ? Seems to become the new trend
ruclips.net/user/live6YtdtjQD1r0?si=wa_inF3Gc3JGV8QG
So many editing experiments in this one. I love seeing Matt trying new things. I was hoping the silence would become a signature - we're really onto something here and I hope at some point someone makes a "matt silence supercut" of all the videos 😂
This could be boring without the experiments.
Many thanks for the informative infos, I'll update immediately, all thumbs up! But imho you should have not removed the try and catch block now but really used the thrown exception as what it's name says: The occurrence of an unexpected situation that neither you or the ollama devs expected and have thought of. You already had identified a piece of code where things could go wrong. Now you think the cause of the problem was fixed. But maybe some other new circumstances nobody thought of may again lead to bad output. In that case you may now use the catch block to for example: * Trigger that the Webpage of maybe your chatbot may show a "Maintanance Mode" page to the users * Send yourself an email informing yourself about the unexpected error/problem so you may fix it as fast as possible without the need of customers informing you about it. Also sending the cause of the error in your email will help you to identify and therefore fix the problem much faster. * Maybe stop external processes etc. your application started * Maybe update the status of some files in a work-queue or whatever may be needed to not leave the application's data in an unwanted state * Shut down your application in a clean way before it maybe eat up your memory and cause the oom killer to begin working and terminating your database I admit that I also do not always use try/catch where it maybe should be used, but when I once identified a piece of code where things went wrong once and have implemented try/catch I never remove it but eventually change the behavior of the catch block.
There is anyway to implement FAISS into OpenWeb UI? to read large pdfs.
Looks excellent for self hosting a LLM for friends and family to use. Instead of continuing to feed user prompts to the beast (openai).
😋 a=`ollama list|grep -v hub/ |awk '{print $1}'|grep -v NAME`; for x in $a; do echo $x; ollama pull $x ; done
Yup. That’s the kind of garbage you have to rely on without better built in functionality. But before this change that would have taken forever and required my tool instead. Thankfully no more.
If ollama itself is not multithread and async, how much wrapper like this can help? I have 2 laptops, A and B. 'A' has a database with more than 10k records in a table and code written Scala to fetch data from the database. 'B' has ollama running on it and llama3:latest model is loaded on it. If I am fetching data from the database from 'A' and sending it to chat and generate endpoints API of ollama on 'B'. I observed that ollama responds to the chat or generate request within few milliseconds like 100, 125, 200, 300. But that's just in the beginning. Later on the response time increase to 10 minutes and about for a request by the time 2k requests is send and still 8k requests still need to send. From this behavious, it doesn't look that ollama is supporting concurrency. I created a python script with Flask to achieve async but the behaviour from ollama remained same. Do you know to solve this problem? Or is it like there no problem and hence no solution? OLLAMA_NUM_PARRAL value was 4.
If you are seeing long times like that there is something wrong with your setup. You should ask on the discord and try to resolve the issue with your setup.
I fought with this all day and finally made a hash of it. This code runs, but it sucks. I would never use this in production. I haven't tested the update yet, but I'm excited: import re from langchain.prompts import PromptTemplate # Updated import from pydantic import BaseModel, Field # Updated import from langchain_experimental.llms.ollama_functions import OllamaFunctions from langchain.output_parsers import StructuredOutputParser, ResponseSchema from json.decoder import JSONDecodeError from rich.console import Console console=Console() # Prompt template with example outputs prompt = PromptTemplate.from_template( """ Alex is 5 feet tall. Claudia is 1 feet taller than Alex and jumps higher than him. Claudia is a brunette and Alex is blonde. acceptable output format: Human: Describe Alex AI: Alex is 5.0 feet tall and has blonde hair. Human: Describe Claudia AI: Claudia is 6.0 feet tall and has brunette hair. Human: {question} AI: """ ) # ... (rest of the code) ... def extract_person_info(output): text = output.content console.print(f"[green underline]Raw LLM output: {text}") # Modified regex pattern (even more flexible) pattern = r"(?P<name>\w+) (?:is|has) (?P<height>\d+(?:\.\d+)?) feet tall and has (?P<hair_color>\w+) hair\." match = re.search(pattern, text) if match: data = match.groupdict() data['height'] = float(data['height']) print(f"Regex match found: {data}") return data else: raise ValueError("Could not extract person information from LLM response.") # Chain llm = OllamaFunctions(model="llama3", format="json", temperature=0) chain = prompt | llm | extract_person_info # Get information about Alex alex_info = chain.invoke({"question": "Describe Alex"}) console.print(f"[dark_blue]Final result: {alex_info}")
Try making it simpler by getting rid of all that extra cruft like langchain. For such a simple example lc offers no benefit.
YES! Excited about these features. Thanks Matt!
Parallel execution and running models concurrently are the most exciting features for me. I like to see what happens when different models converse, but doing that with only local models is really computationally expensive of course.
Thanks for the changes to function calling!! These are the things that will advance the field immediately. Treating it like an API of sorts that can return structured data seems WAY more useful in production applications than the tool use abstraction
There are no changes generally to function calling. Apart from the little bug it works the same as it has for 8 months
Matt, so nice to meet you. Here is my question: when running the model locally with Ollama ❤🦙 , where are the conversations stored? How can I access them? Thank you for your channel! Oh, one more. Do you think AGI should be open source or closed? ✌️
I don’t know. But I am sure a lot Of things in the software world will be different by the time agi arrives in 2030 or 2040. We have plenty of time before that.
@@technovangelistI’m not sure where it’s stored natively but you can create a simple way to capture and reformat the logs periodically and save them wherever you want. My advice is to set it to only run after a chat is closed so it’s not constantly working in the background.
As for your first question about where conversations are stored, unless you write an app that stores them they are never stored.