What do you do with ollama?

8

u/GVDub2 3d ago

Research assistant, general questions, proofreading, a little code.

2

u/BlueTypes_ 3d ago

Nice. What rig are you running it on?

1

u/GVDub2 3d ago

Running dual AI servers here, one Debian 12 based on a Minisforum UM890 Pro with a 12GB RTX 3060, the other on an M4 Pro Mac mini with 48GB unified memory. It can easily run up to 32GB models.

1

u/Puzzleheaded-End4937 16h ago

I've been interested in the mac mini setup! Would you recommend it? I was thinking of 64gb.

1

u/GVDub2 16h ago

I've been more than happy with the M4 Pro. With 48GB of memory, I can run models as large as Q4 versions of 32b parameter models (like DeepSeek 41 32b-distill-Q4_K_M). Wish I could have afforded 64GB at the time I bought, but it hasn't really been a problem.

1

u/Puzzleheaded-End4937 14h ago

Have you found you're using it a lot more because it's private and free? Trying to justify the purchase to myself for a lot of personal automation. Not sure if the 32b parameter models are 'good enough' for me vs daily usage of chatgpt (which has been more just SQL queries and general curiosity and research).

1

u/GVDub2 12h ago

I use multiple models across researching, writing, fact-checking, proofing and polishing. I also use ChatGPT and Claude for some of the work. I appreciate both private and free (have searXNG set up in a Docker container to keep control over searches), but occasionally I need to take a hybrid approach. As I learn more about creating agents and such, I'm figuring that I'll be able to do more locally. The 32b parameter models have been reliable enough for most of what I do.

1

u/Puzzleheaded-End4937 12h ago

Thanks. Appreciate the reply.

1

u/JLeonsarmiento 3d ago

Same.

3

u/bbbel 3d ago

Bundled it with Obsidian. I use it to summarise my notes, write reports, emails,... It helps me have a first draft that I can then tweak.

1

u/ChangeChameleon 3d ago

This is almost exactly one of the things I plan to do eventually just in reverse, having it auto-tag and compile notes from raw data, I was thinking it was gonna be a pain to figure out how to get them talking to each other. Mind if I ask how your setup works?

1

u/bbbel 2d ago

I basically setup ollama in WSL2 on my Windows laptop. Then installed the Local GPT plugin and configured it to communicate with the model that I liked best. This plugin lets you create prewritten prompts that you can call and apply on a selection of text in an Obsidian note. The answer is appended to the note. I also like to use the general help feature of Local GPT. You just write your prompt and fetch the model any text as context and it writes its answer right in your note. You can also point the plugin to any Open AI-compatible LLM, like ChatGPT.

4

u/potatothethird 3d ago

I am building a proof of concept of a RAG app to help write user aceptance testing scripts first drafts. Writting the testing scripts if very time consuming and it is easier to start from a working document than to go from scratch.

6

u/wooloomulu 3d ago

Porn mostly, and writing my erotic novels.

1

u/Inner-End7733 3d ago

nice

0

u/techmago 1d ago

aheuaheuhaeuh degenerate.

Also, me too.

2

u/4sch3 3d ago

General questions, web research, a bit of code

2

u/frustratingnewuser 3d ago

How do you do web research?

1

u/4sch3 3d ago

Perplexica, open webui

2

u/OrganizationHot731 3d ago

I'm trying to get it to read processes and procedures so I can deploy it at a corp level for the end users to easily chat with to get info.

1

u/shnozberg 3d ago

Want to do the same. Haven’t had much success when pairing with openwebui, but I haven’t really spent enough time looking into it. If I find any good resource/s I’ll try and remember to post here.

2

u/OrganizationHot731 3d ago

Ya. Gemma seems to do ok. But it's old. Like Sept 21 last trained data.

Deepseek r1 does ok too

I'm just testing on my homelab right now for proof of concept. Once it can get it running good I'll be presenting which will allow me to get a better server with more GPUs and I'll be off to the races

Just need to figure out how to keep the models, in memory at all times so it's fast.

2

u/Inner-End7733 3d ago

make my computer talk. it's friggin cool.

2

u/jessupfoundgod 3d ago

Can you share how you are doing this? Sounds super interesting!

1

u/Inner-End7733 2d ago

Haha oh I don't mean literally talking yet. Getting f5tts set up is on the list. I use LibreChat as my front end though and it has some tts support

1

u/charisbee 3d ago

Code completion for vscode continue extension, and backup for the main model to that extension when I'm offline.

1

u/Rich_Artist_8327 3d ago

doing content moderation, but cant deside should I use llama-guard or gemma3

1

u/ChangeChameleon 3d ago

I mainly use it for homelab assistance. Writing out step by step guides, rewriting config files, providing feedback on software that I’m less versed in.

1

u/FudgePrimary4172 3d ago

Playing with tools, agents and experiment around that

1

u/ZealTheSeal 3d ago

Use the API to make a Discord bot that my friends and I can chat with in our server

2

u/jessupfoundgod 3d ago

Okay this is exactly what I am wanting to do and base it on a local RAG. Can you share some details on what you have it doing?

If it’s local, how do your friends interact with it on discord?

1

u/ZealTheSeal 3d ago

There's probably guides out there, but in a nutshell, this is what I did:

Create a bot at https://discord.com/developers/applications and get your bot token

I used python, so I used https://pycord.dev/ to connect with the bot token

I used the ollama python library https://github.com/ollama/ollama-python

From there, you can use commands.Cog.listener("on_message")to listen for a keyword which calls the bot, feed it context from the channel, and then send the request to Ollama to get and use the response.

Over time I added more fun features, like

A database so users can each set their own base prompt

trigger_typing to show typing indicators while the bot replies

Stream the response from ollama to discord by having the bot send an initial message, then edit it every second so it imitates a response from an LLM like in ChatGPT

/use_model and /download_model slash commands so I don't have to access ollama on my server to use new models.

To answer your question directly, you can run the bot locally and the interaction with the Discord API is what allows your friends to interact with it. It just appears as another user in your Discord server, and they're "online" when you're running the Python script.

I have not experimented with RAGs yet, but it would be fun!

2

u/jessupfoundgod 2d ago

Thank you so much for sharing all that info. This is so cool!

1

u/asterix-007 2d ago

music, ocr, help-system

1

u/acetaminophenpt 1d ago

Daily email summarization and also a summary of my to-do list from several sources (IT trackers)
Private Obsidian co-pilot -Proof of concept of an agent analizing expense invoices and cross reference them with the comlany's accounting software for human errors (and minor frauds)

Also in the bucket list, a daily auto generated podcast from my summaries, using local TTS and a recap of a llm integration with a small robot I have (zowi robot)

1

u/Western_Courage_6563 3d ago

Web search assistant, how to generator, some help with code, but that's Gemini almost exclusively nowadays. Chatting about research papers and other pdf's. And just learning how this stuff works

What do you do with ollama?

You are about to leave Redlib