SmokeyDope

SmokeyDope@lemmy.world · 2 days ago

I just spent a good few hours optimizing my LLM rig. Disabling the graphical interface to squeeze 150mb of vram from xorg, setting programs cpu niceness to highest priority, tweaking settings to find memory limits.

I was able to increase the token speed by half a second while doubling context size. I don’t have the budget for any big vram upgrade so I’m trying to make the most of what ive got.

I have two desktop computers. One has better ram+CPU+overclocking but worse GPU. The other has better GPU but worse ram, CPU, no overclocking. I’m contemplating whether its worth swapping GPUs to really make the most of available hardware. Its bee years since I took apart a PC and I’m scared of doing somthing wrong and damaging everything. I dunno if its worth the time, effort, and risk for the squeeze.

Otherwise I’m loving my self hosting llm hobby. Ive been very into l learning computers and ML for the past year. Crazy advancements, exciting stuff.

SmokeyDope@lemmy.world · 6 days ago

Sounds like ollama was loaded up with an either overly censored or plain brain dead language model. Do you know which model it was? Maybe try mistral if it fits in your computer.

SmokeyDope@lemmy.world · edit-2 6 days ago

I run kobold.cpp which is a cutting edge local model engine, on my local gaming rig turned server. I like to play around with the latest models to see how they improve/change over time. The current chain of thought thinking models like deepseek r1 distills and qwen qwq are fun to poke at with advanced open ended STEM questions.

STEM questions like “What does Gödel’s incompleteness theorem imply about scientific theories of everything?” Or “Could the speed of light be more accurately refered to as ‘the speed of causality’?”

As for actual daily use, I prefer using mistral small 24b and treating it like a local search engine with the legitimacy of wikipedia. Its a starting point to ask questions about general things I don’t know about or want advice on, then do further research through more legitimate sources.

Its important to not take the LLM too seriously as theres always a small statistical chance it hallucinates some bullshit but most of the time its fairly accurate and is a pretty good jumping off point for further research.

Lets say I want an overview of how can I repair small holes forming in concrete, or general ideas on how to invest financially, how to change fluids in a car, how much fat and protein is in an egg, ect.

If the LLM says a word or related concept I don’t recognize I grill it for clarifying info and follow it through the infinite branching garden of related information.

I’ve used an LLM to help me go through old declassified documents and speculate on internal gov terminalogy I was unfamiliar with.

I’ve used a speech to text model and get it to speek just for fun. Ive used multimodal model and get it to see/scan documents for info.

Ive used websearch to get the model to retrieve information it didn’t know off a ddg search, again mostly for fun.

Feel free to ask me anything, I’m glad to help get newbies started.

SmokeyDope@lemmy.world · edit-2 10 days ago

@CubitOom@infosec.pub theres no need for finding a replacement. The benefit of open source projects is that theres usually someone who forks the project and continues the legacy if its popular enough.

You are very fortunate github user agrhan forked your pass command based android password manager. Here’s a direct link to the apk snapshot while im at it.

This is maintenance-only meaning it only updates dependencies. So don’t expect new features. There might be some others more actively worked on, but this is the most popular and stable one.

In the future maybe take a moment to browse forks on the projects git page instead of just assuming the project is dead and running to a replacement.