Hacker Newsnew | past | comments | ask | show | jobs | submit | petercooper's commentslogin

As a few people have asked for screenshots, I spun it up. Here's a video of the basic gameplay: https://peterc.org/misc/fpscob.mp4 .. it's clunky, but it does play.

I'm not getting anywhere near the speeds advertised on my 3090 Ti, alas, but it's fun watching it "fill out" its answers. I did Simon's "SVG pelican on a bicycle" test on it and the result was quite minimalistic but fit the brief: https://gist.github.com/peterc/7672e74ec1437945e5fca5ce2c1c9... -- this was on the Q4 quant running on patched llama.cpp. I will be interested to see if Simon's looks much different.

Hi! What implementation are you using? Right now VLLM is the one recommended. llama.cpp is in an early draft

Yeah, the patched llama.cpp. The reason is I saw that using the Q4 quant on vLLM is discouraged and the int8 won't fit on my 3090 Ti, but I could certainly give it a go. I also skipped Transformers as it needs to download the full weights and quantize them locally and I didn't fancy waiting for a 50GB download.

That final paragraph is not good (where an LLM has enumerated the ways it has improved the article).

Stuttering John used to do this back on Howard Stern by asking celebrities questions that were far out of the expected gamut at red carpet events. This was all for shock/comedy value, but "who are you and what makes you famous" type questions can really throw celebs off script: https://www.youtube.com/watch?v=8P0hENpnMXk

Narduwar the human serviette is the reference you were really looking for.

https://youtu.be/ZOZYl0aLxDY


I was expecting legit questions. These are just rude

I'm not the OP and I imagine all cases are different, but my dad was a software developer who had early cognitive decline in his 60s (he died of vascular dementia recently) and he used to talk about it a lot. He said it was like his tolerance for complexity kept closing in.

Where he could once hold an entire system and its details in his head (almost an essential skill in the 80s/90s), he could only instead focus on smaller pieces at a time. Any new tooling or approaches that came along, he was fascinated to hear about them, but no longer felt able to pick them up. He could still solve algorithmic problems and debug "in the small", but it was like he had to do math on a Post-it note where once he had a huge sheet of paper.


Its image processing is terrible. I ran several tests against it against Qwen 3.5 0.8b (yes, 7% the size) and Qwen beat it every time with Gemma often getting things entirely wrong. I even gave it a plain image saying "This is a test" and it thought for 6 minutes trying to analyze it and failed. Qwen 3.5 0.8b confidently got it in under a second.

It may be that the Q6 quant I got is borked (or my LM Studio is), but either way, the 0.8b's performance is mind boggling in comparison.


For Qwen 3.5 0.8B presumably you're running it unquantized, because it's so small. Get at least the Q8 of Gemma 4 12B with the F32 mmproj and use an f16 kv cache.

Then run it with the latest llama.cpp that contains the Gemma 4 12B unified bug fixes, using --image-min-tokens 560 --image-max-tokens 2240 --batch-size 4096 --ubatch-size 4096 --temp 1.0 --top-p 0.95 --top-k 64 --jinja

It's understanding far more complex things for me and can reliably handle tiny text, so it should be easily understanding an image that only contains the text "This is a test".


That sounds like a bug. They're very common for open model releases on the first day. If I wasn't on mobile I'd try it on Google's own app.

Sounds like you're doing it wrong, to be honest.

I guess Google implements more / stronger guard rails than Alibaba and thus confuses these small models. At least this was my impression with Gemma3 models where it often said that the image contains some nudity / sex scenes and therefore it cannot give a description of the image. Never understood the point of this behavior....

The biggest problem with all the Google models has always been RLHF, particularly safety training. They take a good, smart model and make it behave like a corporate person that has been to far to many forced anti-{sexism, racism...} seminars so that it is now living in fear of saying something that could be construed as wrong by some moral standard.

This is almost certainly not true.

If it was, they wouldn't need to be using the classifiers they are using to warn Gemini about problematic prompts.


I've always found the Gemma models to vastly under-perform on vision tasks compared to Qwen so that's nothing new.

The Qwen series adopted vision wayyy earlier than anyone else. No idea why the other labs were sleeping on it but they had about 2 years of experimentation without any competition.

Test it on a professional inference provider to rule out trouble on your end.

"It’s not just smarter; it’s leaner"

Can't speak for browser demos, but I just got the ternary model working on my M5 generating images. The 1 bit didn't work, as it has a known bug with XCode 24.5 and I wasn't in the mood for installing 24.4 alongside.

Here's a generation in your honor: https://peterc.org/img/johndoe.png


This year seems to be turning a bit of a corner. Of the top box office movies so far this year there's Michael, Project Hail Mary, Hoppers, Wuthering Heights, GOAT.. with Obsession and Backrooms rapidly rising.

Last year it was basically F1 and Minecraft (and while not sequels, both are arguably well known "franchises" outside of movies - but I guess MJ and Wuthering Heights are too ;-)).


Wuthering Heights was a remake, and Hail Mary was also a safeish bet since it's a novel by the same guy as The Martian.

Not to say that it isn't an improvement, but we're still pretty far from seeing American cinema catching up to the world stage in originality, let alone to the golden Hollywood era.


I remember reading Project Hail Mary years before the movie was announced, and thinking "this is written, if not exactly as a screenplay, in such a way to make it SO easy to adapt to a screenplay that given this is from the Martian author there is no way this will not be made as a movie"

I enjoyed both the book and the movie btw


Yeah a lot of authors nowadays just write screenplay, either thinking on licensing or just by being influenced by tv. Sanderson and Abercrombie come to mind as other authors that basically have action scenes and movie cuts baked into the books.

At least Hail Mary was an original IP with no built-in sequel opportunity. These days, I'd be happy if more major studio, big budget releases were adapted from original IP books.

Sadly, I heard that the studio is apparently trying to figure out how to make a Hail Mary sequel (sigh).


> Sadly, I heard that the studio is apparently trying to figure out how to make a Hail Mary sequel

they put a lot of subtle hooks in the ending montage. i just hope they dont try to force it and make a direct sequel with ryan as the lead.


And Michael was based on some of the most expensive & beloved IP in the world (extremely popular with Gen X despite everything)

I don't think it's a hot take to say: give Kane Parsons the keys to the kingdom.

Agreed, I've already been to the movies twice this year (PHM and Backrooms), usually it's maybe once every other year for some one-day anime movie airing. Really enjoyed both of them, just as with PHM, I think Backrooms is best viewed on a big screen.

There are some older studies that showed nicotine (not via smoking, which makes sleep apnea worse) had a positive effect on sleep apnea as well due to the stimulation increasing muscular activity in the airways. Obviously lots of downsides too so it never caught on, but this seems to have a similar mechanism.


Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: