• 0 Posts
  • 6 Comments
Joined 1 month ago
cake
Cake day: June 7th, 2025

help-circle
  • You forget that there are steps (possible) between data crawling and llm training, i expect they will at least grade the quality (not by hand or anything) and any reasonable company will not want to feed their llm with just about anything. I sure hope competitors do actually filter this garbage out before training their ai.

    Couldn’t the above output have easily been generated with a prompt like “what arguments do holocaust deniers use?”?

    I think that is much more likely than it just straight out denying the holocaust, but my point is that is should be able to churn that out just like it should be able to teach people how to make bombs. An LLM can only give accurate information if ingested that information during its training. I remember for example an early image generator not showing nipples on a woman breastfeeding, likely because it only had access to censured images. Basically the digital varient of the ken doll.


  • huppakee@feddit.nltoBoycottUnitedStates@europe.pubGrok got a Nazi patch
    link
    fedilink
    English
    arrow-up
    15
    arrow-down
    3
    ·
    1 day ago

    LLMs arent necessarily consistent in telling you about facts since it’s just software that knows which words fit well together and itself has no beliefs of what is and isn’t true (or even what is and isn’t likely). It can easily tell you one thing on monday and another thing on tuesday if it isn’t trained well. If the original tweet is actual output and not photoshop, it would show it was trained on holocaust-denying material. The conversation you link to suggest it was trained on factual historic information. But those two things aren’t exclusive - it can have had both as training data. That is i think the worrying part, it might have been (accidentally or intentionally) been trained on inaccurate historic ‘information’.