RESEARCHERS describe a new method, dubbed semantic chaining, that can coax AI chatbots into generating malicious outputs by splitting prompts into discrete chunks. According to Dark Reading, NeuralTrust says the attack fools image-generating models and has already shown effectiveness against Grok 4, the Seedream 4.5 from ByteDance, and Google's Gemini Nano Banana Pro.
The technique relies on a three-step process: first, the model is fed a normal image prompt; then it is asked to modify one element; finally, it is instructed to output a second transformation that bypasses safety checks. The approach is said to be limited by the fact that the final output is rendered as an image, which constrains some safety layers.
Dark Reading notes that the researchers also explored resistance in other chatbots, including conversations about whether the method could be used against ChatGPT, but mattters remain uncertain. The piece was written by Nate Nelson and published on 29 January 2026.