Why AI Prompts Fail — And How to Write Better Ones

The Pain Point: Frustrated by Poor Outputs

Imagine asking an AI assistant a simple question and getting a confusing or incorrect answer in return. Many users have experienced this: you pose a question to a system like ChatGPT or Claude, expecting a clear answer, but the response is wrong, vague, or full of “mumbo jumbo.” You try rephrasing or asking follow-up questions, yet the AI’s answers still miss the mark.

This can quickly lead to frustration and erode your trust in the system. In fact, researchers at Google DeepMind note that while large language models (LLMs) are powerful, their “grip on factual accuracy remains imperfect” they can even “hallucinate” false information, which understandably “can erode trust in LLMs”.

AI experts use the term “hallucination” to describe instances when the model produces an answer that is not grounded in reality, a problem explored further in mitigating LLM hallucinations in legal systems. IBM gives a relatable description: generally, a user expects a correct answer, but “sometimes AI algorithms produce outputs that are not based on [the] training data… or do not follow any identifiable pattern. In other words, it ‘hallucinates’ the response.”

In one OpenAI example, a user asked a chatbot for the title of a particular PhD dissertation, the bot confidently produced an answer, but it was completely wrong. When asked again, it gave a different title, also wrong, and a third try gave yet another incorrect answer.

This kind of confident misinformation is jarring. It feels as if the AI is spewing authoritative-sounding nonsense, leaving users puzzled or misled. The end result is often disappointment (“Why can’t it just tell me the right answer?!”) and skepticism about the AI’s reliability.

Blog illustration

How LLMs Actually Work (in Simple Terms)

To understand why these failures happen, it helps to know how LLMs like ChatGPT work in plain language. These AI models don’t have a database of verified facts or a true “understanding” of the world. Instead, an LLM is essentially a giant statistical engine that predicts words.

OpenAI explains that ChatGPT “learns patterns from large amounts of information” and uses those patterns to “predict the next most likely word when generating a response, one word at a time.” In other words, the AI is doing something akin to an advanced version of autocomplete. It has been trained on massive datasets (for example, huge portions of the internet, books, articles, etc.), and through that training it has picked up on how language is used and which words tend to follow which other words in various contexts.

Because it generates answers by looking for plausible or likely word sequences, a language model can sound very confident and fluent, it’s imitating the patterns of human-written text. However, it doesn’t truly know if a statement is correct. It has no built-in fact-checker or conscious understanding; it’s only as good as the patterns it has seen.

If a prompt asks for something that wasn’t clearly covered in its training data, the model may try to compose an answer from whatever bits and pieces seem to fit, even if the result is inaccurate. The model isn’t trying to lie, it’s trying to be helpful by giving something that looks like an answer. Unfortunately, this means it might output what AI researchers call a hallucination: a response that sounds reasonable but is totally made-up or incorrect.

OpenAI’s researchers define these hallucinations as the model “confidently [generating] an answer that isn’t true.” It’s a direct result of the AI’s training process. During training, the model sees lots of correct text but isn’t explicitly told which statements are true or false. It learns to mimic text, not to verify facts.

IBM draws a clever analogy: AI hallucinations are like how humans sometimes see shapes in clouds, our brains perceive a pattern that isn’t really there. In the same way, an LLM might “see” a plausible answer in the patterns it knows, even if that answer isn’t real. Combined with the model’s lack of genuine understanding, this is why you sometimes get responses that are off-base. It’s not malice or stupidity; it’s the consequence of how the AI was built and trained, it provides statistically likely answers, not guaranteed correct ones.

Blog illustration

The Role of Prompts: Garbage In, Garbage Out

Now, what does the user’s prompt have to do with all this? A great deal, it turns out. There’s an old saying in computing: “garbage in, garbage out.” If you give a system poor input, you’ll get poor output. LLMs strictly follow the instructions and information in your prompt. They are not mind-readers or clairvoyants. A vague or poorly phrased question can easily lead the model astray or force it to guess what you really want. Conversely, a well-crafted prompt can steer the model to provide a much better, more accurate answer. The model will literally respond to exactly what was asked (to the best of its ability), so the quality of the question shapes the quality of the answer.

Consider a simple example. If you ask “Nutrition” and nothing else, the AI has almost no idea what you’re looking for. Do you want a definition of nutrition, advice on a diet, information on nutrition science? The response you get might be extremely broad, generic, or even irrelevant.

The team behind the Perplexity answer engine notes that it works best when you ask a specific question, not something overly broad. For instance, instead of typing a one-word prompt like “Nutrition”, you could ask, “What are the health benefits of a Mediterranean diet?”. This question is focused and clear about the information you seek. As the Perplexity guide explains, the more specific query will yield a “more direct and useful answer.” In general, if your prompt is ambiguous or lacking detail, the model might fill in blanks with its own assumptions and those assumptions can be wrong.

LLMs also don’t have context beyond what you provide (aside from any conversation history in the chat). Clarity is key: you need to spell out what you’re talking about. Anthropic, the company behind Claude, gives a useful mental model for this: think of the AI as a brilliant but very new employee with amnesia, you must give it explicit instructions and all relevant context every time. It doesn’t remember your project’s details unless you include them, and it won’t assume anything you haven’t told it. The more precisely you explain what you want, the better the AI’s response will be.

Blog illustration

How to Write Better Prompts (Actionable Tips)

The good news is that users can dramatically improve an AI’s answers by learning a bit of prompt-crafting. For legal professionals, we also cover tips for effective prompt design in more depth. You don’t need technical expertise, just a few practical strategies. Here are some actionable tips for writing prompts that work well:

Be specific and clear about what you want

Don’t be afraid to direct the AI. Tell it exactly what you’re looking for, including any particular detail or format you need.

– Vague: “Tell me about OpenAI.”

– Better: “Give me a 3-sentence summary of what OpenAI does, in simple terms.”

The more you define the task, the less the model has to guess. OpenAI’s own guidelines emphasize being “specific, descriptive and as detailed as possible about the desired context, outcome, length, format, style, etc.” in your prompt. In short, spell out your expectations.

Provide context or background information

If your question is about a specific scenario or you have certain requirements, include that in the prompt. Don’t assume the AI knows why you’re asking or what situation you’re dealing with. For example, asking “How should I set up my router?” is fine, but “How should I set up my home Wi-Fi router for a two-story house with many devices?” is even better because it gives context.

Anthropic’s Claude guide suggests giving the model as much context as possible: explain the purpose of the task, the target audience, or any relevant details about what you’re doing. This extra information helps the AI tailor its answer to your needs rather than spitting out a one-size-fits-all response.

Break large or complex tasks into smaller pieces

If you ask an extremely broad question (“Explain everything about climate change”), the model will either give a very high-level answer or get tangled trying to cover too much. It can help to split complex inquiries into multiple, simpler prompts.

You might first ask for a summary of one aspect, then ask a follow-up about another aspect. This step-by-step approach often yields clearer information.

In fact, Perplexity’s own advanced search feature will break a complex query into smaller sub-questions to find better answers. You can mimic this by guiding the AI through a topic step by step. For example, start with “What are the main causes of climate change?” then follow up with “What are the effects on weather patterns?” and so on. By chaining your prompts, you help the model focus and deliver more thorough answers.

Ask the model to show its reasoning or provide sources

Another way to improve answer quality is to explicitly request the model to explain how it arrived at an answer, or to ask for citations. For instance, you might prompt: “Explain your reasoning step by step” or “Give me the source of that information.” This can sometimes expose mistakes (if the reasoning is flawed) or increase accuracy (the model might double-check itself before providing a rationale).

Some AI systems are designed to cite sources automatically. The Perplexity answer engine, for example, includes clickable citations with every answer so you can verify the details. If you’re using an AI that doesn’t cite by default (like vanilla ChatGPT), you can still prompt it to provide references or at least mention where it got its info.

Keep in mind that not all models can retrieve real sources on the fly, but asking for a rationale can help ensure the answer isn’t just pulled from thin air. It also encourages you, the user, to critically evaluate the response.

Use follow-up questions to refine the output

Don’t give up if the first answer isn’t perfect. One of the advantages of conversational AI is that you can ask another question to clarify or dig deeper. Instead of restarting from scratch with a brand new prompt, continue the conversation.

For example, if the answer was too general, you can say, “Can you give more details about X?” or “What about situation Y?” This iterative approach often leads to a better result than one-and-done queries.

The context from your initial question stays in the conversation, so the model knows what it has already told you. The Perplexity team points out that you can build on previous questions naturally, allowing a richer understanding without needing to start all over.

In practice, this means you can correct the AI or steer it: “Actually, I meant this specific case. Could you elaborate on that?” By refining the prompt or requesting more detail in follow-ups, you guide the AI toward the information you really want. It feels less like talking to a clueless machine and more like collaborating with an assistant to zoom in on the answer.

Conclusion

The bottom line is that when an AI gives a poor response, it’s often not just a random failure, rather it’s following the cues (or lack of cues) in the prompt. If the prompt is poorly framed, you’re likely to get a poor answer. The flipside is empowering: by learning how to communicate clearly with these models, we can get significantly better results. It’s a bit like giving instructions to a person.

A vague request gets an uncertain outcome, but a clear request gets the job done. Yes, LLMs do have inherent limitations (they don’t truly understand and they can make things up), and even the most advanced models today still occasionally “make stuff up” confidently.

AI developers at OpenAI, Anthropic, DeepMind, IBM and others are actively working to make these systems more reliable and factual. But as end users, we don’t have to wait passively. We can improve our interactions right now by crafting better prompts, exploring AI integration tools, and being aware of the AI’s mindset (predictive text, not all-knowing oracle). When you give a thoughtful, precise prompt and guide the conversation, you’re essentially helping the AI “think” in the right direction. The result is often night-and-day in quality.

In summary, bad prompts often lead to bad outputs, not because the AI is hopeless, but because it’s following misguided or insufficient input. Conversely, good prompts lead to good outputs, showcasing the AI’s capabilities in the best light. The more we understand how LLMs “think” and how to speak their language, the better they can think with us and assist us. Instead of feeling frustrated, we can feel empowered to get the answers or content we need. In the new world of AI assistants, a little prompt crafting know-how goes a long way towards turning confusion into clarity.

Why AI Prompts Fail — And How to Write Better Ones

The Pain Point: Frustrated by Poor Outputs

How LLMs Actually Work (in Simple Terms)

The Role of Prompts: Garbage In, Garbage Out

How to Write Better Prompts (Actionable Tips)

Be specific and clear about what you want

Provide context or background information

Break large or complex tasks into smaller pieces

Ask the model to show its reasoning or provide sources

Use follow-up questions to refine the output

Conclusion

Ready to automate your legal workflows?

Related Articles

Finding AI Multipliers in Your Law Firm: A Practical Guide

Why AI Agents Alone Can't Automate Legal Work

AI Automation for Lawyers: The Power of Vibe Coding