Why prompt engineering?
A prompt guides the model’s behavior without changing its underlying capabilities. By providing instructions, examples, and context, prompts shape how the model responds to inputs. Prompt engineering is important because it allows you to modify model behavior. While other approaches exist (such as fine-tuning), prompt engineering typically offers the lowest barrier to entry and often delivers the highest return on investment. Prompt engineering is often a multi-disciplinary effort. The most effective prompt engineer may be a product manager, domain expert, or other non-technical team member rather than the software engineer building the application. Proper tooling and infrastructure are essential to support this cross-functional collaboration.Prompt types
There are two different types of prompt formats:chat style prompts and completion style prompts.
Chat prompts are a list of messages, each with a role (such as system, user, or assistant). This is the prompting style supported by most current model APIs and is the recommended format.
Completion prompts are a single string. This is an older prompting style maintained primarily for backward compatibility.
Unless you have a specific reason to use completion prompts, use chat prompts for new projects. Chat prompts provide better structure for multi-turn conversations and are better supported by modern LLMs.
Prompts vs. prompt templates
While prompt and prompt template are often used interchangeably, understanding the distinction helps clarify how LangSmith manages and evaluates your AI application.- Prompts refer to the messages that are passed into the language model.
- Prompt templates allow you to create reusable prompts with dynamic placeholders that get filled in at runtime. Instead of hardcoding values, you define variables that LangSmith replaces with different inputs each time you run your prompt. This makes prompts flexible, testable, and easier to iterate on.
-
Define the template: Create a prompt with variables (marked with curly braces) that will be replaced at runtime:
-
Provide input values: Supply the actual values for each variable:
-
Get the final prompt: LangSmith replaces the variables with your inputs to create the prompt sent to the model:
Prompts in LangSmith
You can store and version prompt templates in LangSmith. These templates can be tested in the playground, versioned with commits and tags, and pulled into your application code.Open the playground to create and test your first prompt template. For a step-by-step, refer to Create a prompt.
F-string vs. mustache
You can format your prompt template with input variables using either f-string or mustache format. For details on how to use these formats in the playground, see Template format.The playground uses
f-string as the default template format, but you can switch to mustache format in the prompt settings/template format section. mustache gives you more flexibility around conditional variables, loops, and nested keys. For conditional variables, you’ll need to manually add json variables in the ‘inputs’ section. Read the documentationTools
Tools are interfaces the LLM can use to interact with the outside world. Tools consist of a name, description, and JSON schema of arguments used to call the tool.Structured output
Structured output is a feature of most state of the art LLMs, wherein instead of producing raw text as output they stick to a specified schema. This may or may not use Tools under the hood.Structured output is similar to tools, but different in a few key ways. With tools, the LLM choose which tool to call (or may choose not to call any); with structured output, the LLM always responds in this format. With tools, the LLM may select multiple tools; with structured output, only one response is generate.
Model
Optionally, you can store a model configuration alongside a prompt template. This includes the name of the model and any other parameters (temperature, etc).Prompt versioning
Versioning is a key component of iterating on and collaborating with prompts.Commits
Every saved update to a prompt creates a new commit with a unique commit hash. This allows you to:- View the full history of changes to a prompt.
- Review earlier versions.
- Revert to a previous state if needed.
- Reference specific versions in your code using the commit hash (e.g.,
client.pull_prompt("prompt_name:commit_hash")).

Tags
Commit tags are human-readable labels that point to specific commits in your prompt’s history. Unlike commit hashes, tags can be moved to point to different commits, allowing you to update which version your code references without changing the code itself. Use cases for commit tags can include:- Environment-specific tags: Mark commits for
productionorstagingenvironments, which allows you to switch between different versions without changing your code. - Version control: Mark stable versions of your prompts, for example,
v1,v2, which lets you reference specific versions in your code and track changes over time. - Collaboration: Mark versions ready for review, which enables you to share specific versions with collaborators and get feedback.
Not to be confused with resource tags: Commit tags reference specific prompt versions. Resource tags are key-value pairs used to organize workspace resources.
Prompt playground
The prompt playground provides an interface for iterating on and testing prompts. You can access the playground from the sidebar or directly from a saved prompt. In the playground you can:- Change the model being used
- Change prompt template being used
- Change the output schema
- Change the tools available
- Enter the input variables to run through the prompt template
- Run the prompt through the model
- Observe the outputs
Use Polly in the Playground to optimize prompts, generate tools, and create output schemas with AI assistance.
Testing multiple prompts
You can add multiple prompts to your playground to compare outputs and evaluate performance:
Testing over a dataset
To test over a dataset, select the dataset from the top right and click Start. You can configure whether results are streamed and the number of repetitions for the test.