NLP (7): Prompt Engineering and In-Context Learning

In the era of large language models, how to "converse" with models has become an art form. The same model can produce dramatically different results depending on the prompt used. Prompt engineering is the discipline of designing effective inputs to unlock the best performance from models. From simple zero-shot prompts to complex chain-of-thought reasoning, from role assignment to template design, prompt engineering has become a core skill for working with large models.

In-Context Learning (ICL) is the theoretical foundation of prompt engineering. It reveals how models learn from examples, how they dynamically adjust behavior during inference, and why few-shot prompts often outperform zero-shot prompts. Understanding these mechanisms not only helps us write better prompts but also deepens our understanding of how large language models work.

This article systematically introduces the core concepts and practical techniques of prompt engineering, including zero-shot, few-shot, and chain-of-thought prompting, role assignment and formatting techniques, prompt template design, advanced techniques like Self-Consistency and ReAct, and demonstrates how to build efficient prompt systems through practical examples.

Fundamentals and Principles of Prompt Engineering

The core idea of prompt engineering is: guide models to produce desired outputs through carefully designed input text. A good prompt should be clear, specific, and contain necessary contextual information.

Core Principles of Prompt Engineering

1. Clarity

Prompts should clearly express task requirements and avoid ambiguity. Compare these two prompts:

❌ Poor: "Analyze this text"
✅ Good: "Please analyze the sentiment of the following text and provide a judgment of positive, negative, or neutral"

2. Specificity

Provide specific output format requirements so the model knows how to structure the answer:

prompt = """
Please analyze the pros and cons of the following product and output in the following format:
Pros:
1. ...
2. ...

Cons:
1. ...
2. ...
"""

3. Context Completeness

Ensure prompts contain all information needed to complete the task:

# Missing context
prompt1 = "Translate this sentence"

# Complete context
prompt2 = """
Please translate the following English sentence into Chinese, maintaining the original meaning and tone:
English: The quick brown fox jumps over the lazy dog.
Chinese:
"""

4. Role Assignment

Assigning a clear role to the model can significantly improve output quality:

prompt = """
You are an experienced Python programming expert with expertise in writing clear and efficient code.
Please write a Python function for the following requirement:
Requirement: Implement a quicksort algorithm
"""

Components of a Prompt

A complete prompt typically includes:

Instruction: Clearly tells the model what to do
Context: Provides background information
Input Data: Specific content to process
Output Format: Expected output structure
Examples: Optional examples demonstrating expected behavior

def build_prompt(instruction, context, input_data, output_format=None, examples=None):
    prompt_parts = []
    
    # Role assignment (optional)
    prompt_parts.append("You are a professional data analyst.")
    
    # Instruction
    prompt_parts.append(f"Task: {instruction}")
    
    # Context
    if context:
        prompt_parts.append(f"Background: {context}")
    
    # Examples
    if examples:
        prompt_parts.append("Examples:")
        for ex in examples:
            prompt_parts.append(f"Input: {ex['input']}")
            prompt_parts.append(f"Output: {ex['output']}")
    
    # Output format
    if output_format:
        prompt_parts.append(f"Output format: {output_format}")
    
    # Input data
    prompt_parts.append(f"Please process the following data:\n{input_data}")
    
    return "\n\n".join(prompt_parts)

Zero-shot, Few-shot, and Chain-of-Thought Prompting

Zero-shot Prompting

Zero-shot prompting is the simplest approach: directly provide a task description without any examples.

zero_shot_prompt = """
Please determine the sentiment of the following sentence:
Sentence: This movie has an excellent plot and outstanding acting.
Sentiment:
"""

Use Cases: - Model has good prior knowledge of the task - Task is relatively simple and straightforward - Need to quickly test model capabilities

Limitations: - Models may misunderstand complex tasks - Output format may not meet expectations - Difficult to handle tasks requiring multi-step reasoning

Few-shot Prompting

Few-shot prompting provides a small number of examples, allowing the model to learn task patterns. This is the core manifestation of In-Context Learning.

few_shot_prompt = """
Please determine the sentiment of the following sentences (positive/negative/neutral):

Example 1:
Sentence: The weather is beautiful today, sunny and bright.
Sentiment: positive

Example 2:
Sentence: This product is of poor quality and not worth the price.
Sentiment: negative

Example 3:
Sentence: It will rain tomorrow.
Sentiment: neutral

Now please determine:
Sentence: The service attitude at this restaurant is impressive.
Sentiment:
"""

Advantages of Few-shot Prompting:

Pattern Learning: Models learn task patterns through examples
Format Alignment: Examples demonstrate expected output format
Task Adaptation: Models can quickly adapt to new tasks even without specific training

Principles for Example Selection:

Diversity: Cover different types of inputs
Representativeness: Choose typical, clearly-bounded examples
Relevance: Examples should be highly relevant to the target task

def create_few_shot_prompt(task_description, examples, query):
    prompt = f"{task_description}\n\n"
    prompt += "Examples:\n"
    
    for i, ex in enumerate(examples, 1):
        prompt += f"Example {i}:\n"
        prompt += f"Input: {ex['input']}\n"
        prompt += f"Output: {ex['output']}\n\n"
    
    prompt += f"Now please process:\nInput: {query}\nOutput:"
    return prompt

# Usage example
examples = [
    {"input": "The weather is beautiful today", "output": "positive"},
    {"input": "This product is of poor quality", "output": "negative"},
    {"input": "It will rain tomorrow", "output": "neutral"}
]

prompt = create_few_shot_prompt(
    "Please determine the sentiment of sentences",
    examples,
    "The service attitude at this restaurant is impressive"
)

Chain-of-Thought (CoT) Prompting

Chain-of-Thought prompting requires models to show their reasoning process, thinking step by step. This is particularly effective for complex reasoning tasks.

Standard CoT Prompt:

cot_prompt = """
Please solve the following math problem and show your reasoning process:

Problem: Xiao Ming has 15 apples. He gave 3 to Xiao Hong and 5 to Xiao Hua, then ate 2 himself. How many apples does Xiao Ming have left?

Reasoning:
1. Xiao Ming initially has 15 apples
2. After giving 3 to Xiao Hong: 15 - 3 = 12 apples
3. After giving 5 to Xiao Hua: 12 - 5 = 7 apples
4. After eating 2: 7 - 2 = 5 apples

Answer: 5 apples

Now please solve:
Problem: A book has 120 pages. On day 1, 30 pages were read. On day 2, twice as many as day 1 were read. On day 3, half of the remaining pages were read. How many pages were read on day 3?
"""

Few-shot CoT:

few_shot_cot = """
Please solve the following problems, showing reasoning steps:

Problem 1: A basket has 8 apples. 3 were taken away, then 2 were put back. How many are there now?
Reasoning: Initially 8, after taking 3 there are 5, after putting back 2 there are 7.
Answer: 7

Problem 2: A store has 20 items. 12 were sold, then 8 were restocked. How many are there now?
Reasoning: Initially 20, after selling 12 there are 8, after restocking 8 there are 16.
Answer: 16

Problem 3: A class has 30 students. 5 transferred out, then 7 transferred in. How many students are there now?
"""

Mathematical Expression of CoT:

For problem , CoT prompting guides the model to generate a sequence of reasoning steps, ultimately producing answer:whereis the final answer.

Advantages of CoT:

Improved Accuracy: Showing reasoning helps models avoid errors
Interpretability: Users can understand the model's thinking process
Complex Reasoning: Particularly effective for multi-step reasoning tasks

Role Assignment and Formatting Techniques

Role Assignment

Assigning clear roles to models can significantly change output style and quality.

role_prompts = {
    "expert": """
You are a senior Python engineer with 20 years of experience, skilled in writing efficient and maintainable code.
Please write code for the following requirement:
""",
    
    "teacher": """
You are a patient programming teacher, skilled at explaining complex concepts in simple terms.
Please explain the following concept:
""",
    
    "analyst": """
You are a senior data analyst, skilled at discovering insights from data.
Please analyze the following data:
""",
    
    "creative": """
You are a creative copywriter, skilled at writing engaging copy.
Please write copy for the following product:
"""
}

Effects of Role Assignment:

Professionalism: Expert roles produce more professional, technical outputs
Style Consistency: Role assignment ensures output style meets expectations
Domain Adaptation: Different roles adapt to different domain tasks

Formatting Techniques

1. Structured Output

Use clear format markers to guide models to produce structured content:

structured_prompt = """
Please analyze the following article and output in the following format:

## Article Summary
[Summary content]

## Key Points
1. [Point 1]
2. [Point 2]
3. [Point 3]

## Sentiment Analysis
- Overall sentiment: [positive/negative/neutral]
- Sentiment intensity: [1-10 score]

Article content:
[Article content]
"""

2. JSON Format Output

For programmatic processing, require JSON format:

json_prompt = """
Please analyze the following text and output results in JSON format:

Requirements:
- Field names: title, summary, keywords, sentiment
- sentiment values: positive, negative, neutral

Text: [Text content]

Output format:
{
  "title": "...",
  "summary": "...",
  "keywords": ["...", "..."],
  "sentiment": "..."
}
"""

3. Step-by-step Output

For complex tasks, require step-by-step display:

step_by_step_prompt = """
Please solve the following problem, showing steps:

Step 1: [First step]
Step 2: [Second step]
Step 3: [Third step]
...
Final Answer: [Answer]

Problem: [Problem content]
"""

4. Using Delimiters

Use clear delimiters to distinguish different sections:

delimiter_prompt = """
=== Task Description ===
[Task description]

=== Input Data ===
[Data content]

=== Output Requirements ===
[Format requirements]

=== Start Processing ===
"""

Prompt Template Design

Prompt templates are reusable prompt frameworks suitable for specific types of tasks.

Classification Task Template

classification_template = """
You are a professional text classification expert.

Task: Classify text into one of the following categories: {categories}

Classification rules:
{rules}

Examples:
{examples}

Text to classify:
{text}

Please output:
Category: [Category name]
Confidence: [Value between 0-1]
Reason: [Brief explanation]
"""

Generation Task Template

generation_template = """
You are a professional {domain} content creator.

Task: {task_description}

Requirements:
- Length: Approximately {length} words
- Style: {style}
- Target audience: {audience}

Reference information:
{reference}

Please generate content:
"""

Q&A Task Template

qa_template = """
Answer questions based on the following document.

Document:
{document}

Question: {question}

Requirements:
1. Answers must be based on document content
2. If no relevant information is found in the document, answer "No relevant information found in the document"
3. Cite specific content from the document to support your answer

Answer:
"""

Code Generation Template

code_generation_template = """
You are an experienced {language} programmer.

Requirement: {requirement}

Technical requirements:
- Programming language: {language}
- Code style: {style}
- Must include: {features}

Example input/output:
{examples}

Please write code:
```{language}
[Code]

Code explanation: [Brief explanation of code logic] """


### Template Parameterization Implementation

```python
class PromptTemplate:
    def __init__(self, template: str):
        self.template = template
    
    def format(self, **kwargs) -> str:
        """Format template, filling in parameters"""
        return self.template.format(**kwargs)
    
    def format_safe(self, **kwargs) -> str:
        """Safe formatting, handling missing parameters"""
        try:
            return self.template.format(**kwargs)
        except KeyError as e:
            raise ValueError(f"Missing required template parameter: {e}")

# Usage example
template = PromptTemplate(classification_template)
prompt = template.format(
    categories="positive, negative, neutral",
    rules="Classify based on text sentiment",
    examples="Example 1: The weather is beautiful today → positive",
    text="This movie is really great"
)

Self-Consistency: Consistency Sampling

Self-Consistency is a technique to improve CoT reasoning accuracy. The core idea is: generate multiple reasoning paths for the same problem, then select the most consistent answer.

Self-Consistency Principle

Traditional CoT generates only one reasoning path, while Self-Consistency generatespaths and selects the final answer through voting.

Algorithm Flow:

Use CoT prompting to generatedifferent reasoning paths
Extract answers from each path
Select the most frequently occurring answer (majority voting)

import random
from collections import Counter

def self_consistency_sample(model, prompt, k=5, temperature=0.7):
    """
    Sample multiple reasoning paths using Self-Consistency
    
    Args:
        model: Language model
        prompt: CoT prompt
        k: Number of samples
        temperature: Sampling temperature (higher temperature increases diversity)
    
    Returns:
        Final answer
    """
    answers = []
    
    for _ in range(k):
        # Use higher temperature sampling to increase diversity
        response = model.generate(
            prompt,
            temperature=temperature,
            max_tokens=500
        )
        answer = extract_answer(response)
        answers.append(answer)
    
    # Majority voting
    answer_counts = Counter(answers)
    final_answer = answer_counts.most_common(1)[0][0]
    
    return final_answer, answer_counts

def extract_answer(response):
    """Extract answer from response"""
    # Simple implementation: find content after "Answer:"
    if "Answer:" in response:
        return response.split("Answer:")[-1].strip()
    # Or find the last number
    import re
    numbers = re.findall(r'\d+', response)
    return numbers[-1] if numbers else response

Mathematical Expression of Self-Consistency

For problem, generatereasoning paths, each corresponding to answer.

The final answer is selected through majority voting: $𝟙$ where $𝟙$ is the indicator function.

Implementation Example

class SelfConsistency:
    def __init__(self, model, k=5, temperature=0.7):
        self.model = model
        self.k = k
        self.temperature = temperature
    
    def solve(self, question, cot_prompt_template):
        """Solve problem using Self-Consistency"""
        prompt = cot_prompt_template.format(question=question)
        
        answers = []
        reasoning_paths = []
        
        for i in range(self.k):
            response = self.model.generate(
                prompt,
                temperature=self.temperature,
                max_tokens=500
            )
            
            answer = self._extract_answer(response)
            answers.append(answer)
            reasoning_paths.append(response)
        
        # Count answer distribution
        answer_counts = Counter(answers)
        final_answer = answer_counts.most_common(1)[0][0]
        confidence = answer_counts[final_answer] / self.k
        
        return {
            "answer": final_answer,
            "confidence": confidence,
            "answer_distribution": dict(answer_counts),
            "reasoning_paths": reasoning_paths
        }
    
    def _extract_answer(self, response):
        """Extract answer (needs customization for specific tasks)"""
        # Example: extract last number
        import re
        numbers = re.findall(r'\d+', response)
        return numbers[-1] if numbers else response.split()[-1]

# Usage example
sc = SelfConsistency(model, k=5)
result = sc.solve(
    "Xiao Ming has 15 apples, gave 3 to Xiao Hong and 5 to Xiao Hua. How many are left?",
    cot_prompt_template
)
print(f"Answer: {result['answer']}")
print(f"Confidence: {result['confidence']:.2%}")

Advantages of Self-Consistency

Improved Accuracy: Reduces errors through majority voting
Robustness: Insensitive to single sampling errors
Confidence Estimation: Answer distribution provides confidence information

ReAct: Combining Reasoning and Acting

ReAct (Reasoning + Acting) is a method that combines reasoning and action, particularly suitable for tasks requiring tool usage.

ReAct Framework

The core idea of ReAct is: models can call external tools during reasoning, guiding next-step reasoning through tool observations.

Basic Loop:

Thought: Analyze current situation, decide next action
Action: Call tool or execute operation
Observation: Get action result
Repeat until final answer is obtained

react_prompt = """
You can use the following tools:
- search(query): Search for information
- calculator(expression): Calculate mathematical expressions
- get_weather(city): Get weather information

Please answer in the following format:

Thought: [Your thought process]
Action: [tool_name(parameters)]
Observation: [Tool return result]
... (repeat thought-action-observation loop)
Thought: [Final thought]
Answer: [Final answer]

Question: {question}
"""

ReAct Implementation Example

class ReActAgent:
    def __init__(self, model, tools):
        self.model = model
        self.tools = tools  # Tool dictionary
    
    def solve(self, question):
        """Solve problem using ReAct method"""
        prompt = self._build_initial_prompt(question)
        max_iterations = 10
        
        for i in range(max_iterations):
            # Generate thought
            response = self.model.generate(prompt, max_tokens=200)
            
            # Parse response
            thought, action = self._parse_response(response)
            
            if action is None:
                # No action, might be final answer
                answer = self._extract_answer(response)
                return answer
            
            # Execute action
            observation = self._execute_action(action)
            
            # Update prompt
            prompt += f"\nThought: {thought}\nAction: {action}\nObservation: {observation}\n"
        
        return "Reached maximum iterations"
    
    def _build_initial_prompt(self, question):
        """Build initial prompt"""
        tool_descriptions = "\n".join([
            f"- {name}: {desc}" 
            for name, desc in self.tools.items()
        ])
        
        return f"""
You can use the following tools:
{tool_descriptions}

Please answer in the following format:
Thought: [Your thought process]
Action: [tool_name(parameters)]
Observation: [Tool return result]
... (repeat thought-action-observation loop)
Thought: [Final thought]
Answer: [Final answer]

Question: {question}
"""
    
    def _parse_response(self, response):
        """Parse response, extract thought and action"""
        import re
        
        thought_match = re.search(r'Thought: (.*?)(?=Action: |Answer: |$)', response, re.DOTALL)
        action_match = re.search(r'Action: (\w+)\((.*?)\)', response)
        
        thought = thought_match.group(1).strip() if thought_match else ""
        action = None
        
        if action_match:
            tool_name = action_match.group(1)
            tool_args = action_match.group(2).strip()
            action = (tool_name, tool_args)
        
        return thought, action
    
    def _execute_action(self, action):
        """Execute action"""
        tool_name, tool_args = action
        
        if tool_name not in self.tools:
            return f"Error: Tool {tool_name} does not exist"
        
        tool_func = self.tools[tool_name]
        try:
            result = tool_func(tool_args)
            return str(result)
        except Exception as e:
            return f"Error: {str(e)}"
    
    def _extract_answer(self, response):
        """Extract final answer"""
        import re
        answer_match = re.search(r'Answer: (.*?)$', response, re.DOTALL)
        return answer_match.group(1).strip() if answer_match else response

# Tool definitions
def search_tool(query):
    """Simulate search tool"""
    # In actual implementation, would call real search API
    return f"Search results: Information about '{query}'..."

def calculator_tool(expression):
    """Calculator tool"""
    try:
        result = eval(expression)
        return result
    except:
        return "Calculation error"

# Usage example
tools = {
    "search": search_tool,
    "calculator": calculator_tool
}

agent = ReActAgent(model, tools)
answer = agent.solve("What's the weather in Beijing today? If the temperature is above 20 degrees, calculate 15 * 3")

Advantages of ReAct

Tool Integration: Can call external tools to extend capabilities
Dynamic Adaptation: Adjust strategy based on observation results
Interpretability: Thought process is clearly visible
Complex Tasks: Suitable for tasks requiring multi-step reasoning and tool usage

Building Efficient Prompt Systems

Prompt System Architecture

A complete prompt system should include:

class PromptSystem:
    def __init__(self):
        self.templates = {}  # Prompt template library
        self.examples = {}   # Example library
        self.metrics = {}    # Evaluation metrics
    
    def register_template(self, name, template):
        """Register prompt template"""
        self.templates[name] = template
    
    def build_prompt(self, template_name, **kwargs):
        """Build prompt"""
        if template_name not in self.templates:
            raise ValueError(f"Template {template_name} does not exist")
        
        template = self.templates[template_name]
        return template.format(**kwargs)
    
    def evaluate(self, prompt, test_cases):
        """Evaluate prompt effectiveness"""
        results = []
        for case in test_cases:
            full_prompt = prompt + "\n\n" + case['input']
            output = self.model.generate(full_prompt)
            results.append({
                'input': case['input'],
                'expected': case['expected'],
                'actual': output,
                'match': output.strip() == case['expected'].strip()
            })
        
        accuracy = sum(r['match'] for r in results) / len(results)
        return {
            'accuracy': accuracy,
            'results': results
        }

Prompt Optimization Strategies

1. A/B Testing

Compare effectiveness of different prompt versions:

def ab_test_prompts(prompt_a, prompt_b, test_cases, model):
    """A/B test two prompts"""
    results_a = evaluate_prompt(prompt_a, test_cases, model)
    results_b = evaluate_prompt(prompt_b, test_cases, model)
    
    return {
        'prompt_a': {
            'accuracy': results_a['accuracy'],
            'avg_length': np.mean([len(r['actual']) for r in results_a['results']])
        },
        'prompt_b': {
            'accuracy': results_b['accuracy'],
            'avg_length': np.mean([len(r['actual']) for r in results_b['results']])
        }
    }

2. Iterative Optimization

Continuously improve prompts based on feedback:

def iterative_optimization(initial_prompt, test_cases, model, max_iterations=5):
    """Iteratively optimize prompt"""
    current_prompt = initial_prompt
    best_accuracy = 0
    best_prompt = initial_prompt
    
    for iteration in range(max_iterations):
        # Evaluate current prompt
        results = evaluate_prompt(current_prompt, test_cases, model)
        accuracy = results['accuracy']
        
        if accuracy > best_accuracy:
            best_accuracy = accuracy
            best_prompt = current_prompt
        
        # Analyze error cases
        errors = [r for r in results['results'] if not r['match']]
        
        if not errors:
            break
        
        # Improve prompt based on errors (simplified example)
        current_prompt = refine_prompt(current_prompt, errors)
    
    return best_prompt, best_accuracy

3. Prompt Combination

Combine multiple prompt techniques:

def build_advanced_prompt(question, examples, use_cot=True, use_role=True):
    """Build advanced prompt, combining multiple techniques"""
    prompt_parts = []
    
    # Role assignment
    if use_role:
        prompt_parts.append("You are a professional problem-solving expert.")
    
    # Task description
    prompt_parts.append("Please solve the following problem:")
    
    # Few-shot examples
    if examples:
        prompt_parts.append("Examples:")
        for ex in examples:
            prompt_parts.append(f"Problem: {ex['question']}")
            if use_cot:
                prompt_parts.append(f"Reasoning: {ex['reasoning']}")
            prompt_parts.append(f"Answer: {ex['answer']}")
    
    # CoT requirement
    if use_cot:
        prompt_parts.append("\nPlease show your reasoning process, then provide the answer.")
    
    # Current problem
    prompt_parts.append(f"\nProblem: {question}")
    
    return "\n".join(prompt_parts)

Practical Examples

Example 1: Sentiment Analysis System

class SentimentAnalyzer:
    def __init__(self, model):
        self.model = model
        self.prompt_template = """
You are a professional sentiment analysis expert.

Task: Analyze text sentiment

Classification criteria:
- Positive: Expresses positive emotions, satisfaction, praise, etc.
- Negative: Expresses negative emotions, dissatisfaction, criticism, etc.
- Neutral: No clear emotional tendency

Examples:
Text 1: This movie is really great, with outstanding acting and a tight plot.
Sentiment: positive
Reason: Uses positive words like "great" and "outstanding"

Text 2: The product quality is too poor and not worth the price.
Sentiment: negative
Reason: Uses negative words like "too poor" and "not worth"

Text 3: It will rain tomorrow.
Sentiment: neutral
Reason: Just states a fact, no emotional color

Please analyze the following text:
Text: {text}

Output format:
Sentiment: [positive/negative/neutral]
Confidence: [Value between 0-1]
Reason: [Brief explanation]
"""
    
    def analyze(self, text):
        """Analyze text sentiment"""
        prompt = self.prompt_template.format(text=text)
        response = self.model.generate(prompt)
        return self._parse_response(response)
    
    def _parse_response(self, response):
        """Parse response"""
        import re
        
        sentiment_match = re.search(r'Sentiment: (\w+)', response)
        confidence_match = re.search(r'Confidence: ([\d.]+)', response)
        reason_match = re.search(r'Reason: (.*?)(?=\n|$)', response, re.DOTALL)
        
        return {
            'sentiment': sentiment_match.group(1) if sentiment_match else None,
            'confidence': float(confidence_match.group(1)) if confidence_match else None,
            'reason': reason_match.group(1).strip() if reason_match else None
        }

Example 2: Code Generation Assistant

class CodeGenerator:
    def __init__(self, model):
        self.model = model
        self.prompt_template = """
You are an experienced Python programmer, skilled in writing clear, efficient, and maintainable code.

Task: Generate Python code based on requirements

Requirements:
1. Code must be directly runnable
2. Include necessary comments
3. Follow PEP 8 code standards
4. Handle edge cases

Example:
Requirement: Implement a function to calculate the sum of all even numbers in a list
Code:
```python
def sum_even_numbers(numbers):
    \"\"\"
    Calculate the sum of all even numbers in a list
    
    Args:
        numbers: List of numbers
    
    Returns:
        Sum of even numbers
    \"\"\"
    return sum(num for num in numbers if num % 2 == 0)

Please generate code for the following requirement: Requirement: {requirement}

Code: """

def generate(self, requirement):
    """Generate code"""
    prompt = self.prompt_template.format(requirement=requirement)
    response = self.model.generate(prompt)
    return self._extract_code(response)

def _extract_code(self, response):
    """Extract code block"""
    import re
    code_match = re.search(r'```python\n(.*?)```', response, re.DOTALL)
    if code_match:
        return code_match.group(1).strip()
    return response


### Example 3: Intelligent Q&A System

```python
class QASystem:
    def __init__(self, model):
        self.model = model
        self.knowledge_base = {}  # Knowledge base
    
    def add_knowledge(self, doc_id, content):
        """Add knowledge"""
        self.knowledge_base[doc_id] = content
    
    def answer(self, question, doc_ids=None):
        """Answer question"""
        if doc_ids is None:
            doc_ids = list(self.knowledge_base.keys())
        
        # Retrieve relevant documents
        relevant_docs = self._retrieve_docs(question, doc_ids)
        
        # Build prompt
        prompt = self._build_qa_prompt(question, relevant_docs)
        
        # Generate answer
        response = self.model.generate(prompt)
        return self._extract_answer(response)
    
    def _retrieve_docs(self, question, doc_ids, top_k=3):
        """Retrieve relevant documents (simplified implementation)"""
        # In actual implementation, could use vector retrieval
        return [self.knowledge_base[did] for did in doc_ids[:top_k]]
    
    def _build_qa_prompt(self, question, docs):
        """Build QA prompt"""
        doc_text = "\n\n".join([
            f"Document {i+1}: {doc}" 
            for i, doc in enumerate(docs)
        ])
        
        return f"""
Answer questions based on the following documents.

Documents:
{doc_text}

Question: {question}

Requirements:
1. Answers must be based on document content
2. If no relevant information is found in the documents, answer "No relevant information found in the documents"
3. Cite specific content from the documents to support your answer
4. If there are conflicts between documents, please explain

Answer:
"""
    
    def _extract_answer(self, response):
        """Extract answer"""
        if "Answer:" in response:
            return response.split("Answer:")[-1].strip()
        return response.strip()

❓ Q&A: Common Questions About Prompt Engineering

Q1: Is longer prompt always better?

Not necessarily. Prompts should contain all information needed to complete the task, but excessive length may: - Increase computational cost - Distract model attention - Exceed context window limits

Best practice: Start with a concise version, then gradually add details if results are unsatisfactory.

Q2: How many examples are needed for few-shot prompting?

Usually 2-5 examples work well. Too few may fail to learn patterns, too many may: - Waste tokens - Increase confusion - Reduce efficiency

Recommendation: Start with 2-3 examples, adjust based on results.

Q3: Is CoT effective for all tasks?

No. CoT is mainly suitable for: - Tasks requiring multi-step reasoning (math problems, logical reasoning) - Complex decision-making tasks

For simple classification, direct Q&A tasks, CoT may add noise instead.

Q4: How to choose appropriate temperature parameter?

Low temperature (0.1-0.3): Deterministic output, suitable for tasks requiring accuracy
Medium temperature (0.5-0.7): Balance creativity and accuracy
High temperature (0.8-1.0): Creative output, suitable for generation tasks

For Self-Consistency, usually use higher temperature (0.7-0.9) to increase diversity.

Q5: Can prompt engineering replace fine-tuning?

In some scenarios yes, but each has advantages:

Prompt Engineering: Fast, flexible, no training data needed
Fine-tuning: More professional, more stable, suitable for specific domains

Best practice: Use prompt engineering for quick validation first, consider fine-tuning when higher performance is needed.

Q6: How to handle model hallucination?

Require models to cite sources
Use few-shot examples to demonstrate expected behavior
Require models to mark uncertain parts
Combine with external knowledge bases for verification

Q7: Does example order matter in prompts?

Yes. Models may: - Give more weight to the first example - Remember recent examples more deeply

Recommendation: Place the most typical, clearest examples first.

Q8: How to evaluate prompt effectiveness?

Accuracy: Evaluate correctness on test set
Consistency: Consistency of results across multiple runs
Relevance: Match between output and task
Efficiency: Token usage and response time

Q9: Can multiple prompt techniques be combined?

Yes. Common combinations: - Role assignment + few-shot + CoT - Few-shot + Self-Consistency - ReAct + tool calling

Note: Combinations may increase complexity, need to balance effectiveness and cost.

Q10: What are the limitations of prompt engineering?

Depends on model capability: Base models have limited capabilities
Context window limits: Cannot handle overly long inputs
Instability: Same prompt may produce different results
Requires manual debugging: Optimization process is time-consuming

Future directions: Automated prompt optimization, better evaluation metrics, combination with fine-tuning.