Zero Shot and Prompt Engineering#

Zero Shot, Prompt, and Search Strategies

Zero Shot and Few Shot Learners#

Zero Shot and Few Shot Learners

Zero-shot and few-shot learners are machine learning techniques that allow models to generalize and adapt to new tasks with little or no training data. These approaches are particularly useful in situations where obtaining labeled data is difficult or expensive. They are inspired by the human ability to learn new concepts and skills quickly, even from a few examples.

Zero-shot learning (ZSL) refers to the ability of a model to perform a task without having seen any examples from that specific task during training. It relies on the model’s capacity to generalize from related tasks or to leverage auxiliary information such as relationships between classes, attributes, or semantic information. For example, a zero-shot image classification model might recognize a never-seen-before object by leveraging its understanding of similar objects or associated attributes.

Few-shot learning (FSL), on the other hand, involves training a model to perform a task with only a small number of examples (typically less than 10) from the target task. This is in contrast to traditional machine learning methods that often require large amounts of labeled data. Few-shot learning aims to quickly adapt to new tasks by leveraging prior knowledge and transfer learning. Meta-learning and memory-augmented neural networks are common approaches to few-shot learning, focusing on learning a model that can adapt rapidly to new tasks based on limited data.

Both zero-shot and few-shot learners are essential components in the pursuit of artificial general intelligence (AGI), as they enable models to learn efficiently and generalize effectively across various tasks and domains with minimal data.

Prompt Engineering#

Prompt engineering is the process of designing and refining inputs (prompts) for language models like GPT-3 or GPT-4 to obtain better, more accurate, and contextually relevant outputs. Since large-scale language models are trained to generate text based on the input they receive, crafting effective prompts can significantly influence the quality of the generated responses.

The main goal of prompt engineering is to maximize the usefulness and relevance of a language model’s output by considering factors such as clarity, specificity, context, and constraints. It often involves an iterative process of experimentation and fine-tuning to arrive at the optimal prompt. Some strategies used in prompt engineering include:

  • Clarity: Ensure that the prompt is clear and unambiguous, which helps the language model understand the question or context better.

  • Specificity: Make the prompt more specific by including relevant details or asking for specific information. This can help narrow down the potential responses and avoid generic answers.

  • Context: Provide sufficient context to guide the model in generating responses that are relevant to the given situation or domain.

  • Constraints: Apply constraints on the response format or content, such as specifying the desired answer type (e.g., a list or a single word) or limiting the length of the response.

  • Redundancy: Ask the same question in multiple ways or incorporate different perspectives to increase the likelihood of obtaining accurate and comprehensive answers.

  • Instructiveness: Use explicit instructions or guiding questions to direct the model’s attention to the relevant aspects of the problem or task.

Prompt engineering is an essential skill in working with language models as it helps to bridge the gap between the model’s training data and the desired output for specific use cases. By carefully crafting and refining prompts, users can improve the reliability and usefulness of generated responses, making the models more effective across a wide range of applications.

Prompting on LLMs#

Large Language Models (LLMs) like GPT-3 or GPT-4 can be used for various tasks, including zero-shot, one-shot, and few-shot learning. Here are some examples using text prompts for each case:

Zero-shot learning:#

Task: Sentiment analysis (positive or negative) for a movie review

Prompt:

Determine the sentiment of the following movie review: "I absolutely loved the movie! The storyline was captivating, and the acting was superb. I can't wait to watch it again!"

Since the LLM has been pre-trained on a diverse range of texts, it should be able to classify the sentiment of the review without any additional examples.

One-shot learning:#

Task: Animal classification based on a brief description

Prompt:

Based on the given description, classify the animal:
Example: "This animal has a long neck and long legs. It mainly eats leaves from trees.": Giraffe

Description: "This small creature has a bushy tail, sharp claws, and climbs trees to collect nuts."

By providing an example in the prompt, the LLM can use this context to generate an appropriate classification for the given description.

Few-shot learning:#

Task: Convert a sentence from active to passive voice

Prompt:

Transform the following sentences from active to passive voice:
Example 1: "John painted the house." -> "The house was painted by John."
Example 2: "She baked a cake." -> "A cake was baked by her."

Sentence: "The cat chased the mouse."

By providing multiple examples, the LLM can better understand the desired transformation and apply it to the input sentence.

In each case, the prompt is designed to guide the LLM to perform the desired task with varying amounts of examples. The specific syntax for providing these prompts to an LLM like GPT-3 or GPT-4 depends on the API or library you are using, but the core idea remains the same: designing effective prompts to maximize the usefulness and relevance of the generated outputs.

# Zero-shot Learning with Hugging Face's Transformers
from transformers import pipeline

review = "I absolutely loved the movie! The storyline was captivating, and the acting was superb. I can't wait to watch it again!"

# Initialize the zero-shot classification pipeline
classifier = pipeline("zero-shot-classification")

# Classify sentiment using the pipeline
result = classifier(review, candidate_labels=["positive", "negative"])
print(result["labels"][0])
/home/yjlee/.cache/pypoetry/virtualenvs/lecture-_dERj_9R-py3.8/lib/python3.8/site-packages/tqdm/auto.py:21: TqdmWarning: IProgress not found. Please update jupyter and ipywidgets. See https://ipywidgets.readthedocs.io/en/stable/user_install.html
  from .autonotebook import tqdm as notebook_tqdm
2023-03-30 00:19:57.858469: I tensorflow/core/platform/cpu_feature_guard.cc:182] This TensorFlow binary is optimized to use available CPU instructions in performance-critical operations.
To enable the following instructions: AVX2 FMA, in other operations, rebuild TensorFlow with the appropriate compiler flags.
No model was supplied, defaulted to facebook/bart-large-mnli and revision c626438 (https://huggingface.co/facebook/bart-large-mnli).
Using a pipeline without specifying a model name and revision in production is not recommended.
Downloading pytorch_model.bin: 100%|██████████| 1.63G/1.63G [00:23<00:00, 68.1MB/s]
Downloading (…)okenizer_config.json: 100%|██████████| 26.0/26.0 [00:00<00:00, 10.3kB/s]
Downloading (…)olve/main/vocab.json: 100%|██████████| 899k/899k [00:01<00:00, 874kB/s]
Downloading (…)olve/main/merges.txt: 100%|██████████| 456k/456k [00:00<00:00, 553kB/s]
Downloading (…)/main/tokenizer.json: 100%|██████████| 1.36M/1.36M [00:01<00:00, 1.10MB/s]
positive
# One-shot Learning
from transformers import GPT2LMHeadModel, GPT2Tokenizer

text = """
Answer the following geography-related question:
Question: "What is the capital city of France?": 
Answer: Paris

Question: "What is the highest mountain in the world?"
"""

# Initialize the GPT-2 model and tokenizer
model_name = "gpt2"
tokenizer = GPT2Tokenizer.from_pretrained(model_name)
model = GPT2LMHeadModel.from_pretrained(model_name)

# Generate a response using the model
input_ids = tokenizer.encode(text, return_tensors="pt")
output = model.generate(input_ids, max_length=50)
response = tokenizer.decode(output[0])

print(response)
The attention mask and the pad token id were not set. As a consequence, you may observe unexpected behavior. Please pass your input's `attention_mask` to obtain reliable results.
Setting `pad_token_id` to `eos_token_id`:50256 for open-end generation.
Answer the following geography-related question:
Question: "What is the capital city of France?": 
Answer: Paris

Question: "What is the highest mountain in the world?"

Answer: Mount Everest

Question
# Few-shot Learning
from transformers import GPT2LMHeadModel, GPT2Tokenizer

text = """
Transform the following sentences from active to passive voice:
Example 1: "John painted the house." -> "The house was painted by John."
Example 2: "She baked a cake." -> "A cake was baked by her."

Example 3: "The cat chased the mouse."
"""

# Initialize the GPT-2 model and tokenizer
model_name = "gpt2"
tokenizer = GPT2Tokenizer.from_pretrained(model_name)
model = GPT2LMHeadModel.from_pretrained(model_name)

# Generate a response using the model
input_ids = tokenizer.encode(text, return_tensors="pt")
output = model.generate(input_ids, max_length=100)
response = tokenizer.decode(output[0])

print(response)
The attention mask and the pad token id were not set. As a consequence, you may observe unexpected behavior. Please pass your input's `attention_mask` to obtain reliable results.
Setting `pad_token_id` to `eos_token_id`:50256 for open-end generation.
Transform the following sentences from active to passive voice:
Example 1: "John painted the house." -> "The house was painted by John."
Example 2: "She baked a cake." -> "A cake was baked by her."

Example 3: "The cat chased the mouse."

Example 4: "The cat was a cat." -> "The cat was a cat."

Example 5: "The cat was a cat." -> "The cat was a cat

Zero-Shot Reasoners and Chain-of-Thought Prompting#

The University of Tokyo and Google Brain team discovered that large language models (LLMs) possess inherent zero-shot abilities in high-level cognitive tasks. These abilities can be extracted using a method called Chain-of-Thought (CoT) prompting.

Further research by the Google Brain team delved into CoT prompting and found that generating a series of intermediate reasoning steps, or a “chain-of-thought,” significantly improves LLMs’ complex reasoning capabilities. Experiments conducted on three LLMs showed that CoT prompting enhances performance in arithmetic, common sense, and symbolic reasoning tasks.

Here’s an example to illustrate CoT prompting:

  • Q: Jane has 7 books on her shelf. She borrowed 4 books from the library. How many books does she have now?

  • A: Jane had 7 books initially. She borrowed 4 books from the library. So, 7 + 4 = 11. The answer is 11.

And another example:

  • Q: A shop had 15 umbrellas. 8 umbrellas were sold, and they restocked 5 more. How many umbrellas do they have now?

  • A: The shop had 15 umbrellas originally. They sold 8 umbrellas, leaving 15 - 8 = 7. They restocked 5 more umbrellas, so they have 7 + 5 = 12. The answer is 12.

In summary, chain-of-thought reasoning enables models to break down complex problems into smaller, manageable steps that can be solved individually. The language-based nature of CoT prompting makes it applicable to any task that can be solved through language. Empirical experiments have shown that CoT prompting improves performance across various reasoning tasks, and successful chain-of-thought reasoning emerges as models scale up.

Here’s a Python example using Hugging Face Transformers to demonstrate Zero-Shot Reasoners and Chain-of-Thought Prompting with the GPT-2 model:

Import libraries and prepare the model

The following code imports necessary libraries, TensorFlow and Hugging Face Transformers, to work with the GPT-2 model. It initializes a GPT-2 tokenizer and loads the pre-trained GPT-2 model from Hugging Face’s model hub. The EOS token is set as the PAD token to avoid warnings during tokenization and padding.

import tensorflow as tf
from transformers import TFGPT2LMHeadModel, GPT2Tokenizer


def generate_chain_of_thought(prompt, model, tokenizer, max_length=100):
    input_ids = tokenizer.encode(prompt, return_tensors="tf")
    generated_text = model.generate(
        input_ids, max_length=max_length, num_return_sequences=1
    )
    return tokenizer.decode(generated_text[0], skip_special_tokens=True)


tokenizer = GPT2Tokenizer.from_pretrained("gpt2")
model = TFGPT2LMHeadModel.from_pretrained("gpt2", pad_token_id=tokenizer.eos_token_id)

# Example problem
problem = "Jane has 7 books on her shelf. She borrowed 4 books from the library. How many books does she have now?"

# Chain-of-Thought Prompt
cot_prompt = "To find out the total number of books Jane has now, we can first count the number of books on her shelf, which is 7. Then we count the number of books she borrowed from the library, which is 4. Now, we add these two quantities together. What is 7 + 4?"

# Generate the answer
answer = generate_chain_of_thought(cot_prompt, model, tokenizer)
print(answer)
All model checkpoint layers were used when initializing TFGPT2LMHeadModel.

All the layers of TFGPT2LMHeadModel were initialized from the model checkpoint at gpt2.
If your task is similar to the task the model of the checkpoint was trained on, you can already use TFGPT2LMHeadModel for predictions without further training.
To find out the total number of books Jane has now, we can first count the number of books on her shelf, which is 7. Then we count the number of books she borrowed from the library, which is 4. Now, we add these two quantities together. What is 7 + 4? Well, it's the number of books she borrowed from the library. So, if you have a library of about 500 books, you have about 7 books on your shelf. So, if you have
# You have to install the OpenAI Python library by running the following command:
# pip install openai
import openai
import re

openai.api_key = "API_KEY"


def generate_chain_of_thought(prompt, model_engine="text-davinci-003", max_tokens=100):
    response = openai.Completion.create(
        engine=model_engine,
        prompt=prompt,
        max_tokens=max_tokens,
        n=1,
        stop=None,
        temperature=0.5,
    )
    return response.choices[0].text.strip()


# Example problem
problem = "Jane has 7 books on her shelf. She borrowed 4 books from the library. How many books does she have now?"

# Chain-of-Thought Prompt
cot_prompt = f"Jane had 7 books initially. She borrowed 4 books from the library. What is the total number of books Jane has now? 7 + 4 ="

# Generate the answer
answer_text = generate_chain_of_thought(cot_prompt)
answer = (
    re.findall(r"\d+", answer_text)[0] if re.findall(r"\d+", answer_text) else "unknown"
)
print(answer)