Text Generation using Falcon 7B

Let us see how we can use Falcon 7B for text generation.

1. Install necessary libraries

We are installing the accelerate package as it provides a collection of utilities and wrappers for accelerating Python applications, particularly those related to scientific computing, numerical simulations, and machine learning. The package leverages hardware acceleration features such as SIMD (Single Instruction, Multiple Data), multi-threading, and GPU acceleration to improve the performance of numerical computations.

!pip install accelerate


2. Import the libraries

For text generation, we will require pandas for data manipulation and analysis, pytorch and transformers module for automatic model configuration, auto-loading of pre-trained models and tokenization.

Python3

import os
import pandas as pd
import torch
import torch.nn as nn
import transformers
from transformers import AutoConfig, AutoModelForCausalLM, AutoTokenizer, BitsAndBytesConfig


3. Initiate the Model

This code initializes a text generation pipeline using the “tiiuae/falcon-7b-instruct” model from the Hugging Face transformers library. It loads the tokenizer for the model and creates the pipeline with parameters such as using bfloat16 data type, trusting remote code, and automatically mapping computation to available devices. This pipeline is then ready for text generation tasks based on input prompts.

Python3

# Change the model name and tokenizer loading
tokenizer = transformers.AutoTokenizer.from_pretrained("tiiuae/falcon-7b-instruct")
 
# Create a text generation pipeline
generator = transformers.pipeline(
    "text-generation",
    model="tiiuae/falcon-7b-instruct",
    tokenizer=tokenizer,
    torch_dtype=torch.bfloat16,
    trust_remote_code=True,
    device_map="auto",
)


3. Use the model for text generation

This code snippet generates text sequences using the previously initialized text generation pipeline (generator). It prompts the model with the input text “What is the purpose of life” and specifies parameters such as maximum sequence length (max_length), whether to sample from the model (do_sample=True), the number of top-k candidates to sample from (top_k=10), the number of sequences to generate (num_return_sequences=1), and the end-of-sequence token ID from the tokenizer. The generated text sequences are stored in the text_sequences variable, and then each generated sequence is printed using a loop, displaying the generated text with the label “Result”. This allows for the quick generation and display of text based on the input prompt.

Python3

# Generate text sequences
text_sequences = generator(
    "What is the purpose of life",
    max_length=200,
    do_sample=True,
    top_k=10,
    num_return_sequences=1,
    eos_token_id=tokenizer.eos_token_id,
)
 
# Print the generated text sequences
for i in text_sequences:
    print(f"Result: {i['generated_text']}")

Output

The purpose of life is a highly debated topic, with different individuals or cultures providing unique perspectives on its meaning. From a philosophical standpoint, it is often perceived as a subjective question as it depends on an individual's understanding and beliefs. However, some argue that the purpose of life is to find meaning and fulfillment, cultivate relationships, make a positive impact on the world, or simply to live a content and happy life. Ultimately, it is up to each individual to provide their own answer and purpose.


Falcon LLM: Comprehensive Guide

Falcon LLM is a large language model that is engineered to comprehend and generate human like text, showcasing remarkable improvements in natural language and generation capabilities. This article covers the fundamentals of Falcon LLM and demonstrates how can we perform text generation using Falcon LLM.

Table of Content

  • What is Falcon LLM?
  • Key Features of Falcon LLM
  • Design Philosophy of Falcon LLM
  • Key Model components of Falcon LLM
  • Limitation
  • Text Generation using Falcon 7B

Falcon LLM aims to set new benchmarks in AI’s ability to interact, reason, and assist in a variety of complex tasks, promising transformative impacts across industries and research domains.

Large Language Model (LLM) is a very huge model (in terms of parameter) that are generally based on the transformer architecture (a special type of neural network capable of parallel processing through self-attention mechanism) that are trained on massive amounts of text data which help them to understand and generate text like humans do. Some examples of the famous LLM are GPT-3, Google BART, PaLM. Though the LLM models like GPT-3, Google BART, and PaLM are available to the public for inference, how they have been trained is not documented in detail. Traditionally the open-source LLM model has always lagged behind these private/commercial LLM models in terms of performance and size. The lack of detailed documentation about the training process of successful large-scale models limits the research and progress of open-source models.

Let us get an understanding of the key components of the Falcon Model.

Similar Reads

What is Falcon LLM?

Falcon is an open-source model released by the Technology Innovation Institute of UAE. The Falcon family comprises model of 4 size currently – 1.8B, 7B, 40B and 180B. Unlike other popular LLMS the falcon family of models are freely available under open-source license for further development purpose. The dataset used for training, the design principles used while designing the model and the training process is documented in detail....

Key Features of Falcon LLM

Falcon models are causal decoders based on the transformer‘s decoder architecture, trained on a diverse, high-quality dataset collected from web data.All Falcon models are released under the Apache 2.0 license, making them freely accessible for both research and commercial use. Falcon models demonstrate comparable performance to recent state-of-the-art models like GPT-4 and LLaMA 2 on tasks such as text generation, translation, question answering, and code generation. The Falcon-180B model achieves near-PaLM-2-Large performance at a reduced pretraining and inference cost, placing it among the top language models globally.Falcon models have limited multilingual capabilities as they are trained primarily on English and datasets related to European languages such as German, Spanish, and French.The Falcon team claims that their models require less memory compared to other models of similar sizes, making them more accessible.Falcon-180B, the largest model, has been trained on over 3.5 trillion tokens of text, representing the largest openly documented pretraining run....

Design Philosophy of Falcon LLM

The designers of Falcon models focused on scalability across below three axes which became their design philosophy....

Key Model components of Falcon LLM

Let us understand key model design points that worked for the Falcon team. Note that the below architectural designs are not unique and were invented by the Falcon team. They were there in the public domain before. The Falcon team tried various combinations and found that the below worked best for them. The criteria for evaluation were the design philosophy that they need to not only improve model performance but also make sure that model design is scalable and cost /memory efficient....

Limitation

The key limitation of Falcon model is their limited language support as their proficiency is mainly in English, German, Spanish, and French. Support for other languages is less robust, limiting their global accessibility....

Text Generation using Falcon 7B

Let us see how we can use Falcon 7B for text generation....

Conclusion

...

Contact Us