Pragna-1B Architecture Overview

1. Transformer-based model (inspired by TinyLlama):

Layers: 22
Attention Heads: 32
Context Length: 2048 tokens
Hidden Dimension: 2048
Expansion Dimension: 5632
Vocabulary Size: 69632

2. Rotary Positional Encoding: uses base 10,000 for positional information.

3. Normalization: RSNorm with epsilon 1e-5.

4. Activation Function: Sigmoid Activation Unit (SiLU).

5. Grouped Query Attention: Improves training speed and memory efficiency, allowing inference on lower-compute devices.

6. Trained on GenAI Studio: Proprietary platform for scaling models across GPUs/accelerators with fault tolerance.

7. Development Tools:

Triton (OpenAI): Creates high-performance CUDA kernels.
Fully Sharded Data Parallel (FSDP): Enables distributed training.
FlashAttention2: Speeds up training and inference

Soket AI Partners Google Cloud To Launch Multilingual AI Model

Indian AI is witnessing a big step forward with the introduction of Pragna-1B. This new initiative is a collaboration between Soket AI Labs, a leading Indian AI research firm, and Google Cloud, the global tech giant. Pragna-1B is a game-changer designed specifically to bridge the language gap in India. As India’s first open-source multilingual AI model, Pragna-1B provides developers with cutting-edge Machine Learning (ML) and Natural Language Processing (NLP) capabilities.

Read In Short:

Soket AI Labs partners with Google Cloud to unveil Pragna-1B, India’s first open-source multilingual AI model.

Pragna-1B provides developers with advanced Multilingual Language Processing (MLP) capabilities, catering to Hindi, English, Bengali, and Gujarati.

The open-source nature of Pragna-1B fosters collaboration and accelerates the development of Vernacular language AI solutions in India.

Pragna-1B Architecture Overview

Soket AI Partners Google Cloud To Launch Multilingual AI Model

Similar Reads

Contact Us