Pragna-1B Architecture Overview
1. Transformer-based model (inspired by TinyLlama):
- Layers: 22
- Attention Heads: 32
- Context Length: 2048 tokens
- Hidden Dimension: 2048
- Expansion Dimension: 5632
- Vocabulary Size: 69632
2. Rotary Positional Encoding: uses base 10,000 for positional information.
3. Normalization: RSNorm with epsilon 1e-5.
4. Activation Function: Sigmoid Activation Unit (SiLU).
5. Grouped Query Attention: Improves training speed and memory efficiency, allowing inference on lower-compute devices.
6. Trained on GenAI Studio: Proprietary platform for scaling models across GPUs/accelerators with fault tolerance.
7. Development Tools:
- Triton (OpenAI): Creates high-performance CUDA kernels.
- Fully Sharded Data Parallel (FSDP): Enables distributed training.
- FlashAttention2: Speeds up training and inference
Soket AI Partners Google Cloud To Launch Multilingual AI Model
Indian AI is witnessing a big step forward with the introduction of Pragna-1B. This new initiative is a collaboration between Soket AI Labs, a leading Indian AI research firm, and Google Cloud, the global tech giant. Pragna-1B is a game-changer designed specifically to bridge the language gap in India. As India’s first open-source multilingual AI model, Pragna-1B provides developers with cutting-edge Machine Learning (ML) and Natural Language Processing (NLP) capabilities.
Read In Short:
- Soket AI Labs partners with Google Cloud to unveil Pragna-1B, India’s first open-source multilingual AI model.
- Pragna-1B provides developers with advanced Multilingual Language Processing (MLP) capabilities, catering to Hindi, English, Bengali, and Gujarati.
- The open-source nature of Pragna-1B fosters collaboration and accelerates the development of Vernacular language AI solutions in India.
Contact Us