The Ministry of Electronics and Information Technology (MeitY) has accelerated the deployment of ‘AI Kosh’, a centralized national library designed to consolidate non-personal datasets for large language model (LLM) training. As of early 2026, the platform has scaled to host over 5,500 curated datasets and 251 pre-trained models, effectively addressing the ‘data poverty’ that previously hindered domestic AI development. By providing access to high-fidelity metadata from sectors like agriculture, healthcare, and 12 Indic languages via the Bhashini division, the government aims to facilitate the creation of indigenous foundational models that are culturally and linguistically nuanced.
Strategic infrastructure and economic projection
To complement the data repository, the IndiaAI Mission has expanded its compute capacity to 38,000 GPUs, offering them to researchers and startups at a subsidized rate of approximately Rs 65 per hour. This integrated approach is vital for the 22 Indian firms currently developing LLMs, including BharatGen and Sarvam AI. Sovereign AI isn't just about software; it's about owning the data-to-compute pipeline, noted a senior MeitY official. With the AI sector projected to contribute nearly $500 billion to India’s GDP by 2027, the success of AI Kosh is viewed as a prerequisite for achieving global leadership in the ‘AI for All’ initiative.
Overcoming data poisoning and governance hurdles
Despite the infrastructure gains, the project faces critical challenges regarding data integrity and cybersecurity. Experts warn of ‘model poisoning’ risks, where manipulated inputs could degrade the accuracy of public LLMs. Furthermore, while the Digital Personal Data Protection (DPDP) Act 2023 provides a legal framework, the technical implementation of ‘machine unlearning’- the ability to remove specific data points from a trained model - remains a complex hurdle for developers using these public datasets to build commercial-grade applications.
MeitY serves as the primary architect of India’s Digital Public Infrastructure, managing large-scale platforms like Aadhaar, UPI, and now AI Kosh. The ministry focuses on democratizing technology through open-source stacks and sovereign data repositories. Current growth plans involve scaling AI compute portals to 50,000 GPUs by 2027 to support a $1 trillion digital economy.












