Microsoft launches Phi-2 the next new small language model for researchers


Microsoft recently unveiled Phi-2, a new small language model that is an improved version of Phi-1.5 with 2.7 billion parameters, in the realm of large language models such as GPT-4 and Bard. Microsoft claims that Phi-2 can outperform larger models like Llama-2, Mistral, and Gemini-2 in various generative AI benchmark tests. Phi-2 is currently available via the Azure AI Studio model catalog.

Microsoft Phi-2, which was developed by the research team of Microsoft and first presented by Satya Nadella at Ignite 2023,Microsoft Ph-2  was made available earlier this week. The generative AI model is claimed to possess “common sense,” “language understanding,” and “logical reasoning.” The company claims Microsoft Ph-2 that Phi-2 can even perform better on some tasks than models that are 25 times larger. 

The company claims that Microsoft Phi-2, a 2.7 billion-parameter language model, can outperform LLMs up to 25 times larger. Microsoft’s Phi-2 is also capable of resolving challenging physics and math problems. Furthermore, it can detect a calculation error made by a student.

The next in Microsoft’s line of more compact and agile artificial intelligence (AI) models is aimed at more specialized use cases. The “textbook-quality” data used to train Microsoft Phi-2 SLM includes artificial datasets, common knowledge, theory of mind, everyday activities, and more. This model, which is transformer-based, has features such as a next-word prediction goal. 

The Earlier Launch Of Ph-1 

Microsoft Phi-1, the first of what it refers to as small language models (SLMs) earlier this month. Compared to their large language model (LLM) predecessor, SLMs have far fewer parameters. For instance, ChatGPT’s foundation, the GPT-3 LLM, contains 175 billion parameters. The most recent LLM from OpenAI, GPT-4, has roughly 1.7 trillion parameters. Phi-1.5, which has 1.3 billion parameters in comparison, came after Phi-1.

Microsoft is a significant investor in and collaborator with OpenAI, the company behind ChatGPT, which was introduced slightly over a year ago. Microsoft’s Copilot generative AI assistant is built on ChatGPT.

Phi-2 can be trained on specific data more easily and affordably than GPT-4, according to Microsoft’s 14-day training on 96 A100 GPUs. Using tens of thousands of A100 Tensor Core GPUs, GPT-4 is said to require 90–100 days to train.

On common sense reasoning, language comprehension, math, and coding benchmarks, Phi-2 performs better than the 13B Llama-2 and 7B Mistral. In a similar vein, the model performs significantly better than the 70B Llama-2 LLM. Furthermore, it performs better even than the 3.25B Google Gemini Nano 2 model, which is compatible with the Google Pixel 8 Pro.

Prompt engineering is a technique used to train LLMs of all sizes. It involves feeding inquiries and the right answers into the models to improve algorithmic response time. These days, you can even find marketplaces selling lists of prompts, like the top 100 ChatGPT prompts.

However, the likelihood of poor and inaccurate outputs increases with the amount of data fed into LLMs. Because GenAI tools are next-word indicators, inaccurate data supplied to them may produce inaccurate outcomes. 

A smaller model that outperforms a larger language model such as Llama-2 has a significant advantage because it requires less power and computing resources to run. These models are easily able to run directly on the device and can be developed for particular tasks, which reduces output latency. The Phi-2 model is available to developers via Azure AI Studio.

