5 Small Language Models Examples Boosting Business Efficiency

Updated Sep 6, 2024 • 13 min read

For a long time, everyone talked about the capabilities of large language models. And while they’re truly powerful, some use cases call for a more domain-specific alternative.

That’s where businesses can turn to small language models. They’re currently gaining traction, and for good reason. Not only are they more efficient but also less costly to implement. In this article, I share some of the most promising examples of small language models on the market. I also explain what makes them unique, and what scenarios you could use them for.

What are Small Language Models (SLMs)?

Language models are tools based on artificial intelligence and natural language processing.

They can comprehend and mimic human language based on the patterns and training data they’re given.

Small language models are less capable of processing and generating text as they have fewer parameters as opposed to larger models. This means they’re better at handling less complex tasks, which are more specific, like text classification, sentiment analysis, and basic text generation. These models are ideal for business use cases that don't require complex analysis. They are perfect for clustering, tagging, or extracting necessary information.

Small language models vs large language models

Apart from pure size, how do large language models vs small language models differ?

Small models are great for niche, domain-specific tasks, and can provide more expert, granular information. For example, if you’re in an industry like banking, you could feed it with specialist terminology and turn it into a financial model.

LLMs, on the other hand, are like generalists; they have a wider dataset. They’ll operate on data from various disciplines. The more detailed or industry-specific your need, the harder it may be to get a precise output. Being the domain expert, an small language model would likely outperform a large language model.

Also, due to their compact nature, it’s easy and fast to set up an SLM not only on smartphones and tablets but also on edge computing devices. This can’t be said about LLMs, which require large computational resources to be deployed.

Finally, in terms of security and privacy, small language models tend to be safer, too. How so?

Since they operate locally, you don’t exchange data with external servers, reducing the risk of sensitive data breach. As I’ve covered in a post on local language data security, large language models are more susceptible to hacks, as they often process data on the cloud.

Aspect	Small Language Models (SLMs)	Large Language Models (LLMs)
Definition	Fewer parameters (millions to few billions)	Vast number of parameters (dozens or hundreds of billions)
Performance	Good for simpler tasks	Excels in complex language understanding and generation
Training and resources	Less computational power, memory, and storage required	Significant computational resources, memory, and storage required
Training time	Shorter and less costly	Time-consuming and expensive
Use cases	Embedded systems, mobile applications, local generation	Advanced conversational agents, content creation, complex translation systems
Deployment	Easier to deploy on various platforms	Often requires specialized hardware and powerful cloud servers
Adaptability	Quicker to be fine tuned or adjusted for specific tasks or domains	Greater flexibility and adaptability but requires significant effort to fine-tune
Environmental impact	Lower energy consumption and carbon footprint	Higher energy consumption and larger carbon footprint
Security	Lower risk of exposure due to smaller deployment footprint	Higher risk due to larger attack surface and dependency on cloud infrastructure

Small Language Models Examples

1. PHI-3 – tiny but mighty

In April 2024, Microsoft announced they’re creating a family of PHI-3 open models, which they called “the most capable and cost-effective small language models available.”

The first to come from this Microsoft small language models’ family is Phi-3-mini, which boasts 3.8 billion parameters. What piqued my interest is that the company said it can perform better than models twice its size.

This tiny language model is said to have great logic and reasoning abilities. This means that it could work well for scenarios like:

Summarizing long, domain specific documents like new regulations, and extracting the key points from it.
Setting up a chatbot, which could accurately answer customer questions and dive into CRM records to suggest relevant upgrades.
Generate marketing collateral like social media posts or product/service descriptions.

Phi-3 models are built in a safety-first approach, following Microsoft’s Responsible AI standards. These cover areas like privacy, security, reliability, and inclusiveness (thanks to training on high-quality, inclusive data).

If the company lives up to their promise, we can expect the phi-3 family to be among the best small language models on the market.

2. Mixtral of experts – advanced mix of experts for better reasoning

Mixtral is among the best small language models out there. It operates as a decoder-only model, selecting parameters from 8 different sets to process each text part or token. Designed with efficiency and capability in mind, it utilizes a specialized type of neural network, called a router, to pick the best 'experts' for processing each text segment.

In total, Mixtral has around 46.7 billion parameters but uses only 12.9 billion to analyze any given token. The beauty of it is that while it can handle complicated tasks, just like LLMs do, it’s much more efficient and cheaper. It’s trained on open web data and learns from experts and the router – all at once.

You can compare its performance to that of ChatGPT 3.5. Here is a summary:

It’s able to leverage a wide spectrum of knowledge through a blend of various domains.
It features unique architecture, and it can act as an LLM, but only uses a fraction of its potential.
Mixtral creates new models capable of running on local machines while still achieving comparable power to full-scale LLMs.

3. Llama 3 – one of the most capable small language models on your computer

Llama 3 is an advanced language model from Meta, which is much more powerful than its predecessor. The dataset it’s been trained on is seven times as big as that of Llama 2 and featuresfour times more code.

This AI model can understand text as long as 8,000 tokens, which is twice the capacity of its older brother, making it capable of comprehending and generating longer and more complex text pieces.

Llama 3 has enhanced reasoning capabilities and displays top-tier performance on various industry benchmarks. No wonder, they’re viewed as the best open-source models in their category. Meta made it available to all their users, intending to promote “the next wave of AI innovation impacting everything from applications and developer tools to evaluation methods and inference optimizations”.

Llama 3 is used in Meta products: It’s called Meta AI. Meta AI is accessible through search, feeds, and chats on Meta, Instagram, WhatsApp, and Messenger. Users can retrieve real-time information from the Internet without switching between apps.

4. DeepSeek-Coder-V2 – additional developer on your machine

Calling it an “additional developer” isn’t an understatement. It’s a resourceful AI development tool, and is among the best small language models for code generation. Tests prove that it has amazing coding and mathematical reasoning capabilities. So much so that it could replace Gemini Code or Copilot, when used on your machine.

DeepSeek-Coder-V2 is an open source model built through the Mixture-of-Experts (MoE) machine learning technique. As we can find out from its ‘Read me’ documents on GitHub, it comes pre-trained with 6 trillion tokens, supports 338 languages, and has a context length of 128k tokens. Comparisons show that, when handling coding tasks, it can reach performance rates similar to GPT4-Turbo.

Image Source

Large vs small language models; DeepSeek-Coder-V2 as a powerful small language model example

DeepSeek-Coder-V2 has the highest accuracy on HumanEval, with an impressive 90.2% rate.

Which businesses could most benefit from this model? In my opinion, it’s best suited not only for those who need their SLM to have top-level analytical capabilities. It’s also perfect when you can’t share code through your critical systems, if those operate on the cloud.

5. MiniCPM-Llama3-V 2.5: A GPT-4V Level Multimodal LLM on Your Phone

MiniCPM-Llama3-V 2.5 is the newest model in the MiniCPM-V series and includes 8 billion parameters, built on SigLip-400M and Llama3-8B-Instruct. It’s a lot more capable than its predecessor, MiniCPM-V 2.0. It received 65.1 points on OpenCompass, scoring better than GPT-4V-1106, Gemini Pro, Claude 3, and Qwen-VL-Max, despite smaller number of parameters.

This small language model has a very low hallucination rate of just 10.3% on Object HalBench, making it more trustworthy than GPT-4V-1106 (with 13.6%).

It’s also worth mentioning that you can use it in over 30 languages, such as English, German, French, Korean, and Japanese. This relates to what I believe is the single-most powerful capability of this model, i.e., that it excels in optical character recognition (OCR).

It can process images with up to 1.8 million (!) pixels, with any aspect ratio. An OCR-specific performance test, OCRBench, gave it an impressive score of 700, outranking GPT-4o and Gemini Pro. You can ask questions about any uploaded image and receive specific, accurate answers.

This makes it perfectly suitable for:

Digitization of images, as well as digital and handwritten text
Data extraction for both structured and unstructured data. For example, converting tables to markdown copy.

You can use it on your mobile, making sure the images stay on your phone, which improves data security and privacy.

General business use cases for small language models

Small Language Models (SLMs) like PHI-3, Mixtral, Llama 3, DeepSeek-Coder-V2, and MiniCPM-Llama3-V 2.5 enhance various operations with their advanced capabilities.

Use cases:

Retrieving Information: Leverage searching and summarizing capabilities to find relevant information within large volumes of text.
Tagging and Clustering: Quickly convert any text into the required metadata. For example, tag comments on your webpage or product opinions and cluster them to gain insights.
Creating JSON Data: Convert any text into relevant information for API systems.

How to use small language models in your business?

Convert media notes into proper API calls with tags, allowing your systems to automatically gather and analyze information. This is especially useful for staying up to date with industry news and analyzing financial information.
Tag and cluster all opinions and comments about your products to improve based on user feedback.
Analyze the competition by examining their pages, extracting all the features they offer, clustering this data, and using the insights to enhance your products.

Small language models have fewer parameters but are great for domain-specific tasks

Small language models offer significant benefits in terms of cost savings, efficiency, and versatility. They are less expensive to train and deploy than large language models, making them accessible for a wider range of applications.

Since they use computational resources efficiently, they can offer good performance and run on various devices, including smartphones and edge devices. Additionally, since you can train them on specialized data, they can be extremely helpful when handling niche tasks.

Overall, domain specific language models provide a practical, cost-effective solution for businesses, without sacrificing performance and output accuracy.