Model and Data Protection
In this chapter we look at solutions that have arisen to protect the process of creating, training, and operating LLMs. Because the potential for abuse is so great, many founders recognized early how important it would be to prevent those models from being poisoned during their training phase, or attacked after they were deployed. We start with Data Masking, move to Model Protection, and then to Federated Learning solutions.
Data Masking
Well before the rise of AI models there were products for data masking. When developing products and database schemas, DBAs (database analysts) preferred to work with real datasets. Invariably those datasets contained personally identifiable information (PII) that would present a problem if the DBAs could see it. Things like credit cards, SSNs, health codes, names and emails would need to be obfuscated in some way. The masking could, for instance, replace all the national identity numbers with meaningless numbers of the same format to preserve the functionality of the application.
In the machine age we find ourselves in today, there is another use case. When we first started using LLMs to enrich the descriptions of vendors in the Dashboard, we realized that our biggest “moat” was the time and effort we put into categorizing 4,000+ vendors. Even though Anthropic and OpenAI say they do not train on user prompts, there was always the risk of a leak. So when passing a vendor’s data in a prompt, we masked the vendor name. After receiving the answer to our prompt via API, we would swap back the vendor name. (I suspect that the latest models (GPT 5.1, for instance) could easily guess the vendor based on the size, funding, location, and category data that we provide.)
Data masking solutions have taken on a new life with all of the use cases introduced by LLMs. We cover the vendors here because they are closely associated with the larger Model Protection category covered after. Note that they predate the November 30, 2022 introduction of ChatGPT.
Model Protection
As more enterprise teams deploy their own LLMs internally, a new category of solutions has arisen to help protect those models from various attacks and abuse.
There are solutions to impose access controls for a model. Inbound payload inspection prevents various prompt injection attempts. Monitoring and observability solutions monitor a model to detect drift or if it has been tampered with. Poisoning detection identifies and prevents attempts to corrupt, bias, or manipulate an AI model — either during training, fine-tuning, or through its runtime inputs — to make it behave incorrectly or maliciously.
Here are 26 vendors of Model Protection:
| Company | Country | Investment ($M) | Employees |
|---|---|---|---|
| HiddenLayer | USA | $56.25M | 167 |
| Noma Security | Israel | $132M | 96 |
| Lakera | Switzerland | $30M | 81 |
| Lasso Security | Israel | $12.5M | 64 |
| Acompany | Japan | $13.29M | 57 |
| Straiker | USA | $21M | 40 |
| Gray Swan | USA | - | 39 |
| Deepkeep | Israel | $23.79M | 38 |
| Protect AI | USA | $108.5M | 37 |
| Irregular | Israel | $80M | 36 |
| Virtue AI | USA | $30M | 31 |
| PointGuard AI | USA | - | 28 |
| TrojAI | Canada | $8.58M | 25 |
| Giskard | France | $7.97M | 24 |
| Promptfoo | USA | $23.4M | 24 |
| Airrived.AI | USA | $5.5M | 22 |
| Mirror Security | Ireland | $2.5M | 20 |
| Wald.ai | USA | $4M | 20 |
| Citadel AI | Japan | - | 19 |
| Prediction Guard | USA | $5.72M | 18 |
| Haize Labs | USA | - | 18 |
| Bosch AIShield | India | - | 6 |
| TestSavantAI | USA | - | 6 |
| OpenShield | USA | - | 4 |
| Drift | USA | - | 0 |
Federated Learning
Federated Learning is closely associated with Model Protection. Federated Learning is a category of AI development and training that allows machine learning models to learn from data without that data ever leaving its original location. Instead of collecting and centralizing sensitive information on a single server, Federated Learning trains models locally — on phones, edge devices or private servers — and then sends only the model updates (not the raw data) back to a central aggregator. The aggregator combines these updates to produce an improved global model.
Closely related to Federated Learning is Fully Homomorphic Encryption. Vendors of these products make it possible for models to work with highly sensitive information which is encrypted before being sent to the model.
| Company | Subcategory | Country | Investment ($M) | Employees |
|---|---|---|---|---|
| Flower Labs | Federated Learning | Germany | $23.6M | 81 |
| CryptoLab | FHE | South Korea | $20.9M | 51 |
| integrate.ai | Federated Learning | Canada | $49.23M | 31 |
| Allonia | Federated Learning | France | $12.52M | 29 |
| Devtron | Federated Learning | USA | $28.81M | 10 |
| Lattica | FHE | Israel | $3.25M | 10 |
| Wodan | FHE | Belgium | $3.04M | 9 |
