Model and Data Protection

In this chapter we look at solutions that have arisen to protect the process of creating, training, and operating LLMs. Because the potential for abuse is so great, many founders recognized early how important it would be to prevent those models from being poisoned during their training phase, or attacked after they were deployed. We start with Data Masking, move to Model Protection, and then to Federated Learning solutions.

Data Masking

Well before the rise of AI models there were products for data masking. When developing products and database schemas, DBAs (database analysts) preferred to work with real datasets. Invariably those datasets contained personally identifiable information (PII) that would present a problem if the DBAs could see it. Things like credit cards, SSNs, health codes, names and emails would need to be obfuscated in some way. The masking could, for instance, replace all the national identity numbers with meaningless numbers of the same format to preserve the functionality of the application.

In the machine age we find ourselves in today, there is another use case. When we first started using LLMs to enrich the descriptions of vendors in the Dashboard, we realized that our biggest “moat” was the time and effort we put into categorizing 4,000+ vendors. Even though Anthropic and OpenAI say they do not train on user prompts, there was always the risk of a leak. So when passing a vendor’s data in a prompt, we masked the vendor name. After receiving the answer to our prompt via API, we would swap back the vendor name. (I suspect that the latest models (GPT 5.1, for instance) could easily guess the vendor based on the size, funding, location, and category data that we provide.)

Data masking solutions have taken on a new life with all of the use cases introduced by LLMs. We cover the vendors here because they are closely associated with the larger Model Protection category covered after. Note that they predate the November 30, 2022 introduction of ChatGPT.

Model Protection

As more enterprise teams deploy their own LLMs internally, a new category of solutions has arisen to help protect those models from various attacks and abuse.

There are solutions to impose access controls for a model. Inbound payload inspection prevents various prompt injection attempts. Monitoring and observability solutions monitor a model to detect drift or if it has been tampered with. Poisoning detection identifies and prevents attempts to corrupt, bias, or manipulate an AI model — either during training, fine-tuning, or through its runtime inputs — to make it behave incorrectly or maliciously.

Here are 26 vendors of Model Protection:

Company	Country	Investment ($M)	Employees
HiddenLayer	USA	$56.25M	167
Noma Security	Israel	$132M	96
Lakera	Switzerland	$30M	81
Lasso Security	Israel	$12.5M	64
Acompany	Japan	$13.29M	57
Straiker	USA	$21M	40
Gray Swan	USA	-	39
Deepkeep	Israel	$23.79M	38
Protect AI	USA	$108.5M	37
Irregular	Israel	$80M	36
Virtue AI	USA	$30M	31
PointGuard AI	USA	-	28
TrojAI	Canada	$8.58M	25
Giskard	France	$7.97M	24
Promptfoo	USA	$23.4M	24
Airrived.AI	USA	$5.5M	22
Mirror Security	Ireland	$2.5M	20
Wald.ai	USA	$4M	20
Citadel AI	Japan	-	19
Prediction Guard	USA	$5.72M	18
Haize Labs	USA	-	18
Bosch AIShield	India	-	6
TestSavantAI	USA	-	6
OpenShield	USA	-	4
Drift	USA	-	0

Federated Learning

Federated Learning is closely associated with Model Protection. Federated Learning is a category of AI development and training that allows machine learning models to learn from data without that data ever leaving its original location. Instead of collecting and centralizing sensitive information on a single server, Federated Learning trains models locally — on phones, edge devices or private servers — and then sends only the model updates (not the raw data) back to a central aggregator. The aggregator combines these updates to produce an improved global model.

Closely related to Federated Learning is Fully Homomorphic Encryption. Vendors of these products make it possible for models to work with highly sensitive information which is encrypted before being sent to the model.

Company	Subcategory	Country	Investment ($M)	Employees
Flower Labs	Federated Learning	Germany	$23.6M	81
CryptoLab	FHE	South Korea	$20.9M	51
integrate.ai	Federated Learning	Canada	$49.23M	31
Allonia	Federated Learning	France	$12.52M	29
Devtron	Federated Learning	USA	$28.81M	10
Lattica	FHE	Israel	$3.25M	10
Wodan	FHE	Belgium	$3.04M	9