HIPAA-Compliant AI Agents: What Your Vendor Won't Tell You | Skylink Developers

84% of healthcare AI vendors claim HIPAA compliance. Only 31% have signed Business Associate Agreements. Here are the four gaps in most healthcare AI deployments — and how to audit your vendor in 15 minutes.

The pitch is always the same: 'our platform is HIPAA compliant.' What that phrase means in practice varies widely. An internal analysis of 50 healthcare AI vendors found that 84% use 'HIPAA compliant' in their marketing, but only 31% had executed Business Associate Agreements with their downstream AI providers — OpenAI, Anthropic, Google. Without a BAA covering the model that processes patient data, the entire compliance claim collapses at the point where patient information hits an AI API.

What HIPAA Actually Requires for AI Systems

HIPAA's Security Rule requires covered entities and their business associates to implement technical, administrative, and physical safeguards for Protected Health Information. For AI systems this means: any system that processes, stores, or transmits PHI must be covered by a Business Associate Agreement. The AI model provider — not just your SaaS vendor — must sign a BAA. Access to PHI must be logged and auditable. PHI cannot be used to train models without explicit patient authorization. Data must reside in specified geographic regions.

The word 'compliant' in a vendor's marketing means their platform can be configured to be HIPAA compliant, not that your specific deployment is. That distinction is where most gaps live.

Gap 1: No BAA With the LLM Provider

The most common gap. Your SaaS vendor is HIPAA compliant. They sign a BAA with you. But their product sends patient data to OpenAI's API, and OpenAI hasn't signed a BAA with your SaaS vendor — or if they have, you haven't verified it. OpenAI offers a HIPAA BAA only for certain tiers (API access at Enterprise, not the standard tier most SaaS vendors use). Anthropic offers a BAA for Claude API customers. AWS, Azure, and GCP all offer BAAs as part of standard enterprise agreements.

The question to ask your vendor: 'Which AI model provider processes our patients' data, and can you provide a copy of your BAA with that provider?' If they can't answer in 24 hours, that's your answer. If your vendor uses multiple AI providers, each one needs a BAA in the chain.

Gap 2: PHI in Prompt History and Fine-Tuning Data

AI APIs log prompts by default for debugging and model improvement. If your prompts contain patient names, diagnoses, dates of birth, or any other PHI, those logs may be stored in systems not covered by your BAA. Most enterprise AI API agreements let you opt out of prompt logging — but only if you explicitly request it. Check your API configuration, not your contract.

Fine-tuning is a higher-risk surface. If your vendor fine-tunes their model on your data, that data persists in the model weights in a form that may be difficult to fully purge. For most healthcare AI use cases, fine-tuning is unnecessary and introduces unacceptable data governance complexity. Use RAG on your data, not fine-tuning.

Gap 3: No Audit Trail for AI-Generated Decisions

HIPAA's Audit Controls standard (§ 164.312(b)) requires systems containing or using electronic PHI to implement mechanisms to record and examine activity. For AI systems that influence clinical decisions — triage recommendations, diagnosis support, treatment suggestions — the audit trail must include: what information the AI was given, what recommendation it made, and what action was taken. Most healthcare AI vendors provide application-level logs (user logins, page views) but not model-level logs (what went into the prompt, what came out). Before deploying any AI that touches clinical workflows, confirm you can produce a complete audit trail for each AI-assisted decision.

Gap 4: Data Residency Not Enforced

Most healthcare organizations require patient data to remain in the United States. 'Hosted in the US' doesn't mean all processing happens in the US. Global AI model APIs may route requests through international inference infrastructure depending on load. Your vendor may use a CDN that caches responses internationally. The right question isn't 'where are your servers?' It's 'can you guarantee, contractually, that no PHI will be processed or temporarily stored outside US-based infrastructure?' Some vendors can. Many cannot.

What OCR Asks For During an Investigation

When HHS's Office for Civil Rights investigates a HIPAA breach involving an AI system, they ask for: (1) the BAA chain covering every system that touched the PHI, (2) access logs for the 90 days before the incident, (3) the data flow diagram showing where PHI moved. If you can't produce all three in 48 hours, your compliance posture is weaker than you think.

How to Audit Your Current Vendor in 15 Minutes

Ask these six questions. A HIPAA-compliant vendor answers all of them in under 24 hours: (1) Which AI model providers process patient data through your platform? Get a specific list. (2) Can you provide copies of BAAs with each of those providers? Delay means they don't exist. (3) Is prompt logging disabled for our account? Log into the dashboard and verify. (4) Where is PHI processed at the infrastructure level — not just stored? Look for guarantees about inference, not just storage. (5) What does your audit trail include for AI-assisted decisions? Request a sample log. (6) Have you completed a HIPAA Security Risk Assessment in the last 12 months? Ask for the date and who conducted it. Gaps 1 and 3 are disqualifying. Gaps 2, 4, 5, and 6 are negotiable.

What a Compliant Healthcare AI Stack Looks Like

A compliant deployment uses an LLM provider with an active BAA — Anthropic, OpenAI Enterprise, AWS Bedrock, or Azure OpenAI — with prompt logging disabled. Patient data is processed in a US-only cloud region. PHI is handled via a RAG architecture where patient records are retrieved from a HIPAA-compliant datastore (AWS HealthLake, Azure Health Data Services) and injected into prompts — never stored in the model or in AI provider logs. Every AI-assisted decision writes an audit record to an immutable log with: timestamp, user, prompt summary, model version, and response summary. Access to PHI is role-based, logged, and reviewed quarterly.

Frequently Asked Questions

OpenAI offers a HIPAA BAA for ChatGPT Enterprise and certain API arrangements, but not for the standard API tier most developers use. If you're sending PHI through the standard OpenAI API without confirming your tier's BAA coverage, you have a Gap 1 issue. Contact OpenAI's enterprise team or switch to Anthropic or AWS Bedrock, both of which have documented enterprise BAA processes.

Frequently Asked Questions

Yes — self-hosted open-source models (Llama, Mistral) eliminate the third-party BAA requirement because you control the infrastructure. The tradeoff: you're now responsible for the security of the model infrastructure itself, which requires the same HIPAA safeguards as any other PHI-touching system. For organizations with strong infrastructure teams, self-hosting is a legitimate HIPAA-friendly path.

Frequently Asked Questions

HIPAA penalties range from $100 to $50,000 per violation, with an annual cap of $1.9M per violation category. The higher penalties apply to willful neglect — which a security audit showing you knew about the BAA gap and didn't fix it qualifies as. More practically: a breach without a compliant vendor chain means your cyber insurance claim may be denied.

HIPAA compliance for AI isn't a checkbox. It's a chain — every link from the patient record to the AI model to the response must be covered. Most healthcare AI deployments have at least one broken link. The four gaps above are where to look first.