Mike Berardi·9 min read·2026-04-24

Multi-state operator data architecture: what we'd build today

Multi-state operators are the hardest customers in cannabis tech. They run dispensaries in California, cultivation in Colorado, and delivery in Florida — each under a different regulator, each using a different traceability system, each with its own reporting cadence and data schema. We have worked with enough MSOs to know that the usual single-state POS architecture does not scale. Here is the data model we would build if we were designing an MSO platform from scratch today.

The three real problems

Before designing the model, we need to name the problems precisely. In our experience, MSOs face three systemic data challenges that compound as they add states.

1. Per-state schema fragmentation

California uses METRC with DCC-specific fields. Colorado uses METRC with MED audit-log requirements. Florida uses BioTrack with a SOAP API. Washington uses Leaf Data with near-real-time reporting. Each system has its own entity names, field types, and validation rules. A package in METRC is not the same shape as a package in BioTrack. A transfer in California has different required fields than a transfer in Colorado.

2. Inventory unit conversion

California tracks inventory in grams. Colorado allows ounces for flower. Washington uses grams for everything. When an MSO moves product between states — or even between facilities in the same state — the unit of measure must convert correctly or the compliance report will be wrong. We have seen MSOs write conversion spreadsheets that break every time a new SKU is introduced.

3. Cross-state compliance reporting

Every state wants its own report format, its own cadence, and its own audit trail. An MSO operating in four states might need to file daily reports in two, weekly in one, and monthly in another. Each report must be immutable, timestamped, and reproducible. If a regulator asks for a report from six months ago, the MSO must generate the exact same output, not a best-effort reconstruction.

The ideal data model

We would build around three core principles: events as the source of truth, state-specific projections, and immutable audit logs per state.

Events as source of truth

Instead of storing inventory as rows in a relational table that get updated in place, we would store every business event as an append-only log. An event might be "inventory received," "sale completed," "transfer dispatched," or "package adjusted." Each event contains the raw business facts: product ID, quantity, unit of measure, timestamp, user, facility, and license.

The key insight is that events are state-agnostic. A "sale completed" event in California and a "sale completed" event in Colorado contain the same business facts. The difference is in how each state's projection layer translates those facts into the regulator's required format. By keeping events canonical, we avoid the N-times-M mapping problem of storing state-specific records for every transaction. When we add a new state, we do not rewrite our inventory model. We write a new projection and backfill from the event log. This decouples compliance logic from business logic, which is the hardest thing to get right in cannabis software.

State-specific projections

A projection is a read model derived from the event log. Each state gets its own projection layer that knows the local schema, validation rules, and reporting cadence. The California projection knows DCC transfer types. The Colorado projection knows MED reason codes. The Florida projection knows BioTrack SOAP envelope format.

Projections are rebuilt from the event log, so they are always consistent. If the California DCC changes a field requirement, we update the California projection and rebuild from the event log. We do not need to migrate historical records because the events never change — only the projection logic does.

Immutable audit logs per state

Every time a projection generates a compliance report, we store the report payload, the projection version, and the event-sequence range it was built from. This gives us an immutable audit trail. If a regulator asks for the report from March 15, 2026, we can return the exact same payload, signed with a hash, without rerunning the projection.

We would store these audit logs in append-only, tamper-evident storage — either a write-once database or a Merkle-tree-backed log — so that even an administrator with full database access cannot alter a submitted report. This is overkill for a single-state operator, but for an MSO facing multi-state audits, it is insurance. The append-only constraint also simplifies compliance reviews. When a regulator asks how a report was generated, we can point to the exact projection version and event range, with no ambiguity about whether the data changed after submission.

Handling unit conversion

Events store quantity in a canonical unit — we would use grams as the base unit for weight and milligrams for THC content. The projection layer converts to the state's preferred unit at report time. California gets grams. Colorado gets ounces where allowed. Washington gets grams. The conversion logic lives in the projection, not in the event, so changing a state's unit preference does not require a data migration. We also store the original unit from the source system as metadata, so reconciliation is always possible without guessing.

We would also store the original unit from the source system — METRC, BioTrack, or manual entry — as metadata on the event. This lets us reconcile discrepancies without guessing whether a "1" means one gram or one ounce.

Cross-state analytics without breaking compliance walls

MSOs want cross-state dashboards: total revenue, top SKUs, customer retention. But regulators do not want California data mixed with Colorado data. We would solve this with a two-tier analytics model that keeps compliance data separated while still giving leadership the consolidated view they need to run the business.

Tier one is per-state analytics, built from the state-specific projection. These dashboards are compliant by construction because they only use data that has already been reported to that state's regulator. Tier two is cross-state analytics, built from a sanitized event stream where facility IDs, customer IDs, and transaction IDs are hashed or aggregated so that individual records cannot be reverse-engineered. This gives MSO leadership the birds-eye view without creating a compliance risk. The hashing algorithm is one-way, so even if a tier-two dashboard is subpoenaed, it cannot be used to reconstruct individual transactions. This separation of concerns is what makes the two-tier model viable for regulated industries.

What we built in DubLedger

DubLedger is architected around these principles at the tenant level. Each dispensary tenant has its own event log, projection layer, and audit trail. As we expand to support MSO multi-tenant rollups, we are extending this model with shared event streams and cross-tenant analytics. The goal is to give MSOs the same real-time visibility they get from a generic ERP, but with the compliance rigor that cannabis regulators demand.

If you are an MSO evaluating platforms, ask whether the vendor can show you an immutable audit log, a state-specific projection, and a cross-state dashboard. If they look at you blankly, they have not solved the hard problems yet.

Need a compliance checklist for your state?

We built a free, state-by-state checklist that covers METRC setup, audit-log requirements, and delivery manifest rules. Get it in your inbox.

Get a free state-by-state compliance checklist →