De Novo Decisions Start in the Data Mine

Most ABA providers open new clinics on gut feel. They know where families are asking for services, where clinicians live, or where a landlord offers space at the right rate. But gut feel doesn’t scale.
Opening a new clinic is one of the most high-stakes bets a provider can make. Pick the right region and you unlock years of sustainable growth. Pick wrong, and you inherit staffing headaches, payor friction, and a facility that never turns a profit.
The difference isn’t instinct. It’s data.
But here’s the reality: even providers who manage their own data well — and hire consultants to fill in the rest — still struggle. The data that drives a de novo decision lives in silos, spans multiple systems, and is fragmented across public sources, commercial filings, and internal operations.
And the same data that informs whether you should open a new clinic is the data that later measures whether it worked.
The Data That Shapes a De Novo
Workforce Supply & Turnover
- What matters: How many BCBAs and RBTs are in the region, and how often they turn over. High churn can quietly double staffing costs.
- Where it lives:
- BACB certificant registry → number of certified clinicians in a geography.
- Bureau of Labor Statistics (BLS) → wages, employment levels, turnover (at the “behavioral health” or “technician” job category level).
- Job postings & LinkedIn → directional signal of workforce demand and competition.
- Internal HR/payroll → provider-specific turnover and wage trends.
- Why it’s hard: BLS is lagging and broad, BACB doesn’t show churn, job boards are noisy, and internal HR data rarely gets benchmarked against the market.
- Who could bring it together across ABA: ABA-focused Practice Management and HR platforms
Payor Landscape, Denials & the “Missing Metric”
- What matters: Who the dominant payors are, how much they reimburse, how often they deny claims — and how much care they’re actually approving.
- Where it lives: Practice management claims data, Medicaid fee schedules, commercial payor websites.
- Why it’s hard:
- Commercial reimbursement data is technically public under federal Transparency in Coverage rules, but the files are massive (often terabytes), poorly structured, and inconsistent across payors. Extracting ABA-specific rates requires data gymnastics most providers can’t perform on their own.
- Medicaid fee schedules are easier to access but vary widely in format and update cadence.
- Denial data exists in your own claims, but you can’t see beyond your own organization.
- Prior auth data inside PM systems: Most practice management platforms let staff record authorization numbers, expiration dates, and approved units. Technically, this is PA data — but it’s entered manually, varies by payor, and isn’t built for analysis. At best, it helps an individual provider stay compliant. At worst, it’s incomplete or inconsistent.
- The missing metric: In pharmacy, prescription fill data is a near real-time utilization measure. ABA has no equivalent. Generalized prior authorization data could serve that role — a leading indicator of care volume — but it isn’t standardized or accessible beyond individual providers.
- Who could bring it together across ABA: ABA-focused Practice Management and RCM platforms
Demographics & Demand
- What matters: Autism prevalence, Medicaid penetration, socioeconomic mix, and — critically — the rate and timing of diagnosis in the region. Earlier identification expands demand sooner, while under-diagnosed communities may mask latent demand.
- Where it lives: U.S. Census (population, SES), CDC prevalence studies, Medicaid enrollment data.
- Who could bring it together across ABA: ABA Diagnostics platforms
Competitive Density
- What matters: Which providers already operate in the region, their size, and growth pace.
- Where it lives: Provider websites, LinkedIn headcounts, state licensure lists.
- Why it’s hard: No unified view exists. Headcounts lag reality, licensure data can be outdated, and competitors rarely advertise their true growth pace.
- Who could bring it together across ABA: ABA-focused Practice Management and HR platforms
Referral Ecosystem
- What matters: Whether pediatricians, schools, and specialists will actually send families your way.
- Where it lives: NPI registry (pediatricians, developmental specialists), state education departments, your own CRM/intake system.
- Why it’s hard: Knowing who could refer is easy. Knowing who will refer requires relational context that doesn’t show up in the data.
- Who could bring it together across ABA: ABA-focused CRM and Diagnostic platforms
Real Estate & Facilities
- What matters: Availability and affordability of suitable clinic space, licensing/zoning requirements, parking and transit access.
- Where it lives: County licensing boards, municipal zoning websites, commercial real estate listings.
- Why it’s hard: Public data shows space exists, but not whether it’s actually fit for ABA care. Suitability requires on-the-ground validation.
- Who could bring it together across ABA: No clue!
Compliance & Regulation
- What matters: Regional supervision rules, telehealth allowances, and labor law compliance.
- Where it lives: State Medicaid manuals, OIG reports, state labor law resources.
- Why it’s hard: Regulations shift frequently, vary by interpretation, and are enforced unevenly across regions.
- Who could bring it together across ABA: ABA-focused Compliance platforms
Macro-Economics & Community Dynamics
- What matters: Wage competition from other industries, housing affordability for staff, cultural dynamics around ABA acceptance.
- Where it lives: BLS (wages, unemployment), U.S. Census (housing costs, demographics), local advocacy orgs.
- Why it’s hard: Economic data is available but not ABA-specific, and cultural acceptance is qualitative, often anecdotal.
- Who could bring it together across ABA: ABA-focused Applicant Tracking Systems and HR platforms
The Data You Already Own — But May Not Be Able to Use
Every provider already holds some of the most important data for de novo decisions:
- Claims data → reimbursement, denials, payor mix.
- HR/payroll → wages, churn, hiring velocity.
- CRM/intake → referral patterns, lost leads, demand signals.
- Scheduling → utilization, waitlists, unmet demand.
The challenge isn’t whether the data exists. It’s whether you can bring it together in a way that allows you to benchmark against the regional market.
That requires sophisticated data management, not just pulling a few reports. It means integrating HR, payroll, CRM, intake, and practice management data — and layering it against external benchmarks. Without that, you don’t know whether your turnover is unusually high, your denial rates are normal, or your payor mix is sustainable.
Why Data Is the Core of De Novo Success
- Before launch: Data shows whether the region has demand, workforce, and payor stability. And, most importantly, operating margin.
- After launch: Data measures whether the clinic is actually working. Are outcomes better than market averages? Are costs in line with what’s sustainable?
And here’s the hook: once this data is aggregated and accessible, it becomes fuel for machine learning. AI models can detect patterns, benchmark your results against the market, and even make proactive recommendations on where to expand, which payors to prioritize, and how to staff sustainably.
Data isn’t just the input to a de novo decision. It’s also the metric for success once the doors open — and the training set for the AI that could make the next decision smarter than the last.
Bottom line: The ABA ecosystem doesn’t lack operational data. It lacks integration. De novo success depends on connecting internal insights with external benchmarks — and preparing for a future where AI can turn those integrated datasets into real, actionable guidance.
Next time: I’ll explore what it would take to actually connect these datasets — and why providers who invest in a data-first strategy now will be the only ones positioned to fully leverage AI later.