Databricks has introduced CustomerLake, a new “agentic” customer data platform that unifies customer data, identity resolution, audience building, and activation inside its lakehouse environment. The product is available in Private Preview, with Databricks naming HP, Circle K, AB InBev, and Getnet by Santander as early customers.
CustomerLake positions marketing execution closer to where enterprise data and AI models already live, using governed data access via Unity Catalog and integrations across the advertising and marketing stack. Databricks also says the system is designed to support 1:1 personalized experiences “a billion times a day,” framing the product around always-on, automated decision loops rather than batch campaign cycles.
Table of contents
Jump to each section:
- What CustomerLake changes in Databricks’ platform strategy
- How an “agentic CDP” differs from legacy CDP workflows
- Competitive pressure in the enterprise CDP market
- Implications for marketing ops, data teams, and measurement
What CustomerLake changes in Databricks’ platform strategy
CustomerLake signals a deeper push by Databricks into the marketing and advertising software budget, not just the data platform budget. Rather than positioning as infrastructure that feeds downstream martech tools, Databricks is positioning CustomerLake as a layer where identity, segmentation, and activation can happen without moving customer data into a separate CDP environment.
For enterprises already standardizing analytics and AI development on Databricks, that proposition targets a common pain point: duplicative customer datasets, lag from pipelines, and governance gaps created when marketing stacks rely on copies and exports. If CustomerLake works as described, it becomes an argument for consolidating parts of the martech stack around the same governed data foundation used by finance, product, and operations.
How an “agentic CDP” differs from legacy CDP workflows
Databricks frames legacy CDPs as campaign-centric systems built around a waterfall process: plan, build audiences, ship campaigns, then measure. CustomerLake is presented as an alternative built around continuous loops, where agents analyze behavior, decide what to do, and execute actions in near real time.
Functionally, the difference for teams is where decisions are made and how quickly they can be acted on:
- Identity and profiles: CustomerLake includes “profile agents” and “agentic identity resolution,” which implies a mix of rules plus AI-driven reconciliation for messy identifiers.
- Audience and activation: “Campaign agents” are positioned to build audiences and drive activation from the same environment where data resides, supported by native integrations and reverse ETL.
- Partner ecosystem: Databricks emphasizes interoperability, listing integrations and partners such as Adobe, Meta (audiences and Conversions API), Braze, Bloomreach, Iterable, LiveRamp, Acxiom, Epsilon, The Trade Desk, Twilio, Unity, and others.
The practical question is not whether agents can generate segments, but whether marketing governance, approval workflows, and experimentation discipline can keep up with always-on automation without creating brand or compliance risk.
Competitive pressure in the enterprise CDP market
CustomerLake enters a mature and highly competitive CDP landscape that includes Adobe Experience Platform, Salesforce Data Cloud, Treasure Data, and Twilio Segment. Most of these systems already pitch identity resolution, real-time segmentation, and activation, with varying degrees of embedded AI.
Databricks’ differentiator is the “inside the data platform” premise: reducing data duplication by making the CDP native to where enterprises store data and build models. That could appeal to data leaders who want fewer systems moving sensitive customer data around. At the same time, incumbents can counter with end-to-end marketer UX, packaged connectors, and bundled execution across email, web personalization, and ad platforms. The adoption hinge may be whether CustomerLake can satisfy both audiences: data teams that care about governance and performance, and marketers that care about usability and speed.
Implications for marketing ops, data teams, and measurement
If CDP capabilities shift into the lakehouse, marketing operations becomes more intertwined with data engineering choices and governance models. That can improve consistency, but it also raises operational requirements:
- Data readiness: Always-on personalization depends on clean event schemas, reliable identity stitching, and clear definitions for lifecycle states.
- Governance and permissions: Unity Catalog governance may help, but teams still need policies for who can activate what audiences, where, and with what constraints.
- Measurement loops: The promise of “infinity campaigns” depends on feedback: downstream conversion and exposure data needs to flow back quickly and be usable for agent decisions.
- Cost model scrutiny: Databricks notes pricing will be consumption-based rather than a traditional license. Enterprises will need to model unit economics for segmentation and activation workloads, especially if frequency increases due to continuous decisioning.
For marketers, the strategic implication is that the boundary between “customer data platform” and “data platform” may blur further. The organizational implication is that teams will likely need shared operating rhythms, not just shared data, to avoid turning real-time activation into uncontrolled experimentation.


Leave a Reply