RevOps Brief publishes practitioner-written deep dives for revenue operations professionals. No vendor sponsorship influences our editorial content. Subscribe free at www.revopsbrief.com
Israel has spent several years managing marketing systems, building lead generation engines, and owning the MarTech stack at SaaS and non-SaaS companies alike. His work sits at the intersection of marketing strategy, marketing automation, and the data infrastructure that makes both measurable. He has built and rebuilt HubSpot, Marketo, and custom marketing data environments across multiple growth stages, and has a particular obsession with the gap between what marketing spends and what leadership can actually see. This case study draws on a project he led to make a $500,000 marketing budget legible for the first time.
Company name withheld at the organisation's request. Outcomes verified by the RevOps Brief editorial team.I have spent most of my career on the marketing side of the revenue house. Not in RevOps proper, but in the systems and operations that sit underneath marketing: the automation platforms, the lead gen infrastructure, the attribution tooling, the campaign tracking taxonomy, and the increasingly complex question of what, exactly, all of that spend is producing. This case study is written from that angle. It is not an account of how RevOps fixed marketing. It is an account of how someone who lives inside marketing systems built the infrastructure to make a $500,000 annual marketing budget legible to the people running the company. The four lenses I use when approaching any problem like this:
| Lens | The Question It Answers | Why It Matters |
|---|---|---|
| The Operational Problem | What was breaking, and how was it showing up in decisions and conversations? | You cannot scope a fix until you can describe the dysfunction precisely. |
| The Systems Challenge | What was wrong with the infrastructure underneath the problem? | Most operational problems have a systems cause. Fix the symptom and it comes back. |
| The MarTech & Ops Approach | What did we actually build, decide, and change? | The choices made here determine whether the outcome is durable or fragile. |
| The Infrastructure Outcome | What does the organisation have now that it did not have before? | Results are important. So is the capability that produces them. |
Run every section through these four lenses and you get a complete picture of a revenue intelligence problem: not just what happened, but why it happened, how it was fixed, and what was actually built. That is the standard I am holding this write-up to.
What Was Breaking, and How It Showed Up
The Situation in Plain Terms
The marketing budget was approximately $500,000 per year. That is a straightforward number. What was not straightforward was how different people in the building interpreted it. Finance had one figure, derived from their cost centre accounting. Marketing had another, based on what they were actively tracking in campaign tools. RevOps, when asked to reconcile the two for a board pack, produced a third. The range between the highest and lowest figure was not trivial. Depending on whether you included agency retainers, SaaS tool subscriptions, and freelance content spend, the number moved by a material amount. And when your numerator is uncertain, your CAC is uncertain, and when your CAC is uncertain, every budget conversation is built on sand.
Against that spend, nobody could tell you with confidence what was being produced. Not pipeline. Not opportunities. Not customers. The attribution coverage, the percentage of closed-won deals with any traceable marketing touchpoint in their history, was 23%. That means for more than three quarters of the business we had signed, we had no documented evidence of what had caused a prospect to engage in the first place.
This is not a minor analytical gap. When you do not know what is generating your customers, you are making budget decisions by intuition. Some channels continue to receive investment because they always have. Others get cut because they feel expensive relative to a denominator nobody agrees on. The marketing team presents results in a monthly meeting and leadership nods, not because they believe the numbers, but because they do not have better ones to challenge them with.
In the twelve months before this project started, CAC was presented to the board three times using a $500,000 annual marketing budget as the starting point. Each time, the numerator was calculated differently. The range across those three presentations was 34%. On a $500,000 budget, a 34% swing in reported efficiency is not a rounding error. It is the difference between a channel allocation strategy that makes sense and one that is built on a number nobody agrees is real.
Where the Dysfunction Surfaced
Revenue intelligence problems do not stay contained in data systems. They surface in conversations. These are the four places this one showed up most frequently:
| Conversation | What Should Have Happened | What Actually Happened |
|---|---|---|
| Monthly marketing review | Channel-level ROI discussion based on pipeline attribution | Presentation of spend and vanity metrics (MQLs, impressions) with no pipeline linkage |
| Quarterly board pack | Single, agreed CAC figure with clear methodology | Three to four days of manual reconciliation producing a number everyone privately distrusted |
| Budget planning | Evidence-based channel allocation driven by cost per pipeline dollar | Prior year allocations with marginal adjustments based on gut feel and vendor relationships |
| CRO and CMO alignment | Shared view of marketing contribution to pipeline | Quarterly disagreement about whose numbers were right and whose methodology was broken |
The CRO-CMO dynamic is worth unpacking. When two senior leaders are regularly in disagreement about what a single function is contributing, it is almost never a personality conflict. It is an information architecture problem. They are each looking at a different slice of a fragmented system and drawing different conclusions from the slice they can see. Give them both the same complete picture and the disagreement usually resolves itself. The project goal, stated simply, was to build that picture.
What Triggered the Project
The trigger was a board question in Q4 2024. A board member, reviewing the quarterly deck, asked a straightforward question: what is our CAC by channel, and how has it trended over the last four quarters? It is exactly the kind of question that should take thirty seconds to answer from a live dashboard.
It took eleven days. Eleven days of RevOps analyst time, Finance team involvement, and a lot of email threads about methodology disagreements. The answer we eventually produced came with three footnotes explaining the assumptions behind it. The board member accepted it, noted the footnotes, and said something I have thought about since: "If it takes this long to answer a question about where we are spending money, I am not sure we actually know the answer."
He was right. We did not. And after that meeting, we had executive sponsorship to fix it.
What Was Wrong With the Infrastructure
The Stack Audit
The first task was mapping what existed. Not what was supposed to exist according to the vendor contracts and integration documentation, but what was actually happening with data in practice. Those two pictures are frequently different, and in this case they were significantly different.
| System | What It Was Used For | The Actual Data Problem |
|---|---|---|
| HubSpot (CRM + Marketing Hub) | Contact records, marketing email, lead scoring, deal pipeline | Lead source field had 23 distinct values with no enforced taxonomy. UTM data captured inconsistently. Revenue figures pulled from deals which were regularly not updated by AEs. |
| Google Ads + LinkedIn Ads | Paid acquisition campaigns | Conversion tracking misconfigured. Post-view attribution window set differently on each platform, producing non-comparable ROAS figures. Neither platform connected to CRM at the deal level. |
| Google Analytics 4 | Web traffic and conversion tracking | GA4 and HubSpot form data not reconciled. Session-level conversions in GA4 did not match contact creation events in HubSpot for the same periods by a margin of 18 to 31% depending on the month. |
| Finance ERP | Actual marketing spend by vendor and category | Spend categorised by accounting codes that did not map to marketing channel taxonomy. Paid search spend was split across three accounting codes. Agency fees were in overhead, not marketing. |
| Snowflake (Data Warehouse) | Central analytics layer | Only HubSpot and Finance ERP were connected via Fivetran. Google Ads, LinkedIn, and GA4 were not. The warehouse was 40% populated and not usable for cross-system attribution. |
The CAC Calculation Problem in Detail
CAC should be straightforward: total sales and marketing spend divided by new customers acquired in a period. The complexity is entirely in the definitions. What counts as marketing spend? What counts as a new customer? Which period do you use for spend versus which period for customers, given that spend in one quarter produces customers in the next?
In this organisation, those three definitional questions had three different answers depending on who was doing the calculation:
| Input | Finance Definition | Marketing Definition | RevOps Definition |
|---|---|---|---|
| Marketing spend numerator | All spend coded to marketing cost centres, including agency retainers and software | Paid media spend only (direct campaign spend) | Paid media plus SDR tooling and data enrichment costs |
| Sales spend numerator | All sales headcount costs including base salary | Sales headcount fully excluded | SDR headcount only, 50% weighted |
| New customer denominator | Customers with a first invoice in the period | Deals marked Closed Won in CRM in the period | Customers with signed contract date in the period |
| Attribution period | Same quarter spend vs same quarter customers | Same quarter spend vs customers sourced by campaigns running that quarter | Trailing 90-day spend vs customers closed in period |
None of these definitions is wrong in the abstract. The problem is that they produce different numbers from the same underlying data, and when different stakeholders use different definitions without being explicit about it, every CAC conversation becomes a methodology debate rather than a decision-making conversation. The fix is not to find the "right" definition. It is to pick one, document it, enforce it across all systems, and make it the only number anyone ever reports.
The Attribution Gap: Why 23% Is Worse Than It Sounds
Attribution coverage of 23% means that for 77% of closed-won deals, we had no documented marketing touchpoint in the contact's history. But it is even messier than that number suggests, because the 23% that did have attribution was itself unreliable. The lead source field in HubSpot had 23 distinct values, many of them overlapping: "Website", "Web", "Organic Web", "Inbound - Web", and "Website Contact Form" were all active values representing what was functionally the same source. That is before you get to the values that had been created by individual team members for one-off campaigns and never cleaned up.
The practical consequence was that "organic search" as a channel was being systematically understated because its leads were being split across five different labels, while "direct" was being overstated because any lead where the source field was blank defaulted to direct in the reporting logic. Every channel-level performance conversation was based on miscategorised inputs. We were not just missing data. We were working with corrupted data and treating it as reliable.
A lead source taxonomy with 23 values and no enforcement is not a taxonomy. It is a free-text field with a dropdown. Before you can build attribution reporting, you need fewer categories, strict definitions for each one, and a system that enforces the definition at the point of data entry, not retrospectively in a report.
What We Built and the Order We Built It In
Principle One: Definitions Before Data
Before connecting a single data source or writing a single dbt model, we spent four weeks on definitions. This is the step that most teams skip because it feels slow and unproductive compared to building things. It is not slow. It is the thing that determines whether the things you build are durable.
Three documents came out of those four weeks. First, a CAC Methodology Document: a single-page specification of exactly how CAC would be calculated, what would and would not be included in the numerator, how the denominator would be defined, and what the attribution period logic would be. Signed off by Finance, Marketing, and RevOps. Second, a Lead Source Taxonomy: eight source categories with explicit definitions, example URLs or campaign types for each, and a mapping from every existing HubSpot value to one of the eight. Third, a Marketing Spend Classification Framework: a mapping from every Finance ERP cost code to a marketing channel category, agreed with the Finance controller.
These three documents are unglamorous. They will not impress anyone at a conference. They are also load-bearing. Every dashboard, every model, and every executive conversation that followed was built on top of them. Get the definitions wrong and the intelligence layer you build on top will produce confident-looking numbers that are wrong in systematic ways.
The CAC Methodology Document is the most important artefact from this entire project. Not the Snowflake model. Not the BI dashboard. The one-page document that says exactly what CAC means in this organisation, with the signature of the CFO, CMO, and CRO at the bottom. That document ends every methodology debate before it starts.
The Technical Architecture
With definitions in place, the technical build had a clear specification. We were not building a generic analytics platform. We were building the specific infrastructure needed to produce reliable answers to five executive questions: What is our CAC? What is our CAC by channel? How is it trending? What is marketing contributing to pipeline? Where should the next marketing dollar go?
What Each Layer Did
Sources (four, all connected). Before this project, Snowflake had HubSpot and the Finance ERP connected via Fivetran. The two missing sources were the paid channels (Google Ads and LinkedIn Ads) and GA4. Adding them was not technically complex. The hard part was what came before the connection: enforcing consistent campaign naming conventions in the ad platforms, aligning GA4 event taxonomy with HubSpot contact creation events, and making sure every source was writing to the warehouse at a consistent date granularity. Coming from a MarTech background, this is the part that took the most time, because it required going back into the platforms themselves and fixing how data was being captured before any pipeline work could produce clean output.
Transform layer (dbt, four core models). The four models represent the four questions the intelligence layer was built to answer. The spend_by_channel model joins Finance ERP spend against the channel classification framework, producing a single spend figure by channel by month. The attribution_touches model joins HubSpot contact history against the eight-category taxonomy, enforcing the taxonomy at transformation rather than relying on clean upstream data. The cac_by_period_channel model applies the agreed CAC methodology to produce a single CAC figure with all the definitional choices baked in. The pipeline_contribution model calculates marketing's traceable contribution to pipeline and closed-won revenue.
Serve layer (BI dashboards, three audiences). The executive dashboard surfaces the four or five metrics the CEO and board actually ask about. The marketing intelligence dashboard gives the CMO and marketing team channel-level depth. The board pack feeds auto-refresh from the Snowflake models, so the numbers in the quarterly deck are produced in four hours rather than three days, and they are the same numbers that appear in the executive dashboard because they come from the same source.
Fixing the Attribution Upstream
The technical build was contingent on fixing data quality problems upstream in the marketing systems themselves. This is the part that often gets handed off to engineering, and should not be. The people who best understand why a lead source field has 23 values are the people who have been inside the marketing automation platform long enough to see how it accumulates technical debt over time. A dbt model built on top of a broken taxonomy produces faster wrong answers, not right ones. Three operational changes happened in parallel with the warehouse build, and all three required marketing systems knowledge to execute correctly:
- Lead source field enforced to eight values. The existing 23 values were mapped to eight categories. HubSpot was configured so that new contacts created without a valid source value were routed to a review queue rather than defaulting to "Direct." The review queue was owned by the RevOps analyst on a daily 15-minute review cadence. This eliminated the systematic overstatement of "Direct" as a source within six weeks.
- UTM parameter discipline made non-negotiable. Every campaign URL, every paid ad, every email link was required to carry a UTM parameter following a strict naming convention. HubSpot workflows were updated to capture and persist UTM values on first touch rather than last touch for campaigns where first touch was the relevant signal. The Marketing team was given a UTM builder tool so that compliance required no more effort than non-compliance.
- Google Ads and LinkedIn conversion tracking reconfigured. The misconfigured conversion windows were standardised to a 30-day post-click, zero post-view model across both platforms, making the ROAS figures comparable for the first time. Both platforms were connected to HubSpot via their native CRM integrations, meaning ad-sourced contacts now had platform data flowing into their HubSpot records within 24 hours of contact creation.
None of these changes are glamorous. Enforcing a UTM naming convention is not the kind of initiative that gets talked about at conferences. But attribution coverage went from 23% to 91% in fourteen weeks, and that improvement came almost entirely from operational discipline rather than technical sophistication.
The Capabilities Built, Not Just the Results Achieved
The Metrics, Clearly Stated
That last metric is the one I want to spend time on. The 22% spend reallocation is not an output of the intelligence system. It is a decision that became possible because of the intelligence system. When attribution coverage is 23%, you cannot make defensible channel allocation decisions. When it is 91% and the spend data is connected to pipeline outcomes at the channel level, the decisions become relatively obvious. The CMO did not need to be persuaded by the data. She looked at the channel ROI table and made the reallocation in the same meeting we presented it.
Channel ROI: What We Could See That We Could Not See Before
The channel ROI view was the output that changed behaviour most visibly. For the first time, the organisation could see cost-per-pipeline-dollar by channel, not just cost-per-lead or cost-per-click. The difference matters enormously. A channel that produces cheap leads that do not convert to pipeline is a better-looking channel than it actually is when you are only measuring at the top of the funnel.
| Channel | Monthly Spend | Pipeline Attributed (trailing 6mo) | Cost per Pipeline $ | Prior Assumption | Action Taken |
|---|---|---|---|---|---|
| Paid Search (Google) | $41,000 | $1.2M | $0.034 | Assumed high performer | Maintained; budget protected |
| Content / SEO | $18,000 | $2.1M | $0.0086 | Undervalued; hard to measure | Budget increased 40% |
| LinkedIn Ads | $38,000 | $620,000 | $0.061 | Assumed brand building | Budget reduced 35%; targeting overhauled |
| Content Syndication | $24,000 | $180,000 | $0.133 | Assumed pipeline driver | Paused pending vendor review |
| Events and Webinars | $31,000 | $890,000 | $0.035 | Felt expensive, unclear ROI | Continued; attribution model validated the investment |
| Outbound SDR (tooling only) | $12,000 | $1.4M | $0.0086 | Not tracked as marketing spend | Reclassified; now included in CAC model |
Content syndication is the example I find most instructive. It had been running at $24,000 per month for over a year. It had produced many MQLs, which is what had justified the spend. What the new model showed was that those MQLs were not converting to pipeline at any meaningful rate. Cost per pipeline dollar was $0.133, versus $0.034 for paid search and $0.0086 for content. The vendor had been reporting conversion metrics that stopped at the MQL handoff. The pipeline data lived in HubSpot and had never been connected to the MQL data from the vendor. Once it was connected, the decision to pause the program was not controversial. It was obvious.
The Board Pack: From Manual to Automatic
The quarterly board pack had historically required three days of work across RevOps, Finance, and Marketing. The workflow involved pulling data from multiple systems, reconciling figures in a shared spreadsheet, and manually entering the results into a slide template. The process was documented nowhere, understood by two people, and produced results that carried implicit uncertainty even when nobody said so explicitly.
Post-build, the board pack metrics are driven directly from the Snowflake models. The BI layer produces a standard set of charts and tables on a refresh schedule. The RevOps analyst runs the refresh, validates that the numbers are within expected ranges using an automated sanity check workflow, and exports the outputs into the board template. Total time: four hours, most of which is validation rather than production.
More important than the time saving is the consistency. The same CAC that appears in the board pack appears in the executive dashboard. It is the same number because it comes from the same model. There are no footnotes about methodology. There are no reconciliation emails. There is one number, one methodology, and a documented derivation that any stakeholder can trace if they choose to.
The first board meeting after the new system went live, the board member who had asked the eleven-day question looked at the CAC slide, looked up, and asked: "Is this the same methodology as last quarter?" The answer was yes, because it was the same model. He nodded and moved on. That forty-second exchange was the outcome this project was built for.
The Things Worth Being Honest About
The Definitions Phase Took Longer Than Expected and Should Have Taken Longer Still
Four weeks on definitions felt like a long time when we were in it. In retrospect it was not long enough. The CAC Methodology Document went through six drafts before it was signed. The Lead Source Taxonomy went through four. If we had spent two more weeks in that phase, we would have caught three definitional edge cases that became bugs in the dbt models: how to handle multi-product deals for CAC attribution, what to do with channel data for contacts who had been in the database for more than 18 months before the new taxonomy was applied, and how to treat partner-sourced leads within the channel framework. All three required model revisions after launch that would have been cleaner to handle in the design phase.
UTM Discipline Is a System Design Problem, Not a Training Problem
I have tried to solve UTM compliance through training more times than I want to admit. It does not work at scale. People who are building campaigns are thinking about the campaign, not the tracking string appended to the URL. In the first six weeks after the naming convention policy went live, 23% of new campaign URLs had missing or incorrectly formatted UTM parameters. The solution was not more training. It was removing the need for manual compliance by building a UTM generator directly into the campaign creation workflow in HubSpot. The generator enforced the naming convention at the point of creation, made it no harder to comply than to ignore, and flagged any external campaign URLs added outside the tool in a weekly audit. The violation rate dropped from 23% to under 4%. The lesson is one I apply to every MarTech governance problem: if compliance requires more effort than non-compliance, you will always lose the battle. Design the system so that doing it right is the path of least resistance.
The Marketing Team Sees Attribution Data Differently Than RevOps Does
This is something I understand from having been on the marketing side of the fence for most of my career. When performance data becomes visible for the first time, the people who built the programs being measured do not experience it neutrally. They experience it as a review. The LinkedIn program had been running for over a year. The person managing it had invested real effort. When the pipeline attribution data showed it was the worst-performing channel by cost per pipeline dollar, the immediate response was not "great, now we can reallocate," it was "this methodology is wrong" and "LinkedIn is brand building and you cannot attribute it linearly." Some of that is defensible. Most of it was a natural human reaction to seeing work evaluated by a metric it had never been held to. We should have built the performance framing into the project launch, not introduced it after the data was already in the room. Specifically: the marketing team should have been involved in defining the measurement methodology before the results appeared, so they could not challenge the method once they saw the outcome.
The Board Pack Automation Should Have Come Last, Not Third
In the project sequencing, we automated the board pack feeds before we had fully validated the underlying models. The reasoning was that the board pack deadline was fixed and the automation would save time immediately. The consequence was that the first automated board pack contained a figure in the pipeline contribution model that was off by 8% due to a date boundary handling error in the dbt model. We caught it in the sanity check, but it meant a manual correction for that quarter and a delay to the board pack delivery that felt embarrassing given that the stated goal was to make delivery faster. Validate the models completely before building automation on top of them. The automation is only valuable if what it automates is correct.
The Sequence That Works
The four-lens framework is a diagnostic tool and a sequencing guide. Build in this order and the projects builds cleanly. Skip a layer and you will come back to it under pressure.
| Phase | Duration | Deliverables | Sign-Off Required |
|---|---|---|---|
| Phase 0: Definitions | 3 to 5 weeks | CAC Methodology Doc, Lead Source Taxonomy, Spend Classification Framework | CFO, CMO, CRO, RevOps |
| Phase 1: Source Layer | 2 to 4 weeks | All data sources connected to warehouse, UTM discipline enforced, conversion tracking standardised | Engineering, RevOps |
| Phase 2: Transform Layer | 4 to 6 weeks | dbt models for spend, attribution, CAC, pipeline contribution. All models validated against known-correct outputs. | RevOps, Finance |
| Phase 3: Serve Layer | 2 to 3 weeks | Executive dashboard, marketing intelligence dashboard, validated board pack feeds | CMO, CEO, RevOps |
| Phase 4: Governance | Ongoing | Weekly attribution audit, monthly model review, quarterly methodology review, campaign URL compliance automation | RevOps owns |
Phase 0 is the one that gets skipped. Every team I have talked to that has attempted this kind of build and ended up with a system nobody trusts has skipped Phase 0. They built the data models first and tried to resolve the definitional disagreements in the model logic rather than in a document. The model logic cannot resolve a disagreement between Finance and Marketing about what counts as marketing spend. A document with signatures can. Do Phase 0. Take the time it actually requires.
The total engineering effort in this project was roughly six weeks of a data engineer's time. The total RevOps effort was approximately four months of a senior RevOps manager's time, heavily weighted toward Phase 0 and the operational change management work in Phase 1. If your project plan shows the reverse weighting, with more technical time than operational time, review it carefully. The technical work is not what determines whether this succeeds.
Revenue intelligence is not a dashboard. It is not a data warehouse. It is not an attribution tool. It is the organisational capability to answer questions about revenue with data that people in the room agree is reliable. That sounds like a low bar. In most SaaS companies operating at scale, it is not one they clear consistently.
The project described in this case study took approximately five months from kick-off to a working intelligence layer. The technical work occupied maybe a quarter of that time. The rest was definitional alignment, operational change management, and the kind of quiet unglamorous governance work that does not make for conference talks but does make for systems that stay accurate after the initial build team has moved on to other things.
The CMO now runs budget allocation reviews from a dashboard that she trusts. The CFO signs off on a CAC figure without footnotes. The board asks one question and gets an answer in the same meeting. These outcomes sound simple because the capability that produces them, once built, is simple to operate. Getting there is not simple. But the path is clear enough that any RevOps team with executive sponsorship and the discipline to do the definitional work first can build it.
One final thing. The board member's comment, that if it takes this long to answer a question about spend then we do not actually know the answer, should not require eleven days and a manual reconciliation to prove true. But sometimes it does. Sometimes that is the thing that creates the space to build something better.
"Marketing contribution is not hard to measure. It is hard to measure once, agree on, document, and never have to re-argue again. That is the actual problem. And it is a RevOps problem, not an analytics problem."
Israel Akinfenwa, Revenue Operations Leader (company name withheld by request)RevOps Brief publishes practitioner-written deep dives, frameworks, and teardowns for revenue operations professionals. No vendor influence. No fluff.
Appendix A: The CAC Methodology Document Template
This is the structure of the one-page document that ended the methodology debates. Every organisation will fill it in differently. The structure is what matters. Each field forces a decision that would otherwise remain implicit and disputed.
| Field | Description | Notes |
|---|---|---|
| Marketing Spend: Included | List every spend category that is included in the numerator | Be exhaustive. Agency fees, software subscriptions, paid media, events, and data costs all need an explicit yes or no. |
| Marketing Spend: Excluded | List every spend category explicitly excluded | Exclusions are as important as inclusions. "Overhead" is not a sufficient answer. |
| Sales Spend: Treatment | Is sales headcount or tooling included? If partially, what is the weighting? | There is no universally correct answer. Pick one and write it down. |
| New Customer Definition | Exactly which event defines a new customer: invoice, contract signature, CRM Closed Won, or other? | Must be traceable to a specific system field, not a judgment call. |
| Attribution Period Logic | What period of spend is matched against what period of customers? | Specify the lag assumption explicitly. "Same quarter" is ambiguous. "Trailing 90-day spend matched to customers closed in the calendar quarter" is not. |
| Blended vs Channel CAC | Is this document defining blended CAC, channel-level CAC, or both? If both, are the methodologies consistent? | Channel-level CAC requires attribution, which requires an attribution model. Document that separately if needed. |
| Reporting Cadence | How often is CAC recalculated and reported? Monthly, quarterly, trailing 12 months? | Quarterly reporting on a metric that moves monthly can hide important trends. |
| Sign-Offs | Names and titles of everyone who has agreed to this methodology | This is the part that ends the debate. Without signatures, the methodology is a suggestion. |
Appendix B: Lead Source Taxonomy Design Principles
A lead source taxonomy fails when it has too many categories, when the categories overlap, or when there is no enforcement mechanism. Here are the five principles that governed ours:
- Eight or fewer categories. Above eight, the taxonomy becomes a management problem rather than a clarity tool. If you feel you need more than eight, you are probably confusing channels with sub-channels. Paid Search and Paid Social are channels. Google Ads Brand and Google Ads Non-Brand are sub-channels. Keep the taxonomy at the channel level.
- Mutually exclusive definitions. Every lead source category must have a definition that makes it impossible for a lead to plausibly belong to more than one category. "Organic" and "Referral" are not mutually exclusive if you have SEO-driven content that people share. Write the definitions until they are exclusive.
- Traceable to a system field. Each category must map to something observable in your data: a UTM source value, a form page URL pattern, a domain match, or an explicit contact creation source. If the category relies on human judgment at the point of entry, it will drift.
- Enforced at creation, not corrected in reporting. The taxonomy must be enforced in the CRM at the moment a contact is created, not applied retrospectively in a dbt model. Retrospective application works for historical data migration but cannot scale as an ongoing process.
- Reviewed annually, not continuously. Taxonomy changes should be batched and deliberate, not incremental. Every change to the taxonomy requires a historical remapping of existing data. Make changes once a year at most, with a clear migration plan for historical records.