Differentially Private Ad Conversion Measurement

In this work, we study ad conversion measurement, a central functionality in digital advertising, where an advertiser seeks to estimate advertiser website (or mobile app) conversions attributed to ad impressions that users have interacted with on various publisher websites (or mobile apps). Using differential privacy (DP), a notion that has gained in popularity due to its strong mathematical guarantees, we develop a formal framework for private ad conversion measurement. In particular, we define the notion of an operationally valid configuration of the attribution rule, DP adjacency relation, contribution bounding scope and enforcement point. We then provide, for the set of configurations that most commonly arises in practice, a complete characterization, which uncovers a delicate interplay between attribution and privacy.


INTRODUCTION
Over the last two decades, numerous attacks have illustrated the privacy risks associated with the release of (aggregated, de-identified) information, across various domains [24,30,36].This has led to the introduction of differential privacy (DP) [17,18], a rigorous mathematical notion that quantifies the privacy loss of users against arbitrary adversaries observing the algorithm's output (and regardless of the data contributed by other users).While DP has been deployed in several fields including census data release [3], learning frequently typed words [8], and collection of telemetry data [16], there has not been any formal study of it for ad conversion measurement-a core functionality in the digital advertising spacethat we undertake in this work.
We now provide a quick overview of the problem space.Let the term conversion refer to a valuable action (e.g., a purchase, add-tocart, newsletter sign up, etc.) on the advertiser website1 and the term impression refer to an ad engagement (e.g., a click on an ad or a view of an ad) by the user on the publisher website (and not merely an ad fetch).In (ad) conversion measurement, an advertiser running campaigns on multiple publishers aims to measure the performance of various campaign slices. 2 This includes estimating conversion counts and values (the latter can be numerical, e.g., in US dollars) for different settings of impression and conversion features (e.g., Example 1 below).This information can then be used by the advertiser to estimate the conversion rate and the return on ad spend, to optimize the purchase of ad impressions on guaranteed selling channels, and/or to power real-time bidding systems [12].The ease of measuring conversions on advertiser websites (or apps) and attributing them to impressions that the same user had interacted with on publisher websites (or apps) is an important reason why online advertising is significantly more efficient than more traditional forms of advertising (e.g., print media, radio and television).
Privacy is a crucial consideration in conversion measurement.Ad measurement is indeed fundamentally a cross-site functionality, involving multiple publishers and advertisers.For the last two decades, non-private approaches to the problem have allowed third parties to track users across websites; this has been enabled on the Web by third-party cookie technology.However, in recent years, a consensus has emerged that these approaches are too invasive for users, and that new privacy-preserving methods for supporting various ad use cases are critically needed.This has led to the decision to deprecate third-party cookies by several browsers including Apple Safari [40], Mozilla Firefox [43], and Google Chrome [34].Consequently, multiple efforts are currently underway by many browsers, platforms and industry groups to design privacy-preserving APIs that aim to support ad conversion measurement functionalities. 3hese APIs and proposals include the Interoperable Private Attribution (IPA) proposed by Mozilla and Meta, [37], Masked LARk from Microsoft [32], the Privacy Sandbox Attribution Reporting API (ARA) on Chrome [29] and Android [7], Private Click Measurement (PCM) on Safari [41], SKAdNetwork on iOS [1] as well as the recent proposal [42] from Apple.While most of these efforts seek to guarantee DP in order to ensure that sensitive user information cannot be recovered from the output of the API, they still lack an end-to-end formal DP framework.The goal of our work is to develop such a framework, so that proposals in this space are built on a solid mathematical foundation.

MOTIVATION, SETUP & CONTRIBUTIONS 2.1 Ad Conversion Measurement System
We next define the main components of an ad conversion measurement system, and discuss the central concept of attribution.

Basic
Definitions.An impression event consists of (i) a timestamp, (ii) a publisher id, (iii) an advertiser id, (iv) a user id, and (v) metadata associated with the impression (e.g., type indicating if it is a click or view, format indicating size of ad, etc.).A conversion event consists of (i) a timestamp, (ii) an advertiser id, (iii) a user id, and (iv) metadata associated with the conversion (e.g., a type indicating if it is a purchase or a sign up, a value if it is a purchase etc).The entities that are involved in ad conversion measurement are: • Publisher: a website on which ad impressions take place.
• Advertiser: a website where conversion events take place.
• Ad tech: entity (often a third party) used by an advertiser to buy, manage, and measure their digital advertising.
The most common type of functionality in ad conversion measurement can be thought of as a two-step process: first, conversions are assigned to one or more impressions using a fixed attribution rule, and then queries are executed on the resulting attributed dataset.On a high level, the attribution rule determines how the credit from a conversion is to be divided over the different ad impressions corresponding to the same advertiser and the same user as the conversion.The attributed dataset consists of (impression, conversion) pairs, each corresponding to an attribution; optionally the pair is also associated with a (fractional) credit.(In the cases where all credits are equal, we omit them from the attributed dataset notation for simplicity.)The queries used in the second step include counting the number of conversions attributed to a subset of impressions (and the corresponding conversion rate), as well as the total return on ad spend for that subset.Example 1.An example of the set of impression features can include the publisher website, the advertiser website, the ad engagement type (click or view), as well as its time and geographical location.An example set of conversion features can include the advertiser website, the conversion time, and the conversion type (e.g., add-to-cart, purchase, email sign-up).Thus, examples of ad conversion measurement queries include asking for: • The number of conversions on advertiser1.comattributed to ad impressions on publisher1.com which occurred in the UK.• The total value (in US dollars) of purchases on advertiser1.comtaking place on August 31st and attributed to ad impressions on publisher1.com., We next discuss the attribution step in more details.Figure 1: Example Attribution Path.In this case, a user interacts with four ad impressions from the same advertiser, but on four different publishers.The third of these interactions is a view, whereas the others are clicks.For simplicity, we assume that the four impressions and the subsequent conversion occur at equally spaced times.

Attribution Rule.
A key component of any conversion measurement system is the attribution rule.Specifically, consider the setting, illustrated in Figure 1, where a user is exposed to several impressions on multiple publishers before converting on the advertiser website.Which of these impressions should get the credit for driving the conversion?This attribution question has been the subject of numerous studies (see, e.g., [25] and the references therein).
In practice, the most popular attribution rules include last-touch attribution (LTA), first-touch attribution (FTA), uniform attribution (UNI), and exponential time decayed attribution (EXP).As the names suggest, first-and last-touch attribution assign all the credit to the first and last impressions on the attribution path, respectively, whereas uniform attribution divides the credit evenly among all the impressions, and exponential time decayed attribution assigns to each impression a credit that decays exponentially as a function of the time gap to the conversion.For an illustration of these different attribution rules for the example in Figure 1, we refer the reader to Table 1.We next explain how the values in Table 1 were computed.For LTA, all the credit goes to the "last touch", which is the user click that took place on publisher4.com.For FTA, all the credit goes to the "first touch", which is the user click on publisher1.com.For UNI, the credit is split equally among the 4 ad impressions (3 clicks and 1 view).For the EXP attribution rule, we assume that the half-life parameter 4 is 1, and that the times of the successive impressions and conversion are all separated by 1 time unit, as shown in Figure 1; thus, the credit assigned to ad impression  is 0.55− /(0.5 + 0.5 2 + 0.5 3 + 0.5 4 ) for each  ∈ {1, 2, 3, 4}.
In practice, different ad techs might offer advertisers support different attribution rules to be used when measuring conversions.For instance, while FTA might be desirable for some advertised products, LTA might be natural in other settings.In this work, we consider all of the basic and most popular attribution rules that were mentioned above. 5

Privacy Model
We start by describing the threat model for privacy-preserving ad conversion measurement.

Threat Model.
To illustrate the threat model, we consider the setting where an untrusted third party (e.g., an ad tech as defined in Section 2.1.1)would like to run queries on a conversion measurement dataset pertaining to multiple publishers and advertisers.We assume that a central trusted curator is in charge of executing the query on the dataset and sending the output to the ad tech; the trusted curator could, e.g., be a web browser or mobile platform, a trusted third party, or a secure multi-party computation protocol.The goal is to protect sensitive information (e.g., related to a single impression, conversion, or user?) from being leaked to the ad tech through the output of the protocol.Examples of sensitive information in this setting include app or web browsing history (e.g., did the user recently visit a sensitive website), or the shopping history of the user.To protect the data of individual users from leaking through an output released to an untrusted third party, DP has become the gold standard.It has been suggested as a primary privacy guardrail in multiple industry proposals for privacy-preserving ad measurement systems.Therefore, in this work, we study DP as the desired privacy guarantee on the output of the ad measurement system.
Remark 1. (Robustness to Side Information) We note that in some cases, depending on browser constraints, user sign-ins, and business arrangements, the third party could have a prior partial view of the dataset.E.g., it could know the set of all impressions (across all users but without knowing the associated user id for each impression), or the set of all conversions, or both.Nevertheless, the protection offered by DP would still be meaningful in this setting, since DP is robust to the presence of side information.

Differential Privacy Ingredients.
Adjacency Relation.On a high level, DP dictates that the distributions of the output of a (randomized) algorithm on two adjacent conversion measurement datasets are statistically indistinguishable (see Section 3.1 for a formal definition).It is thus necessary to specify the adjacency relation (a.k.a.privacy unit) to which the DP definition applies.Due to the highly fragmented nature of conversion measurement datasets (with the conversion taking place on one of the advertisers and the impressions taking place on different publishers), it turns out there are multiple natural alternatives for defining the adjacency relation, with subtle implications on the privacy-utility trade-offs.The different possibilities are listed in Table 2.For each adjacency relation, the allowed difference between two adjacent datasets consists of all user engagements that only belong to a single tuple in the adjacency relation.For instance, for the user × advertiser relation, two adjacent datasets can differ on the set of all impressions of Alice associated with advertiser1.com(and shown on any publisher), along with all conversions of Alice on advertiser1.com;this is because all of these engagements are only associated with a single (user, advertiser) tuple, namely, (Alice, advertiser1.com).On the other hand, for the user × publisher × advertiser relation, the difference consists of all impressions associated with a fixed publisher and a fixed advertiser (e.g., all impressions of Alice shown on publisher1.com, and that are associated with advertiser1.com);note that the conversions associated with the advertiser (e.g., advertiser1.com)are not included here, as they can be associated with multiple other publishers (e.g., publisher2.com), and are thus also related to other (user, publisher, advertiser) tuples in the adjacency relation, e.g., (Alice, publisher2.com,advertiser1.com) is such a tuple.
Remark 2 (Intuitive Interpretation of the Different Adjacency Relations).The different adjacency relations described in Table 2 offer a spectrum of possible privacy guarantees.On one end, the conversion (respectively, impression) adjacency relation seeks to protect a single conversion (respectively, impression) from being leaked by the output of the algorithm.On the other end, the user relation protects any information related to all the impressions and conversions of any user from being leaked.The other notions can be seen as interpolating between these two ends.E.g., the user × advertiser relation protects all the user's impressions and conversions pertaining to a single advertiser, but it does not necessarily protect information that can be deduced by observing the user's impressions and conversions across multiple advertisers.In particular, under this user × advertiser relation, observing the output of the privacy-preserving ad conversion measurement the system would not substantially increase an attacker's ability to distinguish whether Alice had any attributed conversions associated with advertiser1.com.The choice of the adjacency relation depends on the privacy protection that the system designer seeks to guarantee, and the associated utility trade-offs that they are willing to accept.E.g., the user × advertiser adjacency can be natural in some settings as it would prevent the ad-tech associated with a publisher from learning that a visitor to the publisher later visited a (sensitive) advertiser site.On the other hand, a user × publisher adjacency relation can prevent an ad-tech associated with the advertiser from learning about the actions of a converting user on publisher sites.
Remark 3 (Time Dimension in Adjacency Relations).In practice, it is common to include the time dimension in some adjacency relations, in particular, the user, user × publisher and user × advertiser ones.Moreover, DP composition can be used to handle the case where the time dimension in the adjacency relation can cover multiple releases of the output of the system on different subsets of the dataset.Contribution Bounding Scope.To ensure privacy for a given adjacency relation, a core component of DP systems is the notion of a contribution bound.Note that, in the case of ad conversion measurement, the number of interactions can naturally be unbounded.For example: • An impression could lead to multiple conversions.
• Different impressions can be on the attribution path of the same conversion.• The same user can be shown many impressions on the same or on different publishers, and could convert multiple times on the advertiser.
See Figure 2 for an example.To be able to guarantee a bounded privacy leakage, the contributions of the interactions to the computed function should be restricted by a certain contribution bound.
The contribution bounding scope is the set of user interactions that share the same contribution bound.The scope can be any one of the options listed in Table 2 for the adjacency relation.We will consider in this work the natural setting where the contribution bounding scope is the same as the adjacency relation.It turns out that each adjacency relation / contribution bounding scope can have a different interplay with the attribution rule, and with the resulting DP guarantee.Moreover, as we will see shortly, some will be easier to operationalize than others.
Contribution Bound Enforcement.One particular choice, that turns out to be important, is whether the contribution bound should be enforced before (pre-attribution contribution capping) or after (post-attribution contribution capping) the attribution rule is applied to the dataset.In the former case, only impressions belonging to a scope with a non-zero remaining contribution bound can be considered on the attribution path of a conversion.In the latter case, a conversion could get attributed to an impression with a remaining contribution bound of 0, only to get discarded at the contribution bound enforcement step (without the attribution falling back to any other impression on the path).In other words, pre-attribution contribution bound enforcement limits the number of impressions or conversions that belong to the contribution bounding scope and that can enter the attribution rule.By contrast, post-attribution enforcement limits the number of post-attribution (impression, conversion) pairs that belong to the contribution bounding scope. 6ee Table 3 for the attributed dataset resulting from applying postattribution contribution bounding with a bound of 2 to the dataset given in Figure 2 and with the LTA rule.( 2 ,  2 ) Table 3: Post-Attribution Contribution Bounding.The input dataset is shown in Figure 2. The LTA rule was applied.
We now explain how the attributed datasets were generated in Table 3. First, note that in the absence of any contribution bounding, each conversion should simply be attributed to the last impression occurring prior to it.So in the dataset given in Figure 2, conversions  1 ,  2 and  3 should be attributed to impression  2 , conversion  4 should be attributed to impression  4 , and conversion  5 should be attributed to impression  5 .This explains the first row of Table 3.We note that post-attribution contribution bounding with a per conversion contribution bounding scope is a no-op, so it would result in the same attributed dataset as the first (i.e., "None") row in Table 3.In second row of Table 3, a post-attribution contribution bound of 2 is applied for each impression.This leads to dropping the pair ( 2 ,  3 ) from the attributed dataset because conversions  1 and  2 are already attributed to impression  2 .In the third row of Table 3, a contribution bound of 2 is applied post-attribution to each (user, advertiser) pair.Since Figure 2 specifies a dataset for a single user and for two advertisers (namely, advertiser1.comand advertiser2.com),this entails capping the number of attributed conversions for of each of the two advertisers to 2. Compared to the second row, this has the additional effect of dropping the pair ( 4 ,  4 ) from the attributed dataset, since conversions  1 and  2 occur on the same advertiser and already appear in the attributed dataset.Finally, in the last row of Table 3, a contribution bound of 2 is applied post-attribution for each user.For the user whose dataset is shown in Figure 2, this implies that the total number of conversions (across all advertisers) appearing in the attributed dataset should be bounded to at most 2. Compared to the third row

Adjacency Relation Difference between Adjacent Datasets Impression A single impression Conversion
A single conversion User × Publisher All impressions shown to a user on a given publisher User × Advertiser All impressions shown to a user and for a given advertiser, and all conversions by the same user on the same advertiser User × Publisher × Advertiser All impressions shown to a user on a given publisher and corresponding to a given advertiser

User
All impressions shown on all publishers, and all conversions occurring on all advertisers, for a given user Table 2: DP Adjacency Relations.
of Table 3, this has the further effect of dropping the pair ( 5 ,  5 ) from the attributed dataset, because the same user already has two conversions,  1 and  2 , appearing in the attributed dataset (despite the fact that these conversions occur on a different advertiser).
For the results for pre-attribution contribution bounding, see Table 4.
We next explain how the attributed datasets were generated in Table 4.In the first row, a contribution bound of 2 is applied pre-attribution for each (user, advertiser) pair.For the single user corresponding to Figure 2, the first advertiser will have its contribution bound exhausted after impressions  1 and  2 are processed; hence, no conversion occurring on the first advertiser will appear in the attributed dataset.By contrast, for the second advertiser, impression  5 and and conversion  5 will be processed, and the resulting pair ( 5 ,  5 ) will be added to the attributed dataset (and at that point the contribution bound for the second advertiser would be exhausted).For the last row of Table 4, the contribution bound of 2 is enforced pre-attribution at the user level.For the user corresponding to Figure 2, the contribution bound would be exhausted after impressions  1 and  2 are processed, and thus the attributed dataset would be empty.We point out that pre-attribution contribution bounding with an per impression contribution bounding scope is a no-op; hence it results in the same attributed dataset as the first (i.e., "None") row in Table 3.
As is the case in Tables 3 and 4, pre-attribution contribution bounding in general results in a larger signal loss in the attributed dataset compared to post-attribution contribution bounding.As we will discuss shortly, the design choice of whether to enforce the contribution bound pre-or post-attribution significantly affects the end-to-end privacy of the ad conversion measurement system.

Valid Configurations
The DP aspects of a private conversion measurement system are mostly captured by the choice of (i) the attribution rule, (ii) the adjacency relation, (iii) the contribution bounding scope, and (iv) the contribution bound enforcement point.We refer to a setting of each of these choices as a configuration.It is natural to consider a configuration to be operationally valid if for every positive integer  , enforcing a contribution bound of  at the required point ensures that any two adjacent datasets always result in two postattribution post-enforcement datasets that differ on at most  0 •  many (impression, conversion) pairs, where  0 is an absolute constant independent of the numbers of publishers and advertisers. 7If this property does not hold, then a change of, e.g., a single impression's contributions, within the the contribution bound of  , could result in a change in the attributed dataset of magnitude growing with the (unbounded and potentially very large) number of publishers and advertisers showing ads to a given user.It turns out that this condition is not only sufficient to ensure the DP of the ad measurement system, but also in a sense necessary, unless the noise is increased with the total number of publishers or advertisers-which is practically unwieldy as this number is not fixed and can vary from user to user (note that the subset of publishers and advertisers showing ads to a given user as they browse the Web cannot be fixed ahead of time).In other words, a configuration is deemed invalid if the sensitivity increases as the number of advertisers or publishers increases.For more details, we refer the reader to Lemma 1 and the paragraph following it.

Our Contributions
In addition to formally defining the framework for ad conversion measurement and defining the notion of an operationally valid configuration, we provide a complete characterization of the validity of the configurations that most commonly arise in practice.
Classification.We provide a complete classification of all the configurations of attribution rule, adjacency relation and contribution bound enforcement point, that are operationally valid; see Table 5.We discuss the obtained classification next.
We first note that pre-attribution contribution bound enforcement results in valid configurations for all considered attribution rules and adjacency relations.A possible challenge to such an enforcement point is that an impression or publisher can incur a deduction from their contribution bound whenever they are part of the input to the attribution rule, and even if they are not selected for attribution.This can result in situations where a publisher can see their contribution bound totally exhausted due to conversions that got attributed to other publishers.This is in fact the main motivation for considering post-attribution contribution bound enforcement, which we discuss next.
It turns out that the adjacency relations that are valid for all attribution rules in the case of post-attribution enforcement are the conversion, user × advertiser, and user options.A limitation of the conversion adjacency relation is that the privacy would degrade as a user converts more than once.Given that conversion events are in practice not restricted to purchases (e.g., page views, email signups, and add-to-carts can also qualify as conversions), the privacy leakage could increase noticeably.On the other hand, the user adjacency relation could be operationally challenging to enforce as all the publishers and advertisers would have to share the same contribution bound, which could end up being dominated by certain publishers and/or advertisers.For the other relations of impression, user × publisher, and user × publisher × advertiser, we demonstrate in Table 5 that the situation is much more delicate, as the validity of the different attribution rules turns out to depend on the adjacency relation.For instance, we show that, surprisingly, while the impression, and user × publisher × advertiser adjacency relations and contribution bounding scopes admit valid configurations for post-attribution contribution bound enforcement, the user × publisher adjacency relation does not.
Our results suggest that if all the considered attribution rules are to be supported, then either pre-attribution enforcement, or a user, user × advertiser, or conversion adjacency relation should be used.
If, however, post-attribution enforcement is desired and a middle ground is sought between the conversion and user contribution bounding scopes, then only a subset of the attribution rules can be supported (as in Table 5).
We emphasize that the dimensions considered in our classification are fundamental to any conversion measurement system.Specifically, any such system has to select an attribution rule.Moreover, any DP implementation has to choose a privacy unit.It also has to bound contributions, and keep track of a remaining contribution bound.
For a high-level overview of the proofs, we refer the reader to Section 6.4.We next give an example illustrating the idea captured by the notion of invalid configurations.

Example of an Invalid Configuration. Consider the configuration
where LTA is selected as the attribution rule, the adjacency relation (and the contribution bounding scope) is set to user × publisher, and contribution bounding is performed post-attribution.Moreover, consider the typical setting where an advertiser runs a campaign displaying ads on multiple publishers.The advertiser's goal is to estimate the number of conversions attributed to ad impressions shown on each publisher.Since the adjacency relation is user × publisher, and since summation has sensitivity  (1), one would hope for an -DP algorithm with error  (1/).On a high level, our results suggest that surprisingly, this is not possible to achieve since adding last-touch interactions on a publisher can remove attributed conversions for all other publishers.Hence, the sensitivity in fact grows with the (practically unbounded) number of publishers, which would result in poor measurements even if a publisher has thousands of attributed conversions.

Additional Related Work
There have been several previous works on (non-private) conversion measurement, e.g., [5,27,28,33].We point out that it is common in the literature on DP to consider notions between protecting a single contribution and protecting all the user's contributions; see, e.g., [31] and the references therein.We also note that some recent ad conversion measurement systems rely on ad-hoc privacy notions; see, e.g., [9].To the best of our knowledge, our work is the first study of the end-to-end (differential) privacy of ad conversion measurement systems.
The very recent works of [13] and [6] provide an empirical evaluation of a differentially private ad conversion measurement system similar to the one studied in this work.Their focus is the Privacy Sandbox Attribution Reporting API (ARA) on Chrome and Android.They consider last touch attribution and an impression privacy unit.Both of these work consider the linear queries functionality (e.g., conversion counts and values).The former focuses on hierarchical queries whereas the latter studies the non-hierarchical setting.These works provide a concrete instantiation of a DP ad conversion measurement system similar to the one studied in this work, and they empirically evaluate the error, on real ad conversion datasets, for different values of the differential privacy parameter .We refer the reader to these two papers for more details.Note that, in our terminology, ARA assumes that the attributed dataset is the input, and that post-attribution capping and noising is performed.Thus, our work complements these previous works: the valid configurations (for the impression adjacency relation) in our work imply that ARA satisfies an end-to-end DP guarantee for the corresponding attribution rules.Meanwhile, the invalid configurations imply that the end-to-end DP guarantee may not hold for those attribution rules.
Organization of Rest of the Paper.We start Section 3 with some notation that will be used in the rest of the paper.We recall the formal definition of DP in Section 3.1.In Section 3.2, we formally define the various attribution rules that will be studied in this paper.In Section 4, we present the notion of an operationally valid configuration, along with its connection to the design of a DP ad conversion measurement system.The pre-and post-attribution contribution bound enforcement algorithms are described in Section 5. We present our main validity and invalidity results in Section 6.Some of the proofs are given in Section 6.5 (with the rest deferred to the Appendix).Our work opens up several interesting areas of exploration; we describe some of these in Section 7 where we also discuss our results in the context of related practical applications.and a (±) cell means means that there are attribution rules in that family that result in a valid configuration and an invalid configuration.Specifically, both the IPA class and the POS class contain FTA and UNI; in the impression and user × publisher × advertiser adjacency relations, the former is valid but the latter is invalid.(*) For the conversion adjacency relation, no contribution bound enforcement is applied as the conversion is already only used once in the attribution rule.

PRELIMINARIES
Notation.For any positive integer , we denote by [] the set {1, . . ., }.For any finite set , we denote by  * the set of all finitelength non-empty sequences of elements of .For any positive integer , the -dimensional probability simplex, denoted by Δ  , is defined as the set of all vectors in [0, 1] +1 whose coordinates add up to 1.The ℓ 1 -norm of a vector  ∈ R  is defined as ∥ ∥ 1 =  =1 |  |.The Laplace distribution with zero mean and scale parameter  > 0 is the continuous probability distribution whose probability density function is given by  (; , for any real number .

Differential Privacy
We denote two adjacent datasets D and D ′ by D ∼ D ′ .The adjacency notions considered in this work are listed in Table 2, and will be further discussed in Section 4.1, but DP can be defined generally for any such relation.
Definition 1 (Differential Privacy [18]).Let  ≥ 0. A randomized mechanism M is -differentially private (denoted by -DP) if for each pair D ∼ D ′ of adjacent datasets and each subset S of outputs of M, it holds that , where the probabilities are over the randomness in M.
Intuitively, the DP definition guarantees that the outputs of two adjacent datasets are approximately statistically indistinguishable.The degree of indistinguishibility is dictated by the  parameter.The smaller  is, the more private the algorithm would be.In our setting, the dataset D consists of the impressions and conversions (pre-attribution), across all users, advertisers and publishers.The output M (D) is the output of the privacy-preserving ad conversion measurement system.
DP satisfies several useful mathematical properties that have made it an appealing measure of privacy.These include robustness to post-processing, composition, and group privacy.For a comprehensive overview of the area, we refer the reader to the monographs [19,39].

Attribution Rule
The attribution rule function (see, e.g., [15] for background) takes as input a sequence of  impressions and a conversion, all corresponding to the same user and advertiser, and returns a fraction in [0, 1] for each of the  impressions.We denote this function by a : I * × C → [0, 1] * , where I is the set of all possible impressions, and C is the set of all possible conversions.We assume that for any input (( 1 , . . .,   ), ) to the attribution function a, it is the case that the impressions  1 , . . .,   have been sorted from least to most recent according to their timestamps and  occurs later than   .Moreover, it is assumed that a(( 1 , . . .,   ), ) ∈ Δ −1 .

Single-Touch.
In single-touch attribution, only a single coordinate in the output a(( 1 , . . .,   ), ) is equal to 1 and all the other  − 1 coordinates are equal to 0. We next describe some notable special cases of single-touch attribution.

Multi-Touch.
While the single touch attribution rules assign all the credit to a single impression, multi-touch attribution allows spreading the credit over more than one impression.The simplest multi-touch attribution rule is uniform (aka linear).
U-Shaped (U-S).If there are at least three impressions, 40% of the credit goes to the first touch, 40% of the credit goes to the last touch, and the remaining credit is divided uniformly over all the intermediate impressions (i.e., those that are neither first nor last).If there are two impressions, we assume that the credit is split equally between them.

Positional (POS).
In positional (aka position based) attribution, the credit assigned to each impression is based on the total number of impressions and the order in which this impression occurs, i.e., the credit does not depend on the user, publisher, or advertiser IDs, or on the metadata.More precisely, a positional attribution is parameterized by a class F = {  }  ∈N of vectors, where   ∈ Δ −1 .The attribution function is defined as a(( 1 , . . .,   ), ) =   .
Similar to POS, IPA is a class of attribution rules; it contains FTA, LTA, UNI, EXP, and U-S.

DIFFERENTIALLY PRIVATE CONVERSION MEASUREMENT SYSTEMS
To describe our framework for DP conversion measurement, we first discuss in Section 4.1 the adjacency relations and contribution bounding scopes that we consider, and then describe attribution systems and how to privatize their outputs in Section 4.2

Adjacency Relations and Contribution Bounding Scopes
As we saw in Definition 1, any application of DP should specify a notion of when two datasets are considered adjacent.In the conversion measurement setting, there are several options; the most natural of them are summarized in Table 2.
For any relation in the first column of Table 2, we can then define two datasets to be adjacent if one can be obtained from the other by adding or removing impressions and/or conversions as listed in the second column of the table.
Any application of DP should, at some level, limit the individual contributions; otherwise, the finite amount of noise that is injected would not be sufficient to ensure DP when the individual contributions become too large.In the conversion measurement use case, there are multiple contribution bounding scopes in which the contributions could be limited.These include the same choices listed in Table 2 for the adjacency relation.We consider henceforth the most natural setting where the contribution bounding scope matches the adjacency relation.E.g., for the user × publisher adjacency relation, all the contributions of a given user on a given publisher share the same contribution bound.

Attribution Systems
An attribution system is an algorithm that takes as input impressions and conversions sequences and outputs the attributions, represented by a set of weighted pairs of impressions and conversions (defined formally as an attributed dataset below).
Definition 2 (Attributed Dataset).An attributed dataset D attr is a set of triplets (, , ) ∈ I × C × R ≥0 , where for each (, ) there is a unique .We may represent an attributed dataset as a function  D attr : I × C → R ≥0 , where  D attr (, ) represents8 the total weight attributed to impression  by conversion .The ℓ 1 -distance between two attributed datasets D attr , D ′ attr is given by Given an attribution system, we can build a conversion measurement system by applying a function  that maps the attributed dataset to a vector in R  ; the vector measures the statistics that an ad tech would like to estimate.For example, if the ad tech wants to know the total attributions for each slice of (campaign × time-ofday), then each of the  dimensions can represent a valid (campaign ID, time-of-day) pair, and the value that  assigns to that dimension would be the total attribution for that campaign ID and time-of-day.
Of course, as described above, the system is not (differentially) private: the ad tech can allocate a dimension for a particular user and then count exactly, e.g., the number of impressions that user sees.Since this is not desirable, we employ two methods to ensure privacy.First, we apply contribution bound enforcement within the attribution system, which will be discussed below.Second, we add (appropriately scaled) Laplace noise to each of the  coordinates of the values of  ; these noisy estimates are then sent to the ad tech.See Figure 3 for an illustration of such a conversion measurement system.Note that we consider the central DP setting, where a (trusted) curator runs the attribution rule, computes the function  , and adds the noise; the output of this curator is required to be DP.
It turns out that there are two important properties needed to ensure DP of the output.The first is that the sensitivity of  is small, i.e., that a small change (in the ℓ 1 -distance) to the attributed dataset does not change the value of  (again, in the ℓ 1 -distance) by much.This is formalized below.Definition 3 (Sensitivity of  ).For a function  that maps an attributed dataset to a vector of real numbers in R  , we define its (ℓ 1 -)sensitivity to be For many natural functions, such as the "sum by slices" example above, the sensitivity is bounded (e.g., by 1 in the example).Each coordinate of the noise  is drawn from the Laplace distribution with an appropriate scale (see Lemma 1).We note that the attribution system can include a contribution bound enforcement component (this is the case in Algorithms 1 and 2).

Input
The second property we need is with regards to the attribution system itself.Although we have not defined the contribution bound enforcement yet, it takes in a positive integer parameter  , considered as the "contribution bound". 9To ensure DP, we need this parameter  to be an upper bound on the possible change (in the ℓ 1 sense) in the resulting attributed dataset.
More specifically, an attribution system-which can be specified by a "configuration" of adjacency relation, contribution bound enforcement point, and attribution rule-is "valid" if two adjacent datasets get mapped to attributed datasets that are at most  ( ) apart, as stated more precisely below.

Definition 4 (Valid Configurations
).An adjacency relation along with a contribution bound enforcement point is said to be  0 -valid for a given attribution rule if, for every positive integer  , applying a contribution bound of  at the required enforcement point ensures that any two adjacent datasets always result in two attributed datasets that are at an ℓ 1 -distance of at most  0 •  , where  0 is an absolute constant independent of the numbers of publishers and advertisers.We call a configuration valid if it is  0 -valid for some absolute constant  0 > 0.
Assuming the above two properties, if we appropriately scale the Laplace-distributed noise injected in Figure 3, then we can guarantee that the system is DP: Lemma 1.If the attribution system is instantiated with a  0 -valid configuration and each coordinate of the noise  is sampled according to the Laplace distribution with scale parameter  0 •  • Δ( )/, then the conversion measurement system is -DP.
We remark that the "converse" of Lemma 1 is also true in the following sense: if we let  be the identity function, i.e., the range of  is associated with R I×C and let  (D attr ) (,) =  D attr (, ), then the conversion measurement system with  sampled according to the Laplace distribution with scale  0 •  / is -DP iff the attribution system is instantiated with a  0 -valid configuration.In other words, the validity of the configuration characterizes the DP property of the conversion measurement system in this sense.
Proof of Lemma 1.Consider any two adjacent datasets D, D ′ ; let D attr and D ′ attr be the results of the attribution system on these two datasets, respectively.Then, since the configuration is  0 -valid, we have ∥D attr − D ′ attr ∥ 1 ≤  0 •  .Therefore, from the definition of Δ( ), we can conclude that ∥  (D attr ) −  (D ′ attr )∥ 1 ≤  0 •  • Δ( ).In other words, the entire measurement system before adding noise is simply a function with ℓ 1 -sensitivity at most  0 •  • Δ( ).Thus, the standard DP guarantee of the Laplace mechanism [18] implies that the system is -DP as desired.□ Remark 4. While the treatment above considers the evaluation of a single function on the input dataset, it can be readily extended to the setting where one would like to compute multiple (possibly adaptively chosen) functions on the same dataset; this can be done using the standard composition properties of DP [19].
Remark 5.For typical functions  that are of interest in ad conversion measurement, their sensitivity Δ( ) can be computed explicitly.For example, if  computes the total (attributed) conversion count, then Δ( ) = 1.Similarly, if  computes the number of distinct users with attributed conversions, then Δ( ) = 1 too.On the other hand, if computes the sum of capped conversion values, where each value is capped to a positive real number  , then the sensitivity of  is given by Δ( ) =  .Queries of interests are also often "sliced" by certain attributes.For example, one might be interested in histogram of the total (attributed) conversion count for each publisher or each geographic location (which is e.g.determined by the impression's metadata).In this case, the sensitivity Δ( ) remains 1.

CONTRIBUTION BOUND ENFORCEMENT
As stated earlier, a major part of the attribution system is contribution bound enforcement algorithm.This algorithm takes a positive integer  , the "contribution bound", and tries to "enforce" this contribution bound.We consider two types of enforcement in this paper, pre-attribution and post-attribution, which we will explain next.(It is an interesting future direction to understand if there are other contribution bound enforcement strategies that may be more privacy safe and/or practical than the ones considered here.) First, let us note that for the conversion adjacency relation (Table 2), no contribution bound enforcement is applied.The reason is that each conversion is used only once in the attribution rule.Therefore, the enforcement strategies described below will anyway not affect it.
To describe the enforcement strategies for the other adjacency relations, it is key to define the scope of a contribution bound.
Definition 5 (Contribution Bounding Scope).The contribution bounding scope for an adjacency relation is the unit of that relation.
For example, for the user adjacency relation, a contribution bounding scope would be each user.

Post-Attribution Contribution Bound Enforcement
For post-attribution contribution bound enforcement, we simply run the attribution algorithm as usual.However, we only add each weighted (impression, conversion) pair to the attributed dataset if it does not exceed the (remaining) contribution bound of that scope.The pseudo-code is given in Algorithm 1.

Pre-Attribution Contribution Bound Enforcement
Pre-attribution enforcement is much more pessimistic than the post-attribution approach.Specifically, we charge one unit from the contribution bound of every scope involved with the input impressions to the attribution rule.If any scope does not have enough contribution bound left, we remove all impressions associated with that scope.As we will show below, such a pessimistic approach is-perhaps not too surprisingly-more privacy-safe than the postattribution approach in certain settings.The full pseudo-code of the pre-attribution contribution bound enforcement is given in Algorithm 2.  ← set of impressions that come before   in time and are associated with the same advertiser and the same user as

8:
← set of contribution bounding scopes corresponding to at least one impression in  .

CLASSIFICATION RESULTS
In this section we present our results on the validity of different adjacency relations; they are summarized in Table 5.For ease of presentation, we organize our results into two categories: when the validity holds independent on the attribution rule and otherwise.
We only state in this section the theorems that are proved in Section 6.5.For the other theorems, we provide forward pointers to their formal statements (along with their proofs) in the Appendix.

Attribution Rule-Independent Validity
We start by discussing the validity of different adjacency relations, which turn out to be independent of the attribution rule.The conversion adjacency relation (which, as stated earlier, involves no contribution bound enforcement) turns out to be valid for every adjacency relation and every attribution rule (Theorem 4).
For pre-attribution enforcement, it also turns out that all adjacency relations and all attribution rules result in valid configurations (Theorem 5).
We next turn our attention to post-attribution enforcement.In this case, we show that the validity of the user × advertiser and user adjacency relations are independent of the attribution rule (Theorems 6 and 7).By contrast, the following theorem (which we will prove in Section 6.5) shows the invalidity of the user × publisher adjacency relation for any attribution rule.
Theorem 1.For any attribution rule, the user × publisher adjacency relation with contribution bound enforced post-attribution constitutes an invalid configuration.

Post-Attribution Enforcement and Impression Adjacency
In this section we consider the impression adjacency under postattribution enforcement of the contribution bound.Our first result proves the validity of FTA. 10 Following Lemma 1, this means that these configurations require adding twice as much noise compared to those in previous validity results (under the same contribution bound).
Theorem 2. For the FTA rule, the impression adjacency relation with contribution bound enforced post-attribution constitutes a 2-valid configuration.
We also prove a similar result for LTA (Theorem 8).By contrast, we prove that the UNI, EXP, and U-S attribution rules are all valid in this case (Theorem 9, Corollary 1, and Theorem 10 respectively).

Post-Attribution Enforcement and
User × Publisher × Advertiser Adjacency We next consider the user × publisher × advertiser adjacency relation.In this case, and under post-attribution enforcement, it turns out that only FTA results in a valid configuration (Theorem 11) whereas the LTA, UNI, EXP, and U-S attribution rules result in invalid configurations (Theorem 3 and Corollary 2).
Theorem 3.For the LTA attribution rule, the user × publisher × advertiser adjacency relation with contribution bound enforced post-attribution constitutes an invalid configuration.

Intuition and Proof Overview
Before we proceed to the formal proofs of these classification results, let us briefly (and informally) discuss the high-level ideas behind them.For the invalidity results in the post-attribution case, there are (roughly speaking) two root causes behind them: • Cascading Effect within Attribution Rule.A single impression can be an input of multiple executions of the attribution algorithm (Line 8 of Algorithm 1).When removing such an impression from the dataset, the attribution rule also changes the weights assigned to other input impressions.Below we construct datasets which make sure that these changes affect different privacy units, implying that it can exceed the contribution bound.This is the idea behind our constructions for the impression adjacency relation (Theorem 9, Corollary 1, and Theorem 10).• Multiple Impressions Affecting Multiple Privacy Units.
In some scenarios (see LTA, FTA discussion below), changing a single impression does not result in a cascading effect.In this case, the high-level idea is to construct multiple impressions-corresponding to the same privacy unit-in such a way that removing each one affects some other different privacy unit.When all of these impressions are removed simultaneously, the effect occurs across different privacy units and therefore bypasses the contribution bounding.This is the gist of our constructions for the user × publisher adjacency relation (Theorem 1) and the user × publisher × advertiser adjacency relation (Theorem 3).
We next discuss the validity results.We remark that the attribution rule-independent results (e.g., for the user-level adjacency relation) are relatively straightforward to prove, so we will focus our discussion here on the exceptions: LTA for the impression-level adjacency relation (Theorem 8), and FTA for the impression-level adjacency (Theorem 2) and the user × publisher × advertiser adjacency relations (Theorem 11).
• LTA.When we remove an impression, LTA essentially "routes" all the attributions of this impression to the previous one with the same advertiser.Thus, in the impression-level adjacency relation, contribution bounding will upper-bound the change.On the other hand, this reasoning fails for the user × publisher × advertiser adjacency relation because it is possible to remove multiple impressions that affect different privacy units (i.e., different publishers); this is indeed the second root cause described above for invalidity.• FTA.When we remove an impression, if this impression is not the first one of this advertiser, then no change occurs in attribution.Otherwise, FTA "routes" all the attributions of this impression to the second impression of this advertiser.Similar to LTA, this implies validity for the impression-level adjacency relation.However, in contrast to LTA, this argument remains true in the user × publisher × advertiser adjacency relation; this is because, even after removing multiple impressions of the same advertiser, all attributions are routed to the same impression-the first one of this advertiser after the removal.Therefore, post-attribution enforcement successfully bounds the change.This concludes our summary of the proof ideas.We will next formalize these by providing the proofs of Theorems 1, 2, and 3. (The remaining proofs are deferred to the Appendix.)

Selected Proofs
FTA, Impression Adjacency.We now prove the validity of the FTA rule for the impression adjacency relation.The proof follows the outline from the previous subsection.Theorem 2. For the FTA rule, the impression adjacency relation with contribution bound enforced post-attribution constitutes a 2-valid configuration.
Proof of Theorem 2. Consider two adjacent datasets D, D ′ such that D ′ results from removing an impression ĩ.Let  be ĩ's advertiser,  be ĩ's user and   denote the set of conversions from the advertiser and the user.We may assume w.l.o.g. that all conversions in   occur after the first impression w.r.t. the advertiser  and the user  .(As other conversions remain unattributed in both D and D ′ .)We consider two cases, based on whether ĩ is the first impression w.r.t.its advertiser  and its user  (in D).
• Case I: ĩ is not the first impression w.r.t. the advertiser  and the user  .In this case, the two attributed datasets D attr , D ′ attr are exactly the same, because all conversions in   are attributed to the first impression w.r.t. the advertiser  (which is not ĩ).
• Case II: ĩ is the first impression w.r.t. the advertiser  and the user  .In this case, all conversions   ∈   are attributed to ĩ.These are the only conversions whose attributions change between D and D ′ .
To analyze this change, consider two subcases, whether ĩ is the only impression in D from .
-Case IIa: ĩ is the only impression in D from  and the user  .In this case, all conversions in   become unattributed in D ′ .Therefore, ∥D attr − D ′ attr ∥ 1 =    D attr ( ĩ,   ) ≤ , where the inequality follows from the post-attribution contribution bound enforcement for ĩ.
-Case IIb: ĩ is not the only impression in D from  and the user  .Let ĩ ′ be the second impression in D from .
In this case, every conversion in   is either attributed to ĩ ′ or unattributed in D ′ .Furthermore, no conversions are attributed to ĩ ′ in D (because ĩ comes before ĩ ′ in the same advertiser  and the same user  ).Therefore, we have where the inequality follows from the post-attribution contribution bound enforcement for impressions ĩ and ĩ ′ .In all cases, we can conclude that ∥D attr − D ′ attr ∥ 1 ≤ 2 .□ LTA, User × Publisher × Advertiser Adjacency.Next, we prove the invalidity of the LTA rule under the user × publisher × advertiser adjacency relation.This is due to the fact that we may arrange the impressions/conversions in such a way that a single publisher gets a large amount of attribution weight (before contribution bound enforcement) and that, once this publisher is removed, this weight is re-attributed to multiple publishers.The latter ensures that the change grows with the number of publishers.(This is also the main difference between LTA and FTA, since we cannot ensure such a condition for FTA.)Such an example is given together with a formal argument below.
Theorem 3.For the LTA attribution rule, the user × publisher × advertiser adjacency relation with contribution bound enforced post-attribution constitutes an invalid configuration.
Proof of Theorem 3. Let  = 1 and let  > 1 be any integer.We construct the dataset D as follows: • Let there be a single user, a single advertiser, and  publishers  1 , . . .,   .In D, publishers  1 , . . .,  −1 's impressions get attributed with zero weight.On the other hand, in D ′ , each of these publishers get attribution weight of exactly one.Therefore, ∥D attr − D ′ attr ∥ 1 ≥  − 1, invalidating the attribution system for this configuration.□ Any Attribution Rule, User × Publisher Adjacency.Finally, we prove the invalidity of the user × publisher adjacency for any attribution rule.We remark that, if we were looking for an invalidity proof of a specific attribution rule, then the construction could have been simplified.For example, the construction in Theorem 3 above also shows the invalidity of LTA in this setting.However, we would like our proof to generalize to all attribution rules.Our construction below accomplishes this by first creating another "dummy" dataset D (with multiple publishers) to understand how the attribution weights are distributed across different publishers.We then create the datasets D, D ′ that differ on the highest weighted publisher to ensure that there is a large-unbounded-change between the two attributed datasets.
Theorem 1.For any attribution rule, the user × publisher adjacency relation with contribution bound enforced post-attribution constitutes an invalid configuration.
Proof of Theorem 1.Let a be any attribution rule.To create our datasets D, D ′ , let us start by constructing another dataset D as follows.
• For each advertiser  { , } , let there be impressions  { , }  ,  { , }  and conversion  { , } , coming after both impressions.Furthermore, let  { , }  and  { , }  be associated with publishers   and   , respectively.Now, suppose we run the attribution system-without any contribution bound enforcement-on D. Let  ℓ denote the publisher that gets the largest total attribution weight (with ties broken arbitrarily).Note that the total weight it receives must be at least  2 / = 0.5( − 1).We now construct D by keeping only advertisers  {ℓ, } for  ∈ [] \ {ℓ } in D (and discard the rest of advertisers together with all impressions and conversions associated to them).Let the contribution bound  be 1.Furthermore, let D ′ denote the dataset resulting from removing all impressions corresponding to publisher  ℓ from D. We will now show that ∥D attr − D ′ attr ∥ 1 ≥ 0.5( − 1), which implies that the attribution system is an invalid one.
Combining the above two inequalities, we get that

DISCUSSION AND FUTURE DIRECTIONS
In this paper, we presented a formal framework for DP ad conversion measurement setting.We also demonstrated a delicate interplay between attribution and privacy.We defined the notion of operationally valid configurations, and provided a complete classification of the validity of the configurations based on the most popular attribution rules, adjacency relations, contribution bounding scopes, and contribution bound enforcement points.We hope that our end-to-end differential privacy framework can lead to a solid foundation for practical privacy-preserving ad conversion measurement systems.While we have focused for simplicity on pure-DP (Definition 1), ℓ 1 -sensitivity (Definition 2, 3, and 4), and Laplace mechanism (Lemma 1), our formalism extends readily to the case of approximate-DP [17], other type of sensitivities, and other DP mechanisms.For example, if the set of measurements is large, we could replace the Laplace mechanism with the partition selection algorithm (see, e.g., [14] and the references therein) if we relax to approximate DP.Similarly, we can extend our framework to ℓ 2 -sensitivity 11 and Gaussian mechanism, which could for instance be used to train DP predicted conversion rate models based on DP stochastic gradient descent [2].(For prior work on non-private conversion models, see, e.g., [5,27,28,33].) We describe next some interesting future research directions.
Adjacency Relation ≠ Contribution Bounding Scope.We focused in this work on the most natural setting where the contribution bound scope is the same as the adjacency relation.In principle, this is not necessary: e.g., one might consider a user contribution bounding scope with a user × advertiser adjacency relation.It might be interesting to give a characterization in such cases, as it will lead to an even more fine-grained understanding of the privacy provided by the conversion measurement system.
Contribution Capping: Beyond Pre-and Post-Attribution?While we focus on pre-attribution and post-attribution contribution capping, it remains an interesting open question whether there are other (general) capping procedures that can further improve the utility-privacy trade-off.
To illustrate the challenge, note that an intuitive capping strategy is to do it "at the query evaluation time".Although such a strategy makes sense for certain query functions  and adjacency relations, it is not completely well-defined for all functions  .For example, let  be the number of distinct users with attributed conversions from Remark 5 and suppose we are interested in the impression adjacency relation.If two impressions share the same user ID, then it is not clear what their contributions are; on one hand, removing each of them alone does not cause any change to the value of  .Meanwhile, removing them both may decrease the value.Such a situation is only exacerbated when we have a more complicated function  .We also remark that it is preferable if a single capping procedure is used for all functions  since it allows more flexibility for the measurements that can be made on the platform.
Privacy of the Computation.Our work has focused on guaranteeing privacy against an adversary that has access to "what" is being computed, but not to "how" it is being computed.Concretely, our results capture the setting where a single entity has access to the all the raw impression and conversion data, and seeks to release DP estimates to some requested conversion measurement queries.Studying the privacy of "how" this is computed is an important direction for future work.For instance, one could naturally extend this formalism to distributed settings where the trust in a single entity is relaxed by relying on methods such as secure multiparty computing [21], and on-device noise addition as in local DP [18,22,26] or shuffle DP [10,11,20].Moreover, we studied the case of a static dataset of impressions and conversions; it would be of interest to study the online variant of the problem where privacy needs to be ensured at any time as the impressions and conversions take place.
Enhanced Attribution.Some attribution systems offer a conversion lookback window option (which limits how far back in time from a conversion are impressions eligible for attribution), and an impression expiry option (which limits how far in the future would the impression be eligible for attribution).It would be interesting to investigate the interplay of these enhancements with privacy and their impact on the validity of the configurations.
In our classification, we considered the simplest and most commonly used attribution rules (listed in Section 3.2), which operate on a single user's data.It would be interesting to investigate the interplay between DP and more advanced alternatives such as those based on the Shapley value (e.g., [35]) as well as data-driven attribution (DDA), which by contrast is a class of attribution rules that operate on the entire dataset (across users).
Incentives.While there has been interesting prior work at the intersection of privacy and economics, e.g., [4,23], understanding privacy and incentives in the conversion measurement setting would greatly benefit from further investigation.For instance, our study captures the case where a single ad tech company would like to query the DP conversion measurement system across multiple publishers.In reality, multiple ad techs, which are often competing but could in principle collude, would want to issue DP queries on overlapping impressions and conversions taking place on the same set of publishers and advertisers.
Finally, while a user contribution bounding scope can admit valid configurations, it is vulnerable to "crowding out attacks", where, e.g., one publisher can exhaust the contribution bound of a user (by showing them a large number of impressions).Incorporating the economic incentives of different entities into the analysis of privacy and utility of conversion measurement systems seems worthwhile.
Privacy-Utility Trade-offs of Various Tasks.The classification in this work is in terms of sensitivity, which is closely related to additive noise mechanisms as these naturally calibrate the noise scale to the sensitivity.While most proposed DP conversion measurement systems follow this sensitivity and additive noise paradigm, it would be valuable to consider other families of mechanisms, and to quantify the privacy-utility trade-offs of various estimation tasks.
Correlation across Users.There are settings where different users' data can be correlated.In ad measurement, this can arise if multiple users watch the same ad (e.g., on a TV).Then, their impressions are correlated, and extra care is needed when applying DP [38].We leave the exploration of this interesting setting for future work.
DP Advertising.Applying DP in practical advertising systems has been notoriously difficult for the reasons considered in this work, namely: How to define adjacent datasets?Given the correlation between a given user's behavior across (a practically unbounded number of) websites (and/or apps), can DP be applied without adding a disproportionately large amount of noise that would preclude the measurement of simple statistics?We hope that our work provides a stepping stone for tackling these questions in the setting of attribution measurement-the cornerstone of digital advertising-and leads to solid deployments of DP in practical advertising systems.where the inequality again follows from the post-attribution contribution bound enforcement for the contribution bounding scope ( , , ) and the contribution bounding scope of ĩ ′ .In all cases, we can conclude that ∥D attr − D ′ attr ∥ 1 ≤ 2 , which means that this is a valid configuration with  0 = 2 as desired.□ Uniform Multi-Touch, Exponential Time Decay and U-Shaped Attributions.The invalidity of these attribution rules (Corollary 2) follows directly from that of the impression case.
Corollary 2. For uniform multi-touch attribution, exponential time decay attribution and U-shaped attribution, the user × publisher × advertiser adjacency relation with contribution bound enforced postattribution constitutes an invalid configuration.
Proof of Corollary 2. We may use the same constructions as in the proof of Theorem 9, Corollary 1, and Theorem 10 respectively, except we assign each  1 , . . .,   to  different publishers.This way D, D ′ are adjacent under the user × publisher × advertiser relation.The remainder of the proof then ensures that ∥D attr − D ′ attr ∥ 1 is not bounded above by  0 •  for any constant  0 .□

Figure 2 :
Figure 2: Attribution Path for Multiple Publishers and Advertisers.In this example, the user interacts with ads on two publishers, and converts on the two corresponding advertiser websites.

Figure 3 :
Figure3: Illustration of a (DP) Conversion Measurement System.Each coordinate of the noise  is drawn from the Laplace distribution with an appropriate scale (see Lemma 1).We note that the attribution system can include a contribution bound enforcement component (this is the case in Algorithms 1 and 2).

•
Case II: ĩ1 is the first impression w.r.t. the advertiser  and the user  .In this case, all conversions   ∈   are attributed to ĩ1 .These are the only conversions whose attributions change between D and D ′ .To analyze this change, consider two subcases, whether ĩ1 , ..., ĩ are the only impressions in D from the advertiser  and the user  .-CaseIIa:ĩ1, ..., ĩ are the only impressions in D from the advertiser  and the user  .In this case, all conversions in   become unattributed in D ′ .Thus,∥D attr − D ′ attr ∥ 1 =  ∈ []    D attr ( ĩ ,   ) ≤ ,where the inequality follows from the post-attribution contribution bound enforcement for the contribution bounding scope ( , , ).-Case IIb: ĩ1 , . . ., ĩ are not the only impressions in D from the advertiser  and the user  .Let ĩ ′ be the first impression w.r.t. the advertiser  and the user  that is not among ĩ1 , . . ., ĩ .In this case, all conversions in   are attributed to ĩ ′ in D ′ .Furthermore, no conversions are attributed to ĩ ′ (or other impressions in ĩ ′ 's contribution bounding scope) in D because ĩ1 comes first for the same advertiser  and the same user  .Therefore, we have ∥D attr − D ′ attr ∥ 1 =  D attr ( ĩ ,   ) + ∑︁    D ′ attr ( ĩ ′ ,   ) ≤ 2,