Opted Out, Yet Tracked: Are Regulations Enough to Protect Your Privacy?

Data protection regulations, such as GDPR and CCPA, require websites and embedded third-parties, especially advertisers, to seek user consent before they can collect and process user data. Only when the users opt in, can these entities collect, process, and share user data. Websites typically incorporate Consent Management Platforms (CMPs), such as OneTrust and CookieBot, to solicit and convey user consent to the embedded advertisers, with the expectation that the consent will be respected. However, neither the websites nor the regulators currently have any mechanism to audit advertisers' compliance with the user consent, i.e., to determine if advertisers indeed do not collect, process, and share user data when the user opts out. In this paper, we propose an auditing framework that leverages advertisers' bidding behavior to empirically assess the violations of data protection regulations. Using our framework, we conduct a measurement study to evaluate four of the most widely deployed CMPs, i.e., Didomi, Quantcast, OneTrust, and CookieBot, as well as advertiser-offered opt-out controls, i.e., National Advertising Initiative's opt-out, under GDPR and CCPA. Our results indicate that in many cases user data is unfortunately still being collected, processed, and shared even when users opt-out. We also find that some CMPs are better than the others at conveying user consent and that several ad platforms ignore user consent. Our results also indicate that advertiser-offered opt-out are equally ineffective at protecting user privacy.


INTRODUCTION
There has been a recent increase in the promulgation of data protection regulations, such as General Data Protection Regulation (GDPR) [47], California Consumer Privacy Act (CCPA) [48], and General Personal Data Protection Act (LGPD) [49], across the globe.At a high level, data protection regulations aim to protect user privacy by mandating online services to take user consent before collection, processing, and sharing of user data.Because of their mass deployment, automatic enforcement, and legal binding, data protection regulations possess the potential to protect user privacy; provided that users do not consent to data collection and processing.In fact, infringement fines have already amounted to billions.For example, in case of GDPR -arguably the most mature data protection regulation -the fines have accumulated to a total of 1.6 billion [21].However, despite legal binding, prior research has found that online services often trick users into giving positive consent [75], do not include controls to opt-out of data collection and processing [88], or deploy user interfaces that are unintuitive to navigate in terms of providing consent [51,66].In cases where users are indeed able to exercise their rights, user data is poorly handled.For example, online services often ignore or respond late to data access requests [87] and even leak sensitive user data to unauthorized users because of weak authentication mechanisms [55,63].In some cases, the existence of these issues could be attributed to the complexity of the regulations, unpreparedness, or oversights of online services.In other cases, it could be attributed to inconsideration of online services towards data protection regulations.
Regulators have mostly focused on auditing compliance of large well-known corporations, such as Amazon [5] and Google [8], perhaps because of the lack of systematic mechanisms to automatically detect infringements at scale [22].Prior research [51,66,75,88] has focused on auditing the implementation deficiencies in consent management platforms/tools but it has largely ignored the instances where compliance is correctly conveyed but online services fail to comply.Though, negligence in implementation raises doubts on the seriousness of online services in protecting users' privacy, it does not by itself imply non-compliance.
In this paper, we set out to fill this gap in the state-of-the-art research and deployed practice by regulatory bodies in assessing whether online services are actually compliant with the data regulations or not.To this end, we propose a framework to automatically audit regulatory compliance.We focus on cases where user consent is correctly conveyed but online services may not necessarily comply.We evaluate our auditing framework on the web, whereby websites typically record user consent using consent management platforms (CMPs), e.g., OneTrust [29], and convey it to advertisers under GDPR and CCPA.Our key idea is to leak user interest data in controlled A/B experiments, opt-out/in of processing and selling through CMPs, and leverage advertisers bidding behavior as a side channel in the advertising ecosystem to infer the processing and selling of user information.Since the bidding behavior of advertisers is shaped up by their pre-existing knowledge of the user, we expect to receive higher bids when advertisers process or sell leaked user interest data, i.e., are non-compliant with the law, despite the user choosing to opt-out.
We summarize our key contributions as follows: (1) We propose a framework to automatically audit regulatory compliance of online services.We implement our framework by extending OpenWPM [64].The framework allows us to imitate real user, automatically opt-out/opt-in of data processing and selling, and capture advertisers bidding by advertisers.
(2) As a case study, we use our proposed framework to audit regulatory compliance of online services under GDPR and CCPA with four consent management platforms, i.e, Didomi [13], Quantcast [39], OneTrust [29], and CookieBot [9].Our results indicate that in many cases the advertisers do not necessarily comply with the user consent to opt-out of data processing and selling.Some CMPs perform better than the others, though.For example, when consent is conveyed through Didomi, advertisers bidding behavior significantly changes under CCPA.(3) We also pursue a comparative analysis between state-enforced regulations and advertiser-offered controls, i.e.National Advertising Initiative's (NAI) central opt-out [28], in reduction of collection and selling of user data.Our results indicate that the advertiser-offered NAI's opt-out controls might be equally ineffective at protecting user privacy.
Paper Organization: The rest of the paper is outlined as follows.Section 2 presents an overview of online privacy threats and protection mechanisms.Section 3 describes the design of our framework to audit regulatory compliance of online services.Section 4 presents the results of our auditing.Section 5 presents discussion and limitations of our proposed auditing framework.Section 6 offers the main conclusions from our work.

BACKGROUND & RELATED WORK 2.1 Online Tracking
Online trackers capture users browsing histories and activities across the web to facilitate online behavioral advertising, among other use cases [58].Online tracking is typically conducted through cookies that are set by third party resources loaded on websites, with the key idea being third parties having cross-site access to their cookies.Since most third parties are present on a limited number of websites, they often partner with each other to increase their coverage.Prior research has shown that trackers engage in data sharing partnerships and exchange cookies with as much as 118 other third parties [64], which allows them to increase their coverage by as much as 7 times [82].
Online tracking, and especially tracking driven advertising, poses a serious threat to users' privacy both at the individual and the societal level.At the individual level, trackers collect sensitive personal information, for example, about health and sexual orientation, which is then used to hyper-target the individuals, for instance, through personalized ads [54,79].At the societal level, tracking driven advertising has been leveraged to conduct mass surveillance [26], increase political polarization [50], spread misinformation [17], and discriminate [18].Overall, people are frustrated by the privacy harms facilitated by online tracking.

Protection Mechanisms
2.2.1 Self-Regulations.To tackle user privacy concerns and pressure from the regulatory bodies, such as the Federal Trade Commission (FTC), the online advertising industry has responded with self-regulations [19,33].However, prominent self-regulatory actions, such as the ones facilitated by browsers, for example, Platform for Privacy Preferences (P3P) [35] and Do Not Track (DNT) [42], and the ones offered by the advertisers, for example, Digital Advertising Alliance's (DAA) AdChoices [2] and National Advertising Initiative's (NAI) central opt-out [28], are either not respected by majority of the vendors or they are too convoluted to be used or understood by the lay users.
Browser-Facilitated Automated Controls.Browsers provide several mechanisms that advertisers can leverage to enforce self-regulatory measures in an automated manner.P3P and DNT request headers are two such mechanisms.P3P, now discontinued, was an automated mechanism for online services (e.g., website and third-party vendors) to communicate their privacy policies to web browsers.It was implemented by major web browsers, such as Internet Explorer and Firefox [40,72], and supported by thousands of websites [60].However, P3P was often misrepresented by online services [73,84] likely because it was not enforced under any state regulation.Similarly, DNT was proposed to convey user's privacy preferences to the online services in an automated manner.However, it also enjoyed limited adoption and it had practically no impact in limiting tracking.Libert et al. [74] reported that only 7% of the websites mentioned DNT in their privacy policies, and in majority of those cases specified that the DNT signal is not respected.Miguel et al. [56], conducted an A/B study and identified that the DNT signal essentially had no impact on ad targeting, experienced by users.
Advertiser-Offered Manual Controls.In response to the concerns from FTC, advertising networks formed National Advertising Initiative (NAI), which provides a central interface for users to opt-out from targeted advertising, i.e., if users opt-out through NAI's central opt-out interface, they will (supposedly) no longer be tracked for online advertising [27].McDonald and Cranor [76] conducted a user study and found that only 11% of respondents understood NAI's opt-out mechanism, which indicates that its adoption is perhaps low.Similarly, taking a step forward in self-regulations, several of the advertising consortiums, created Digital Advertising Alliance (DAA) with an aim to provide easy to access user transparency and control, with "AdChoices" icon, to opt-out of targeted advertisements [2].Unfortunately, similar to NAI's opt-out, only 9.9% of ads shown on top websites had AdChoices icon [67].

2.2.2
User-Managed Privacy Protections.Currently, the most effective way for users to self-protect their privacy is to rely on off-theshelf privacy-enhancing tools, such as AdBlock Plus [1], Privacy Badger [37], and Disconnect [16].However, privacy-enhancing tools are not available by default in browsers and need to be installed separately; which limits their adoptability to mostly techsavvy users.Further, trackers engage in an arms-race with privacyenhancing tools and try to come up with evasive tactics, for example, bounce tracking [89] and CNAME cloaking [61], to evade privacy protections.
The other likely more feasible alternative is to rely on default privacy protections offered by the mainstream browsers, which are available to a larger population.However, these protections are too weak to completely protect user privacy.For example, some mainstream browsers block third-party cookies, which makes them susceptible to new and sophisticated ways of tracking, such as browser fingerprinting [64,69].Further, some browsers, such as Google Chrome, are too cautious even in blocking third-party cookies because of website breakage concerns [43].

State-Enforced
Regulations: Focus of Our Work.Both selfregulations and user-managed privacy protections do not have any legal binding and are thus blatantly bypassed by the advertisers and trackers.Only recently, legislators have promulgated regulations, such as General Data Protection Regulation (GDPR) [47] in EU and California Consumer Privacy Act (CCPA) [48] in California, that have potential to rein-in online advertising and tracking ecosystem.These regulations have clearly-stated frameworks that define protocols to collect, share, and use personal user information.Most importantly, their infringements can be prosecuted; which can lead to heavy fines [20,48].For example, both Amazon and Google were recently fined for 746 [5,12] and 50 millions [8] under GDPR, respectively.Essentially, these regulations possess the ability to keep advertising and tracking ecosystem in check.
Both GDPR and CCPA guarantee a right for individuals to optout of processing and selling of their data.Under GDPR, online services need to take user consent (Articles 4 (11)) before they can process user data (Article 6 (1) (a)).GDPR has a broad definition of data processing, that includes collection, recording, organization, structuring, storage, adaptation or alteration, retrieval, consultation, use, disclosure by transmission, dissemination or otherwise making available, alignment or combination, restriction, erasure or destruction of user data (Article 4 (2)).Under CCPA, online services need to provide user control to opt-out of sale of personal user data (Section 1798 (a) (1)).CCPA has a broad definition of personal data selling, that includes selling, renting, releasing, disclosing, disseminating, making available, and transferring data to another business or a third party for monetary or other valuable consideration (Section 178.140 (t) (1)).Both GDPR's and CCPA's data processing and selling definition covers routine data exchanges, such as processing user data to serve personalized ads (e.g., through Real-Time Bidding (RTB) protocol [41]), and sharing data with advertising partners under data sharing partnerships (e.g., with cookie syncing [65]).In fact, The Office of the California's Attorney General explicitly lists several such examples as violations of CCPA [7,80].It is noteworthy that GDPR requires to obtain consent beforehand (Article 6 (1)(a)): Processing shall be lawful only if and to the extent that at least one of the following applies: (a) the data subject has given consent to the processing of his or her personal data for one or more specific purposes.Whereas, CCPA requires to provide tools to opt-out later (Section 1798.120(a)): A consumer shall have the right, at any time, to direct a business that sells or shares personal information about the consumer to third parties not to sell or share the consumer's personal information.This right may be referred to as the right to opt-out of sale or sharing.CCPA does not require a beforehand consent because it only restricts the selling/sharing of personal data and not its collection.
Both GDPR and CCPA require websites to provide privacy notices with information and controls to opt-in/out of personal information collection and/or processing.To obtain user consent, websites typically embed off-the-shelf consent management platforms (CMPs), e.g., OneTrust [29] and Cookiebot [9].CMPs scan websites and identify all cookies set by the HTTP headers and scripts, from both the first and third party resources.In case of GDPR, CMPs should ensure that only strictly necessary cookies are shared and consent is obtained before non-essential cookies, such as for advertising and analytics, are shared.In case of CCPA, CMPs should ensure that they provide controls to users to opt-out to sell their personal information.Figure 1a shows an example consent dialog displayed under GDPR, and Figure 1b shows an example consent dialog displayed under CCPA.

Header bidding
Header bidding [24] is a strategy in which websites, known as publishers, allocate various advertising slots for advertisers.The advertiser with the most competitive bid wins the opportunity to showcase their ads within the relevant ad slots.In the context of client-side header bidding, users can conveniently view all the bids directly from their web browser.A prominent illustration of this approach is demonstrated by prebid.js[36].We've identified the Alexa top 100K websites, encompassing those with advertisements (e.g., cnn.com) and those without (e.g., google.com), to verify their utilization of prebid.js.Our findings revealed that 5421 websites are employing prebid.jswith the standard API label pbjs (Personalized custom API labels are also utilized within prebid.js,and these are excluded from our tally of detections.).This signifies that over 5% of websites continue to employ prebid.js.
In contrast, server-side header bidding involves conducting the bidding auction on the ad server instead of the user's browser.Consequently, the bids from participating advertisers remain concealed from the web user's perspective.Noteworthy instances of serverside header bidding include Google Open Bidding, Amazon TAM, and Prebid Server.Comparing to server-side header bidding, the client side header bidding has the following advantages: (1) Through header bidding, publishers retain the ability to select buyers via header bidding wrappers.Additionally, publishers have the authority to set the minimum price for each ad unit.Consequently, the entire auction process becomes visible and clear for both publishers and advertisers.However, such transparency is not as pronounced with serverside header bidding.While publishers still determine the floor price, they lack visibility into the buyers participating in the auction, resulting in a more concealed auction process.
(2) The primary rationale for favoring client-side header bidding over server-side header bidding lies in auction management.Header bidding wrappers empower publishers to oversee and manage the auction.Publishers can add buyers, establish timeout configurations, and ensure simultaneous bid requests to all buyers using these wrappers.In contrast, server-side header bidding involves the server reaching out to buyers and initiating bid requests, hence the management is primarily executed by the server.(3) Client-side header bidding enables advertisers to directly access ad units from publishers' web pages using wrappers, thereby gaining access to user cookie data.This data can be further employed for targeted advertising.Conversely, server-side header bidding encounters limitations in cookie matching.
Since the server-side header bidding does not expose auction at the client side, we do not consider it in our measurements.

Statistical Analysis
To evaluate if there are significant differences in advertisers bidding behavior when users opt-out under GDPR and CCPA, we conduct Mann-Whitney U test of statistical significance [57].Mann-Whitney U test is a nonparametric test to compare the differences between two distributions.Since we perform multiple comparisons, i.e., compare bid values for all 16 personas, we also conduct Bonferroni correction on the statistical test.Our null hypothesis is that the bid distributions for opt-in and opt-out are similar to each other.We reject the null hypothesis, when the p-value (after correction, i.e., original value multiplied by 16) is less than 0.05 (reflecting a 95% confidence interval), i.e., the distributions are statistically different.We also measure the magnitude of the difference between bid values by calculating the effect size [57].Effect size less than 0.3, between 0.3 and 0.5, and greater than 0.5 is considered small, medium, and large, respectively.Effect sizes are reported only in cases where statistically significant differences are observed.In instances where no bids are accumulated under either opt-out or opt-in conditions, the calculation of p-value and effect size becomes unfeasible, as these measures necessitate two datasets for meaningful comparison.

Related Work
Prior research has identified that online services design unintuitive and hard to navigate data access interfaces [51,66], trick users into giving positive consent [75], and do not include controls to opt-out of data selling [88].Alizadeh et al. [51] conducted a user study to understand data rights under GDPR and identified that the participants find data access interfaces unintuitive and hard to navigate.Specifically, users prefer structured and easy-to-navigate data usage reports in contrast to data dumps, that are hard to explore.Habib et al. [51] conducted a measurement study of 150 websites and identified that the privacy controls were hard to locate on the majority of websites.Furthermore, in several instances, links to privacy control did not lead to stated choices.Matte et al. [75] investigated CMPs and identified that the consent is often incorrectly conveyed.Specifically, websites often register consent before the user has made any choice, register positive consent regardless of user's choice, or nudge users to give pre-selected positive consent.Urban et al. [86] discovered that 93% of the examined websites included third-party elements originating from regions that potentially do not conform to the prevailing legal framework.More recently, Nortwick and Wilson [88], conducted a measurement study of top 500K English websites and identified that only 2% of the websites provided controls to users to opt-out of data selling, i.e., "Do Not Sell My Personal Information" (DNSMPI), under CCPA.The study by Toth et al. [85] found that CMPs themselves may exhibit dark patterns and could track users' data to some extent by investigating 10 consent services from 5 CMPs deployed on different blank websites.They also identified that default configurations of consent pop-ups often violate regulations and that their configuration options may lead to non-compliance.Recently, Nguyen et al. [78] studied the implementation of consent notices specifically on Android apps and identified that about 20% of these apps violate at least one GDPR consent.In the study conducted by Demir et al. [62], a thorough examination of 114 prior research papers on web measurement was undertaken.The findings revealed a significant trend, with the majority (72.6%) of these papers lacking a comprehensive inventory of the pages they had analyzed.This has substantial implications for experiment reproducibility, as it highlights a prevalent issue where the majority of experiments cannot be replicated in terms of the specific sites and pages that were studied.
Though negligence in obtaining consent and not providing easyto-navigate opt-out controls raises doubts on online services' seriousness in protecting users' data and respecting their consent, it does not automatically imply non-compliance.Prior work, to the best of our knowledge, has not directly measured non-compliance through consent notices on traditional web browsers, especially for the cases where consent is properly conveyed to the online services.To bridge that gap, in our work, we set out to audit the usage and selling of personal user data, where the user has directed online services to cease the processing and selling of their data, and their consent is properly recorded by the CMPs.

Inferring Non-Compliance
Online services, including publishers, advertisers, and trackers, do not offer much transparency in the usage and sharing of collected data, which makes it challenging to directly assess non-compliance.Though prior work has not directly measured advertisers and trackers non-compliance, they have relied on side channel information to infer the usage and sharing of user data [53,59,71,81,83].
A series of studies [59,81,83] leaked user interest data, in controlled experiments, and leveraged advertisers bidding behavior as a side channel to infer the usage and sharing of user data.Especially in the study [81] the author mentioned that the prices for profiles "Only category" are about 40% higher than those for "New user".Their main insight is that the advertisers bidding behavior is shaped by their pre-existing knowledge of the user, which typically results in higher bid values, as compared to bid values for users for which advertisers do not have knowledge.Specifically, higher bids made by the advertiser to which the data was leaked indicates the usage of the leaked data for ad targeting.Whereas, higher bids from the advertiser to which data was not leaked indicates the sharing of data from advertisers to which the data was leaked.Data sharing is an essential component of online advertising ecosystem and it is baked into ad delivery protocols, such as RTB [41] and HB [24] protocols.Prior work [64,82] has identified that advertisers and trackers use ad delivery protocols, to directly share user data with each other at the client side, e.g., by cookie syncing [46].Thus, client side data sharing can be directly inferred by analyzing network requests (e.g., redirects), between advertising and tracking services.
We argue that analyzing advertisers bidding behavior and network traffic should suffice in establishing whether advertisers comply with the user consent, when they opt-out of processing and selling of their data under GDPR and CCPA.Thus, in this study, we leverage advertisers bidding behavior and network traffic to audit regulatory compliance of advertisers under GDPR and CCPA.

METHODOLOGY
In this section, we describe our framework to audit advertising and tracking services under GDPR and CCPA.At a high level, we simulate synthetic user personas (profiles) with specific interests, intentionally leak those interests to advertisers and trackers, optout of processing and selling of user data, and analyze advertisers bidding behavior and network traffic to infer the usage and selling of leaked data.Figure 2 gives an overview of our approach.

Pre-Conditions: Crawling Under GDPR & CCPA
3.1.1Web Crawling.We rely on OpenWPM [64] to set up our auditing framework.OpenWPM is a widely used Firefox-based, open-source web measurement tool that has been used in numerous research studies [34].OpenWPM by default provides functionality to automatically visit websites and capture network requests, among other things.To adapt to our needs, we further extend Open-WPM to automatically opt-out of processing and selling of user data, and to capture advertisers bids on ad slots.In the opt-out process, we perform either the JavaScript opt-out event activation or engage the opt-out button provided by the CMP.We describe comprehensive information in Section 3.3.1.The JavaScript optout function is accomplished using a browser extension, while the opt-out button is clicked through the utilization of Selenium.It's worth noting that OpenWPM employs Selenium for browser handling, allowing functions compatible with Selenium to also function within OpenWPM.Throughout the phase of recording advertisers' bidding behaviors, the getBidResponses method, supported by prebid.js, is employed.

Simulating Measurements under GDPR & CCPA.
We conduct our measurements from EU and California to audit third-party compliance under both GDPR and CCPA.We choose Frankfurt (Germany) and Northern California as representative jurisdictions under GDPR and CCPA, respectively.We rely on Amazon EC2 to simulate web crawls through the respective locations.We setup a new EC2 node, i.e., with a unique IP address, for each OpenWPM instance.

Simulating User Interest Personas
Recognizing the distinct bidding patterns exhibited by advertisers based on varying user interests, we undertake the simulation of 16 unique user interest personas.These personas are shaped by the categories of the top 16 Alexa-listed websites1 [4].Notably, our selection approach compensates for Alexa's shift in offerings after September 2020 [3].The method of persona simulation involves initializing a fresh browser profile within an OpenWPM instance.This occurs on a pristine EC2 node, each furnished with an exclusive IP address.The process entails iterative visits to the top 50 websites within each category, with the browser profile being updated following each visit.Importantly, it is pertinent to highlight that our interactions were limited to the main pages of these websites.Our rationale for crafting these simulated personas rests upon their potential to persuade advertisers and trackers regarding the interests of each persona.The goal is to incentivize advertisers to submit higher bids when tailoring personalized ads for each persona.Inclusive of the aforementioned 16 personas, our study also introduces a control persona, represented by an empty browser profile.This control persona functions as a benchmark, facilitating the measurement of disparities in bidding behavior across personas.Noteworthy measures undertaken in our methodology encompass the activation of OpenWPM's bot mitigation features and the introduction of random delays ranging between 10 to 30 seconds subsequent to loading each website.These steps serve to enhance the authenticity of user behavior emulation.We deployed a total of 34 instances: 17 in Germany (16 personas + 1 Control) using 17 different IPs, and 17 in California (16 personas + 1 Control) using 17 different IPs.
We automated the loading of selected web pages and simulated various user interactions, including mouse movements and scrolls, with random time gaps between actions.The mouse cursor is moved

Filtering Opt-Out and Header Bidding Supported Websites
We shortlist websites that support opt-out through CMPs and also implement header bidding through prebid.js.We identify such websites, by crawling Alexa top-100K websites, using OpenWPM, and probing for the presence of CMPs and prebid.js(as described in Section 3.3.1 and 3.4.2).Table 1 lists the presence of CMPs and prebid.json Alexa top-100K websites.We note that a large number of websites deploy CMPs but not all of them deploy prebid.js.However, scanning top-100K websites allows us to filter a meaningful number (i.e., 352) of websites that deploy CMPs and prebid.jsunder both GDPR and CCPA.

Opting-out of
Processing & Selling of User Data.We extend OpenWPM to programmatically opt-out of processing and selling of user data from Didomi, [13], Quantcast [39], OneTrust [29], and CookieBot [9], four of the widely used consent management platforms (CMPs) [44,68].We conducted assessments on alternative CMPs as well.Our evaluations encompassed TrustArc, Crownpeak, LiveRamp, CookieYes, Osano, AdRoll, iubenda, and Usercentric.Employing Frankfurt, GE and Los Angeles, US IPs, we filtered websites to verify their compatibility with Opt-out, Opt-in, and prebid.js.Notably, CookieBot, Didomi, OneTrust, and Quantcast emerged as the four CMPs with the greatest prevalence among the top 100K websites listed on Alexa.At a high level, we either trigger the JavaScript opt-out event or click the opt-out button of the CMP.Specifically, for Didomi, we check for the presence of consent dialog with Didomi.notice.isVisible,trigger Didomi.setUserDisag-reeToAllmethod to opt-out, and then hide the consent dialog by setting the display attributes of consent dialog markup to none [14].For OneTrust, we check for the presence of consent dialog with window.OneTrust, trigger window.OneTrust.RejectAll method to opt-out and hide the consent dialog [30].For CookieBot, we check for the presence of consent dialog with window.Cookiebot, traverse the DOM to find the opt-out button with id CybotCookiebotD-ialogBodyButtonDecline and click it.For Quantcast, we check for the presence of consent dialog by traversing the DOM to find the dialog with qc-cmp2-summary-buttons class name and click the button with Reject or similar text. 2 If the reject button is not present on the first page of consent dialog, we expand the dialog by clicking the button with more options text and then click the Reject All button.Figure 3 shows the Quantcast dialog.The full lists of websites used to test Opt-out/Opt-in are listed on Github [3] (a) Main page of consent dialog.3.3.2Ensuring the Functionality of Selected CMPs.We must ensure a comprehensive understanding of how each selected CMP interacts with advertisers both prior to and following user consent.CookieBot [6], Didomi [11], and OneTrust [32] have official documentation outlining their procedures for managing 3rd party scripts that load prior to user consent.However, there is no available information pertaining to Quantcast.We conducted manual testing on all four CMPs using German IPs, calculating the quantity of HTTP requests generated by 3rd parties and the number of 3rd party cookies.Employing a Macbook device equipped with a Chrome browser, all data was monitored through Chrome Dev-Tools.The exact counts of 3rd party HTTP requests and cookies are presented in Table 2, depicting the figures before consent is given, after consent is granted, or after consent is denied.From the standpoint of request and cookie numbers, it appears that Cookiebot, Didomi, and Quantcast prevent 3rd parties from setting trackers in cookies before user consent.Additionally, the number of requests is lower in the Opt-out scenario compared to Opt-in.Conversely, such a trend is not evident in the case of OneTrust.Notably, cookies from domains "googleads.g.doubleclick.net"and "agead2.googlesyndication.com"consistently appear both before and after user consent or rejection.Each of the four CMPs also establishes consent cookies following user decisions to either grant or deny consent.In the case of Cookiebot, a new cookie named "CookieConsent" is created once the user makes a choice regarding consent.Didomi employs two cookies, namely "didomi_token" and "euconsent-v2, " to store consent-related information.OneTrust uses the "OptanonConsent" cookie to retain the user's consent preference.In the context of Quantcast, three cookies are generated upon user consent: "euconsent-v2," "_pbjs_userid_consent_data," and "addtl_consent."If users opt to decline consent, only the "euconsent-v2" cookie is present.It's important to note that all these cookies are designated as 1st party cookies and are not affiliated with the CMPs themselves.

Measuring Targeting on User Interest Personas
3.4.1 Measuring Targeting on Personas.Next, we measure targeting received by our personas to infer compliance (or lack thereof) under GDPR and CCPA.As mentioned earlier, we register negative user consent, i.e., opt-out of processing and selling of user data, through Didomi, Quantcast, OneTrust, and CookieBot and capture bids through prebid.js.
After filtering the websites, we iteratively visit each website nine times from each of our 16 (+ control) personas under both GDPR and CCPA.We visit the websites first time to opt-out of processing or selling of data and the next eight times to collect bids.
Occasionally, the internet connection might experience instability, causing the browser to miss out on receiving bids.The quantity of bids received by the browser during distinct visits can also vary.It is unfeasible to ensure that the framework or a typical browser captures bids from all potential advertisers within a single visit.Therefore, we configured our framework to access the website on 8 occasions.The choice of the number 8 isn't dictated by any particular rationale; rather, we opted for a relatively sizable number due to necessity.Specifically, additional factors, e.g., day/week and website popularity, may influence the bids [59,81,83].In addition, we also use identical hardware/software, collect bids at the same time, from the same location, and on the same websites, across all personas.Overall, we expect that crawling websites several times and keeping conditions consistent will minimize the variability in bids.

Capturing
Bidding by Advertisers.We treat advertisers bidding behavior as an indication of advertisers and trackers noncompliance with the user consent (as discussed in § 2.6).To this end, we audit advertisers and trackers on websites that support header bidding, more specifically prebid.js-a widely used implementation of header bidding protocol [25], primarily because header bidding occurs at the client side and allows us to intercept the bidding process [24].Additional header bidding solutions such as Amazon Transparent Ad Marketplace and Google Ad Manager (GAM) featuring EBDA (Exchange Bidding Dynamic Allocation) operate through server-side mechanisms, preventing us from accessing bid data on the client side.Previously, Google offered a client-side header bidding service.Nevertheless, the current unavailability of the API googletag.pubads().getBidsReceived()has impacted this functionality.To capture the bids, we first identify header bidding supported websites.We identify such websites by injecting a script on the webpage that probes the prebid.jsversion; if we receive a response, we consider the website as prebid.jssupported website.Note that we do not consider personalized custom API labels (i.e., other than pbjs) in our measurements.After identification, we capture the bids by calling the getBidResponses method which returns the bids CPMs 3 .In case, we do not get any bids, we request the bids ourselves by calling the requestBids method.

Capturing Cookie Syncing by Advertisers.
Client side data sharing is a standard practice in the online advertising ecosystem.Advertisers most commonly share data through cookie syncing [65].Specifically, advertisers read their cookies (or other identifiers) from the browser and embed them in redirect requests which force the browser to send the embedded identifiers to the redirected advertiser.Since cookie syncing involves redirects from the browser, network traffic can be analyzed to detect cookie syncing events.To evaluate advertisers compliance, we measure whether opt-outs, under GDPR and CCPA, reduce cookie syncing.We use the heuristic from prior work [70] to detect cookie syncing in network traffic when users opt-out and opt-in using CMPs.

Baseline Comparison with Opt-in.
To understand the impact of regulations, we also establish a baseline by opting in to the processing and selling of data.Our rationale for opting in is to get the upper bound on processing and selling of data, as inferred by advertisers bidding behavior.To opt-in, we enable all cookie controls mentioned in Section 3.3.1.For Didomi we call the Didomi.setUserAgreeToAll,for OneTrust we call window.OneTrust.AllowAll, for CookieBot we click the opt-in button with id CybotCookiebotDialogBodyLevelButtonLevelOptinAl lowAll, and for Quantcast we click the button with Accept or similar text.

Comparison
With Advertisers Self Regulations.We also compare state-enforced regulations, i.e., GDPR and CCPA, with advertisers offered controls, i.e., with NAI's central opt-out [27] in curbing the processing and selling of data.We opt-out of NAI's offered controls by automatically navigating to their opt-out webpage [27] and by clicking OPT OUT OF ALL button.To evaluate advertisers offered controls, we select a different set of websites that support prebid.js but do not support CMPs.Specifically, we filter Alexa top-50 websites and identify 28 websites that support prebid.js but do not support any CMPs under both GDRP and CCPA.It is important to select websites that do not support CMPs because otherwise we cannot distinguish between the impact of advertisers offered controls and state-enforced regulations.

RESULTS AND COMPREHENSIVES ANALYSIS
In this section, we analyze advertisers compliance when users optout of data processing and selling.We compare and contrast targeting by advertisers across personas and configurations, make statistical observations, and draw conclusions from those observations about advertisers' compliance under GDPR and CCPA.We present our findings at the granularity of individual CMP because some CMPs might better handle user consent than the others.We measure advertisers compliance as follows: (1) Data usage.Opting out should lead to lower bid values, in interest personas as compared to the control, from advertisers.If advertisers continue to bid higher, they may still be using user data when users opt out of data processing and selling.(2) Server-side data sharing.Opting out should lead to lower bid values, in interest personas as compared to the control, from advertisers to whom data is not directly leaked.If advertisers to which data is not leaked bid higher, advertisers might still be sharing data when users opt out of data processing and selling.(3) Client-side data sharing.Opting out should eliminate or significantly reduce cookie syncing events, in interest personas as compared to the control, from advertisers.If advertisers continue to sync cookie with each other, they may be sharing/selling user data when users opt out of data processing and selling.
As the steps of analysis in each CMP and NAI are similar, we only listed Cookiebot in this section, put the rest in Appendix A, and only remain the takeaway of each CMPs and NAI except Cookiebot.

Cookiebot
Data usage.We evaluate reduction in data usage by analyzing advertisers bidding behavior.Table 3 presents advertisers bidding on personas when users opt-out and opt-in through Cookiebot under GDPR and CCPA.We note that all personas, with the exception of the Shopping where the bid value is same as the control under CCPA, receive higher bids as compared to the control when users opt-out under both GDPR and CCPA.
Next, we analyze if there is statistically significant difference between advertisers bidding patterns when users opt-out or opt-in under GDPR and CCPA.It can be seen in Table 3 that advertisers bidding behavior does not significantly changes regardless of whether users opt-out or opt-in under both GDPR and CCPA.Server-side data sharing.We evaluate reduction in server-side data sharing by analyzing bidding from advertisers to which we do not leak data.Table 4 presents bids from advertisers to which we did not explicitly leak data.It can be seen that all personas, with the exception of Shopping for CCPA, receive higher bids on average than the control persona.Even in the case of Shopping persona, the bid value is only 0.01 less than the control.Client-side data sharing.We evaluate reduction in client-side data sharing by measuring advertiser cookie syncing in network traffic.Table 4 presents advertiser cookie syncing behavior.Under GDPR, we note that there is substantial difference between advertisers cookie syncing behavior for opt-out and opt-in.Specifically, we only experience cookie syncing events in one persona (i.e., News) when we opt-out but we experience substantial more cookie syncing when we opt-in.Under CCPA, however, advertisers engage in cookie syncing events on 12 personas when we opt-out and all 16 personas when we opt-in.The total number of cookie syncing events on average in both opt-out and opt-in remains the same.We further investigate cookie syncing frequency of individual advertisers.Table 5 presents the top 5 most prevalent advertisers that participate in cookie syncing, when we opt-out under both GDPR and CCPA.It can be seen from the table that advertisers participate in as many as 3 and 128 cookie syncing events when we opt-out under GDPR and CCPA with Cookiebot, respectively.Takeaway.The effectiveness of opt-out under CCPA appears limited, with average bids comparable between opt-out and opt-in across all 16 personas settings.Additionally, the syncing events in opt-out are not significantly fewer than in opt-in for most personas settings.Conversely, under GDPR, bid differences between opt-out and opt-in suggest opt-out might not be effective.Nonetheless, the number of syncing events is consistently lower in opt-out than in opt-in, indicating its effectiveness.This means both GDPR and CCPA compliance may not be ensured on the same websites.

Didomi, Onetrust, Quantcast and NAI
For Didomi, Onetrust, Quantcast and NAI, we did similar client-side data sharing analysis and server-side data sharing analysis, as what we did for Cookiebot.The comprehensive analysis details can be found in Appendix A.
In Didomi, significant decreases in data utilization and sharing are observed when users choose to opt-out under the guidelines of GDPR and CCPA.The decline in data usage is more pronounced under CCPA than under GDPR.Conversely, the decrease in client-side data sharing is more prominent under GDPR compared to CCPA.While the utilization of Didomi for obtaining consent noticeably minimizes targeted activities, it does not entirely eradicate them.This is evident in the continued heightened bidding for certain user profiles and the involvement of advertisements in cookie synchronization.Consequently, achieving GDPR compliance might be feasible through Didomi, but ensuring CCPA compliance might not be guaranteed for the same set of websites.
In OneTrust, differences in advertiser behavior between GDPR and CCPA were observed when users opted out.Specifically, opting out did not result in a statistically significant difference in data usage under GDPR, but it did under CCPA.The prevalence of both server-side and client-side data sharing was higher under CCPA compared to GDPR.Surprisingly, advertisers synchronized more cookies (meaning they shared more data on the client side) under CCPA compared to GDPR.This implies that compliance with GDPR and CCPA might not be guaranteed on the same websites.
In Quantcast, significant decreases in data usage and sharing aren't observed when users choose to opt-out.Advertiser bidding behavior undergoes notable changes for 5 user personas under GDPR, albeit with a minor impact.When users opt-in under GDPR, there's a noticeable increase in cookie syncing events.This suggests that ensuring compliance with both GDPR and CCPA might not be assured on identical websites.
In NAI, the utilization of advertisers' data remains relatively stable.Nevertheless, advertisers tend to offer lower bids under CCPA compared to GDPR.Likewise, we have observed a marked decrease in data sharing, both on the server-side and client-side, under CCPA.As a result, it is possible that websites might not be concurrently compliant with both GDPR and CCPA regulations.
In the majority of cases within four CMPs and NAI, compliance with GDPR and CCPA regulations is lacking.There is little distinction in bids between opt-out and opt-in scenarios, and the volume of syncing events remains comparable regardless of opt-out or optin choices.It is plausible that some advertisers adhered to GDPR and CCPA regulations, and the CMP or NAI systems effectively functioned.Nonetheless, these advertisers were not extensively involved in most bidding events and cookie syncing occurrences.To validate this presumption, our focus shifted to advertisers, and a more extensive analysis is presented in Section 4.4.

Advertisers Bidding Behavior with pre-opt-out
Under GDPR processing personal data is prohibited, unless the data subject has consented to the processing (Article 6).However, under CCPA, data selling and sharing should be stopped immediately stop once consumers opt-out (Section 798.120 (a), Section 7013 (a)).Thus to eliminate the impact of data collection and sharing prior to opting-out, we conduct additional experiments where we opt-out prior to simulating personas.Similar to post opt-out, we note that under both GDPR and CCPA advertisers continue to use data even when we opt-out prior to collecting bids.We discuss advertisers bidding behavior with pre-opt-out in detail in Appendix B.  5).Additionally, we have tabulated the event counts during the Opt-in phase in Table 6.A noticeable trend emerges: under GDPR regulations, CookieBot, Didomi, and Quantcast have effectively curbed a substantial number of syncing events.However, in the case of OneTrust, the disparity in the number of syncing events between Opt-out and Opt-in appears minimal.When considering CCPA, Didomi and NAI emerge as more successful than Cookiebot, OneTrust, and Quantcast in preventing syncing events.Nevertheless, it's worth noting that the counts of syncing events originating from websites utilizing Didomi and NAI are already quite high during the Opt-out phase.
Takeaway.In terms of obstructing cookie syncing, CMPs exhibit superior performance under GDPR compared to CCPA.Notably, cookie events persist post Opt-out in CCPA scenarios.Among all CMPs, OneTrust shows the weakest performance, as the event counts between Opt-out and Opt-in remain comparable.4.4.2Analysis of Bids from Syncing Advertisers.In both Table 5 and Table 6, there are four advertisers that consistently appear, occupying the top four positions in both instances.Consequently, we extracted bids data from these shared four advertisers and the two unique advertisers from each table.This data was then employed to compile Table 7 and Table 8.Notably, certain scenarios, such as the "Arts" persona under Cookiebot, exhibit significantly higher bid counts in Opt-out compared to Opt-in.To provide a comprehensive overview, we calculated the bid quantities under similar circumstances, as detailed in Table 9 and Table 10.Under GDPR regulations, both CookieBot and Didomi show reduced bid counts in Opt-out compared to Opt-in, whereas the figures remain relatively consistent for OneTrust, Quantcast, and NAI.Furthermore, the averages between Opt-out and Opt-in exhibit similarity for OneTrust, Quantcast, and NAI.In the context of CCPA, more bids emerges in Opt-out compared to GDPR, accompanied by greater bid values.The trend in bid counts between Opt-out and Opt-in under CCPA resembles that observed under GDPR.
Takeaway.A comparative analysis between GDPR and CCPA reveals that all four CMPs and NAI demonstrate superior bid-blocking performance for the six selected advertisers under GDPR.A CMPto-CMP comparison under GDPR reveals the superior performance of Cookiebot and Didomi compared to OneTrust, Quantcast, and NAI.Bid value and bid count patterns remain consistent between Opt-out and Opt-in for OneTrust, Quantcast, and NAI.When contrasting CMPs under CCPA, Cookiebot and Didomi still outperform OneTrust and Quantcast.Bid values and counts remain comparable between Opt-out and Opt-in for OneTrust, Quantcast, and NAI, with substantially higher values observed in Opt-out scenarios.11 and Table 12.Additionally, Table 13 and Table 14 display the bid count originating from advertisers who either participated or did not participate in cookie syncing under both GDPR and CCPA regulations.
(1) In Cookiebot, bids from syncing advertisers are significantly fewer than those from non-syncing counterparts.Notably, for personas such as Arts, Health, Reference, and Shopping, the average bid from syncing advertisers surpasses that from nonsyncing advertisers under CCPA.Remarkably, the bid count remains comparable between GDPR and CCPA scenarios.(2) In Didomi, the number of bids is notably limited under GDPR.
Conversely, under CCPA, an overwhelming majority of bids originate from syncing advertisers, with the average bid value showing parity between syncing and non-syncing participants.(3) OneTrust showcases a trend where bids predominantly stem from non-syncing advertisers under both GDPR and CCPA, with a notable shift towards syncing advertisers under CCPA compared to GDPR.Additionally, the bid values from syncing advertisers considerably surpass those from non-syncing counterparts under both GDPR and CCPA.(4) Quantcast exhibits a situation where nearly all bids arise from syncing advertisers under GDPR.Meanwhile, under CCPA, bids are sourced from both syncing and non-syncing advertisers, with syncing advertisers accounting for a larger share.Interestingly, under CCPA, the bid values from non-syncing advertisers are notably greater than those from syncing participants.(5) For NAI, the bid count from non-syncing advertisers surpasses that from syncing advertisers.Remarkably, the bid count maintains similarity between GDPR and CCPA contexts.In the context of GDPR, syncing advertisers yield lower values compared to non-syncing ones, whereas the reverse trend is observed under CCPA.Syncing advertisers exhibit lower values under GDPR compared to CCPA, while non-syncing advertisers display the opposite trend.
Takeaway.All 4 CMPs have a very small amount of bids from advertisers which participated in cookie syncing under GDPR, except NAI.There are still bids from no-syncing advertisers in CookieBot and Onetrust under GDPR.There are more bids under CCPA, especially in Didomi(Syncing), Onetrust(Syncing), and Quantcast(Both Syncing and no-Syncing), except NAI.NAI has a similar number of bids under GDPR and CCPA.The bid value is higher under CCPA than under GDPR, except Cookiebot and NAI.Cookiebot has similar average bids from no-syncing under GDPR and CCPA.NAI has opposite value comparison trends between syncing advertisers and no-syncing ones under both GDPR and CCPA.Overall we note that under CMPs most personas receive higher bids compared to control when users opt-out of data processing and selling under GDPR and CCPA.The variability in bid values, particularly higher bids as compared to control, indicates that the leaked user interests are used to target ads to users, despite users' consent to opt-out of data processing as part of the regulations.We also note that opt-out is not statistically different from opt-in.The similarity in bid values for opt-in and opt-out indicates that the user consent in most cases does not have any effect on processing and selling of data.However some CMPs perform better than the others.For example, advertisers bidding behavior significantly changes under CCPA when the consent is conveyed through Didomi.We note that advertisers participate in data sharing activities both at the server and the client side without user consent.At the server side, we received higher bid values from advertisers, who we did not explicitly leak user interests; which indicates potential selling and sharing from advertisers who we leaked user data.At the client side, we notice that the advertisers share unique user identifiers in plain sight and share their data with as many other advertisers.
Advertiser-offered opt-out controls are also ineffective in curbing the processing and selling of user data despite user consent to optout.While advertisers at large do not honor their own opt-out controls, they slightly share less data as compared to the stateenforced regulations.The following is one example set of client-side data sharing.

Consent Handling by CMPs
At a high level, CMPs block or allow cookies to enforce user consent [10,15].As a first step, CMPs scan the website and identify all first and third-party cookies.After identifying the cookies, CMPs classify them into essential (i.e., necessary for websites to operate) and nonessential (e.g., advertising, tracking, marketing, etc.) cookies.To identify necessary cookies, CMPs rely on information from the website developers To identify non-essential cookies, CMPs do not clearly disclose their techniques, but they might just be relying on information shared by advertising and tracking services about the purpose of their cookies (e.g., Google declares the purpose of their cookies [23]).Many CMPs, such as OneTrust and Cookiebot, consolidate the information across websites and maintain database of cookies and their purposes [10,31].Consolidating information allows CMPs to automatically identify essential and non-essential cookies on new websites.CMPs typically take user consent and store it at the client side in first-party cookies.In addition to blocking cookies, CMPs also block execution of elements (e.g., scripts, iframes, videos, images) that might exfiltrate non-essential cookies before user consent is stored.To give website developers more control in order to accurately enforce user consent and avoid breakage by blocking essential cookies, CMPs allow website developers to block or allow cookies.
There are two main ways in which advertisers might be able to process and share user information despite negative consent.One, website developers may inaccurately deploy CMPs.For example, tracking code may execute first before CMPs even have a chance to block cookies or website developers may inaccurately list non-essential cookies as essential.Two, advertisers may rely on side channel information to circumvent enforcement by CMPs.For example, advertisers may routinely change their cookies to avoid detection or they may rely on browser fingerprinting to track users [69].Recently, Toth et al. [85] found that CMPs themselves may violate regulations and that their configuration options may lead to non-compliance.

Possible Recommendations
Our findings in general cast a serious doubt on the effectiveness of regulations as a sole means of privacy protection.Specifically, even after users opt-out through CMPs, their data may still be used and shared by advertiser.Unfortunately, in order to fully protect privacy, users still need to rely on privacy-enhancing tools, such as ad/tracker blocking browser extensions and privacy-focused browsers (e.g., Brave Browser).However, not all users may utilize privacy-enhancing tools to protect their privacy.
Website developers have an important role in enforcement of regulations.Specifically, they could deploy CMPs that are better at conveying and enforcing user consent.For example, research like ours could help inform the effectiveness of consent conveyance by different CMPs.Moving forward, we also recommended that CMPs, advertisers, website developers, and regulators should work together to define protocols for conveying and enforcing consent.

Limitations
Main page and subpages: In our experiment, the framework exclusively accessed the main pages of each training website.Discrepancies may arise between visits to main pages and the inclusion of subpages [52].Our analysis indicates that opting out on testing websites does not prevent tracking when solely accessing their main pages.Our primary aim was to illustrate that opting out does not guarantee evasion of tracking.The possibility of training personas by exploring subpages of training websites remains a potential avenue for future exploration.Notably, the focus of visiting testing websites was to gather bid data, rather than persona training, which is why we limited our visits to main pages.
CCPA applicability criteria: CCPA applies to online services that meet its broad applicability criteria.Specifically, as per Section 1798.140(c) (1), CCPA applies to online services, that have an annual revenue of more than $ 25 million, annually sell data of more than 50K California residents, or earn more than 50% of their revenue from the sale of personal data of California residents.Since most information required to determine applicability is not publicly available, it is challenging to determine the applicability criteria at scale [88].Thus, for our study, we did not strictly follow the CCPA applicability criteria.However, it is noteworthy that the prevalent advertisers (Table 5) in our dataset are mostly large corporates with revenue exceeding hundreds of millions [38,45].
Sample size: In comparison to prior work that analyzed ad bidding (e.g., Cook et al. [59] analyzed 25 websites), we analyze a substantially large number of websites (i.e., 352 that support Didomi, Quantcast, OneTrust, and CookieBot).We also repeat our measurements several times (i.e., 8 times) to reduce the chance of the sample size biasing our results.In future, researchers could further increase the sample size by incorporating websites that support various CMPs.We leave the non-trivial task of automating opt-outs from different CMPs at scale as future work.In future, researchers could also rely on alternative methodologies that use ad content e.g., [77], to eliminate the need to rely on ad bidding altogether for inference of data usage and sharing.Such techniques might allows researchers to audit online services at a much larger scale.
Server-side data sharing: We rely on the insight, also leveraged by prior research [53,59], that the advertisers behavior is shaped by their pre-existing knowledge of the user.Using that insight, we make an inference that higher bids from advertisers to which data was not leaked indicates the sharing of data from advertisers to which the data was leaked.However, there may be other additional uncontrolled factors that might impact the bids.
Automated data collection: We rely on OpenWPM to automatically collect bids and use Amazon's EC2 cloud platform to simulate crawls from Germany and California.In order to more accurately simulate real users, we enable bot mitigation in OpenWPM and also randomly wait between 10-30 seconds after loading each website.We also refrain from using public proxy servers, which may be black listed, and instead rely on Amazon EC2.

CONCLUSIONS
In this paper, we proposed a framework to audit regulatory compliance of online services at scale.We used the proposed framework to audit online advertising services on popular websites under GDPR and CCPA.Despite users exercising their rights under GDPR and CCPA to opt-out of processing and selling of their data using CMPs, we find that advertisers process user data to possibly target them and also share it with their partners both at the server and the client side.However, we find that some CMPs perform better than the others, i.e., advertisers bidding behavior significantly changes when the consent is conveyed.We also audited advertisers' self-proposed opt-out controls, i.e., NAI's opt-out, and found that they might be equally ineffective at curbing processing and selling of user data.Overall, our measurements sadly indicate that the regulations may not protect user privacy, and advertisers might be in potential violation of GDPR and CCPA.To foster follow-up research, we will also release our code and data set at the time of publication.changes when users opt-in to data processing and sharing.However, we note that the difference in advertisers behavior is small, i.e., effect size is less than 0.3, except for Arts and Regional personas where the effect size is medium.Server-side data sharing.We evaluate reduction in server-side data sharing by analyzing bidding from advertisers to which we do not leak data.Table 16 presents bids from advertisers to which we did not explicitly leak user data.Under GDPR, 3 personas bid higher than the control and 2 personas bid less than the control.However, the difference in bid values is less than 0.02, except for Science where it is 8 times higher than the control.Under CCPA, 6 personas bid higher and 8 personas bid less than the control.For two personas i.e., Arts and Computers the bid values are 2.5 times higher than the control and for Kids persona the bid value is 4.5 times less than the control.Client-side data sharing.We evaluate reduction in client-side data sharing by measuring cookie syncing by advertisers in network traffic.Table 16 presents the cookie syncing participation of advertisers.Under GDPR, we note that there is difference between advertisers cookie syncing behavior for opt-out and opt-in.Specifically, we experience cookie syncing events in 6 persona when we opt-out but we experience substantial more cookie syncing when we opt-in.On average there are 3 and 223 cookie syncing events per persona when users opt-out and opt-in, respectively.Under CCPA, advertisers engage in cookie syncing events on all 16 personas regardless of whether the user opts-out or opts-in.However, number of cookie syncing events substantially increases from 42 to 170 when users opt-out.We further investigate cookie syncing frequency of individual advertisers.It can be seen from Table 5 that advertisers participate in as many as 31 and 211 cookie syncing events when we opt-out under GDPR and CCPA with Didomi, respectively.Takeaway.Significant decreases in data utilization and sharing are observed when users choose to opt-out under both GDPR and CCPA regulations.The decline in data usage is more pronounced under CCPA in comparison to GDPR.Conversely, the decline in client-side data sharing is more notable under GDPR than CCPA.Despite the utilization of Didomi for obtaining consent, which notably curbs targeting, it doesn't entirely eradicate it.This is evidenced by the continued higher bids on certain user personas and the involvement of advertising in cookie synchronization.Consequently, while achieving GDPR compliance might be possible through Didomi, ensuring CCPA compliance on the same websites could be more challenging.

A.2 OneTrust
Data usage.We evaluate reduction in data usage by analyzing advertisers bidding behavior.Table 17 presents advertisers bidding on personas when users opt-out and opt-in through OneTrust under GDPR and CCPA.We note that under GDPR, 6 personas bid higher than control and 5 personas bid lower than control.Except for Home and Shopping personas where bid values substantially exceed when users opt-out, the difference between bid values as compared to the control is only 0.01.We also note that advertisers did not return any bids for the arts persona.In contrast, under CCPA, except for 2 personas, i.e., Business and Society, 14 personas receive bid values that are higher than that of the control.
Next, we analyze if there is statistically significant difference between advertisers bidding patterns when users opt-out or optin under GDPR and CCPA.It can be seen in Table 17 that under GDPR, for all personas, with the exception of Recreation persona, there is no statistically significant difference between advertisers bidding behavior.Under CCPA, for 8 personas here is no statistically significant difference between advertisers bidding behavior.For the other 8 personas, however, advertisers have statistically significant different advertising behavior (with medium effect size for 6 personas).Server-side data sharing.We evaluate reduction in server-side data sharing by analyzing bidding from advertisers to which we do not leak data.Table 18 presents bids from advertisers to which Table 22: Ad bidding and cookie syncing under GDPR and CCPA after opt-out (Out) and opt-in (In) with NAI.Avg.column represents the mean of all bid value from advertisers who did not bid or appear when we simulated personas but appeared and bid after we opt-out.Out and In under C-Sync.represent number of cookie syncing events after opt-out and opt-in, respectively.decrease in data sharing, both on the server-side and client-side, under CCPA.This suggests that achieving compliance with both GDPR and CCPA might not be guaranteed on identical websites.

B ADVERTISERS BIDDING BEHAVIOR WITH PRE-OPT-OUT
Under GDPR processing personal data is prohibited, unless the data subject has consented to the processing (Article 6).However, under CCPA, data selling and sharing should immediately stop once consumers opt-out (Section 798.120 (a), Section 7013 (a)).Thus to eliminate the impact of data collection and sharing prior to optingout, we conduct additional experiments where we opt-out prior to simulating personas.Table 25 and Table 26 present the ad bidding under GDPR and CCPA.Under GDPR, we note that advertisers bid higher for most personas than control across all four CMPs.In several instances the bid values are even higher than the sum of average and standard deviation of the bid values in control persona.Under CCPA, however, we note varying trends across CMPs.For Cookiebot, OneTrust, and Quantcast 16, 7, and 4 personas receive higher bid values from advertisers despite opting out, respectively.In the case of Didomi, only 1 persona receives higher bid values.
Table 23 and Table 24 present the cookie syncing events from advertisers under GDPR and CCPA.We note that advertisers participate in cookie syncing events despite users opting out under both GDPR and CCPA.
Takeaway.Similar to post opt-out, we note that under GDPR advertisers continue to use data even when we opt-out prior to collecting bids.Under CCPA, as compared to GDPR, less number of personas receive higher bid values than that of the control.However, there are still several personas where advertisers continue to bid higher than the control.In the case of client side data sharing, we did not notice any reduction in cookie syncing under both GDPR and CCPA.
(a) Consent management dialog for GDPR.(b) Consent management dialog for CCPA.

Figure 2 :
Figure 2: High level overview of our framework to audit regulatory compliance.(1) We use OpenWPM [64] to automatically visit top-50 websites from 16 different interest categories to simulate 16 user interest personas.(2) We filter top websites that support opt-outs through Didomi, Quantcast, OneTrust, and CookieBot under GDPR and CCPA and also support header bidding through prebid.js[36].(3)We then visit the filtered websites with user interest personas, opt-out of data processing and selling, and collect bids and network requests from advertisers.(4)We then analyze the collected bids and network requests to infer data processing and selling from advertisers.
(b) Expanded consent dialog with all options.

Table 1 :
CMP and prebid.jsdeployment on Alexa top-100K websites under GDPR and CCPA.+PB represents the count of websites for each CMP that also deploy prebid.js.Common websites column represents the count of websites that deploy both CMPs and prebid.jsand are common across GDPR and CCPA.
[59,70]ing amounts in different directions, scrolling occurs at unpredictable intervals down the page, and waiting periods are introduced randomly to create irregular patterns in page visits.This approach is consistent with previous research in web measurement[59,70].

Table 2 :
Number of 3rd party requests and cookies set during manual test on 4 CMPs: Cookiebot, Didomi, Onetrust and Cookiebot.Column "3rd R" represents the number of 3rd party HTTP requests, and column "3rd C" represents the number of 3rd party cookies.

Table 3 :
Ad bidding under GDPR and CCPA after opt-out (Out) and opt-in (In) with Cookiebot.Avg.column represents the mean of all bid value.Light red and Light blue indicate bid values that are higher and lower than Control's avg., respectively.Dark red and Dark blue indicate bid values that are Control's avg.± std., respectively.Column p-val.and Eff.represent p-value and effect size, respectively

Table 4 :
Ad bidding and cookie syncing under GDPR and CCPA after opt-out (Out) and opt-in (In) with Cookiebot.Avg.column represents the mean of all bid value from advertisers who did not bid or appear when we simulated personas but appeared and bid after we opt-out.Out and In under C-Sync.represent number of cookie syncing events after opt-out and opt-in, respectively.

Table 5 :
Most prevalent advertisers that participate in cookie syncing, when we opt-out under GDPR and CCPA.These advertiser appear in all personas across CookieBot, Didomi, Onetrust, Quantcast and NAI configurations.CB, DM, OT, QC and NAI columns represent the count of cookie syncing events under CookieBot, Didomi, Onetrust, Quantcast and NAI for each advertiser.

Table 6 :
Most prevalent advertisers that participate in cookie syncing, when we opt-in under GDPR and CCPA.These advertiser appear in all personas across CookieBot, Didomi, Onetrust, Quantcast and NAI configurations.CB, DM, OT, QC and NAI columns represent the count of cookie syncing events under CookieBot, Didomi, Onetrust, Quantcast and NAI for each advertiser.

Table 7 :
Average of bids data from 6 advertisers in cookie syncing, when we opt-out and opt-in under GDPR.CB, DM, OT, QC and NAI columns represent the count of cookie syncing events under Cook-ieBot, Didomi, Onetrust, Quantcast and NAI for each advertiser.

Table 8 :
Average of bids data from 6 advertisers in cookie syncing, when we opt-out and opt-in under CCPA.CB, DM, OT, QC and NAI columns represent the count of cookie syncing events under Cook-ieBot, Didomi, Onetrust, Quantcast and NAI for each advertiser.
4.4.1 Cookie Syncing Advertisers.We have compiled a list of the top five advertisers from a pool of 53, based on the highest number of syncing events during the Opt-out phase (Table

Table 9 :
Number of bids data from 6 advertisers in cookie syncing, when we opt-out and opt-in under GDPR.CB, DM, OT, QC and NAI columns represent the count of cookie syncing events under Cook-ieBot, Didomi, Onetrust, Quantcast and NAI for each advertiser.

Table 10 :
Number of bids data from 6 advertisers in cookie syncing, when we opt-out and opt-in under CCPA.CB, DM, OT, QC and NAI columns represent the count of cookie syncing events under Cook-ieBot, Didomi, Onetrust, Quantcast and NAI for each advertiser.

Table 11 :
Average of bids data from advertisers in cookie syncing and not in cookie syncing under GDPR.CB, DM, OT, QC and NAI columns represent the count of cookie syncing events under Cook-ieBot, Didomi, Onetrust, Quantcast and NAI for each advertiser.

Table 12 :
Average of bids data from advertisers in cookie syncing and not in cookie syncing under CCPA.CB, DM, OT, QC and NAI columns represent the count of cookie syncing events under Cook-ieBot, Didomi, Onetrust, Quantcast and NAI for each advertiser.

Table 13 :
Number of bids data from advertisers in cookie syncing and not in cookie syncing under GDPR.CB, DM, OT, QC and NAI columns represent the count of cookie syncing events under Cook-ieBot, Didomi, Onetrust, Quantcast and NAI for each advertiser.

Table 14 :
Number of bids data from advertisers in cookie syncing and not in cookie syncing under CCPA.CB, DM, OT, QC and NAI columns represent the count of cookie syncing events under Cook-ieBot, Didomi, Onetrust, Quantcast and NAI for each advertiser.

Table 18 :
Ad bidding and cookie syncing under GDPR and CCPA after opt-out (Out) and opt-in (In) with OneTrust.Avg.column represents the mean of all bid value from advertisers who did not bid or appear when we simulated personas but appeared and bid after we opt-out.Out and In under C-Sync.represent number of cookie syncing events after opt-out and opt-in, respectively.

Table 24 :
Cookie syncing events by advertisers under CCPA in California after Pre-opt-out.Evt.column represents the number of cookie syncing events from advertisers, respectively.

Table 23 :
Cookie syncing events by advertisers under GDPR in Germany after Pre-opt-out.Evt.column represents the number of cookie syncing events from advertisers, respectively.