Model-driven Privacy

Data protection regulations in many countries require IT systems to implement baseline privacy requirements like purpose limitation and consent as mandated by the GDPR. Such requirements are often specified in the system’s privacy policy and are challenging to implement as system developers must address them consistently and in a cross-cutting manner. Moreover, without a formal connection between a system’s privacy policy and its implementation, the system’s correctness and evolution are extremely difficult to attain. We propose a model-driven development methodology that incorporates privacy policies into the system design. Namely, we define a system’s privacy model, which has precise semantics and is used to specify privacy policies. We provide semantic-preserving model transformations that generate system implementations that enforce the given privacy policies by design. We implement two such model transformations, targeting C# and Python system implementations. We evaluate our methodology on three substantial case studies and show the enforcement of privacy policies related to purpose limitation and consent. Our evaluation also demonstrates our approach’s generality, effectiveness, and modest overhead.


INTRODUCTION
Motivation and problem statement.Modern IT systems implement a wide range of security and privacy requirements.The sources of privacy requirements are manifold and include privacy regulations, end-user concerns, self-imposed constraints, prioritized risk scenarios, best practices, and standards.
In this paper we focus on data protection, a particularly challenging class of privacy requirements.Data protection requires enforcing users' data ownership rights on data beyond the users' actual control or, dually, ensuring that any user's personal data is treated according to the user's data-usage policy [56].As users rarely explicitly specify their data-usage policies, data protection regulations, like the EU's General Data Protection Regulation (GDPR) [32], mandate the enforcement of particular classes of baseline data-usage policies.For example, personal data may be collected only for specified, explicit, and legitimate purposes (Art.5 §1 (b) GDPR) and processed for each purpose only when the owner has provided consent (Art.7 §1 GDPR).In practice, a user's consent is often implicit.For example, by using a website, a user tacitly agrees with the (often hard to find) text of the website's privacy policy [63].Ideally, a user may consent only to parts of the privacy policy and therefore create their data-usage policy, which should be enforced.
The question of how to support data protection during system development and evolution naturally arises.While notions like privacy-by-design have been advocated, we still lack effective methodologies and tools to formally specify privacy and data-usage policies, connect these policies with system implementations, and keep them in sync during the systems' evolution.In fact, the implementation of an enforcement mechanism and its synchronization with the data-usage policies still requires manual developer intervention in the system code, which is difficult and error-prone.
For security, model-driven development (MDD) [11,20,44] has proven effective in tightly coupling system implementations with security policies via design models that integrate security into the system design process.The model semantics are defined as the allowed system executions and MDD relies on semantic-preserving model transformations to generate an enforcement mechanism that prevents those system executions disallowed by the modeled policy.Model transformations are parametrized by a specific implementation technology and can target popular technology-specific infrastructure (e.g., authorization frameworks like JAAS [47]).MDD was shown to reduce system development time, and improve correctness [31] and security [17] with respect to a system specification.
Prior work.Existing MDD approaches for privacy (Section 2) focus on automatically transforming architectural system designs [8] (e.g., by adding access control mechanisms or integrity-protected logger components), or reasoning about privacy properties of different designs [7].However, a large gap remains between designs and the concrete system implementations.In contrast, approaches for enforcing data-usage policies [19,62] focus on the enforcement mechanism, making the policy statement implementation dependent and hence not a design artifact.Finally, there are approaches that do not propose any technical means for achieving privacyby-design, but rather report on experiences [39], analyze regulations [6], or exclusively work with system design models [5].
Previously, Basin et al. [16] have specialized MDD to modeldriven security, where they additionally model and enforce security requirements (Section 3).In particular, they focus on fine-grained access control policies that combine declarative and programmatic access control policies.In model-driven security, a software engineer first creates a design model in a design modeling language (e.g., a UML class diagram).Next, a security engineer creates a security model in a security modeling language (e.g., like SecureUML [51]).Using model transformations, a secure-by-design implementation is then obtained from the two models.The associated threat model is that system designers provide correct system specifications via design models, whereas developers may unintentionally make implementation errors.Relying on model transformations for security therefore reduces the risk of critical implementation errors.
Our approach.In this paper, we extend model-driven security with support for privacy by proposing a new privacy model (Section 4) that captures the GDPR's notions of purpose-limitation and consent.As an additional design step in our new methodology (Section 5), privacy engineers define the system's privacy model, which formalizes its privacy policy.We rely on UML as a modeling language for presenting models, but define our models more generally.We define the privacy model's set-theoretic semantics (Section 6), and present two semantic-preserving model transformations that generate ASP.NET C# and Flask Python web applications, respectively (Section 7).An application generated using our approach allows users to explicitly and selectively consent to (parts of) the application's privacy policy and the generated implementation automatically respects the user-provided consent (i.e., enforces the users' data-usage policies).In particular, whenever a user's personal data will be manipulated for a purpose not declared by the privacy policy or not consented to by the user, the attempted operation is prevented by the generated enforcement mechanism.By leveraging model transformations that systematically enforce policies in a cross-cutting manner, we can prevent policy violations resulting from developers' mistakes under the same threat model as in model-driven security [16].Finally, we use our approach to implement three case study applications (Section 8).Our models simplify reasoning about the correctness of the generated applications, facilitate policy evolution, and generate enforcement mechanisms that operate efficiently, with low overhead.
In summary, we make the following contributions: • We propose a privacy model that can express privacy policies incorporating purpose-limitation and consent.• We extend model-driven development to support privacy.
• We propose two model transformations that generate complete, configured, enforcement infrastructures for privacy.One generates ASP.NET C# web applications, whereas the other generates Flask Python web applications.• We evaluate our approach on three case studies demonstrating its generality, effectiveness, and modest overhead.To our knowledge, this is the first approach proposing an MDD methodology for privacy with all technical means for obtaining configured enforcement infrastructure for typical privacy policies.
Scope and limitations.Privacy can be understood in a very broad and general sense.We focus on a particular privacy aspect: enforcing purpose-limitation and consent data protection requirements as typically specified in privacy policies.Therefore, for the purposes of this paper, we interpret privacy and the legal obligation of data protection as the enforcement of the appropriate data-usage policies.
While we focus on controlling the usage of collected personal data, more work remains to fully address data protection requirements.In particular, designing systems that minimize collected personal data (Art.5 §1 (c) GDPR), and enforce a combination of access control and information-flow policies.The latter are needed to prevent subtle inference attacks.Finally, our privacy model reflects a simplified interpretation of GDPR; incorporating a more elaborate formalization of this law [59] is future work (Section 9).

RELATED WORK
We split related work into three parts: privacy specification, privacy enforcement, and model-driven development (MDD) for privacy.
Privacy specification.Privacy policies [63] are security policies [60] that encode the privacy rights of users of software systems.Such policies are now heavily anchored in national and international law.They can be domain-specific, like the Health Insurance Portability and Accountability Act (HIPAA) [41] and Gramm-Leach-Bliley Act (GLBA) [35].Alternatively, they can be domain independent, like the General Data Protection Regulation (GDPR) [32], the California Consumer Privacy Act [24], and the Digital Charter Implementation Act [29].Privacy policies must be both understandable to end users and precise.Precision is achieved by defining a formal semantics for the policy specification language, which is essential if policies are used to generate an enforcement mechanism.
Lam et al. [48] formalize certain HIPAA regulations that focus on access control rules for the use and disclosure of protected health information, including treatment information, healthcare operations, and payment.They implement their formalization using a fragment of the Datalog logic programming language and demonstrate its usability in a prototype hospital system.
Similarly, May et al. [52] extend the classical Harrison, Ruzzo, and Ullman access control model [40] to formally analyze and audit HIPAA regulations.They formalize the target regulations and automatically translate it into a Promela model for model checking.
DeYoung et al. [30] go beyond the aforementioned HIPAA regulations and formalize regulations on the protection of customer information under the GLBA, which they also automatically audit [25].
Various languages have been proposed [49] to specify privacy policies [63].The, now obsolete, P3P [26,27] and EPAL [10] languages have been proposed to declare the purposes for data processing in a machine readable form.With P3P, users can set their preferences, which the browser would than match against the websites' declared purposes, notifying the user about any mismatch.EPAL goes further by allowing enterprises to declare internal access control policies needed to enforce the declared privacy policies within the system.As they lack a formal semantics, P3P and EPAL policies only amount to machine readable documentation.
The Ponder language [28,65] is an access control specification language.It can specify obligations, which is relevant for some common privacy requirements, like data minimization.
The PrimeLife Policy Language (PPL) [9] and its extensions [13] are based on the eXtensible Access Control Markup Language (XACML) with custom notions of purpose and obligation.It reuses existing XACML implementations, but it inherits its complexity.
Recently, Pilot [54] and the policy language by Baramashetru et al. [14] have been proposed to capture GDPR-specific privacy requirements.Besides purpose-limitation and consent, the former can also specify to whom private data may be transferred, while the latter can specify data retention and location-specific processing.
Our privacy model focuses on the most important aspect of GDPR: purpose limitation and consent.Unlike existing policy languages, the formal semantics of our privacy model (Section 6) can be used to generate the enforcement mechanisms.
Privacy enforcement.Guerriero et al. [38] propose a privacy policy language and an enforcement mechanism tailored for dataintensive applications.Application users can specify privacy policies in a fragment of past-time metric temporal logic.A policy is enforced by removing from the data streams all the user-owned data items that violate the user's policy.While the approach can specify purpose limitation, consent is not accounted for and the approach is specialized for data-intensive applications.
Byun et al. [21,22] model purpose hierarchies and distinguish between the intended purpose defined for each data item and the access purpose, which is the purpose for which the data item is accessed.Assuming that the latter is accurately stated by a user, their enforcement of purpose limitation amounts to checking whether this purpose conforms to the data item's intended purpose, where both may be complex objects.However, the above assumption is a strong limitation of this approach and, with the exception of our work and Karami et al. [43], of all other approaches in this section.
Karami et al. [43] propose Data Protection Language (DPL), a programing language with explicit constructs for purpose and consent.DPL's semantics guarantees the enforcement of purpose, consent, and storage limitation.While having these constructs available in the implementation language improves privacy, it imposes a significant technological restriction on the developer to use DPL when implementing both privacy policies and the business logic.In contrast, MDD targets multiple implementation technologies and hence developers can choose a model transformation for the technology best-suited for implementing the business logic.
MDD for privacy.Privacy by design [23] posits that privacy must be accounted for throughout the whole engineering process.In particular, software must be developed to ensure privacy policy enforcement.MDD supports this aspect of privacy by design by introducing privacy policies in the software design models.
Antignac et al. [8] propose an automatic transformation of a high-level architectural system design into a privacy preserving design.Their transformation takes a data-flow diagram (DFD) (i.e., the initial architectural design), and a privacy policy.The privacy policy classifies each DFD's data-flow based on the data subject whose data flows through it, the purpose of the flow, and the data's retention time.The model transformation enhances the input DFD with additional processes that check privacy policies before data is used (e.g., checking purpose and the existence of user's consent before data processing).These added processes are abstract and a large gap remains between designs and the concrete system implementation.
Antignac et al. [7] also support the thesis that privacy should be addressed at the architectural level and they specifically focus on data minimization policies.They propose two languages: a formal language for modeling architectures for data minimization and a logic for reasoning about the correctness of such architectures.The latter, called privacy logic, is a variant of epistemic logic with a proof system to reason about the knowledge and beliefs of different architectural components.Unfortunately, their privacy architectures are modeled at a very high level, and obtaining a concrete implementation requires a highly non-trivial manual refinement.
Privacy policies modeled in ModelSec [57] do not consider purpose or consent.According to the authors, privacy policies ensure that information only can be read by those who are allowed, which coincides with the standard notion of confidentiality.

PRELIMINARIES
We start by introducing an example application and its (functional, security, and privacy) requirements.We will use it as a running example to illustrate the concepts discussed throughout this paper.We rely on some software engineering and modeling concepts [36], such as UML notation, without introducing them.
Example 1.Consider a simple conference management system used by researchers to publish and review papers.A researcher can be a conference chair, a committee member, or a normal user.
Any researcher can publish papers as well as receive paper recommendations based on their personal data.To publish a paper, they must first create and initialize the object representing it.Then they must associate the object with themselves and their coauthors.Such papers are considered unpublished until the review process is completed.Researchers who are committee members can additionally review papers and possibly delegate reviews to other researchers.Finally, a researcher who is a conference chair can accept papers for publication.
Additionally, a researcher assigned to review a paper must not have any conflicts with the paper's authors, i.e., be in a direct advisor relationship or have co-authored a paper in the last two years.
We now introduce a model-driven security methodology [16], which we extend in this paper.To model the above application using the model-driven security methodology, we need to recap data and security models.The data model defines the structure of wellformed states of an application and it is designed by the software engineers.The security model represents application's fine-grained access control (FGAC) permissions, which security engineers define based on the actions allowed on data specified by the data model.
For ease of presentation, we use UML syntax for visualizing data models.Nevertheless, the ideas presented in this paper are independent from UML's particular syntax.Hence, we introduce a straightforward set-theoretic representation of our models.
Let ID be a countable set of names and S ⊂ ID a set of sorts.Sorts can be primitive sorts from PS ⊂ S, consisting of natural numbers (N), integers (Z), floats (F), strings (String), and Booleans (B), or custom (i.e., defined by a data model).
We use ×, +, →, and ⇁ to denote type constructors for product, sum, function, and partial function, respectively.We write  * ( + ) for a finite (nonempty) sequence of  and () as the empty sequence.Starting from a set of sorts, we use these operations to build more complex types.The sets dom( ) and rng( ) are the domain and the range of the function  .We write  ↦ →  for a pair (, ) that belongs to a function.
A data model is a formalization of a UML class diagram.For each sort  ∈ CS, attr() returns a function from attribute names to their types, and meth() returns a function from method names to their types.Every method name has a function type with at least one argument, which is its owning sort.A (binary) association name  ∈ dom(asc), relates an association end name and its sort to another association name and its sort, as returned by asc().
Intuitively, sorts in CS correspond to UML classes and ⪯ D defines the classes' inheritance relationship.The remaining functions in D are a straightforward encoding of the classes' structure and their associations.For simplicity and without loss of generality, we consider only binary associations.We say that a relation is asc()typed with asc() = ((a, A), (b, B)) if it has type A × B. An instance of a data model is an object model, which interprets sorts with finite sets of objects, and associations as relations on the objects.An object interprets all the attributes of the sort, whereas methods have a fixed interpretation for every object of a sort.Two objects are (structurally) equal if they interpret attributes equally.To distinguish structurally equal objects, each object has a unique name  ∈ ID.Hence, an object of a sort  is a pair ( ↦ → (, )), where  is a set of attribute values.In other words, an object binds its name  to an interpretation  of the attributes of the sort .Since object names are unique, a finite collection of objects yields a (partial) function from object names to their interpretations.
Intuitively, an object model formalizes a UML object diagram used to model a state of a system.
where  1 ,  2 , and  3 are method implementations returning (), (), and Papers * , respectively.For conciseness, we use object names instead of the objects themselves in R.
Given a data model, the set of actions A that a user may take is These consist of creating or deleting objects, reading or updating attributes or associations, and executing methods.
The object constraint language (OCL) [37] is an order-sorted firstorder logic used to query object models or define properties of data models.OCL syntax has a pre-defined part used with primitive sorts (e.g., integer addition or string concatenation) and collections (e.g., forall, exists, select, size, and excludes), and a variable part, which is defined by a specific data model.OCL is strongly typed and every OCL expression evaluates to a value of some sort.We denote by   all OCL expressions that evaluate to values of sort .Expressions  B are called OCL constraints.
OCL's dot-operator is used to access an object's attribute (e.g., r 1 .name),or an association end, i.e., the collection of objects linked with the object via an association (e.g., r 1 .advisers).Collection functions are called using an arrow notation (e.g., r 1 .advisers→size()).When used with collections, the dot-operator maps over them.OCL expressions can be written in the context of an object, and such objects can be referred to using the keyword self.
Example 4. If the OCL constraint self .title.size() > 0 is given as an invariant of the Paper sort, it would restrict the possible object models to those whose objects have non-empty paper titles.
The set of free variables of an OCL expression  is fv(), e.g., fv(self .papers→forall(| .concat(.title).size()< )) = { String ,  Z }.Here variable  Paper is bound by the forall function.Note that we subscript variables with their sorts for clarity.
Given   ∈ fv() and an OCL expression  ∈   ,  [  /] is obtained from  by substituting all occurrences of   with .
A security model defines which actions can be carried out on an object model by users in different roles, i.e., fine-grained access control (FGAC) policies.Our definition here corresponds to a simplified version of SecureUML [51] without action and user hierarchies.
Let R be a set of sorts, each representing a role and ⪯ R a partial order on R. For example, R = {Normal, Committee, Chair} and Given a data model D, let D ′ extend it with a sort User representing the application's users with attr(User) = {role ↦ → R}.Also ⪯ D ′ may extend ⪯ D with (User, ) for any sort  that already represents application users (e.g., Researcher in Figure 1).
Authorization constraints are OCL constraints that express conditions under which a user (e.g., Alice) in some role (e.g., Chair) may take some action (e.g., read the year attribute) of some resource (e.g., an object of the Paper sort).A constraint is written in the context of the object representing the resource, hence the self OCL expression evaluates to the object.The user object is represented as a free variable caller User in the constraint.Depending on the action, other factors may need to be account for when making the authorization decision.For example, when updating an attribute or an association end, the intended new value of some sort  can influence the decision, which is captured by the free variable value  .Given a variables' valuation, the OCL constraint evaluates to a Boolean that encodes whether the user can carry out the action.
Given D ′ , let Con( ) = { |  ∈  B , fv() ⊆  } be the set of OCL constraints containing all free variables in  .Given an action  ∈ A, the set of authorization constraints C() is defined as follows: are in asc, and  is add(, ) or remove(, ).
Now we define the security model that models FGAC policies.
Definition 3. Given a data model D, a security model is a 4tuple (D ′ , R, ⪯ R , PA), where PA is a set of permission assignments: Intuitively, an action by a user is allowed if it is associated with a permission containing a satisfied authorization constraint, and the user's role is larger than the role in the permission.More precisely, a security model allows a user  to take an action  on an object  and (possibly) update the object with the value  iff there exists a (, , ) ∈ PA such that  ⪯ R .role and  [caller/] [value/] evaluates to true in the context of .

MODELING PURPOSE AND CONSENT
Purpose limitation is the most prominent privacy requirement mandated by the GDPR.It states that personal data (namely, the data provided by the application's users) can only be used for the purposes that the data owners have consented to.Here one can distinguish between different semantics for data usage.Access control semantics considers only the initial access to the data as usage.Dataflow semantics additionally considers as usage any access to data explicitly derived (e.g., by assignment) from the initially accessed personal data.Finally, information-flow semantics further considers as usage any access to data implicitly derived (e.g., by branching on private data, or via other implicit or explicit flows) from the initially accessed personal data.We opt for an access control semantics as the other more draconian semantics substantially complicate the creation of usable applications.We first must distinguish which sorts model personal data, so let CS P ⊆ CS be the set of such sorts.Only the actions on these sorts are subject to purpose limitation.Hence, let A P ⊆ A be the set of actions on sorts in CS P .Note that this is without loss of generality, as one can mark personal data at a more fine-grained level (e.g., a single attribute, or an association end) by modeling them as explicit sorts in the data model.Note that, for simplicity and to focus on the enforcement aspects of data protection, we here consider personal data as the subset of the data model sorts fixed at the design phase.Fine-grained and dynamic personal data definition is future work.
Let P be a set of basic purpose sorts, which induces the lattice (2 P , ⊆) of complex purposes (i.e., sets of basic purposes).For example, any purpose is the set P, no purpose is ∅, and marketing purpose is {TargetedMarketing, MassMarketing}.As in some previous work [21], we shall distinguish two general types of purposes: declared purpose and actual purpose.
A declared purpose is a purpose presented to the user in the form of the privacy policy.It is intended to be enforced during the application's execution, i.e., any access to the private data must be checked with respect to the declared purposes.In fact, declared purposes, the privacy policy, and the enforcement mechanism must always be synchronized.Therefore, in our model-driven approach, it is sufficient to specify the declared purposes, whereby the other two are then automatically generated.Furthermore, whenever the declared purposes change, users can receive an updated version of the correspondingly (re-)generated privacy policy, and the enforcement mechanism consistently changes to enforce the new policy.
To model declared purposes, we emulate the permission assignment relation from FGAC (Section 3): The relation PA P replaces the role notion in PA with the purpose.Example 6.Let the basic purposes of the conference management system be P = {PublishPaper, AssignReviewer, RecommendPapers}.Consider the following privacy policy from Example 1: If you are a student, we will use your list of authored papers to recommend to you papers.A declared purpose modeling the privacy policy is the tuple (RecommendPapers, read(Researcher, papers), self .student).
A natural question arises: how should an enforcement mechanism decide whether some data access conforms to the declared purpose?In contract to the declared purpose, an actual purpose describes how data is actually being used.It is difficult to determine the actual purpose automatically by analyzing the design models we have seen so far.It must therefore be specified by annotating some of the methods in the data model with complex purposes.If a data access occurs as a part of an execution of a method, the method's annotation is the actual purpose for the data access.Given a data model D, we explicitly define the application's interface as

Design steps
Provide inputs to Generated artifacts a subset of methods I ⊆  ∈CS dom(meth()), and consider the function annotate : I → 2 P annotating the interface methods with (complex) purposes.We can extend the annotate function to all methods, by associating no purpose ∅ to methods outside of I.
Even if declared purposes are presented to the user and enforced in the application, the GDPR further requires that personal data is only processed when the owner has provided consent.In practice, a user's consent is implicit.Typically by using a website, the user tacitly agrees with the text of the website's privacy policy.We aim to make this consent collection step explicit.
In general, users can (partially) consent to or even reject all the declared purposes.The lack of consent is treated as a rejection of that part of the privacy policy.Consent can be provided lazily (e.g., when a user performs an action) and can be revoked at any time.
To model consent, we must capture the relationship between a user, their personal data of some sort , and a purpose  for each tuple (, , ) in the PA P relation, where the action  is taken on an object of sort .Hence we can model consent as a relation over User × CS P × P that relates a user and a sort of personal data to a purpose that the user has consented to.For example, (Alice, Researcher, RecommendPapers) means that Alice consents that her information in the Researcher object can be used to recommend papers to her.Since collecting consent happens only once the application is running, the relation above must be maintained dynamically as a part of the data model.
Given a data model D ′ (from the security model) with identified sorts CS P containing personal data.The data model D ′′ extends D ′ with a sort Consent such that attr(Consent)(purposes) = 2 P , attr(Consent) (user) = User, and attr(Consent)(data) = CS P .A Consent object relates a user, a personal data type, and a set of purposes.Such an object exists only if the user has consented that their personal data with the appropriate type can be used for the set of purposes.The extended model also requires that every sort  ∈ CS P has the owner attribute, attr()(owner) = User, relating every instance of sort  to the user who owns it.We only model single data owners, as the treatment of shared personal data is still an open problem in GDPR [46,Problem 6].We now combine the above notions into a privacy model.Definition 4. Given a data model D, the privacy model is the tuple (D ′′ , P, CS P , PA P , annotate).
Intuitively, an enforcement mechanism for a privacy model ensures that actual purpose always conforms to the declared purpose and that all users that own the personal data have consented to every actual purpose.More precisely, it allows a user 's action  to execute on private data  of sort D that (possibly) updates  with value  as part of the method 's execution iff annotate() is a subset of all declared purposes { | (, , ) ∈ PA P ,  [caller/, value/] in the context of  } and there exists a Consent object  such that .user = .owner,.data = D and annotate() ⊆ .purposes.

METHODOLOGY
We now put all the above notions together into a model-driven methodology for developing applications that enforce privacy.Overall, an application's design is given by three models: a data model, a security model, and a privacy model.We describe each step of our methodology in detail and explain how some of the design decisions can be made.Then we show what model transformations can generate from the models.Figure 2 summarizes the methodology's design steps (left) and the outputs of the model transformations (right).Dashed arrows denote model-to-model transformations, solid arrows denote model-to-code transformations, and the dotted arrow denotes a model-to-text transformation [18].
As a first step, software engineers create a data model D focusing solely on the application's business logic.The data model captures the well-formed states of the application.In the next step, security engineers define the application's security model based on the data model D. This consists of multiple sub-steps.They first pave the way for extending the data model by identifying the sort representing users of the system.They then define roles (R), role hierarchy (⪯ R ), and use them to define the permission assignment (PA), which models FGAC policies for the data in the data model.
Finally, privacy engineers define the privacy model.They first determine the basic purposes (P), which depend on the application's domain.One can, however, consider a taxonomy of commonly used purposes as a starting point [50].Next, they declare The annotated methods have a special role: they are the application's interface providing different business logic services.Choosing the right annotations is a creative process.One can trivially choose one basic purpose for each method in the data model.Alternatively, more precise annotations can be chosen.In general, the interface methods should correspond to the application's different business processes as modeled by a UML activity diagram [15].In the most general case, they may call each other (in which case their actual purpose is combined in a more complex purpose), or some processes may not be annotated.For example, a general method for sorting lists is typically used as an auxiliary process and its actual purpose depends on the interface methods whose execution requires sorting.Both declared and actual purpose can be obtained by analyzing business processes, which we exemplify in the following.
Example 8. Suppose we have access to a UML activity diagram (Figure 3) describing the business processes implemented by the methods of the conference management system described in Example 1.
The diagram consists of three activities (shown as vertical lanes) each with input actions ( symbol), process actions ( symbol), and output actions ( symbol) connected with directed edges describing the activities' control flow.Objects ( symbol) representing different sorts from the data model are shown outside of the activities' lanes.We abuse the notation and mark in grey the sorts from CS P .Directed edges involving objects describe the data flow within the activities.Note that some objects in the activity diagram do not correspond to sorts in the data model.This can be achieved by modeling associations as explicit sorts and including them in CS P (Section 4).
Actual purposes can be derived from the diagram by observing which methods are used exclusively by other methods, as opposed to the methods that are additionally used as part of the application's interface.For the conference management system, each method is only used as an interface method and has its own actual purpose.Declared purposes can be derived by observing the data flows from objects to process actions.For instance, the privacy policy from Example 6 is inspired by the edge that connects the Author object to an action in the recommend papers activity.
Identifying the purposes of a piece of code is a difficult task that cannot be readily automated.Hence, having additional manual model validation steps (either by developers, or by external privacy auditors) would help to ensure correct privacy enforcement.Now let us consider what can be generated from the three models.The initial data model is first updated to account for extensions required by the security and privacy models.It is sufficient for the security engineer to identify the sort representing users by specifying its name in the security model.The data model can be automatically extended based on that information: the selected sort is declared as a subsort of the User sort.The roles and purposes can be encoded as enumerations (i.e., by using the + type constructor).A role is assigned to each user via the role attribute in the User sort, i.e., attr(User)(role) = Role.Purposes are similarly declared (as enumerations) and associated to the Consent sort via an attribute.Distinguishing personal data sorts (CS P ) from the other sorts can be achieved by declaring them as subsorts of the sort PersonalData that has the owner attribute of type User.Once the extended model is generated, it can be used to further generate code for classes and method stubs, which still must be implemented.Using the existing object-relational mappers, each class is also generated with persistence support defining the application's data-tier.It is also possible to use existing authentication libraries (like ASP.NET Core Identity [4] for C#, or Flask-User [2] for Python) to generate an authentication mechanism given the identified User sort.We use such a library with default settings for password-based authentication, which can further be configured.
The permission assignment (PA) in the security model is used to generate an enforcement mechanism for the specified FGAC policies.Namely, the mechanism is called each time an attempt is made to execute any of the action from A.
The privacy model is used to generate multiple artifacts.Firstly, privacy policy text presented to the user can be generated from the declared purposes.For example, the tuple (, read(, ), ) ∈ PA P can be presented as the policy: If , then we will read  of your personal data  for the purpose .In our concrete implementation, we provide a description field for every OCL constraint in our models, which are used when generating the privacy policy.If multiple tuples have the same action and constraint, their text can be combined and the corresponding complex purpose shown.
Similarly, the enforcement mechanism checking purpose limitation and consent is generated based on the specified declared and actual purposes and it is called whenever any of the action from A P was attempted.In addition, the consent collection mechanism that prompts users (either upon a change of declared purposes, or lazily when user data is used upon the user's request) can also be generated automatically based on the declared purposes.
Finally, a purpose tracking mechanism can be generated to track the actual purpose during the execution.It simply needs to maintain one (current) complex purpose per execution thread of the application.Whenever an interface method is invoked, the tracking mechanism adds the method's actual purpose to the current complex purpose and removes it once the method returns.
The generation process described here is technology-agnostic.In Section 7 we will describe two model transformations that we have implemented: one for ASP .NET web application and the other for Python/Flask web applications.Clearly based on the three models, only method stubs can be generated for the methods implementing custom business logic.Nevertheless, other methods that implement cross-cutting functionality like security and privacy policy enforcement, and authentication are fully implemented.
If the models' semantics allow it, the developers' implementations of the method stubs will execute successfully.Otherwise, they will throw a runtime exception.In fact, the generated enforcement mechanism would consistently throw exceptions whenever actions violate the specified models.This eliminates the risk of developers introducing implementation errors that violate users' privacy.

FORMAL SEMANTICS
We now provide a formal semantics for our models.Conceptually, the models formalize access control decisions of two types [16]: (1) declarative access control decisions, which depend only on static information defined in the models (e.g., roles in the permission assignment); and (2) programmatic access control decisions, which depend on the dynamic information captured by the current system's state (e.g., authorization constraint satisfaction).We extend both types of decisions to be purpose-based, namely, to depend on the privacy model (e.g., purposes in the declared purposes) and on the extended state of the system (e.g., existence of a Consent object).
We use first-order logic (FOL) [53] to formalize the semantics of our security and privacy models.FOL's syntax consists of logical symbols (e.g., ¬, ∧, ∃), as well as non-logical ones (e.g., predicate, function, or constant symbols) defined by a signature.FOL's semantics is defined with respect to a valuation  of its free variables and a first-order structure  consisting of a carrier set and interpretation functions mapping non-logical symbols to predicates, functions, and constants, respectively.We write ,  |=  if FOL formula  is satisfied by structure  and valuation .Instead of having a single homogeneous carrier set, order-sorted FOL extends FOL by considering multiple carrier sets each containing elements of the same sort.
Static information (e.g., the relations PA and PA P ) are formalized as relations in a first-order structure   The dynamic information is the content of a system state (i.e., an object model of the extended data model D ′′ ), which we also formalize as a first-order structure   .The semantics of our security and privacy models is formalized by order-sorted FOL formulas   and   .An overall system state combines both static and dynamic information into a composite structure (  ,   ) obtained by combining   and   .Given a composite structure the decision to allow an action is equivalent to checking if (  ,   ),  |=   ∧   for some .
To define the two structures and the two formulas, we first define the respective signature Σ, partitioned into two parts Σ  and Σ  , representing static and dynamic information, respectively.
Signatures.Let the order-sorted signature for static information be Σ  = (R ∪ P, ⪯ R , ∅, {PA, PA P }), which contains a sort for each role (in R) and purpose (in P), role hierarchy (⪯ R ) as an order on the sorts, no function symbols, and two relation symbols (PA and PA P ).Intuitively, these sorts and relation symbols will be interpreted by the roles, purposes, and the (permission assignment and declared purposes) relations from our security (Section 3) and privacy (Section 4) models.
The signature for dynamic information Σ  is built from a data model, which naturally induces an order-sorted signature with typed function and relation symbols [16].For clarity, we further split this signature into two parts: the primitive signature Σ  and the data model signature Σ D .The primitive signature is fixed and independent of any particular data model.Let Σ  = (PS, ⪯  , Ω  , ∅) be an order-sorted signature where the set PS contains the five primitive sorts (Section 3).The relation ⪯  is a partial order on PS such that N ⪯  Z ⪯  F. The set Ω  contains common function symbols on the primitive sorts, like conjunction (and : B × B → B), addition (+ : F × F → F), string length (size : String → N), etc.The signature has no relation symbols.
As indicated above, the structure with static information   maps each role (and purpose) sort in Σ  to a set of strings such that the sets respect ⪯ R .The relation symbols in Σ  are interpreted by the permission assignment relation (⟨⟨PA⟩⟩ = PA) and the declared purpose relation (⟨⟨PA P ⟩⟩ = PA P ), respectively.
Regarding the structure with dynamic information   , we start by interpreting the primitive sorts, and then the sorts defined by the data model.Conceptually, an object model induces a first-order structure that interprets the sorts defined by the data model.The We now formally define carrier sets for sorts defined by the data model, which contain elements called objects.Recall from Section 3 that an object of a sort  ∈ CS is a pair ( ↦ → (, )).Since attributes are just function symbols,  is formally a set of functions {{{}} |  ∈ dom(attr())} interpreting the function symbols.Since object names are unique, a finite set of objects forms a (partial) function from object names to their interpretations.Hence the carrier set  of a sort  ∈ CS is the set of all partial functions from object names to all possible attribute interpretations, i.e., Model semantics.To capture the formal semantics of our models, we use the standard order-sorted FOL syntax over the signature Σ = Σ  ∪ Σ  additionally extended with the unary relation symbol CAP : P modeling the current actual purposes.Since our semantics depends on authorization constraints expressed in OCL, we abuse the notation and use the OCL constraints as subformulas in FOL formula.This is fine as OCL is itself a FOL, only with a specialized concrete syntax.For clarity, we provide OCL's syntax and semantics in Appendix A.

{{ID ⇁ ({𝑐} ×
The structure  on which we evaluate the formula combines the structures   and   .Moreover, it is further extended to interpret the unary relation symbol CAP : P with the set of current actual purposes, which is an abstraction representing the state of our purpose tracking mechanism.The exact content of this relation depends on the execution history of the system and it will be specified later in this section.The semantics of our models are specified for any interpretation of the CAP relation symbol. Given a valuation  defining a user  executing an action  on an object  (possibly) with an update value , the formal semantics of the security model is captured by the order-sorted FOL formula   : where the OCL expression self evaluates to the object .
Given a valuation  defining a user  executing an action  on a personal data object  of a sort  (possibly) with an update value , the formal semantics of the privacy model is captured by the ordersorted FOL formula   : where the OCL expression self again evaluates to the object .
The access control decisions can be specified based on the current system state.In contrast, the behavior of the purpose tracking mechanism (modeled by the CAP relation) depends on both the current and previous system states.Therefore, we model any system as a labeled transition system (LTS) Δ = (, A ′ , ), where the set of nodes  contains all possible composite structures , edges are labeled by the actions extended with the actions corresponding to the method returns (i.e., A ′ = A ∪ {return(, ) |  ∈ dom(meth()),  ∈ CS}), and  ⊆  × A ′ ×  is transition relation.The transition relation allows only authorized actions and relates two structures based on the action taken as expected.For example, for an authorized update action, the new structure is the same as the old structure except that the new updated value is assigned, see Appendix B. The transition relation also updates the current actual purpose appropriately, i.e.,  contains: Intuitively, whenever an authorized method is executed, its actual purpose is added to the set of current actual purposes.When the method returns, the actual purpose is removed.

IMPLEMENTATION
In this section we describe two model transformations that implement our model-based methodology in two popular programming languages, C# and Python.In this section, we describe the codegeneration process and design choices for both transformations.
Based on the three models, our model transformations create either an ASP.NET web application in C# or a Flask web application in Python.From the extended data model, they generate a set of Entity Framework Core classes [3] in C# or SQLAlchemy [1] model classes in Python.This set of classes will be automatically mapped into an appropriate SQL database schema by the two libraries.
To handle the concept of users, roles, and the authentication process, we rely on the ASP.NET Core Identity [4] and Flask-User [2] library.We generate the user class extended with the authentication library's user class (IdentityUser or UserMixin class, respectively).Since the libraries do not support role hierarchies, we have encoded them manually in the authorization checks.
A class and the corresponding associations are also created for the purpose, personal data, and consent sorts.We configure the application such that the purposes defined in the privacy model are inserted into the database table corresponding to the purpose class whenever the database is initialized.Furthermore, a fully functional web page for collecting consent is generated in both our ASP.NET and Flask applications.The page contains the automatically generated text of the application's privacy policy and allows the user to consent to each declared purpose individually.
Each action in A is mapped to an existing method provided by the generated classes, e.g., for every sort  ∈ CS and attribute  ∈ dom(attr()), getter and setter methods for the property  in the class  correspond to actions read(, ) and update(, ), respectively.To enforce security and privacy policies, these methods need to be instrumented to call the enforcement mechanisms that implement the respective models' semantics.Such instrumentation can be achieved using different technologies.In C#, we annotate these methods with the [Secured] label, whose definition relies on the PostSharp C# aspect-oriented programming (AOP) library [55] to call the enforcement mechanisms.In Python, we instrument methods via a function decorator @secured that calls the enforcement mechanisms directly.If an action is permitted, according to both the security and privacy models, then the system executes it; otherwise, an enforcement mechanism raises an exception.If the developers do not handle the raised exception, the generated default exception handler is called that displays which permission was disallowed.
To enable the purpose tracking mechanism to track the current actual purposes, we extend the label (decorator) to include the labeled (decorated) method's actual purpose as a parameter.The parameter is a constant specified during the code generation.The definition of the label and decorator additionally includes code that pushes the supplied parameter to a thread-local stack of purposes when the interface method is called and pops it when the method returns.The stack content corresponds to the content of the CAP relation (Section 6).
Finally, our model transformations generate code for the security and privacy enforcement mechanisms.The code directly implements the set-theoretic semantics presented in Section 6.To support security and privacy policy evolution, our model transformations can also generate just the enforcement mechanisms' code, while the application's business logic remains unchanged.
Threat model and assumptions.Our approach makes as few assumptions on the system as possible.In particular, for the designers to specify an application's privacy policies (i.e., to define a privacy model), they must know the structure of the application's state (i.e., the data model) and the application's interface methods (i.e., the domain of the annotate function).Having a data model of an application at a design phase is a realistic assumption.
Our approach relies on application designers (i.e., software, security, and privacy engineers) to correctly specify the application's well-formed states and its security and privacy requirements.The problem of enforcing these requirements becomes extremely challenging if application designers or developers are malicious and requires some form of trusted platform technology to ensure that a correct version of enforcement mechanism code runs [56].We therefore consider application designers to be trustworthy.Additional design validation techniques, like model checking or manual model inspection, can help justify this assumption.
In contrast, the application developers, while not malicious, may unintentionally deviate from the design by making implementation errors.Our methodology and tools aid well-intentioned developers in systematically and consistently enforcing security and privacy policies.As these policies have a cross-cutting effect on system execution, generating enforcement mechanism code and instrumenting relevant actions effectively connects privacy-relevant design and implementation artifacts thereby enabling privacy-by-design.
Overall, our approach helps organizations mitigate the risk of GDPR violations caused by accidental implementation errors.

EVALUATION
We have evaluated our model-driven approach using the generators described in Section 7. In particular, in our evaluation aims to answer the following groups of research questions (RQ): RQ1: Does our approach generalize across different implementation technologies and application domains?RQ2: How much developer effort is required to use our modeldriven approach compared to a manual implementation?What is the ratio between the generated and manually written code?RQ3: How much runtime overhead does our approach incur?How does it compare to manual implementation and to state-ofthe-art approaches?How does it scale as the size of an application's input workload and state grow?
We answer the above questions by implementing three realistic applications as case studies from different business domains (RQ1): (i) a social networking site MiniTwit, (ii) a conference management system ConfMS, and (iii) a health record manager Hipaa.Each application is implemented multiple times.ConfMS is implemented using both of our model transformations targeting different implementation technologies (RQ1).To assess the development effort (RQ2), MiniTwit is implemented manually twice, in addition to using our model transformation targeting Python.The first manual implementation is the original MiniTwit implementation [61] that does not enforce any security or privacy policy.This baseline implementation allows us to assess the (worst case) runtime overhead of our generated enforcement mechanisms (RQ3).To the best of our knowledge, no state-of-the-art approach automatically enforces privacy policies.We still compare the overhead of our approach to a state-of-the-art framework Jacqueline [64] for security policies (RQ3).The Hipaa application was chosen as it has an independent implementation in Jacqueline.  1 summarizes the names, descriptions, and developers of all the implementations of the three applications.The code and the deployment instructions for each of them can be found under the respective subdirectory in our publicly available artifact [45].
We now describe each application in detail and afterwards we present our evaluation results.

Case studies
Social networking site.MiniTwit is a Twitter clone with an opensource implementation [61].In short, registered users can write new messages, follow or unfollow users, and view their own timeline.Moreover, to assess privacy policy enforcement, we extend its functionality to support in-application advertisements displayed on users' timelines.
We denote this implementation as MiniTwit-baseline as it forgoes security and privacy policy enforcement and implements only MiniTwit's business logic.Next, we introduce MiniTwit's security and privacy policies.We denote by MiniTwit-secured and MiniTwit-flask the two MiniTwit implementations that enforce the security and privacy policies.The former extends the baseline application with a manual implementation that directly enforces the policies, while the latter follows our approach using our Python model transformations (Table 1 Here  is (self .author.oclIsUndefined() and value = caller).
We define personal data sorts CS P as {User, Follow}, i.e., the set of users's basic information (like their gender and age) and the set of users they follow.The application has two purposes P = {GenerateAds, DisplayPosts} and the following privacy policy: (DP1) We will read your basic information, namely your age and gender, to generate and display advertisements that you may find interesting, modeled as the declared purpose tuples (GenerateAds, read(User, age), ⊤), (GenerateAds, read(User, gender), ⊤).(DP2) We will read your followers to populate your timeline with posts, modeled as (DisplayPosts, read(User, follows), ⊤).The following JSON snippet shows the definition of the security policy for updating a message's text (as in AC2), and the privacy policy for reading the user's age for the purpose of generating advertisements (as in DP1).The concrete syntax used to define all our models is JSON.% Security policy, permission assignment from AC2 { "role": "RegUser", "action": "update", "resource": { "class": "Message", "attribute": "text" }, "constraint": "self.author= caller" } % Privacy policy, declared purpose from DP1 { "purpose": "GenerateRelevantMarketingEntities", "action": "read", "resources": [ { "class": "User", "attribute": "age" } ], "constraint": { "ocl": "true", "desc": "true" } } Observe that the JSON input to our model transformations is isomorphic to the model definitions presented in Sections 3 and 4.
Conference management system.ConfMS is an extension of our running example in this paper.As described in Example 1, a registered user in this application can have either a normal role, or be on the program committee, or be the chair.Normal users can edit their profile information, search and view accepted papers, and submit new papers.Program committee users can delegate the review of a paper to some users.In addition, users with the chair role can accept papers for publication.
We define our three models for this conference management system.For space reasons, we omit the details.In a nutshell, the data model consists of 2 classes and 3 associations (Figure 1), the security model consists of 3 roles and 14 FGAC policies, and the privacy model has 3 declared and 3 actual purposes as depicted in Examples 6, 7, and Figure 3.These models are input into our codegenerators to generate C# and Python applications with method stubs.We straightforwardly implement the stubs' business logic following the activities specified in Figure 3.We denote by ConfMS-asp and ConfMS-flask the name of these two applications (Table 1).
Health record manager.Hipaa is a web application that allows users with different roles to view personal medical records.We define three models correspond to the underlying database and implementation of the similar case study conducted in Yang et al. [64] and compare the performance result.Due to space limitations, we only report on the sizes on the models, which are available within our artifact [45].Hipaa's data model contains 10 classes and 13 associations, its security model contains one FGAC policy, and its privacy model declares one declared and one actual purpose.
We denote by Hipaa-jacqueline the name of the HRM application implemented by Yang et al. [64] and Hipaa-flask the name of the Python application that is generated.

RQ1: Generality
To answer RQ1, we have showcased in Section 7 that our modeldriven approach can target multiple different programming languages, namely C# and Python.Furthermore, we claim that any language can be targeted.Languages supporting the implementation of cross-cutting concerns like AOP [33,34] or function decorators [42] are particularly convenient for implementing the model transformations, and such transformations produce code that is easy to maintain and evolve.Existing support for authentication and authorization also helps in this respect as the relevant libraries can be directly targeted by model transformations.Finally, by implementing the three case study applications, we show that our approach can be used in different application domains.

RQ2: Development effort
To answer RQ2, Table 2 provides statistics on the different implementations of the ConfMS application.At the top, the table shows the size of our three models of the ConfMS application.A data model's size is determined by summing the number of defined sorts, attributes, methods, and associations.A security model's size is the size of its permission assignment relation, whereas a privacy model's size is the sum of the number of declared and actual purposes.Below the model sizes, our table shows the number of generated lines of code (# LoC) for ConfMS-flask (Python and HTML code) and ConfMS-asp (C# and HTML code) implementations, as well as the lines of code needed to implement the method stubs manually.Developers need to write only 16% of the overall codebase manually in Python and 15% in C# to obtain functional web applications.The absolute difference in the number of lines of code written manually is only 202, which makes the manual effort across different technologies comparable.The difference in the ratios of manual and generated code between the two technologies is due to the Model-View-Control architecture of ASP.NET web applications, which significantly increases the codebase.We do not report numbers for the generated authentication mechanism as it is handled by the respective libraries that require only a few configuration parameters.The most significant part of the generated code is for the enforcement mechanisms: 21% of the total Python code and 30% of total C# code.We also assess the overall effort when developing the MiniTwit application manually (MiniTwit-secured) compared to using our approach (MiniTwit-flask).The MiniTwit-secured application consists of 202 lines of Python and 137 lines of HTML code, whereas MiniTwit-flask is just 172 lines of Python and 119 lines of HTML code.The remaining MiniTwit-flask code is generated from models with a total size of 29 (its data model size is 15, security model size is 12, and privacy model size is 2).This provides evidence that the combined effort of the designers and developers using our approach is significantly lower than the effort of developers manually implementing the application.Furthermore, when the security or privacy policies change, our model transformation would regenerate correct enforcement mechanisms automatically, whereas a manually implemented application may require non-trivial changes.

RQ3: Runtime overhead and scalability
We deploy the three versions of the MiniTwit application and execute them to measure the overhead incurred by enforcing security and privacy requirements using our methodology.We then deploy the Hipaa implementations to compare our approach to the stateof-the-art security web frameworks.We measure the time taken between sending a request to the web application and receiving a response back.The execution time is averaged over 10 executions.
In particular, as input to the applications we use the workload where a registered user logs in and then views their own timeline (by invoking the public_timeline method).Each MiniTwit instance  is deployed with 4 users and a variable number of messages in its state.Message content is randomly sampled, and its assignment to a user is randomly determined.One of the users is set to follow all the other users and we generate the workload on behalf of that user.The user's followers are considered personal data.By controlling the number of messages present in the state of the application we can assess our approach's scalability.Figure 5 shows the execution time of the three MiniTwit implementations on the above workload.All the implementations scale linearly with the number of messages.Our approach clearly adds overhead as its linear factor is larger than both the baseline and the manually secured implementation.The baseline implementation does not implement any policy checks hence its execution time is the lower bound for the other two implementations.The manually secured implementation is more efficient as it performs only two checks (one for security and one for privacy policy compliance) before outputting the list of messages to the public timeline.This is prone to errors as it assumes that the subsequent code correctly queries and outputs the right messages.In contrast, our approach performs both checks whenever a message is accessed within the public_timeline method's implementation, ensuring that the developer only manipulates data in compliance with the policies.
Although the added overhead is comparatively high, it is unnoticeable from a user's perspective.In particular, our implementation's execution times are below one second for any number of messages.This is ensured as MiniTwit implements a 30-item pagination for all lists shown to the user.Therefore, our approach's performance stabilizes below 0.3 seconds for all executions involving more than 30 messages in the application's state.
When running the Hipaa implementations (hipaa-jacqueline and hipaa-flask), we initialize their state with a single doctor and variable number of randomly sampled patients.We then run a workload where the doctor logs in and opens the initial page of the application (by invoking the index method), which shows all their patients and the hospitals where they work.In this case, only the patient information is considered personal data.6 shows the execution time of the two Hipaa implementations on the above workload.While our approach still scales linearly in the number of patients, Jacqueline's performance is superlinear.Both approaches perform well enough to be used with a 100-item pagination, which the Hipaa application does not implement.The significant difference in the performance between the two approaches is caused by the performance-intensive faceted execution [12] employed by Jacqueline.It is used to enforce stronger information-flow security policies.

CONCLUSIONS
In data protection, a privacy policy is an informal legal document describing how a system gathers, uses, discloses, and manages personal data.We have introduced the well-known and widelymandated classes of privacy policies involving purpose-limitation and consent into the system design phase.Namely, we propose a privacy model that formalizes these classes of privacy policies.We define the model's semantics and implement model transformations that can generate web applications in C# and Python with complete, configured, enforcement infrastructure for their specified privacy polices.By generating this (often critical) enforcement code, our approach reduces the risk of privacy policy violations due to implementation errors.We evaluate our approach and demonstrate its broad scope and applicability, as well as its modest overhead.
In the future, we plan to extend our methodology to incorporate other data protection requirements such as data minimization (Art.5 §1 (c) GDPR), storage limitation (Art.5 §1 (e) GDPR), and accountability (Art.5 §2 GDPR).We also plan extend our methodology to enforce information-flow security policies on particularly sensitive classes of personal data to prevent inference attacks.Models that have formal semantics enable the analysis of both the models themselves and the transformation functions.We plan to carry out automatic property checking of privacy models to detect and correct design errors and provide feedback to developers during method stub implementation.We also plan to fully verify the correctness of our model transformation functions.

Definition 2 .Example 3 .
Object model of a data model D is a triple (O, M, R), where O is a finite set of objects, M maps each method name  ∈ dom(meth) to a meth()-typed function, and R maps each association name  ∈ dom(asc) to a asc()-typed relation over objects O. Consider the object diagram shown in Figure 1 (right) for the corresponding class diagram in the same figure.It shows three researchers, with names r 1 , r 2 , and r 3 representing two students and their adviser.They have authored a paper p, which is neither published nor reviewed yet.The corresponding object model (O, M, R) is

Figure 1 :
Figure 1: UML class (left) and UML object (right) diagrams of the conference management system from Example 1.

Example 5 .
Consider the requirements from Example 1 and the data model D from Example 2, which we extend to D ′ with the User entity such that ⪯ D ′ is the same as ⪯ D except that Researcher ⪯ D ′ User.There are three roles R = {Normal, Committee, Chair} with Normal ⪯ R Committee ⪯ R Chair.In addition, the security model expresses the following security requirement from Example 1: A user with role Normal can read an unpublished paper's title only if the user has no conflict with any of the paper's authors.The triple (Normal, read(Paper, title), ) ∈ PA models the above requirement, where  is the following authorization constraint: self .publishedor self .authors→forall( | caller.advisers→excludes() and .papers→ forall( | not  = self and 2024 − .year < 2 implies .authors→ excludes(caller))).

Figure 3 :
Figure 3: UML Activity diagram of the conference management system

Figure 4 :
Figure 4: UML Class diagram corresponding to the extended data model of the conference management system

Example 9 .
Suppose that based on the data model from Figure1security and privacy engineers have identified that sort Researcher represents both users of the system and contains personal data.Then the extended data model generated by our approach is shown in Figure 4. Namely, the diagram contains three additional classes.The User class identifies Researcher as a sort representing users and declares additional attributes and methods useful for the subsequent generation of authentication code.The PersonalData class identifies Researcher as the only sort containing personal data and declares the additional attribute owner.Finally, the Consent class captures the existing consents that users provided by declaring the attributes user, data, and purposes.In the diagram, attributes that refer to classes in the diagram are shown as unnamed associations.

𝑛
∈dom(attr()) attr()() )}}.An object model (O, M, R) of a data model D induces a firstorder structure where O defines carrier sets for D's sorts, M defines functions for D's methods, and R defines relations for D's associations.Otherwise, the interpretation functions are as defined above.

Figure 6 :
Figure 6: Performance of different Hipaa implementations

Figure
Figure6shows the execution time of the two Hipaa implementations on the above workload.While our approach still scales linearly in the number of patients, Jacqueline's performance is superlinear.Both approaches perform well enough to be used with a 100-item pagination, which the Hipaa application does not implement.The significant difference in the performance between the two approaches is caused by the performance-intensive faceted execution[12] employed by Jacqueline.It is used to enforce stronger information-flow security policies.

Table 1 :
Summary of the case study implementations ).The MiniTwit application has only one role, the registered user RegUser and the following FGAC policies: (AC1) Users can create new messages, modeled as the permission assignment tuple (RegUser, create(Message), ⊤).(AC2) Users can initialize themselves as an author of the message and update the its publication date and text, modeled as (RegUser, update(Message, author), ), (RegUser, update(Message, pub_date), self .author= caller), (RegUser, update(Message, text), self .author= caller).