Simplified models: a different perspective on models as mediators

We introduce a novel point of view on the “models as mediators” framework in order to emphasize certain important epistemological questions about models in science which have so far been little investigated. To illustrate how this perspective can help answer these kinds of questions, we explore the use of simplified models in high energy physics research beyond the Standard Model. We show in detail how the construction of simplified models is grounded in the need to mitigate pressing epistemic problems concerning the uncertainty inherent in the present theoretical and experimental contexts.


Introduction
The philosophical literature on models in science has drawn attention to a variety of issues of great importance for understanding the practice of science (Bailer-Jones 1999;Frigg and Hartmann 2012), as numerous case studies attest. 1 One of the key insights, emphasized both by Cartwright (1999) and Morrison (1999), is that a distinctive feature of models is their partial independence from both theory and data-in a sense explicated by Morrison and Morgan (1999) they mediate between them. Morrison and Morgan argue that this autonomy is obtained through the construction of models: models are neither derived from theory nor simple representations of data. It is this autonomy, they claim, that affords us the possibility of learning from models, whether about phenomena or theory.
Morgan and Morrison's account emphasizes the construction and representational function of models, for it is these features of models that they claim ground the autonomy and functions of models. Since "there are no rules for model building" (Morrison and Morgan 1999, 31), modeling to some extent depending on the creativity of model builders or simply being a tacit skill of some scientists (Morrison and Morgan 1999, 12), philosophical studies of models tend to begin with a constructed models "in hand" and proceed to inquire into the practical application of these models or their metaphysics, e.g. their nature, ability to represent, and how to distinguish them from theories (Frigg and Hartmann 2012). While these topics are clearly of philosophical interest, their predominance has resulted in relatively little attention being paid to equally important epistemological questions about models (Frigg and Hartmann 2012, §3), e.g. how knowledge about models is converted into knowledge about phenomena, why autonomy should afford the possibility of particular kinds of learning, and why individual models are constructed with the particular degrees of independence from theory and experiment that they end up having. These last two questions are the particular ones taken up in this paper, in the interest of contributing to a better understanding of the epistemological aspects of modeling.
Our starting point for investigating the link between autonomy and learning is to begin with the simple observation that models may be autonomous from theory and data to varying degrees and in different respects. This naturally leads one to wonder what accounts for modelers' choices in model construction, since these are the very choices, according to Morrison and Morgan's account, which should result in autonomy of a particular degree and respect. Although some degree of creativity and skill is no doubt required in the construction of models, it is not unreasonable to suppose that there are always salient, justifiable grounds (albeit context specific ones) which help determine particular choices in model construction. 2 Insight into these grounds can plausibly be obtained by attending to the particular learning aims of the scientists involved. These aims inform the requisite functions of a model, which then determine (to some extent) how the model should be constructed. This perspective on models thus neatly reverses the scheme given by Morrison and Morgan, which moved from the construction of models to their autonomy to their function as tools of learning. In adopting it here we do not see ourselves as offering a competing account to Morrison and Morgan's but rather a complementary point of view on models as mediators.
We illustrate this approach through the use of a particular example drawn from recent work in high energy physics concerning Beyond Standard Model (BSM) searches at the large hadron collider (LHC) at CERN. The models that we draw attention to are known as "simplified models". 3 They have seen wide application in recent years at the LHC's two main experiments, ATLAS and CMS, since their introduction in the late 2000s. 4 They are perfect examples of "models as mediators", demonstrating partial autonomy through their construction and a variety of functions which contribute to their utility in learning about potential new phenomena and about theory. The multiplicity of their relations to theory and experiment-to quantum field theory (QFT), the standard model (SM) of particle physics, speculative BSM ideas like supersymmetry (SUSY), and collider data from the LHC and Tevatron experiments-also permits a study of the variety and degree of independence from these in concert with simplified models' various intended functions. This variety makes simplified models an attractive and nuanced example for illustrating how modeling choices may be determined and grounded in the larger epistemic context of a particular research area.
Our principal claim concerning simplified models in this paper is that they are constructed in order to alleviate certain important difficulties posed by the current epistemic context in high energy physics. The two primary problems faced by BSM searches presently are the lack of reliable theoretical guidance and the lack of any experimental discrepancies between SM predictions and collider data. One might think that these circumstances together suggest that BSM physics is a bit of a will-o'-the-wisp, but there are many reasons (of varying strength) to expect that there is physics beyond the SM, including the existence of dark matter (Zinkernagel 2002, Smeenk 2013, naturalness and the hierarchy problem (Giudice 2008, Grinbaum 2012, Williams 2015, and unification (Maudlin 1996;Wayne 1996;Morrison 2000;Li 2003). There are various proposals for solving these problems, the most prominent of which is the assumption of supersymmetry (described in Section 3). SUSY is a speculative idea which remains empirically unconfirmed, and the theoretical arguments for it to appear at the relatively low energies testable by the LHC are not especially strong (although many theoretical physicists do find them compelling). The highly speculative nature of BSM physics in general means that theory is presently not a reliable guide for BSM experimental searches at colliders. Thus, where to look and what to look for are quite unconstrained in such searches, due to the epistemic context of current high energy physics.
This present context contrasts with the circumstances surrounding the search for the SM Higgs, where there were plentiful data indicating that the SM (of which the Higgs is a crucial part) was correct and strong theoretical arguments for its existence. 5 In other words, where to look and what to look for were strongly constrained in the case of Higgs search, unlike the present circumstances in BSM searches.
The paper is organized as follows. To begin (Section 2) we briefly motivate our investigation by introducing the models as mediators framework and showing how the novel perspective we introduce leads to the epistemological question we take up in the paper. We then introduce simplified models (Section 3), explaining their construction as "incomplete" effective field theories that abstract from various BSM physics scenarios and providing a concrete example. In Section 4 we describe the functions of simplified models from the perspective of the theoretical physicist and show how simplified models are motivated by epistemic difficulties on the theory side. In Section 5 we describe the functions of simplified models from the perspective of the experimental physicist and show how simplified models are also motivated by epistemic difficulties on the experimental side. The next section (Section 6) briefly remarks on the application of the ideas presented in the previous section to epistemic issues concerning Big Data science. We provide some concluding remarks in Section 7.

A different perspective on models as mediators
The principal concern of Morrison and  is to articulate an account of the autonomy-from both theory and phenomena-of models in order to show how they may function as instruments of learning. Their account is motivated by the recognition that "there is a significant connection between the autonomy of models and their ability to function as instruments" (Morrison and Morgan 1999, 10). They ground the autonomy of models in their construction and their ability to represent. Accordingly, the account outlines the four main aspects of models mentioned: their construction, their functioning (as instruments), how they represent, and how they are used as tools of learning. In this section we briefly introduce these four components and their connections as presented in Morrison and Morgan (1999), then show how to re-orient their account as a way to highlight how learning drives functioning and construction.
First, a brief overview of how these components are connected in the Morrison and Morgan account. They begin with model construction, for on their view it is because models are constructed partially from theory and partially from data that they are partially independent from each. It is precisely in this sense that they mean models are (partially) autonomous. This autonomy-through-construction allows models to function, as they say, like a tool or instrument. They also emphasize that a model's representative capacity or capability is crucial for it to function autonomously in this way. Finally, they state that learning from or through a model is facilitated by constructing it and using it. Indeed, they suppose that autonomous functioning is required for learning about and mediating between theory and world with models. The following illustration pictures these main components and their direct relations: In short, the Morrison and Morgan account of models indicates how partial independence from theory and data in the construction of models grounds the autonomous functioning of models, which is necessary in order to learn from models about theory and data. A model's capacity to represent the world is also necessary for its autonomous functioning, but as representation is not particularly relevant to our concerns in this paper, we will mostly set aside its role in the following. 6 Let us therefore prune the previous diagram slightly to the components that will be of primary interest: Morrison and Morgan do not actually provide an argument per se for the claim that autonomy, obtained by construction and manifested in functioning, is necessary for learning. There is a reasonably straightforward way to understand it, however, and no doubt this is what they had in mind. Consider first a simple model that divides the epistemic products of science into two categories: theory and data. The mutual autonomy of theory and data grounds the possibility of learning in the following way. To subject a theory to test, it must be possible for the data to be consistent with its predictions and it must be possible for the data to be inconsistent with them. The data must be autonomous with respect to any given theory, in other words, so that we can learn something about the theory by performing experimental tests, e.g. whether it is true or at least empirically adequate. Conversely, given some experimental data, some theories are consistent with it and some are not-empirical data underdetermine theory. The theories are importantly autonomous with respect to any given empirical data, so we can learn something about the data by theorizing, e.g. what the data represent or even what data are possible to obtain. This mutual autonomy, and hence the possibility of learning from it, can be easily extended to the view of scientific products which includes models: models and theories are mutually independent because (a) theory does not determine a particular model and (b) particular models underdetermine theory; models and data are mutually independent because (c) data are not determined by models and (d) particular data underdetermine models of that data.
Given these comments, it is plausible to suppose that models possess different degrees and kinds of autonomy with respect both to theory and to data. What, though, determines the kinds and degrees of a model's independence (and dependence)? While some may insist that the construction of a model is an "art" and nothing can be gained from pursuing an investigation along this line, it is plausible to suppose that there are at least some relevant epistemic considerations which significantly influence the construction of models. As said in the introduction, we suggest that the natural starting point is to look at what scientists hope to learn from their models. This thought motivates re-orienting our perspective on models as mediators, from the direction highlighted by Morrison and Morgan, to the reverse direction: More explicitly, the basic suggestion is that valuable insight into the autonomy and construction of models is to be had by investigating first what scientists wish to learn from a model. In other words, we aim to show how scientists' epistemic goals can (partially) determine how a model should function and hence how it should be constructed. This does not mean that construction is necessarily a mere pragmatic matter, however, wholly dependent on the particular aims of individual scientists or groups of scientists. The relevant aims may be grounded in, among other things, objective assessments of the viability of (or confidence in) present theories, models, and data, which assessments suggest on what a novel model should be dependent and of what it should be independent. To elucidate the foregoing ideas, we turn to our concrete example-simplified models-in the following sections.

Simplified models
Simplified models have emerged in recent years as a useful class of models in high energy physics, particularly because of how they mediate between the data collected at particle colliders and the theoretical scenarios explored in BSM physics (as we will explain in the following). A "simplified model" in this context is an extension of the SM that adds only a couple of new hypothetical BSM particles to the SM, along with decay chains of these particles into BSM and SM particles (Alwall et al. 2009; LHC New Physics Working Group 2012). They are "simplified" because, unlike full BSM models, which typically introduce numerous particles and decay chains, each simplified model only introduces a small handful of new experimental parameters to the SM: the masses of the posited BSM particles and a few cross sections and branching ratios. 7 It is important to stress that probably no physicist thinks that a simplified model is a realistic model of BSM physics. 8 For this reason we will distinguish them from ("full") BSM physics models, which purport to represent actual physics beyond the Standard Model. Despite being unrealistic, simplified models do have a certain utility, which is to be explained by their being embedded in relations of dependence and independence to possible BSM physics and collider experiments.
Physicists expect that BSM particle physics will include both novel phenomena at extremely high energies (presumably described by some theory of quantum gravity) as well as lower energy phenomena that may be detectable by the LHC in the near future. The complete and correct underlying BSM theory (if any such theory indeed exists) presently remains beyond the reach of current physics, both theoretically and experimentally. For the most basic purposes of particle physics experiments, however, all one needs is a low energy effective field theory (EFT) that describes particle phenomena at the energies relevant for collider experiments. The SM, for example, is plausibly a low energy EFT of a higher energy EFT or full BSM theory. Now, if the complete BSM theory (or at least a higher energy EFT than the SM) was in hand, then in principle one could construct an appropriate EFT by "integrating out" the undetectable higher energy particles and modifying the remainder of the theory to account for their effects on the detectable low energy physics (Cao and Schweber 1993, 64). Since such is not so in hand, theoretical physicists generally proceed by developing plausible EFTs on the usual physical grounds, e.g. symmetry principles and physically reasonable constraints (Bain 2013). This EFT "philosophy" is widespread in contemporary physics because it is "a practical and convenient way of proceeding in describing natural phenomena" (Castellani 2002, 265), without the need for developing a fully fleshed out fundamental theory. 9 Simplified models are EFTs, since they do model low energy particle phenomenology, i.e. experimentally detectable particle behavior, in a quantum field theoretic description. Although they are not constructed on the aforementioned usual physical grounds, they are nevertheless motivated primarily by theoretical (especially SUSY) considerations and expectations. For this reason they are theory-driven models, echoing the theory-driven methodology of (a lot of) physics. 10 Although experimental considerations drive the construction of simplified models in important ways (described below), they are certainly not phenomenological models, 11 simply because there are presently no recalcitrant experimental data (vis-à-vis the Standard Model) upon which to construct one. Thus, being motivated by SUSY phenomenology is quite different from being a phenomenological model. SUSY phenomenology motivates the construction of simplified models by suggesting possibly detectable particle signatures which can be captured in a simplified model. By contrast, phenomenological modeling in particle physics would have to start from detected particle signatures.
Simplified models are not expected (due to their simplifications) to be candidates to succeed the SM; they are rather taken to be (in some sense) "incomplete" models. Indeed, no plausible physical principle would suggest that the correct EFT describing particle phenomenology at the LHC would include just the particular particles and decay channels of any individual simplified model. It may seem "counter-intuitive" to construct such "deliberately incomplete models" (Alwall et al. 2009, 2) in this way; nevertheless, our discussion of the functioning of simplified models in the following sections, building in particular on the original ideas from Arkani-Hamed et al. (2007) and Alwall et al. (2009), shows why simplified models are methodologically sensible to adopt in this exploratory epistemic context. They are, in short, primarily constructed as heuristic tools.
We emphasize that simplified models are not purely arbitrary constructions. The simplified models originally introduced in Alwall et al. (2009), for example, are motivated by SUSY phenomenology (SUSY models being the most popular among the BSM physics models), as are most of the many simplified models constructed since (LHC New Physics Working Group 2012). Nevertheless, simplified models are to some extent "model-independent" in the sense that they do not presuppose that supersymmetry actually exists, nor do they reflect the particular constraints or principles of any particular theoretical model incorporating SUSY. Indeed, the utility of simplified models does not necessarily depend on the eventual validation of any SUSY model. Thus, even if supersymmetry turned out to be false, simplified models would still be useful for characterizing BSM physics, as the particles and decay channels in simplified models are also relatable to a variety of non-SUSY BSM phenomenology (LHC New Physics Working Group 2012). So long as the products of the modeled decays are detectable as hadronic jets, missing transverse energy, or leptons (all of which are detectable at collider experiments), the SUSY-motivated simplified models can still be used to characterize BSM phenomenology (Alwall et al. 2009, 1). More unusual phenomenology that does not fit this mold, lumped together under the label "exotica" in particle physics, e.g. displaced vertex signatures, lepton jets, "weird" tracks, etc., has also motivated the introduction of other simplified models besides those motivated from SUSY physics (LHC New Physics Working Group 2012, §VII). The Fig. 1 A leptonic decay simplified model guiding idea behind developing the set of simplified models is to cover all the anticipated BSM phenomenology that is both theoretically possible and experimentally detectable at the LHC (and other potential collider experiments as well).
Before looking at an example of a simplified model, it will help to say a little about SUSY and SUSY terminology. Supersymmetry is popular among theoretical physicists and can be motivated in several ways-as a solution to the naturalness and hierarchy problem (Giudice 2008;Grinbaum 2012;Williams 2015), to help with grand unification (Maudlin 1996;Wayne 1996;Morrison 2000;Li 2003), to explain dark matter (Zinkernagel 2002;Smeenk 2013), etc. In SUSY models, every SM particle has a partner related by supersymmetry (a "superpartner"), i.e. a particle that shares many of its properties but differs in its intrinsic spin by one-half. 12 Thus each boson (integer spin) has a fermion (half-integer spin) as a superpartner, and each fermion has a boson as a superpartner. The superpartner bosons' names are just the names of the SM fermions preceded by an 's'. For example, the electron's superpartner is the "selectron" and quark superpartners are called "squarks". The superpartner fermion's names are the names of the SM bosons (subject to slight variation) with "ino" appended. Thus, the gluon's superpartner is the "gluino" and the W's superpartner is the "wino".
Our first example is a leptonic decay simplified model (Alwall et al. 2009). In this model ( Fig. 1) a proton-proton collision (like those at the LHC) leads to the production of a pair of quark superpartners, i.e. squarks (labeled "Q" in the figure). These squarks have relatively high mass and are unstable particles, so quickly decay into further, lower mass decay products. This particular model has four possible decay channels for the squark. First, the squark may decay into a SM quark (q), which subsequently gives rise to a detectable jet event in the LHC detector, and an LSP (lightest supersymmetric particle), which is both stable and "invisible" (its existence would be inferred from the detection of missing transverse energy E, in the same way that neutrinos, for example, are detected in collider experiments). The squark may also decay into a quark and, via an intermediate state, a W or Z boson and an LSP. The third possibility is a decay (via intermediate states) to quarks, leptons (l), and an LSP; the fourth is a decay to quarks, a lepton, a neutrino (ν), and a LSP. Each of these four decay channels has a calculable cross-section and the model has an associated branching ratio, which determines the likelihood of the squark decaying along each of the four decay chains. There are three (or four) new masses which would be measurable (inferable) in this model: the mass of the squark (Q), the masses of the intermediate particle state(s) (solid unlabeled line segments in figure), and the mass of the LSP (which would be inferred from the missing energy E).
With this basic introduction to simplified models in mind, it is straightforward to show how simplified models illustrate the four components (construction, functioning, representing, and learning) of Morrison and Morgan's framework and what grounds their construction. 13 To begin, simplified models, like the one described above, are constructed to be autonomous, in that they are both partially independent from collider data and also partially independent of the details of any specific BSM theory. In an obvious sense, they are not independent of QFT however-they are QFTs, i.e. models of the general QFT framework. Still, there is also a slender sense in which simplified models are independent of QFT, since simplified models can potentially be useful for characterizing BSM phenomena that may well not be describable by QFT, e.g. quantum gravitational phenomena (that is, if it were to somehow have a detectable effect on low energy physics). Simplified models are also plainly independent of experiment in their construction, for no BSM phenomena has yet been observed in particle experiments. Indeed, they are modeling particles which, for all we know, might well prove to be non-existent. Yet there is an important sense in which the construction of simplified models is driven by experimental considerations. Simplified models are built in such a way that their parameters are easily relatable to collider observables, viz. particle masses, cross-sections, and branching ratios. 14 As noted in Section 2, in Morrison and Morgan's framework the autonomy of models derives (in part) from their independence from theory and data by construction. Simplified models are indeed partially autonomous in this way. Morrison and Morgan also claim that the autonomy of models derives (in part) from their representational capacity. Although representation, again, is not particularly relevant to the argument of this paper, a few remarks on the topic are perhaps called for, as simplified models are of some interest in this respect as well. Simplified models are, as noted previously, deliberately constructed to be incomplete models of physics. That alone makes them "idealized models"; they may, however, be completely wrongheaded in Wimsatt's sense: "not only are there interactions wrong, but a significant number of the entities and/or their properties do not exist" (Wimsatt 1987, 29). This could happen, for example, if some collision products (jets, leptons, missing energy) well-described by a simplified model are actually produced through multiple processes, complicated interactions, etc. and not at all via the simple decay chains of the simplified model. In the case where simplified models are merely incomplete one has a case for denying that they represent; if they are completely wrong-headed, one has even a stronger case. On the other hand, it is arguable that simplified models can 13 Cf. also a similar analysis of Higgs models in Borrelli and Stöltzner (2013). 14 Alwall et al. (2009) in particular note that a major advantage of simplified models (especially the four they introduce) is that they are related to important, discriminating collider observables, e.g. mass signatures, and lepton and heavy quark counts. represent the observed phenomena indirectly and partially (perhaps along the lines suggested in Bailer-Jones (2003)). Although we cannot pursue this issue further here, since our present aims lie elsewhere, these remarks do suggest that simplified models should be a particularly interesting case to investigate for those interested in debates on scientific representation. 15

Theoretical perspectives on simplified models
Much of the current work of theoretical physicists in high energy physics is dedicated either to rethinking the conceptual foundations of the SM or else to the construction of BSM models. 16 Simplified models can and do play a useful role in these activities. Indeed, the independence of simplified models from BSM physics and their connection to experiment allows theoretical physicists to learn about BSM physics in ways that pure theoretical work and empirical data would not necessarily permit. This section demonstrates how the learning aims of theoretical particle physicists, which are substantially grounded in the present epistemic context of high energy physics, guide the particular construction of simplified models to make this kind of learning possible. We start by relating the two main functions of simplified models relevant to theorizing: (1) ruling out individual BSM models and (2) informing BSM model building.
First function first: simplified models can be used to rule out full BSM models (Alwall et al. 2009, 4). Although direct comparison of a full BSM model with collider data is adequate to this task, simplified models can facilitate this procedure-a theorist merely has to determine what predictions her model will make in terms of each simplified model's carefully selected few parameters to see if her model is consistent with the data (we describe this procedure in Section 5), rather than determining which precise experimental predictions the theoretical model makes for any particular experimental apparatus. Thus a theorist compares two models rather than a model and data. This is a relatively simple function, one which is, again, not strictly necessary as theorists can make the latter comparison if they wish.
For this reason the second function of simplified models is more significant. Simplified models can be used to guide the process of BSM model building. This guidance is potentially manifested in two ways, viz. in terms of negative guidance (constraints) and in terms of positive guidance.
Concerning positive guidance, in the event of the detection of new BSM phenomena at the LHC, a best-fit simplified model is heuristically useful for suggesting a corresponding complete BSM model (or set of models) for the new phenomena. As Alwall et al. (2009, 33) point out, "a reasonable hypothesis for the new physics is one that is consistent with the simplified model, except that where the data differs from the simplified model, the hypothesis differs in the same direction." What they mean, roughly, is that residual discrepancies between the data and the simplified model (which one would expect, given that simplified models are not full BSM models) can suggest what additional processes might need to be included in a more complete model.
As negative guidance, the presentation of experimental data in terms of simplified models reveals constraints on possible BSM models (see, e.g. (the ATLAS Collaboration 2015)). If a particular simplified model is ruled out by experiment in some regime (again, we discuss this procedure in Section 5), then theoreticians have some evidence that those processes are excluded and have grounds for introducing a corresponding constraint in their model building. Of course experimental data can be used to determine such constraints as well. Nevertheless, simplified models do facilitate this function. For one, as they are theory-driven models they have a more direct relation to full BSM physics models than experimental data. 17 The constraints can therefore be stated in terms of the theoretical framework rather than in terms of experimental observables. These inter-model relations can then be exploited to rule out entire classes of BSM models that are not consistent with the experimental data (as presented in the guise of a simplified model).
It is important to note that "ruling out" is actually somewhat too strong of a term to use in this context. The hypothetical BSM model space is enormous. Particular experimental signatures can be realized in subtle, complicated ways in full BSM models (including SUSY models). Properly "ruling out" models thus only applies to BSM models that behave more or less like the simplified models that have been experimentally excluded. For this reason a theoretical physicist who favors her model for theoretical reasons may not necessarily be deterred much by a negative result upon comparison with experimental data expressed in terms of a simplified model-she might in fact rather just see the full experimental data and draw her own conclusions from them.
Given these functions along with the caveats we point out, it may not seem like simplified models are really of much value to theoretical physicists, since they can apparently make direct comparisons between their models and the data and take guidance directly from the data for building new models. This would misunderstand, however, what the autonomy of simplified models affords in this context. As Morrison and Morgan emphasize, the autonomy of simplified models is important for them to be instruments of learning. Simplified models' crucial function (from the point of view of theoretical physicists) is to characterize a certain set of particle phenomenology that may be detected at the LHC independently of any particular BSM scenario and its complicated details. Simplified models are used as tools to inform physicists about the feasibility of various BSM scenarios they entertain in their model building practice (either by suggesting promising directions for model building or by indicating unpromising ones). Thus what one learns from the use of simplified models is something about whole classes of BSM physics-not merely something about individual BSM models.
Why is this important? The space of SUSY models (let alone the space of possible BSM physics models) is huge. Accordingly, there are simply far too many SUSY models to test at the LHC (let alone BSM physics models in general), so the conventional scientific method of deriving predictions from a model and comparing those predictions with the experimental data is inefficient and even wholly impractical in this context. 18 If theoretical physicists had strong reasons to pursue particular models with constrained parameters, then this massive model underdetermination would not be such a problem. But it has become increasingly clear that BSM theorizing is quite unconstrained-particle physicists no longer have a reliable guiding model (apart from the general QFT framework) as they did with the SM, and there are no recalcitrant collider data at present from which to build new phenomenological models.
A natural response to these circumstances is to constrain attention to small parameter spaces. However the usual approach to theorizing, e.g. by assuming certain simple, intuitive constraints, generally limits possible phenomenology too much. For example, minimal supergravity (mSUGRA) has only four parameters, but its assumptions would force strong constraints between parameters in more general models, strongly limiting possible physical effects in these more complete models. Not only is it unlikely that mSUGRA itself is correct (it is a "toy model" (Hartmann 1995)), but, due to its constraining of the possible phenomenology of full BSM models, the best fit mSUGRA model cannot tell us much about BSM physics. Other examples are similar; strong theoretical constraints limit parameter spaces, but since those constraints are likely not preserved in realistic BSM models, the small parameter spaces cannot tell us much about BSM physics. The same goes for "small", experimentfriendly parameter spaces like the constrained minimally supersymmetric standard model (pMSSM). 19 The alternative to the theory-driven approach is simply to wait for novel data from collider experiments, build phenomenological models of this data, and then use these phenomenological models to guide theoretical BSM models. This approach too is rather impractical in this context. There is a massive range of possible phenomenological models one could generate given significant discrepancies between the SM predictions and collider data, as it is not clear what physical processes might occur before the detection event (we revisit this problem in more detail in Section 5). 20 Hence, given the aims of particle physicists and the present epistemic situation (no guidance from experiment and little from theory), "minimal" theoretical models are needed that are closely tied to the experiments which particle physicists can currently perform. 21 In other words, particle physicists require a set of models that mediate between theory and experiment, by both having the descriptive capacity to account for possible experimental signatures while also connecting to possible BSM particle phenomenology and BSM theoretical "cores" like SUSY (Borrelli 2012).
Simplified models fit the bill quite well. They abstract from the details of full BSM models to derive particle phenomenology that is potentially detectable at the LHC. Because they retain physically significant relations to full BSM models, data at the LHC can then be used to make inferences about BSM physics (ruling out, guiding theory development). Key to motivating simplified models is the epistemic context of high energy physics. Since there is limited theoretical and experimental guidance in BSM theorizing at present, there is a significant underdetermination of physically plausible models which undermines the application of familiar approaches to theory testing. Limiting this underdetermination is crucial to make progress, but doing so by imposing constraints severs the epistemic connection between constrained models and more complete models. On the other hand, any forthcoming data from the LHC would be consistent with a vast array of phenomenological models which may have an inscrutable relation to full BSM models. Simplified models solve these problems by limiting the underdetermination on "both sides" while preserving strong links to more complete models and experimental data.
Given these considerations, it is worth remarking that the role played by simplified models can only be understood as part of the dynamical processes of science (Hartmann 1995). Simplified models are (in Hartmann's terms) model "substitutes for theory", since they provide some temporary explanatory, descriptive, and heuristic resources in circumstances where the underlying theory being unknown. Indeed, considered in isolation of their historical and epistemic context simplified models' relevance and importance would be entirely opaque (because they are deliberately incomplete models). Recalling the discussion of Section 2, attention solely to the details of simplified models' construction (i.e. that simplified models are EFTs with a handful of parameters beyond the SM, etc.) and their use as tools (ruling out full BSM models, etc.) overlooks the importance of how the context in which they are situated elucidates the scientific role that they play.

Experimentalist perspectives on simplified models
The aspects of simplified models relevant for experimental practice have already been hinted at above whilst discussing the theoretical perspective. In this section we discuss these aspects in further detail from the experimental perspective. We begin with a specific example to illustrate this perspective and then distill three important experimental motivations for their development, drawing on ideas in Alwall et al. (2009), Alves et al. (2011), and LHC New Physics Working Group (2012. We focus our attention on a particular simplified model, one of those abstracted from a certain SUSY process, to illustrate the experimental functions of simplified models. This specific BSM scenario is just one among several others that the CMS Collaboration (2016) has been investigating. The relevant process for this model (Fig. 2) involves the products of a proton-proton collision at the LHC decaying into a pair of squarks (a top squark (t) and its antiparticle (t), which in turn decay into a top quark (t) and its antiparticle (t) and two neutralinos (χ 0 1 ). This is (one decay chain of) a simplified model where the only relevant BSM phenomenon is the production of the top squark and its antiparticle and their decays into detectable SM particles (the quarks decay into jets of hadrons) and missing energy (the neutralinos). The Feynman diagram of this simplified model (like any simplified model) abstracts from any other kind of BSM particle production that might go on in proton-proton collisions. Indeed, it abstracts from all the other relevant theoretical parameters at work in SUSY models with the exception of masses and cross-sections of the top squark and the neutralino, as described in Section 3.
Experimentalists look for signal regions where they know from the SM what the background of detector events will be and where the simplified model suggests that novel detector events may be distinguishable from the background. So, for example, if 10 background events are expected on the basis of the SM and 20 events are observed in that signal region, then this excess of events might be the sign of new BSM physics (depending of course on statistical considerations). 22 The first step in this procedure is to map the background: this is done both via Monte Carlo simulations and also via control regions. The latter are well-known regions within the Standard Model where no new signals are expected, and they are regularly used to test The signal region where this decay process may be revealed in the data is identified in such a way so as to enhance as much as possible the signal and reduce as much as possible the background. Experimentalists can then check for any deviation in this signal region from the expected background. This is done by translating the Feynman diagram of the simplified model in Fig. 2 into the graph of Fig. 3, which shows the hypothetical production rate for the top squark/anti-squark pair decaying into the neutralino for a range of possible mass parameter values in GeV for both particles.
In the example under consideration the mass of the top quark t is known but neither the mass of the hypothetical neutralino nor that of the top squark is known. Thus, the simplified model associated with Fig. 2 is rendered in the diagram as a convenient 2-parameter model, which maps these two unknown masses of putative BSM particles into a region where their production rate could be measured against the SM expectations and collider data. To achieve this goal, it is of course important that the new signatures for top squark and neutralino production differ in relevant ways from the signature of the top quark as well as from other well-known signatures within the Standard Model that might be relevant for this signal region. Figure 3 shows how ranges of possible mass values for the top squark (horizontal axis) and neutralino (vertical axis) in this simplified model fare with respect to observation (solid lines) and the expected Standard Model background (dotted lines). This kind of graph depicts whether a deviation, i.e. an excess of events, which might be the signature of top squarks decaying into neutralinos, has been detected in this 23 These procedures are in most respects similar to those carried out during the initial runs of the LHC when searching for the Higgs. See Franklin (2017) for an accessible description of the experimental procedures of the LHC during the Higgs discovery. Morrison (2015) and Massimi and Bhimji (2015) discuss the role of simulation in LHC experiments. parameter region. In this particular case it ultimately tells us that no evidence for top squarks decaying into neutralinos has been found-for this simplified model the parameter region below the dotted lines (SM expectations) and the solid lines (data) is statistically excluded. Experimentalists therefore conclude that to these limits this simplified model has been ruled out.
The example illustrates in more detail some aspects of the functions discussed in Section 3, viz. to (1) rule out BSM models and (2) provide guidance in model building. It also illustrates how simplified models have two important functions for experimentalists. Simplified models can be used to (3) interpret experimental data and (4) assess experimental search procedures. A third function is also important: simplified models can be used to (5) compare experimental data from different colliders or experiments in an experiment-independent way. We discuss these latter three functions and their associated aims in turn.
The first experimentally relevant aim of developing simplified models is to have tools which can be used to interpret new data at the LHC. 24 In other words, simplified models play a useful role in answering the question of what the data represent. Data models alone cannot provide such an interpretation, for the detector events cannot have an unambiguous interpretation in theoretical terms, i.e. kinds of particles and decay processes. Detector events merely register that something (with certain physical properties) triggered the detector. As noted previously, many conceivable BSM physical processes are compatible with and may result in the same detectable events. Simplified models supply a transparent, easy-to-use interpretation of experimental data in terms of new particles (their cross-sections and branching ratios) or their absence, and hence a means to characterize the data obtained by the detectors at the LHC (albeit a characterization that may significantly distort the nature of the real underlying physical processes). Since simplified models focus on experimental observables (masses, branching ratios, cross sections), it is easy to compare data with simplified models and check for any eventual excesses in the expected SM events that might be the signature for new possible BSM particles. The data can then be presented in terms of simplified models (alongside more typical presentations such as signature-based results and comparisons of data to benchmark models, e.g. the constrained or phenomenological MSSM), as has been standardly done in numerous reports from the ATLAS and CMS experiments at the LHC since 2011.
This function of simplified models, viz. as interpretations of new experimental signals, allows experimentalists to learn something about the results of their experiments and the phenomena responsible for them. Because simplified models are deliberately incomplete models and hence plausibly may not directly represent the phenomena, experimentalists likely cannot learn precisely what is causing their data. Nevertheless, simplified models allow experimentalists to give some contentful physical characterization of what is causing their data. Moreover, they can give one that connects to plausible complete BSM models (as described in Section 4). Although this characterization may distort the actual facts to some degree (even completely), it does not preclude learning some important physical facts about the underlying phenomena.
The second experimentally relevant aim of developing simplified models is to evaluate and revise experimental search strategies. As simplified models are designed to capture a complete range of expected BSM particle phenomenology, they can be used to reveal potentially overlooked particle phenomenology in standard search strategies (LHC New Physics Working Group 2012, 4). Initial BSM searches at the LHC were based primarily on strategies developed for Standard Model searches (searches for the Higgs in particular) or extended strategies appropriate for particular benchmark models, e.g. the MSSM (Alves et al. 2011). Such strategies may however be inadequate to revealing other (non-MSSM or non-SM-like) BSM physics, giving rise to an important source of experimental risk, namely that the LHC may be producing BSM particles which remain undetected because of the particular search strategies employed. Assessing the experimental signatures of a wide range of simplified models in relation to the sensitivity of LHC search strategies can suggest needed strategy modifications and particular experimental searches worth undertaking. The need for autonomy from experiment and a degree of model-independence for this aim is evident.
The third experimental application of simplified models is to compare experimental data from different experiments. For example, in our Fig. 3, the exclusion region marked by the solid black line can be compared with exclusion limits found in other BSM searches at ATLAS that have been focusing on the top quark using a different methodology. Simplified models are useful for this task because they are independent of any particular experiment (Alwall et al. 2009, 2). Experimental data from different experiments may be difficult to compare directly, due to different experimental methodologies, setups, etc. Combining data sets from different experiments is obviously important, and there are, naturally, various statistical methods employed to do just this in all areas of science. 25 Theory-driven models, unlike such statistical methods, have the advantage, however, of supplying a unified interpretive locus for making direct comparisons. In particular, "simplified models provide a figure of merit for comparing searches at different collider experiments, because the kinematics and cross-sections expected for a simplified model at different colliders can be computed from their fundamental parameters" (LHC New Physics Working Group 2012, 24). This is certainly not to say, however, that statistical methods for combining data are inferior to theory-driven comparisons; it is just to point out that the uncertainties involved in the two approaches are different, so what one can learn from each data-comparison approach is different-and potentially mutually informing.
In concluding this section, we emphasize that, since they are independent of particular collider experiments, simplified models allow experimentalists to learn something about the phenomena in a partially experiment-and data-independent way (while still of course providing an interpretation of the data). This is important because data from a collider experiment may be dependent in various ways on the experimental apparatus and methodology, i.e. ways which do not directly represent the physical phenomena being studied. Comparisons of data are useful precisely because they can uncover the presence of these dependencies. Since simplified models may function as tools of data comparison, they can be used to reveal the underlying phenomena, shorn of irrelevant distortions from the nature of the experiment. Each of these three experimental functions of simplified models-data interpretation, assessment of experimental methodology, and data comparison across different experiments-depend on the relative independence of simplified models from experimental data. As simplified models are constructed to be autonomous in this sense, they function autonomously in the three aforementioned ways, i.e. as tools for learning specific things about the data and the experiment.

Big data and BSM searches
We turn in this final section to connecting our discussion of simplified models with wider epistemological issues raised by so-called Big Data. We do this not to introduce any new considerations beyond those discussed already, but only to illustrate what has been said so far in relation to a topic increasingly of interest to philosophers of science. We intend this brief discussion to present our account in a slightly different light, as well as to situate it in an interesting and relevant context. Indeed, high energy physics is often presented as one of the areas in contemporary science where Big Data and its related epistemological issues arise. Data-intensive science is an extremely broad domain of inquiry arising in informatics, social sciences, physical sciences and life sciences. Only recently philosophers of science have begun to focus their attention on data-intensive science, with a focus primarily on machine learning, biology, and the social sciences, e.g. (Floridi 2012;Leonelli 2014;Pietsch 2016). Although a precise definition is elusive, data-intensive science can be and has been characterized in terms of the sheer amount of information handled and the technological challenges it poses for data acquisition, data storage, and data analysis.
Adding to some of the concerns that philosophers of science have already raised about the epistemological novelty of Big Data, in e.g. biology (Leonelli 2012), the epistemic challenges we have been discussing in the search for BSM physics in high energy physics can be characterized as a distinct Big Data issue, one which simplified models are intended to address. The challenges, recall, are both that BSM physics is highly underdetermined by empirical data, and that available theoretical considerations do not presently provide much guidance on what BSM physics we should expect. These challenges become exacerbated by the Big Data environment of the LHC. In the social sciences which involve Big Data the main challenge, e.g. in microtargeting (modeling voters' behavior to make polls), is modeling a great quantity of discrete, well-defined, and easily identifiable demographic data, e.g. age, race, gender of the voters (Pietsch 2016). In biomedical sciences the main challenge consists in collating a large volume of decontextualized yet still well-defined and easily identifiable data, e.g. appearance, behavior, breeding, about given species of organisms (e.g. over a thousand in the case of drosophila), storing the data, and re-using it for various purposes (Leonelli 2008;. But in high energy physics the problems connected to high volumes of data go beyond the classification, organization, storage and re-use of data (although these are certainly present).
It might be tempting just to think that the Big Data challenge is the same here as elsewhere. That is, the challenge in high energy physics is the problem of tracking a high-volume of data representing particles impinging on various particle detectors. While this is certainly a challenge, it is a challenge that has been adequately solved at the LHC-at least for the search for the Higgs. The SM could be used to suggest some plausible constraints on where to look for its experimental signature (mass ranges, important decay chains where the Higgs would appear before decaying into detectable events, etc.). This led to the development of particular search strategies, where certain detector events were ignored (either in hardware or through software "triggers" (Karaca 2017)) in response to practical limitations in data storage capacity and processing rates at CERN. Since physicists knew (more or less) what they were looking for, they were able to optimize their search to find it. And indeed, they did find it (the ATLAS Collaboration 2012a; the CMS Collaboration 2012).
Unlike in biomedical science or social science with their identifiable data, in BSM searches, where the aim is to discover new physics, the problem of transient underdetermination (Sklar 1981) of theory by data arises. For every data pattern detected at the CMS or ATLAS experiments at CERN, there is, as mentioned above, a multiplicity of possible BSM particles and decay channels that might have led to exactly the same data pattern (be it the pattern of a signature/ excess of events vis-à-vis background). Since decisions have to be made, however, about which data to keep and which data to throw away (due to the quantity of data produced in collisions) and since scientists do not have much guidance in where precisely to look for data signatures that would suggest new physics, the Big Data problem is importantly different in high energy physics (and naturally in other exploratory contexts as well).
Our point has been that simplified models follow a modeling strategy which mitigates this problem to some extent. Simplified models essentially undercut the underdetermination problem by the way they mediate between theory and experiment. Theory first provides some guidance in what to look for by suggesting theoretically-motivated phenomenology. This guidance is turned into model substitutes for theory, viz. simplified models. These models then can be used to optimize searches for these kinds of phenomenology, mitigating the problem from Big Data. Given experimental data, they can next provide a tool for interpreting experimental data. Interpreted in terms of a simplified model, the data can finally be used to rule out full BSM models and suggest guidance on model building. If there is BSM physics, this procedure provides a realistic method for finding it, characterizing it, and eventually settling on a physically plausible BSM model. Thus simplified models solve the "Learner's Paradox" problem in the usual way: one does not have to know precisely what one is looking for; one just has to know what one is looking for well enough in order to progress.

Conclusion
The novel perspective on the "models as mediators" framework introduced in Section 2 usefully emphasizes the importance of certain epistemological questions which have so far been little investigated in the philosophical literature on models in science. In this paper we focused on the question of why models are constructed to have the autonomy from theory and data that they do. We suggested that this question can be helpfully addressed by looking to the learning aims of scientists who make use of them. These aims provide clues as to what determines the degree of independence and dependence that a model has with respect to theory and data. We argued that the epistemic context of a research area plays a crucial role in determining the needed autonomy of a model.
To illustrate these ideas we presented the case of simplified models from high energy physics research. Simplified models address two major epistemic difficulties faced by physicists searching for beyond standard model physics.
The first is a serious underdetermination of theory by evidence (both empirical and otherwise). This underdetermination is not the logical underdetermination familiar in philosophical debates, but the kind relevant for making methodological decisions in science, viz. "transient" (Sklar 1981) or "scientific" (Dawid 2013) underdetermination. There are a variety of theoretical options currently available, but the lack of confidence in any individual model means there is little consistent guidance available for conducting experiments, and testing all the possible models would be massively inefficient and impractical.
The second is an analogous "underdetermination", namely of data patterns by possible BSM physics theories. Of course, insofar as one believes the Standard Model is correct, then there is no underdetermination: the SM tells us exactly what we should expect to see in the future, which is nothing new at all. But there is a variety of reasons to suppose that there should be BSM physics. Without interpretive guidance from theory, experimentalists do not exactly know what they are looking for-or at, should they see a new signature in the data. One might think that they can simply run the collider and look for discrepancies between SM predictions and the data, but the amount and complexity of the data make the situation somewhat more challenging than this. Experimentalists need theoretical guidance so that they know where to look in the enormous amounts of data for potential new phenomena.
We have argued that simplified models can help solve these two problems by mediating between theory and data. To mitigate the transient underdetermination faced in high energy physics, it is natural to follow the EFT philosophy and attend to low energy EFTs relevant to the experimental capacities of the LHC. As EFTs, simplified models are dependent on the theoretical framework of QFT, but they abstract from complete BSM physics models, such as SUSY models, because of physicists' uncertainty about BSM physics. BSM physics models do, however, suggest a range of particle phenomenology that the LHC could detect. Each simplified model corresponds to some plausible phenomenology, although no model by itself is physically plausible as a complete BSM physics model. Because of their clear theoretical connections to more complete BSM physics models, simplified models can be used to rule out (or at least disfavor) the latter, as well as suggest plausible constraints and guidance for future model building (of complete models).
Simplified models also mitigate the epistemic difficulties on the experimental side. They provide a ready interpretation of the data in terms of particles and experimentally accessible parameters. As they cover the range of plausible particle phenomenology, they can be used to assess the sensitivity of experimental search strategies to potentially observable discrepancies between the SM and data. (We also noted the value of having a simple tool for comparing data from different experimental apparatuses.) The independence of theory-driven models (like simplified models) from the data (which is characterized merely in terms of detector events) is clearly necessary for these particular interpretive, evaluative, and comparative functions.
We therefore conclude that the construction of simplified models is substantially based in attempting to solve salient epistemic problems faced by physicists in searching for BSM physics. Simplified models are constructed to be independent from theory and data, i.e. autonomous, but also dependent on them in particular ways and to particular extents precisely because of these circumstances. By expanding our attention beyond the details of the model to the larger theoretical and experimental context, we see that the learning goals of physicists are based substantially on these epistemic considerations, which then determine the needed functions of the set of models, and hence what autonomy must be "built into" them. Although we do not expect all cases to mirror our own in their particulars, as what is epistemically relevant in each context surely varies, insight into scientific methodology and practice can likely be gained by investigating the issues we have related here in other cases.