Towards a Philosophy of Science of the Artificial

One common way to start an essay on the philosophy of science is to ask: ”Should we be scientific realists?” While this isn’t precisely the question I’m interested in here, it is the entry point of this essay. So bear with me.

Scientific realism, in short, is the view that scientific theories are (approximately) true. Different philosophers have proposed different interpretations of “approximately true”, e.g., as meaning that scientific theories “aim to give us literally true stories about what the world is like” (Van Fraassen, 1980, p. 9), or that “what makes them true or false is something external—that is to say, it is not (in general) our sense data, actual or potential, or the structure of our minds, or our language, etc.” (Hilary Putnam, 1975, p. 69f), or that their terms refer (e.g. Boyd, 1983), or that they correspond to reality (e.g. Fine, 1986). 

Much has been written about whether or not we should be scientific realists. A lot of this discussion has focused on the history of science as a source for empirical support for or against the conjecture of scientific realism. For example, one of the most commonly raised argument in support of scientific realism is the so-called no miracle argument (Boyd 1989): what can better explain the striking success of the scientific enterprise than that scientific theories are (approximately) true—that they are “latching on” to reality in some way or form. Conversely, an influential arguments against scientific realism is the argument from pessimistic meta-induction (e.g. Laudan 1981) which suggests that, given the fact that most past scientific theories have turned out to be false, we should expect our current theories to face the fate of being proven false (as opposed to approximately true). 

In this essay, I consider a different angle on the discussion. Instead of discussing whether scientific realism can adequately explain what we empirical understand about the history of science, I ask whether scientific realism provides us with a satisfying account of the nature and functioning of the future-oriented scientific enterprise—or, what Herbert Simon (1996) called the sciences of the artificial. What I mean by this, in short, is that an epistemology of science needs to be able to account for the fact that the scientist's theorising affects what will come into existence, as well as for how this happens. 

I proceed as follows. In part 1, I explain what I mean by the sciences of the artificial, and motivate the premise of this essay—namely that we should aspire for our epistemology of science to provide an adequate account not only of the inquiry into the “natural”, but also into the nature and coming-into-being of the “artificial”. In part 2, I analyse whether or not scientific realism provides such an account. I conclude that its application to the sciences of the artificial world exposes scientific realism, while not as false, as insufficiently expressive. In part 3, we briefly sketch how an alternative to scientific realism—pragmatism—might present a more satisfactory account. I conclude with a summary of the arguments raised and the key takeaways.

Part 1: The need for a philosophy of science of the artificial

The Sciences of the Artificial—a term coined by Herbert Simon in the eponymous book—refer to domains of scientific enterprise that deal not only with what is, but also with what might be. Examples include the many domains of engineering, medicine or architecture but also fields like psychology, economics or administration. What characterises all of these domains is their descriptive-normative dual nature. The study of what is is, both, informed and mediated by some normative ideal(s). Medicine wants to understand the functioning of the body in order to bring about and relative to a body’s healthy functioning; civil engineering studies materials and applied mechanics in order to build functional and safe infrastructure. In either case, at the end of the day, the central subject of their study is not a matter of what is—it does not exist (yet)—, instead it is the goal of their study to bring it into existence. In Simon’s words, the sciences of the artificial are “concerned not with the necessary but with the contingent [sic] not with how things are but with how they might be [sic] in short, with design” (p. xii).

Some might doubt that a veritable science of the artificial exists. After all, is science not quintessentially concerned with understanding what is—the laws and regularity of the natural world? However, Simon provides what I think is a convincing case that, not only is there a valid and coherent notion of a “science of the artificial”, but also that one of its most interesting dimensions is precisely its epistemology. In the preface to the second edition of the book, he writes: “The contingency of artificial phenomena has always created doubts as to whether they fall properly within the compass of science. Sometimes these doubts refer to the goal-directed character of artificial systems and the consequent difficulty of disentangling prescription from description. This seems to me not to be the real difficulty. The genuine problem is to show how empirical propositions can be made at all about systems that, given different circumstances, might be quite other than they are.” (p. xi).

As such, one of the things we want from our epistemology of science—a theory of the nature of scientific knowledge and the functioning of scientific inquiry—is to provide an adequate treatment of the science of the artificial. It ought, for example, to allow us to think not only about what is true but also what might be true, and how our own theorising affects what comes into being (i.e., what comes to be true). By means of metaphor, insofar as the natural sciences are typically concerned with the making of “maps”, the sciences of the artificial are interested in what goes into the making of “blueprints”. In the philosophy of science, we then get to ask: What is the relationship between maps and blueprints? For example, on one hand, the quality of our maps (i.e., scientific understanding) shape what blueprints we are able to draw (i.e., the things we are able to build). At the same time, our blueprints also end up affecting our maps. As Simon puts it: “The world we live in today is much more a man-made, or artificial, world than it is a natural world. Almost every element in our environment shows evidence of human artifice.“ (p. 2).

One important aspect of the domain of the artificial is that there is usually more than one low-level implementation (and respective theoretical-technological paradigm) through which desired function can be achieved. For example, we have found several ways to travel long distances (e.g. by bicycle, by car, by train, by plane). What is more, there exist several different types of trains (e.g. coal-powered, steam-powered or electric trains; high-speed trains using specialised rolling stocks to reduce friction, or so-called maglevs trains which use magnetic levitation). Most of the time, we must not concern ourselves with the low-level implementation because of their functional equivalency. Precisely because they are designed artefacts, we can expect them to depict largely similar high-level functional properties. If I want to travel from London to Paris, I generally don't have much reason to care what specific type of train I end up finding myself in. 

However, differences in their respective low-level implementation can start to matter under the ‘right’ circumstances, i.e., given relevant variation in external environments. Simon provides us with useful language to talk about this. He writes: “An artifact can be thought of as a meeting point[—]an ‘interface’ in today's terms[—]between an ‘inner’ environment, [i.e.,] the substance and organization of the artifact itself, and an ‘outer’ environment, [i.e.,] the surroundings in which it operates. If the inner environment is appropriate to the outer environment, or vice versa, the artifact will serve its intended purpose.” (p. 6). As such, the (proper) functioning of the artefact is cast in terms of the relationship between the inner environment (i.e., the artefact’s structure or character, its implementation details) and the outer environment (i.e., the conditions in which the artefact operates).” He further provides a simple example to clarify the point: “A bridge, under its usual conditions of service, behaves simply as a relatively smooth level surface on which vehicles can move. Only when it has been overloaded do we learn the physical properties of the materials from which it is built.” (p. 13). 

The point is that two artefacts that have been designed with the same purpose in mind, and which in a wide range of environments behave equivalently, will start to show different behaviours if we enter environments to which their design hasn’t been fully adapted to. Now, their low-level implementation (or ‘inner environments’, in Simon’s terminology) starts to matter. 

To further illustrate the relevance of this point to the current discussion, let us consider the field artificial intelligence (AI)—surely a prime example of a science of the artificial. The aspiration of the field of AI is to find ways to instantiate advanced intelligent behaviour in artificial substrates. As such, it can be understood in terms of its dual nature: it aims to both (descriptively) understand what intelligent behaviour is and how it functions, as well as how to (normatively) implement it. The dominant technological paradigm for building AGI at the moment is machine learning. However, nothing in principle precludes that other paradigms could (or will) be used for implementing AGI (e.g., some variation on symbolic AI, some cybernetic paradigm, soft robotics, or any number of paradigms that haven’t been discovered yet). 

Furthermore, different implementation paradigms for AGI imply different safety- or governance-relevant properties. Imagine, for example, an AGI built in the form of a “singleton” (i.e. a single, monolithic) compared to one built as a multi-agent, distributed system assembly. A singleton AGI, for example, seems more likely to lack interpretability (i.e. behave in ways and for reasons that, by default, remain largely obscure to humans), while a distributed system might be more likely to fall prey to game theoretic pitfalls such as collusion (e.g. Christiano, 2015). It is not at this point properly understood what the different implications of the different paradigms are, but the specifics of this must not matter for the argument I am trying to make here. The point is that, if the goal is to make sure that future AI systems will be safe and used to the benefit of humanity, it may matter a huge deal which of these paradigms is adopted, and to understand what different paradigms imply for considerations of safety and governability. 

As such, the problem of paradigm choice—choices over which implementation roadmap to adopt and which theories to use to inform said roadmap—comes into focus. As philosophers of science, we must ask: What determines paradigm choice? As well as: How, if at all, can a scientist or scientific community navigate questions of paradigm choice “from within” the history of science?

This is where our discussion of the appropriate epistemology for the sciences of the artificial properly begins. Next, let us evaluate whether we can find satisfying answers in scientific realism.  

Part 2: Scientific realism of the artificial

Faced with the question of paradigm choice, one answer that a scientific realist might give is that, what determines the right paradigm choices comes down entirely to how the world is. In other words, what the AI researcher does when trying to figure out how to build AGI is equivalent to uncovering the truth about what AGI, fundamentally, is. We can, of course, at a given point in time be uncertain about the ‘true nature’ of AGI, and thus be exploring different paradigms; but eventually, we will discover which of those paradigms turns out to be the correct one. In other words, the notion of paradigm choice is replaced with the notion of paradigm change. In essence, the aspiration of building AGI is rendered equivalent to the question of what AGI, fundamentally, is. 

As I will argue in what follows, I consider this answer to be dissatisfying in that it denies the very premise of the science of the artificial we have discussed in the earlier section. Consider the following arguments. 

First, the answer by the scientific realist seems to be fundamentally confused about the type-signature of the concept of “AGI”. AGI, in the sense I’ve proposed here, is best understood as a functional description—a design requirement or aspiration. As discussed earlier, it is entirely plausible that there exist several micro-level implementations which are functionally-equivalent to (i.e. depict) generally intelligent behaviour. As such, by treating “the aspiration of building AGI [as equivalent] to the question of what AGI is”, the scientific realist has implicitly already gone away—and thus failed to properly engage with—from the premise of the question. 

Second, note that there are different vantage points from where we could be asking the question. We could take the vantage point of a “forecaster” and ask what artefacts we should expect to exist 100 years from now. Or, we could take the vantage point of a “designer” and ask which artefacts we want to create (or ought to create, given some set of moral, political, aesthetic, or other commitments). While it naturally assumes the vantage point of the forecaster, scientific realism appears inadequate for taking seriously the vantage point of the designer. 

Third, let’s start with the assumption that what will come into existence is a matter of fact. While plausible-sounding at first, further inspection reveals problems with this claim. To show how this is the case, let us consider the following tri-partite characterisation proposed by Eric Drexler (2018). We want to distinguish between three notions of “possible”, namely: (physically) realistic, (techno-economically) plausible, and (socio-politically) credible. This is to say, beyond such facts as the fundamental laws of physics (i.e., the physically realistic), there are other factors—less totalising and yet endowed with some degree of causal force—which shape what comes to be in the future (e.g., economic and sociological pressures). 

Importantly, the physically realistic does not on its own determine what sorts of artefacts come into existence. For example, paradigm A (by mere chance, or for reasons of historical contingency) receives differential economic investment compared to paradigm B, resulting in its faster maturation; or inversely, it might get restrained or banned through political means, resulting in it being blocked, and maybe even eventually forgotten. Examples of political decisions (e.g. regulation, subvention, taxation, etc.) affecting technological trajectories abound. To name just one, consider how the ban on human cloning has, in fact, stopped human cloning activities, as well as any innovations related to making human cloning ‘better’ (cheaper, more convenient, etc.) in some way.

The scientific realist might react to this by arguing that, while the physically realistic is not the only factor that determines what sorts of artefacts come into existence, there is still a matter of fact to the nature and force of economic, political and social factors affecting technological trajectories, all of which could, at least in principle be understood scientifically. While I am happy to grant this view, I argue that the problem lies elsewhere. As we have seen, the interactions between the physically realistic, the techno-economically plausible, and the socio-politically crediblethe are highly complex and, importantly, self-referential. It is exactly this self-referentiality that makes this a case of paradigm choice, rather than paradigm change, when viewed from the position of the scientist. In other words, an adequate answer to the problem of paradigm choice must necessarily consider a “view from inside of the history of science”, as opposed to a “view from nowhere”. After all, the paradigm is being chosen by the scientific community (and the researchers making up that community), and they are making said choice from their own situated perspective.

In summary, it is less that the answers provided by the scientific realist are outright wrong. It rather appears as if the view provided by scientific realism is not expressive enough to deal with the realities of the sciences of the artificial. It can not usefully guide the scientific enterprise when it comes to the considerations brought to light by the sciences of the artificial. Philosophy of science needs to do better if it wants to avoid confirming the accusation raised by Richard Feynman—that philosophers of science are to scientists what ornithologists are to birds; namely irrelevant. 

Next, we will consider whether we can find a different epistemological framework, while holding onto as much realism as possible, appears more adequate for the needs of the sciences of the artificial. 

Part 3: A pragmatic account of the artificial

So far, we have introduced the notion of the science of the artificial, discussed what it demands from the philosophy of science, and observed how scientific realism fails to appropriately respond to those demands. The question is then: Can we do better? 

An alternative account to scientific realism—and the one we will consider in this last section—is pragmatic realism, chiefly originating from the American pragmatists William James, Charles Sander Pierce, John Dewey. For the present discussion, I will largely draw on contemporary work trying to revive a pragmatic philosophy of science that is truly able to guide and support scientific inquiry, such by Robert Toretti, Hasok Chang, Rein Vihalemm. 

Such a pragmatist philosophy of science emphasises scientific research as a practical activity, and the role of an epistemology of science as helping to successfully conduct this activity. While sharing with the scientific realist a commitment to an external reality, pragmatism suggests that our ways of getting to know the world are necessarily mediated by the ways knowledge is created and used, i.e., by our epistemic aims and means of “perception”—both the mind and scientific tools, as well as our scientific paradigms.

Note that pragmatism, as I have presented it here, does at no point do away with the notion of an external reality. As Giere (2006) clarifies, not all types of realism must conscribe to a “full-blown objective realism”(or what Putnam called “metaphysical realism”)—roughly speaking the view that “[t]here is exactly one true and complete description of ‘the way the world is.’” (Putnam, 1981, p. 49). As such, pragmatic realism, while rejecting objective or metaphysical realism, remains squarely committed to realism, and understands scientific inquiry as an activity directed at better understanding reality (Chang, 2022, p. 5; p. 208).  

Let us now consider whether pragmatism is better able to deal with the epistemological demands of the sciences of the artificial than scientific realism. Rather than providing a full-fledged account of how the sciences of the artificial can be theorised within the framework of pragmatic realism, what I set out to do here is more humble in its ambition. Namely, I aim to support my claim that scientific realism is insufficiently expressive as an epistemology of the sciences of the artificial by showcasing there exist alternative different frameworks—in this case pragmatic realism—that do not face the same limitations. In other words, I aim to showcase that, indeed, we can do better. 

First, as we have seen, scientific realism fails to adopt the “view point of the scientist''. As a result, it collapses the question of paradigm choice to a question of paradigm change. This makes scientific realism incapable of addressing the (very real) challenge faced by the scientist; afterall, as I have argued, different paradigms might come with different properties we care about (such as when they concern questions of safety or governance). In contrast to scientific realism, pragmatism explicitly rejects the idea that scientific inquiry can ever adopt a “view from nowhere” (or, a “God’s eye view” as Putnam (1981, p. 49) puts it). Chang (2019) emphasises the “humanistic impulse” in pragmatism. “Humanism in relation to science is a commitment to understand and promote science as something that human agents do, not as a body of knowledge that comes from accessing information about nature that exists completely apart from ourselves and our investigations.” (p. 10). This aligns well with the need of the sciences of the artificial to be able to reason from the point of view of the scientist.

Second, pragmatism, in virtue of focusing on the means through which scientific knowledge is created, recognise the historicity of scientific activity (also see, e.g. Vihalemm, p.3; Chang, 2019). This capacity allows pragmatic realism to reflect the historicity that is also present in the science of the artificial. Recall that, as we discussed earlier, one central epistemological question of the sciences of the artificial concerns how our theorising affects what comes into existence. As such, our prior beliefs, scientific frameworks and tools affect, by means of ‘differential investment’ in designing artefacts under a given paradigm, what sort of reality comes to be. Moreover, the nature of technological progress itself affects what we become able to understand, discover and build in the future. Pragmatism suggests that, rather than there being already a predetermined answer as to which will be the most successful paradigm, the scientist must understand their own scientific activity as part of an iterative and path-dependent epistemic process.

Lastly, consider how the sciences of the artificial entail a ‘strange inversion’ of ‘functional’ and ‘mechanistic’ explanations. In the domain of the natural, the ‘function’ of a system is understood as a viable post-hoc description of the system, resulting from its continuous adaptation to the environment by external pressures. In contrast, in design, the ‘function’ of an artefact becomes that which is antecedent, while the internal environment of the artefact - its low-level implementation - becomes post-hoc. It appears difficult, through the eyes of a scientific realist, to fully accept this inversion. At the same time, accepting it appears to be useful, if not required, in order to epistemology of the sciences of the artificial on its own terms. Pragmatic realism, on the other hand, does not face the same trouble. To exemplify this, let us take Chang’s notion of operational coherence, a deeply pragmatist notion of the yardstick of scientific inquiry, which he describes as “a harmonious fitting-together of actions that is conducive to a successful achievement of one’s aims” (Chang, 2019, p. 14). As such, insofar as we are able to argue that a given practice in the sciences of the artificial possesses such operational coherence, it is compatible with pragmatic realism. What I have tried to show hereby is that the sciences of the artificial, including the ‘strange inversion’ of the role of ‘ functions’ which it entails, is fully theorisable inside of the framework of pragmatic realism. As such, unlike scientific realism, the latter does not fail to engage with the sciences of the artificial on its own terms.

To summarise this section, I have argued, by means of three examples, that pragmatic realism is a promising candidate for a philosophy of science within which it is possible to theorise the sciences of the artificial. In that, pragmatic realism differs from scientific realism. In particular, I have invoked the fact that the sciences of the artificial requires us to take the “point of view of the scientist”, to acknowledge the iterative, path-dependent and self-referential nature of scientific inquiry (i.e., its historicity), and, finally, to accept the central role of ‘function’ in understanding designed artefacts. 

Conclusion

In section 1, I have laid out the case for why we need a philosophy of science that can encompass questions arising from the sciences of the artificial. One central such question is the problem of paradigm choice, which requires the scientific practitioner to understand the ways in which their own theorising affects what will come into existence.

In section 2, I have considered whether scientific realism provides a sufficient account, and concluded that it doesn’t. I have listed three examples of ways in which scientific realism seems to be insufficiently expressive as an epistemology of the sciences of the artificial. Finally, in section 3, I explored whether we can do better, and have provided three examples of epistemic puzzles, arising from the sciences of the artificial, that pragmatic realism, in contrast with scientific realism, is able to account for. 

While scientific realism seems attractive on the basis of its explaining the success of science (of the natural), it does not in fact present a good explanation of the success of the science of the artificial. How, before things like, say, planes, computers, or democratic institutions existed, could we have learnt to build them if all that was involved in the scientific enterprise was uncovering that which (already) is? As such, I claim that the sciences of the artificial provide an important reason for why we should not be satisfied with the epistemological framework provided by scientific realism with respect to understanding and—importantly—guiding scientific inquiry. 


References

Boyd, R. N. (1983). On the current status of the issue of scientific realism. Methodology, Epistemology, and Philosophy of Science: Essays in Honour of Wolfgang Stegmüller on the Occasion of His 60th Birthday, June 3rd, 1983, 45-90.

Chang, H. (2019). Pragmatism, perspectivism, and the historicity of science. In Understanding perspectivism. pp. 10-27. Routledge.

Chang, H. (2022). Realism for Realistic People. Cambridge University Press.

Christiano, P. (2015). On heterogeneous objectives. AI Alignment (medium.com). Retrieved from. https://ai-alignment.com/on-heterogeneous-objectives-b38d0e003399.

Drexler, E. (2018). Paretotopian goal alignment, Talk at EA Global: London 2018. 

Drexler, E. (2019). Reframing Superintelligence. Future of Humanity Institute.

Fine, A. (1986). Unnatural Attitudes: Realist and Antirealist Attachments to Science. Mind, 95(378): 149–177. 

Fu, W., Qian Q. (2023). Artificial Intelligence and Dual Contract. arXiv preprint arXiv:2303.12350.

Laudan, L. (1981). A confutation of convergent realism. Philosophy of science. 48(1), 19-49.

Normile, D. (2018). CRISPR bombshell: Chinese researcher claims to have created gene-edited twins. Science. doi: 10.1126/science.aaw1839.

Putnam, H. (1975). Mathematics,  Matter  and  Method. Cambridge:  Cambridge University Press.

Putnam, H. (1981). Reason, Truth and History. Cambridge: Cambridge University Press.

Simon, H. (1996). The Sciences of the Artificial - 3rd ed, The MIT Press.

Soares, N., Fallenstein, B. (2017). Agent foundations for aligning machine intelligence with human interests: a technical research agenda. The technological singularity: Managing the journey, 103-125.

Toretti, R. (2000). Scientific Realism’ and Scientific Practice. In Evandro Agazzi and Massimo Pauri (eds.), The Reality of the Unobservable. Dordrecht: Kluwer.

Van Fraassen, B. (1980). The scientific image. Oxford University Press.

Vihalemm, R. (2012). Practical Realism: Against Standard Scientific Realism and Anti-realism. Studia Philosophica Estonica. 5/2: 7–22.

Epistemic justification in (Hu)man and Machine

What does it take for a belief to be epistemically justified? In the hope of providing a novel angle to this long-standing discussion, I will investigate the question of epistemic justification by means of considering not only (what one might call) ‘classical’ cases, but also ‘machine’ cases. Concretely, I will discuss whether—and, if so, on what basis—artificial systems instantiating intelligent behaviour can be said to form epistemically justified ‘beliefs’. This will serve as a sort of thought experiment or case study used to test plausible answers to the problem of epistemic justification and, potentially, derive inspirations for novel ones.

Why do I choose to adopt this methodological approach? Consider, by comparison, the classic question in biology: what is life? Fields such as astrobiology or artificial life allow us to think about this question in a more (and more appropriately) open-minded way—by helping us to uproot unjustified assumptions about what life can and cannot look like based on sampling from Earth-based forms of life alone. The field of artificial intelligence can serve a similar function vis-à-vis philosophical inquiry. Insofar as we aspire for our theories—including our theories of knowledge and epistemic justification—to be valid beyond the contingencies of human intelligence, insights from the study of AI stand in a fruitful intellectual symbiosis with philosophical thought. 

I will start our investigation into epistemic justification with a thought experiment. 

Rome: Consider Alice; when having dinner with her friends, the topic of her upcoming trip to Italy comes up. Alice explains that she will be taking a plane to Rome, Italy’s capital city, from where she will start her journey. 

It seems uncontroversial to say that Alice is epistemically justified in her belief that Rome is in fact the capital of Italy. The question I want to raise here is: in virtue of what is this the case? Before I delve into examining plausible answers to this question, however, let us compare the former story to a slightly different one. 

Rome’: In this case, Bob is playing around with the latest large language model trained and made available by one of the leading AI labs—let’s call it ChatAI. Bob plays with the model in order to get a handle on what ChatAI is and isn’t able to do. At one point, he submits the following query to the model: “What is the capital of Italy?”, and the model replies: “The capital city of Italy is Rome.” 

By analogy to the first case, should we conclude that the model is epistemically justified in its claim that Rome is the capital of Italy? And if not, how are these two cases different? In what follows, I will investigate these questions in more detail, considering various approaches attempting to clarify what amounts to epistemic justification. To do so, I will toggle between considering the traditional (or human) case and the machine case of epistemic justification and study whether this dialogue can provide insight into the question of epistemic justification. 

Correctness (alone) is not enough—process reliabilism for minds and machines

Thus, let us return to a question raised earlier: in virtue of what can we say Alice is justified in claiming that Rome is the capital of Italy? A first observation that appears pertinent is that Alice is correct with her statement. Rome is in fact the capital of Italy. While this appears relevant, it doesn’t represent a sufficient condition for epistemic justification. To see why, we need only think of cases where someone is correct due to mere chance or accident, or even against their better judgement. You may ask me a question about a topic I have never heard of, and yet I might get the answer right by mere luck. Or, in an even more extreme case, we may play a game where the goal is to not give a correct answer. It is quite easily conceivable, in virtue of my utter ignorance of the topic, that I end up giving an answer that turns out to be factually correct, despite trying to pick an answer that I believe to be wrong. In the first case, I got lucky, and in the second case, I uttered the correct answer against my better judgement. In none of these cases would my factually correct answer represent an epistemically justified correct answer. 

As such, I have shown that the truth condition (alone) is an insufficient account of epistemic justification. Furthermore, I have identified a particular concern: that epistemic justification is not given in cases where claim is correct for arbitrary or ‘lucky’ reasons. This conclusion seems to be supported when considering the machine case. If, say, we designed a program that, when queried, iterated through a predefined set of answers and picked one of them at random, then, even if this program happened to pick the correct answers, we wouldn’t feel compelled to consider this a case of epistemic justification. Insofar as we are here taking offense with the arbitrariness of the answer-producing process when considering its status of epistemic justification, we may come to wonder what it would look like for a claim to be correct on a non-arbirary or non-lucky basis. 

To that effect, let us consider the proposal of process relabilism (Goldman, 1979, 1986). At its core, this theory claims that a belief is epistemically justified if it is the product of a belief-formation process that is systematically truth-conducive. In other words, while it is insufficient to observe that a process produces the correct answer on a single and isolated instance, if a process tends to produce the correct answer with a certain reliability, said process acts as a basis for epistemic justification according to the reliabilist thesis. Applied to our Rome case from earlier, the question is thus which processes (e.g., of information gathering and processing) led Alice to claiming that Rome is the Italian capital, and whether these same processes have shown sufficient epistemic reliability in other cases. Let’s say that, in Alice’s case, she inferred her belief that Rome is the capital of Italy as follows. First, her uncle told her that he was about to emmigrate to live in the capital city of Italy. A few weeks later, Alice receives a letter from said uncle which was sent from, as she can tell by the post stamp on the card, Rome. From this, Alice infers that Rome must be the capital of Italy. As such, Alice’s belief is justified insofar as it involved the application of perception, rational reflection, or logical reason, rather than, say, guessing, wishful thinking, or superstitious reasoning. 

Furthermore, we don’t have to understand reliability here merely in terms of the frequency at which a process produces true answers. Instead, we can interpret it in terms of the propensity at which it does so. In the latter case, we capture a notion of truth-conduciveness that pertains not only to the actual-world observed, but is also cognizant of other possible worlds. As such, it aims to be sensitive to the notion that a suitable causal link is required between the given process and its epistemic domain, i.e., what the process is forming beliefs over. This renders the thesis more robust against unlikely but statistically possible cases where an arbitrary process gets an answer repeatedly correct, which would undermine the extent to which process reliabilism can serve as a suitable basis for epistemic justification. To illustrate this, consider the case of the scientific method, where we rely on empiricism to test hypotheses. This process is epistemically reliable not in virtue of getting true answers at a certain frequency, but in virtue of its procedural properties which guarantee that the process will, sooner or later, falsify wrong hypotheses. 

To summarise, according to process reliabilism, a belief-formation process is reliable as a function of its propensity to produce true beliefs. Furthermore, the reliability (as defined just now) of a belief-formation process serves as the basis of epistemic justification for the resulting belief. How does this apply or not to the machine case from earlier (Rome’)? 

To answer this question, let us imagine that Bob continues to play with the model by asking it more questions about the capital cities of other countries. Assuming capabilities representative of the current state of the art in machine learning and large language models in particular, let us say that ChatAI’s responses to Bob’s questions are very often correct. We understand enough about how machine learning works that, beyond knowing that it is merely frequently correct, we can deny that ChatAI (and comparable AI systems) produces correct answers by mere coincidence. In particular, machine learning exploits insights from statistics and optimization theory to implement a form of inference on its training data. To prove this is the case and test the performance of different models, the machine learning communities regularly develop so-called ‘benchmarks’ based on various performance-relevant features of the model being evaluated, such as accuracy as well as speed or (learning) efficiency. As such, AI systems can, given appropriate design and training, produce correct outputs with high reliability and for non-arbitrary reasons. This suggests that, according to process reliabilism, outputs from ChatAI (and comparable AI systems) can qualify as being epistemically justified. 

Challenge 1: “You get out only what you put in”

However, the reliabilist picture as painted so far does not in fact hold up to scrutiny. The first problem I want to discuss concerns the fact that, even if procedurally truth-conducive, a process can produce systematically incorrect outputs if said process operates on wrong initial beliefs or assumptions. If, for example, Alice’s uncle was himself mistaken about what the capital of Italy is, thus moving to a city that he mistakenly thought was the capital, and if he had thus through his words and action passed on this mistaken belief to Alice, the same reasoning process she used earlier to arrive at a (seemingly) epistemically justified belief would now have produced an incorrect belief. Differently put, someone’s reasoning might be flawless, but if based on wrong premises, its conclusions must be regarded as null in terms of their epistemic justification. 

A similar story can be told in the machine case. A machine learning algorithm seeking to identify underlying statistical patterns of a given data set can only ever be as epistemically valid as is the data set it’s being trained on. As a matter of fact, this is a vividly discussed concern in the AI ethics literature, where ML models have been shown to reproduce bias present in their training sets. For example, language models have been shown (before corrective interventions were implemented) to associate certain professions (e.g., ‘CEO’ or ‘nurse’) predominantly with certain genders. Similarly, in the legal context, ML systems used to predict recidivism risk have been criticised for reproducing racial bias.  

What this discussion highlights is that the reliabilist thesis as I stated it earlier is insufficient. Thus, let us attempt to vindicate the thesis before I discuss a second source of criticism that can be raised against it. As such, we can reformulate a refined reliabilist thesis as follows: for a belief to be epistemically justified, it needs to a) be the product of a truth-conducive processes, and b) the premises on which said process operates to produce the (resulting) belief in question must themselves be justified. 

As some might notice, this approach, however, may be at risk of running into a problem of regress. If justified belief requires that the premises on which the epistemic process operates must be justified, how do those premises gain their justification other than by reference to a reliable process operating on justified premises? Without providing, in the context of this essay, a comprehensive account of how one may deal with this regress problem, I will provide a handful of pointers to such attempts that have been made. 

A pragmatist, for example, may emphasise their interests in a process that can reliably produce useful beliefs. Since the usefulness of beliefs is determined by its usage, this does not fall prey to the regress challenge as stated above. A belief can be tested for its usefulness without making reference to another belief. Klein (1999), on the other hand, denies that the type of regress at hand is vicious in the first place, making references to a view called infinitism. According to infinitism, justification requires an appropriate chain of reasons, and in the case of infinitism specifically, such chains take the form of non-repeating infinite ones. Finally, Goldman himself (2008) tackles the regress problem by differentiating between basic and non-basic beliefs, where the former is justified without reference to another belief but in virtue of being the product of an unconditionally reliable process. Such basic beliefs, then, represent a plausible stopping point for such a regress dynamic. Perception has been proposed as a candidate of such an unconditional process, although one may object to this account by denying that it is possible, or common, for perceptual or empirical data to be entirely atheoretical. In any case, the essence of Goldman’s proposal, and of the proposals of externalist reliabilists in general, is that a belief must be justified not with reference to reflectively accessible reasons (which is what internalists propose), but in virtue of the causal process that produced the belief whether or not these processes make reference to other beliefs. As such, externalists are commonly understood to be able to dodge the regress bullet. 

For now, this shall suffice as a treatment of the problem of regress. I will now discuss another challenge to process reliabilism (including its refined version as stated above). It concerns questions regarding the domain in which the reliability of a process is being evaluated. 

Challenge 2: Generalization and its limits

To understand the issue at hand better, let’s consider the “new evil demon problem”, first raised by Cohen (1984) as a critique against reliabilism. The problem arises from the following thought experiment: Imagine a world WD in which there exists an epistemic counterpart of yours, let’s call her Anna, who is identical to you in every regard except one. She experiences precisely what you experience and believe precisely what you believe. According to process reliabilism, you are epistemically justified in beliefs about this world—let’s call it WO—on the basis of those beliefs being the product of truth-conducive processes such as perception or rational reasoning. In virtue of the same reasoning, Anna ought to be epistemically justified in her beliefs about her world. However, and this is where the problem arises, the one way in which Anna differs from you is that her experiences and beliefs of WD have been carefully curated by an evil demon with the aim of deceiving her. Anna’s world does not in fact exist in the way she experiences it. On a reliabilist account, or so some would argue, we would have to say that Anna’s beliefs are not justified, since her belief-formation processes do not reliably lead to correct beliefs. However, how can your counterpart, who in every regard relevant to the reliabilist thesis is identical to you, not be justified in their beliefs while you are? The dilemma arises in that many would intuitively say that Anna is just as justified in believing what she believes as we are, despite the fact that the process that produced Anna’s belief is unreliable. 

One way to cast the above problem–which also reveals a way to diffuse it–is by indexing and then separately evaluating the reliability of the belief-formation processes for the different worlds, WO and WD. From here, as developed by Comesaña (2002), we can make the case that while the belief-formation processes are reliable in the case of WO, they are not in the case of WD. As such, the reliability of a process, and thus epistemic justification, must always be assessed relative to a specific domain of application. 

Another similar approach to the same problem has been discussed for example by Jarrett Leplin (2007, 2009) by invoicing the notion of ‘normal conditions’, a term originally introduced by Ruth Millikan in 1984. The idea is that the reliability of a process is evaluated with respect to the normal conditions of its functioning. Lepin defines normal conditions as “conditions typical or characteristic of situations in which the method is applicable” and explains that “[a] reliable method could yield a preponderance of false beliefs, if used predominantly under abnormal conditions” (Lepin, 2007, p. 33). As such, the new evil demon case can be understood as a case where the epistemic processes which are reliable in a demon-less world cease to be reliable in the demon world, since that world no longer complies with the ‘normal conditions’ that guarantee the functionality of said process. While promising as an approach to address a range of challenges raised against reliabilism, there is, one must note, still work to do in terms of clearly formalising the notion of normality.

What both of these approaches share in common is that they seek to defend reliabilism against the new evil demon problem by means of specifying the domain or conditions in which the reliability of a process is evaluated. Instead of suggesting that, for a process to be reliable—and thus to serve as a basis for epistemic justification—it has to be universally reliable, these refinements to reliabilism seek to formalise a way of putting boundaries on the application space of a given process. As such, we can understand the new evil demon problem as an instance of a more general phenomena: of generalization and its limits. This way of describing the problem serves to clarify how the new evil demon problem relates to issues frequently discussed in the context of machine learning.

The problem of generalization in machine learning concerns the fact that the latter, generally speaking, works by trying to exploit underlying patterns to approximate functions that efficiently describe the data encountered. While this approach (and others) has enabled impressive AI applications to date, it faces important limitations. In particular, this learning method is based on an assumption, commonly called IID (i.e., independent and identically distributed sampling), which says that the data set used in training must be representative of the data encountered upon deployment for there to be a guarantee of the effectiveness or accuracy of the learned model. In other words, while we have guarantees about a model’s performance (i.e., accuracy/loss) under the IID assumption, these guarantees no longer hold when the nature of the distribution changes, i.e., when we encounter what is called a distributional shift. Under distributional shift, whatever approximation function a model has learnt will no longer be effective in the new (deployment) environment. This would be called a case of failure to generalise.

Let us reiterate the suggested analogy between the new evil demon problem and the problem of out-of-distribution generalization failures in machine learning. I claim that the demon world WD represents an ‘outside-of-distribution case’ for the epistemic processes that in our world WO are reliable. Though Anna nominally uses the same processes, because she uses them in an importantly different environment, it makes it seem unsurprising that they turn out to be unreliable in WD. Afterall, the reality of WD differs in fundamental ways from WO (namely, the existence of the evil demon). Insofar as the thought experiment is intended to suggest that the demon itself may be subject to completely different fundamental laws than the ones that govern WO, the same processes that can approximate the fundamental laws of WO are not guaranteed to approximate the fundamental laws that govern WD. As such, I have vindicated process reliabilism from the evil demon problem by squaring what earlier appeared counterintuitive: the same processes that are reliable—and thus the basis for epistemic justification in our world (WO)—can turn out to be unreliable in an environment sufficiently foreign to ours, such as the demon world WD. 

Conclusion 

In this essay, I have set out to evaluate the question of epistemic justification. Most centrally, I discussed whether the proposal of process reliabilism may serve as a basis for justification. To this effect, I raised several challenges to process reliabilism. For example, I observed that a reliable process operating on false premises (or, corrupted data) may cease to systematically produce correct beliefs. We then discussed ways to refine reliabilism to accommodate said concern, and how such refinements may or may not fall prey to a problem of regress. More practically speaking, I linked this discussion to the machine case by explaining how AI systems, even if they may operate on reliable processes, may become corrupted in their ability to produce epistemically justified outputs due to algorithmic bias due to having been trained on non-representative data samples. 

The second challenge to reliabilism I discussed concerns details of how the reliability of a process should be evaluated. In particular, I identified a need to specify and bound a ‘domain of application’ in reference to which a process’s reliability is established. The goal of such a demarcation—which may come in the form of indexing as suggested by Comesaña, in the form of defining normal conditions such as proposed by Leplin, or in some other way—is to be sensitive to (the limits of) a process’s ability to generalise. As such, over the course of this discussion, I developed a novel perspective on the new evil demon problem by casting it as an instance of a cluster of issues concerning generalisation and its limits. While the new evil demon problem is commonly raised as an objection to process reliabilism—claiming that the reliabilist solution to the case is counterintuitive—I was able to vindicate reliabilism from these allegations. Anna’s epistemic processes—despite being nominally the same as ours—do fail to be reliable; however, said failure must not be surprising to us because the demon world represents an application domain that is sufficiently and relevantly different from our world. 

Throughout the essay, I have attempted to straddle both the classical domain of epistemological inquiry, as well as a more novel domain, which one may call ‘machine’ epistemology. I believe this dialogue can be methodologically fruitful, and hope to have been able to provide evidence towards that conviction by means of the preceding discussion. It may serve as source of inspiration; it may, as discussed at the start of this essay, help us appropriately de-condition ourselves from unjustified assumptions such as forms of anthropocentrism; and it may serve as a practical testing ground and source of empirical evidence towards assessing the plausibility of different epistemological theories. Unlike with humans or mental processes, machines provide us with a larger possibility space and more nimbleness in implementing and testing our theoretical proposals. This is not to say that there aren’t dis-analogies between artificially intelligent machines and humans, and as such, any work that seeks to reap said benefits is also required to adopt the relevant levels of care and philosophical rigor. 

As a last, brief and evocative thought before the conclusion of this essay, let us return to a question raised at the very beginning of this essay. When comparing the two cases Rome and Rome’, we asked ourselves whether we should conclude that, by analogy between these two cases, insofar as Alice is deemed justified in believing the capital of Italy is Rome, so must be ChatAI. First, we must recognise that the only way to take this analogy seriously is to adopt an externalist perspective on the issues—that is, at least unless we are happy to get sucked into discussions of the possibility of machine mentality and reflective awareness of their own reasons. While some may take offense with this on the basis of favouring internalism over externalism, others—including me—may endorse this direction of travel for metaphysical reasons (see, e.g., Ladyman & Ross, 2007). Afterall—and most scientific realists would agree on this—whatever processes give rise to human life and cognition, they must in some fundamental sense be mechanistic and materialistic (i.e., non-magical) in just the way machine processes are. As the field of AI continues to uncover ever more complex processes, it would not be reasonable to exclude the possibility that they will, at some point—and in isolated cases already today—resemble human epistemic processes sufficiently that any basis of epistemic justification must either stand or fall for both types of processes simultaneously. This perspective can be seen as unraveling further depth in the analogy between classical and machine epistemology, and as such, provide support towards the validity of said comparison for philosophical and scientific thought.  

Resources

  • Cohen, Stewart (1984). “Justification and Truth”, Philosophical Studies, 46(3): 279–295. doi:10.1007/BF00372907

  • Comesaña, Juan (2002). “The Diagonal and the Demon”, Philosophical Studies, 110(3): 249–266. doi:10.1023/A:1020656411534

  • Conee, Earl and Richard Feldman (1998). “The Generality Problem for Reliabilism”, Philosophical Studies, 89(1): 1–29. doi:10.1023/A:1004243308503

  • Feldman, Richard (1985). “Reliability and Justification”:, The Monist, 68(2): 159–174. doi:10.5840/monist198568226

  • Goldman, Alvin (1979). “What is Justified Belief?” In George Pappas (ed.), Justification and Knowledge. Boston: D. Reidel. pp. 1-25.

  • Goldman, Alvin (1986). Epistemology and Cognition, Cambridge, MA: Harvard University Press.

  • Goldman, Alvin (2008). “Immediate Justification and Process Reliabilism”, in Quentin Smith (ed.), Epistemology: New Essays, New York: Oxford University Press, pp. 63–82.

  • Goldman, Alvin (2009). “Internalism, Externalism, and the Architecture of Justification”, Journal of Philosophy, 106(6): 309–338. doi:10.5840/jphil2009106611

  • Goldman, Alvin (2011). “Toward a Synthesis of Reliabilism and Evidentialism”, in Trent Dougherty (ed.), Evidentialism and Its Discontents, New York: Oxford University Press, pp. 254–290.

  • Janiesch, C., Zschech, P., & Heinrich, K. (2021). “Machine learning and deep learning”, Electronic Markets, 31(3), 685-695.

  • Klein, P. (1999). “Human Knowledge and the Infinite Regress of Reasons,” in J. Tomberlin, ed. Philosophical Perspectives 13, 297-325. 

  • Ladyman, James & Ross, Don (2007). Every Thing Must Go: Metaphysics Naturalized. Oxford University Press.

  • Leplin, Jarrett (2007). “In Defense of Reliabilism”, Philosophical Studies, 134(1): 31–42. doi:10.1007/s11098-006-9018-3

  • Leplin, Jarrett (2009). A Theory of Epistemic Justification, (Philosophical Studies Series 112), Dordrecht: Springer Netherlands. doi:10.1007/978-1-4020-9567-2