<aside>

Summary

This project combines a qualitative interview study with a structured taxonomy of the field's disagreements. The work maps how consciousness researchers and field-adjacent participants reason about AI consciousness, what evidence would potentially alter their credence, and whether AI consciousness would carry moral weight.

The premise is that experts in this field often talk past each other when they state their positions. Asking "what would convince you?" surfaces the epistemic structure underneath the conclusions and separates substantive disagreement from semantic disagreement, where two researchers can say "consciousness requires X" and mean very different things by X.

The study uses semi-structured interviews with consciousness scientists, AI researchers, philosophers, and governance practitioners. The protocol is designed to probe the conditions under which participants' views would shift, and in which direction.

Analysis runs in two passes. Content analysis applies a literature-derived taxonomy, classifying claims at the conceptual, methodological, theoretical, metaphysical, or normative level, and separating debates about AI consciousness from debates about AI moral status. Thematic analysis surfaces emergent patterns the taxonomy misses and the factors that shape participants' views.

</aside>

Background

There is increasing discourse around the possibility of consciousness emerging in AI and the moral implications that may follow. This is partly due to the emergence of phenomena like what has been described as ‘situational awareness’ in current large language models (LLMs), meaning their proposed ability to be aware of themselves and the situation they are in, as well as the development of human-AI personal relationships in recent years. This field rests upon a complex political and ideological landscape, where claims about consciousness in these systems and the broader narratives surrounding them can be financially beneficial to AI companies. These incentives coupled with the sensationalism clouding AI discourse incite both credulity and scepticism.

Consequently, there are disagreements about AI consciousness at multiple levels. This is also due to the difficulty in defining consciousness (as opposed to other properties related to subjective experience like sentience, agency, relational capacity, or intelligence) and in verifying the existence of subjective experience in all systems. There are various reasons why this is especially true in the case of AI systems: they are trained on human-produced data, so can imitate humans convincingly, and can optimise towards life-like benchmarks set for them; they also do not possess the same biochemical features that would typically be used as evidence of subjective experience in biological systems. These problems are particularly acute when relying on behavioural indicators to attribute consciousness, so researchers advocate looking for structural indicators too (i.e. the specific architectural organisation a system possesses).

That being said, using these indicators (behavioural, structural or otherwise) to attribute consciousness carries the assumption that consciousness can be computed in digital systems (computational functionalism). On the other hand, biological naturalists argue that the substrate of the system matters and consciousness can only emerge in biological systems. However, this also presumes that consciousness is understandable and definable in the first place, but others argue that the nature of it resists definition at a fundamental level. These are just a few of the disparate perspectives on AI consciousness, which are often predicated on loaded assumptions (acknowledged or not); this leads researchers in the field to become entrenched in a definitional morass.

Therefore, we categorise areas of debate across both the AI consciousness and moral status discourses. Subcategories are non-exhaustive and are derived systematically from a narrative review of the literature; they represent active debates in the field so the criteria for inclusion is that each must contain at least two differing stances. Some of these subcategories may be closely related, and could be merged into broader umbrella categories, but they are kept separate here to highlight the distinction between them.

Table 1: AI consciousness (C)

Level Subcategory
1. Conceptual a. Defining terms (‘consciousness’, ‘sentience’, ‘intelligence’, etc)
b. The adequacy of current theories of consciousness (Do they require refinement or fundamental revision?)
c. Consciousness as a gradual process or binary presence/absence distinction
2. Methodological a. The use of a single or multiple evidential indicator(s)
b. The application of existing frameworks (E.g. neuroscience - should we compare computational systems to biological neural development?)
c. The reliability of self-reported consciousness
d. The verifiability of subjective experience in AI systems
3. Theoretical a. Structural
(What internal organisation does a system need to have?)
b. Behavioural
(Which observable capabilities are treated as relevant?)
c. Computational sufficiency (Is running the right computation enough, or is something else required?)
4. Metaphysical a. Substrate dependence (Does what it's made of matter?)
b. The distinct quality and definability of subjective experience (Is it independent of functional/physical description, and can its nature even be grasped?)
c. Synchronic unity of consciousness (Singular and bounded or multiple and distributed?)
d. Diachronic identity of AI systems (Across context windows, models, updates, etc)

Table 2: Moral status of AI systems (M)

Level Subcategory
1. Conceptual a. What grounds moral status? (sentience, agency, interests, relational capacity, etc)
b. The relationship between consciousness and moral status (necessary, sufficient, not at all)
2. Methodological a. The appropriate evidential basis for AI moral status criteria (intuitions, empirical evidence, thresholds)
b. The application of existing frameworks (e.g. animal welfare research, moral psychology)
3. Normative a. Specific obligations following from AI moral patienthood (welfare or rights based, scaling, legal protections, institutional responsibility)
b. The interaction between AI moral status and consideration for existing moral patients
c. Whether creating, modifying, or terminating potentially morally relevant AI systems obliges or denies responsibility
d. The asymmetry of error (Is it worse to treat moral patients as tools or tools as moral patients?)

Scope and Methodology

We conduct a series of qualitative interviews across the fields of consciousness science, philosophy of mind, and AI research. Our aim is to investigate how people informally reason about AI consciousness beyond published research agendas and test whether our findings reflect the cruxes identified in the literature or uncover new ones. We asked participants what evidence would convince them of AI consciousness, and whether AI consciousness would carry moral relevance. We take two approaches to transcript analysis: content and thematic analysis. In the content analysis, we categorise participants’ answers based on the classification system derived from the existing literature (Tables 1 and 2). In the thematic analysis, we capture emergent themes in the interviews that may not be reflected in the classification system as well as elucidate the various factors that shape people’s views.

Preliminary findings

Study participant views:

What would convince of AI consciousness

<aside>

Common themes:

Preference for structural/architectural over behavioural evidence.

Chinese room thought experiment:

Other stances:

</aside>

Likelihood of current AI systems being conscious:

Indented bullet represents the estimate 10 years from now:

<aside> 💡

Views go from 0% likelihood to "current models are conscious".

Conceptualisations/moral frameworks:

<aside>