A list of all the posts and pages found on the site. For you robots out there is an XML version available for digesting as well.



Self-Serving utilitarian arguments

Utilitarianism can give rise to self-serving arguments. These are arguments in which people justify prioritizing themselves over others based on the amount of good they will do. I argue that although self-serving arguments can be misused, they are sometimes morally justified. We need ways of distinguishing good faith from bad faith self-serving arguments and I suggest a few ways we might do this.

In AI ethics, “bad” isn’t good enough

Lately I’ve been thinking about AI ethics and the norms we should want the field to adopt. It’s fairly common for AI ethicists to focus on harmful consequences of AI systems. While this is useful, we shouldn’t conflate arguments that AI systems have harmful consequences with arguments about what we should do. Arguments about what should do have to consider far more factors than arguments focused solely on harmful consequences.

Price gouging: are we shooting the messenger of inequality?

People price gouge when they buy goods during an emergency in order to re-sell them for a higher price. Why does price gouging feel wrong to us? In this post I consider a couple of possible reasons and argue that price gouging feels wrong because when people price gouge others, only two kinds of people can buy a scarce good: the rich and the desparate. So it makes prior inequalities between people more salient. I call this “shooting the messenger of inequality” and argue that doing this is often counterproductive.

Fairness, evidence, and predictive equality

Sometimes information that makes a prediction more accurate can make that prediction feel less fair. In this post, I explore some possible causal principles that could be beneath this kind of intuition, but argue that these principles are inconsistent with out intuitions in other cases. I then argue that our intuitions may reflect a desire to move towards more “predictive equality” in order to mitigate some of the negative social effects that come from making predictions based on properties generally correlated with worse outcomes.

AI bias and the problems of ethical locality

In this post I argue that attempts to reduce bias in AI decision-making face the problem of practical locality—we are limited in what we can do because the actions available to us depend on the society we find ourselves in—and the problem of epistemic locality—we are limited in what we can do because ethical views evolve over time and vary across regions. Both problems have consequences for work on AI bias, and the epistemic locality problem highlights the important links between AI bias and the alignment problem.

When robustly tolerable beats precariously optimal

Something is “robustly tolerable” if it performs adequately under a wide range of circumstances, including unexpectedly bad circumstances. In this post, I argue that when the costs of failure are high, it’s better for something to be robustly tolerable even if this means taking a hit on performance or agility.

The virtues and vices of shark curiosity

Embracing the kind of aggressive curiosity of sharks seems to be a good way of getting better at arguing. But it can have a chilling effect on discourse and friendships. In this post, I explain what I mean by shark curiosity, and how we can strike the right balance between nurturing and testing new ideas.

The optimal rate of failure

We sometimes assume that seeing someone fail implies that they are doing something wrong, but I argue that the ideal rate at which our plans should fail is often quite high. I note that this has consequences in politics and ethics that are often underappreciated.

Does deliberation limit prediction?

There is a longstanding debate about whether deliberation prevents us from making any predictions about actions. In this post I will argue for a weaker thesis, namely that deliberation limits our ability to predict actions.

Disagreeing with content and disagreeing with connotations

It’s possible to agree with the content of a piece of writing but but to think that the conclusions that many readers might draw from it are wrong. I think it’s useful to distinguish between these before criticizing the writing of others.

Impossibility reasoning

It’s typical to teach and use sequential reasoning, but all sequential arguments can be reforumalted as impossibility results. Thinking and presenting arguments in terms of impossibility results rather than sequential arguments can be more fruitful than sequential reasoning.

Keep others’ identities small

I really like Paul Graham’s advice to “keep your identity small” - to avoid making groups or positions part of your identity if you want to remain unbiased. But I often want to add to it “and keep other people’s identity small too”.

Transmitting credences and transmitting evidence

There is a longstanding debate about whether deliberation prevents us from making any predictions about actions. In this post I will argue for a weaker thesis, namely that deliberation limits our ability to predict actions.

Against jargon

It’s sometimes useful to introduce new terms into discourse, but new terms can increase communication efficiency but at the cost of accessibility and sometimes precision. In this post I outline the pros and cons of introducing new, domain-specific terms.

Some noise on signaling

I ask what signaling is and argue that it’s a bad idea to simply accuse people of “signaling” because signaling can mean a lot of things. I also argue that not all signaling is bad.

Vegetarianism, abortion, and moral empathy

When people disagree about moral issues, they often fail to treat the moral beliefs of those that they disagree with as genuine moral beliefs. They instead they treat them like mere whims or mild preferences. This shows a lack of what I call moral empathy. I argue that lacking moral empathy can be harmful and can prevent fruitful discussion on divisive topics.

Can we offset immorality?

People offset bad actions in various ways. The most salient example of this is probably carbon offsetting, where we pay a company to reduce the carbon in the atmosphere by roughly the same amount that we put in. But there are arguably more mundane examples of acts that are intended to offset immoral behavior. In this post I ask what moral offsetting is and whether it is something we should be in favor of.

Prison is no more humane than flogging

Many people believe that corporal punishmenthas no place in a modern criminal justice system. Imprisonment is seen as a more humane form of punishment, and it is one that is employed in most modern criminal justice systems. In this post I ask why we think that imprisonment is humane while corporal punishment is not. I think this should cause us to question the ethics of imprisoning people.

Is the born this way message homophobic?

The message of “born this way“ is that your sexual orientation is something you’re born with rather than something you choose. This is considered an important point in the justification of gay rights. I’m a strong supporter of gay rights, but I realised just over a year ago that something about this slogan didn’t sit right with me. I’m now pretty confident that basing gay rights on the “born this way“ message can be pretty harmful to LGBT people and other oppressed groups.




Objective Epistemic Consequentialism

June 01, 2011

In this thesis I construct and defend a position that I call objective epistemic consequentialism. Objective epistemic consequentialism states that we ought to believe a proposition P to a given degree if and only if doing so produces the most epistemic value. I argue that this offers a viable response not only to the question of what we should believe and why, but also to which decision procedures we should commit ourselves to, what is of final epistemic value, and what is the nature of epistemic oughts.

Recommended citation: Askell, Amanda. ‘Objective Epistemic Consequntialism.’ BPhil thesis, University of Oxford (2011).

Epistemic Consequentialism and Epistemic Enkrasia

Published in Epistemic Consequentialism, 2018

In this chapter I investigate what the epistemic consequentialist will say about epistemic enkrasia principles: principles that instruct one not to adopt a belief state that one takes to be irrational. I argue that a certain epistemic enkrasia principle for degrees of belief can be shown to maximize expected accuracy, and thus that a certain kind of epistemic consequentialist is committed to such a principle. But this is bad news for such an epistemic consequentialist because epistemic enkrasia principles are problematic.

Recommended citation: Askell, Amanda. ‘Epistemic Consequentialism and Epistemic Enkrasia’. In Epistemic Consequentialism, edited by Kristoffer Ahlstrom-Vij and Jeff Dunn. Oxford: Oxford University Press, 2018

Pareto Principles in Infinite Ethics

May 01, 2018

In this thesis I argue that ethical rankings of worlds that contain infinite levels of wellbeing ought to be consistent with the Pareto principle, which says that if two worlds contain the same agents and some agents are better off in the first world than they are in the second and no agents are worse off than they are in the second, then the first world is better than the second. I show that if we accept four axioms – the Pareto principle, transitivity, an axiom stating that populations of worlds can be permuted, and the claim that if the ‘at least as good as’ relation holds between two worlds then it holds between qualitative duplicates of this world pair – then we must conclude that there is ubiquitous incomparability between infinite worlds.

Recommended citation: Askell, Amanda. ‘Pareto Principles in Infinite Ethics.’ PhD thesis, New York University (2018).

AI Safety Needs Social Scientists

Published in Distill, February, 2019

Properly aligning advanced AI systems with human values will require resolving many uncertainties related to the psychology of human rationality, emotion, and biases. These can only be resolved empirically through experimentation — if we want to train AI to do what humans want, we need to study humans.

Recommended citation: Irving & Askell, ‘AI Safety Needs Social Scientists’, Distill, 2019.

Prudential Objections to Atheism

Published in A Companion to Atheism and Philosophy, 2019

In order for prudential objections to atheism to get off the ground, we must believe that we can have prudential reasons for and against believing things. In this chapter, I argue that a modest version of this view is more plausible than it may initially seem. I then explore two kinds of prudential reasons for belief: personal benefits like consolation, health, and community; and Pascal’s contention that we are more likely to experience an infinitely good afterlife if we believe in God.

Recommended citation: Askell, Amanda. ‘Prudential Objections to Atheism’. In A Companion to Atheism and Philosophy, edited by Graham Oppy. Wiley-Blackwell, 2019.

The Role of Cooperation in Responsible AI Development

July 10, 2019

In this paper, we argue that competitive pressures could incentivize AI companies to underinvest in ensuring their systems are safe, secure, and have a positive social impact. Ensuring that AI systems are developed responsibly may therefore require preventing and solving collective action problems between companies. We note that there are several key factors that improve the prospects for cooperation in collective action problems. We use this to identify strategies to improve the prospects for industry cooperation on the responsible development of AI.

Recommended citation: Askell, Amanda, Miles Brundage, and Gillian Hadfield. ‘The Role of Cooperation in Responsible AI Development.’ arXiv preprint arXiv:1907.04534 (2019).

Release Strategies and the Social Impacts of Language Models

August 24, 2019

This report discusses OpenAI’s work related to the release of its GPT-2 language model. It discusses staged release, which allows time between model releases to conduct risk and benefit analyses as model sizes increased. It also discusses ongoing partnership-based research and provides recommendations for better coordination and responsible publication in AI.

Recommended citation: Solaiman, Irene, et al. ‘Release strategies and the social impacts of language models.’ arXiv preprint arXiv:1908.09203 (2019).

Evidence Neutrality and the Moral Value of Information

Published in Effective Altruism: Philosophical Issues, 2019

In this chapter, I consider whether there is a case for favoring interventions whose effectiveness has stronger evidential support, when expected effectiveness is equal. I argue that in fact the reverse is true: when expected value is equal one should prefer to invest in interventions that have less evidential support, on the grounds that by doing so one can acquire evidence of their effectiveness (or ineffectiveness) that may then be valuable for future investment decisions.

Recommended citation: Askell, Amanda. ‘Evidence Neutrality and the Moral Value of Information’. In Effective Altruism: Philosophical Issues, edited by Hilary Greaves and Theron Pummer. Oxford: Oxford University Press, 2019.

Toward Trustworthy AI Development: Mechanisms for Supporting Verifiable Claims

April 20, 2020

This report suggests various steps that different stakeholders can take to improve the verifiability of claims made about AI systems and their associated development processes, with a focus on providing evidence about the safety, security, fairness, and privacy protection of AI systems. We analyze ten mechanisms for this purpose–spanning institutions, software, and hardware–and make recommendations aimed at implementing, exploring, or improving those mechanisms.

Recommended citation: Miles Brundage, Shahar Avin, Jasmine Wang, and Haydn Belfield, et al. ‘Toward Trustworthy AI Development: Mechanisms for Supporting Verifiable Claims’. arXiv preprint arXiv:2004.07213 (2020).

Language Models are Few-Shot Learners

May 28, 2020

Here we show that scaling up language models greatly improves task-agnostic, few-shot performance, sometimes even reaching competitiveness with prior state-of-the-art fine-tuning approaches. Specifically, we train GPT-3, an autoregressive language model with 175 billion parameters, 10x more than any previous non-sparse language model, and test its performance in the few-shot setting. For all tasks, GPT-3 is applied without any gradient updates or fine-tuning, with tasks and few-shot demonstrations specified purely via text interaction with the model. GPT-3 achieves strong performance on many NLP datasets, including translation, question-answering, and cloze tasks, as well as several tasks that require on-the-fly reasoning or domain adaptation, such as unscrambling words, using a novel word in a sentence, or performing 3-digit arithmetic.

Recommended citation: Brown, Tom; Mann, Ben; Ryder, Nick; Subbiah, Melanie et al. ‘Language Models are Few-Shot Learners.’ arXiv preprint arXiv:2005.14165 (2020).

Learning Transferable Visual Models From Natural Language Supervision

January 05, 2021

State-of-the-art computer vision systems aretrained to predict a fixed set of predeterminedobject categories. This restricted form of super-vision limits their generality and usability sinceadditional labeled data is needed to specify anyother visual concept. Learning directly from rawtext about images is a promising alternative whichleverages a much broader source of supervision.We demonstrate that the simple pre-training taskof predicting which caption goes with which im-age is an efficient and scalable way to learn SOTAimage representations from scratch on a datasetof 400 million (image, text) pairs collected fromthe internet.

Recommended citation: Radford, Alec & Kim, Jong Wook, et al. "Learning Transferable Visual Models From Natural Language Supervision." (2021).


The Moral Value of Information Permalink

When faced with ethical decisions, we generally prefer to act on more evidence rather than less. If the expected value of two options available to us are similar but the expected value of one option is based on more evidence than the the other is, then we will generally prefer the option that has more evidential support. In this talk, I argue that although we are intuitively disinclined to favor interventions with poor evidential support, there are reasons for thinking that these are sometimes better than favoring interventions with a proven track record.

Pascal’s Wager and other low risks with high stakes Permalink

In this episode of Rationally Speaking I argue that it’s much trickier to rebut Pascal’s Wager than most people think. Julia and I also discuss how to handle other decisions where a risk has very low probability but would matter a lot if it came true – should you round them down to zero? Does it matter how measurable the risk is? And should you take into account the chance you’re being scammed?

Moral Offsetting Permalink

With Tyler John (co-author and co-presenter). Many people try to offset the harm their behaviors do to the world. People offset carbon, river and air pollution, risky clinical trials, and consuming animal products. Yet it is not clear whether and when such offsetting is permissible. We give a preliminary conceptual analysis of moral offsetting. We then show that every comprehensive moral theory faces a trilemma: (i) any bad action can in principle be permissibly offset; (ii) no actions can in principle be permissibly offset; or (iii) there is a bad act that can be permissibly offset such that another act worse by an arbitrarily small degree cannot be offset, no matter how great the offset.

Moral empathy, the value of information & the ethics of infinity Permalink

In this episode of The 80,000 Hours podcast we cover a range of topics including the problem of ‘moral cluelessness’, whether there an ethical difference between prison and corporal punishment, how to resolve ‘infinitarian paralysis’, how we should think about jargon, and having moral empathy for intellectual adversaries.

AI safety needs social scientists Permalink

When an AI wins a game against a human, that AI has usually trained by playing that game against itself millions of times. When an AI recognizes that an image contains a cat, it’s probably been trained on thousands of cat photos. So if we want to teach an AI about human preferences, we’ll probably need lots of data to train it. In this talk, I explore ways that social science might help us steer advanced AI in the right direction.

OpenAI’s GPT-2 Language Model Permalink

TWIML hosted a live-streamed panel discussion exploring the issues surrounding OpenAI’s GPT-2 announcement. The discussion explores issues like where GPT-2 fits in the broader NLP landscape, why OpenAI didn’t release the full model, and the best practices in honest reporting of new ML results.

Publication norms, malicious uses of AI, and general-purpose learning algorithms Permalink

In this episode of The 80,000 Hours podcast, Miles Brundage, Jack Clark, and I discuss a range of topics in AI policy, including the most significant changes in the AI policy world over the last year or two, how much the field is still in the phase of just doing research versus taking concrete actions, how should we approach possible malicious uses of AI, and publication norms for AI research.

Responsible AI development as a collective action problem

It has been argued that competitive pressures could cause AI developers to cut corners on the safety of their systems. If this is true, however, why don’t we see this dynamic play out more often in other private markets? In this talk I outline the standard incentives to produce safe products: market incentives, liability law, and regulation. I argue that if these incentives are too weak because of information asymmetries or other factors, competitive pressure could cause firms to invest in safety below a level that is socially optimal. I argue that, in such circumstances, responsible AI development is a kind of collective action problem. I then develop a conceptual framework to help us identify levers to improve the prospects for cooperation in this kind of collective action problem.

Girl Geek X: What is AI Policy? Permalink

AI systems can fail unexpectedly, be used in ways their creators didn’t anticipate, or having unforeseen social consequences. But the same can be said of cars and pharmaceuticals, so why should we think there are any unique policy challenges posed by AI? In this talk, I’ll point out that the primary mechanisms for preventing these sorts of failures in other industries are not currently well-suited to AI systems. I then discuss the ways that engineers can help meet these policy challenges.