Human Compatible provides an analysis of the long-term risks from artificial intelligence, by someone with a good deal more of the relevant prestige than any prior author on this subject.

What should I make of Russell? I skimmed his best-known book, Artificial Intelligence: A Modern Approach, and got the impression that it taught a bunch of ideas that were popular among academics, but which weren't the focus of the people who were getting interesting AI results. So I guessed that people would be better off reading Deep Learning by Goodfellow, Ben

... (Read more)

Charlie Steiner13m2

Thanks for this perspective! I really should get around to reading this book...

Have you ever played the game Hanabi? Some of the statements you make imply, "why would he say them otherwise?" style, that your error bars aren't big enough.

So, depending on how you feel about statements like, e.g., "Human Compatible neither confirms nor dispels the impression that Russell is a bit too academic", I think you should either widen your error bars, or do a better job of communicating wide error bars.

[Question]How does a Living Being solve the problem of Subsystem Alignment?

Alan Givré

3d1 min readShow Highlight

So, a Living Being is composed of multiple parts who act pretty much on tandem except extreme situations like Cancer, how does that work?

2Answer by romeostevensit3hThe Republic is about this. As is Moby Dick though it is not explicit in the latter whereas the metaphor is explicitly declared in the former. Plato's stuff actually makes even more sense if you append the death of socrates cycle to the end of the republic. First you instantiate the philosopher king who puts the house in order, then the philosopher king commits suicide as a logical result of the rules as set up by that very same philosopher king.

3Raemon1hI'd be interested in hearing more thoughts about this.

romeostevensit29m5

Imagine that Herman Melville was doing IFS and that the book is his notes. There are different ways to think about how he splits things up into different characters (just as everyone's ifs process is idiosyncratic but has recurring patterns), but the overall frame winds up feeling like it just fits. And I don't mean this in the vacuous 'everything could be an IFS manual if you think about it' way. I'm actually not familiar with any others besides those two that are central examples of the thing. Thinking for a bit I'd venture ... (read more)

[Question]What types of compute/processing could we distinguish?

MoritzG

2d1 min readShow Highlight

I am a bit confused and thought I'd rather ask and discuss here before thinking about it for long. As usual I am trying to compartmentalize, structure, make distinctions.

My confusion was triggered thinking about the evaluation function (heuristic to rate the certainty to win/loose) in chess. Clearly what it takes is all there on the board, actually the game is already decided based on that state and assuming both player play to force the best possible outcome.

Why do we need to process data when the information is obviously already in the input? (Yes, I know one can make wordy the distinct... (Read more)

1MoritzG3hOk, let's go with chess. For that game there is an optimal balance between the tree search and the evaluation function. The search is exploratory. The evaluation is a score. The evaluation can obviously predict the search to some degree. Humans are very bad at searching, still some can win against computers. The search is decompressing the information to something more easily evaluated by a computer. A human can do it with much less expansion. Just a matter of Hardware or is it because the information was there all along and just needed a "smarter" analysis?

Dagon1h2

I may not have been clear enough. The evaluation _IS_ a search. The value of a position is exactly the value of a min-max adversarial search to a leaf (game end).

Compression and caching and prediction are ways to work around the fact that we don't actually have the lookup table available.

1Answer by Jader Martins11h"Why can I have little information but still have to search a huge state space, why can't I go straight to the conclusion/action?" There's no answer for this question, if you find one, pick your prize of 1million dollars: https://en.wikipedia.org/wiki/P_versus_NP_problem [https://en.wikipedia.org/wiki/P_versus_NP_problem] Here is a video explaining it better: https://www.youtube.com/watch?v=YX40hbAHx3s [https://www.youtube.com/watch?v=YX40hbAHx3s]

Go F*** Someone

Jacobian

4d7 min readShow Highlight

As always, cross-posted from Putanumonit.

From Tokyo to TriBeCa, people are increasingly alone. People go on fewer dates, marry less and later, have smaller families if at all. People are having less sex, especially young people. The common complaint: it’s just too hard. Dating is hard, intimacy is hard, relationships are hard. I’m not ready to play on hard mode yet, I’ll do the relationship thing when I level up.

And simultaneously, a cottage industry sprung up extolling the virtue of loneliness. Self-care, self-development, self-love. Travel solo, live solo, you do you. Wait, doesn’t that la

... (Read more)

lincolnquirk1h4

I really enjoyed this post. The analogy of capital vs. labor really hit home in particular, I realized that’s exactly how I’ve been implicitly treating dating, so I think this post is likely to change my behavior in the future. Thanks for writing it.

3Mary Chernyshenko10hAlso, when a person who has been building his or her career becomes "staying at home", the person doesn't just lose standing among peers and colleagues. He or she loses peers and colleagues, at all. It is one thing to be known as "someone who is no longer staying at work until nine", but it's quite a different thing to just not be known anymore. It makes you... lonely.

The Road to Mazedom

Zvi

1d6 min readShow Highlight

Previous post: How Escape From Immoral Mazes

Sequence begins here: Moloch Hasn’t Won

The previous posts mostly took mazes as given.

As an individual, one’s ability to fight any large system is limited.

That does not mean our individual decisions do not matter. They do matter. They add up.

Mostly our choice is a basic one. Lend our strength to that which we wish to be free from. Or not do so.

Even that is difficult. The methods of doing so are unclear. Mazes are ubiquitous. Not lending our strength to mazes, together with the goal of keeping one’s metaphorical soul intact and still putting food o... (Read more)

2Zvi2hWithout anchoring anyone too much on my question elsewhere in the thread: I would say that this is certainly a central case of maze behavior and points in the correct direction, but as a definition of all maze behavior it is importantly too small a class of things. There is something more fundamental going on, and it is a Fnord (I have Fnord as the top of my future post pile, where Fnord is a thing that makes you want to not notice look at it or notice it.)

5Zvi2hYes, that is what it is intended to mean, while noting that 'acting like a maze' or 'doing what it takes to get ahead in a maze' is in general a maze-creating and maze-supporting behavior. Agree with Raemon that I haven't done the best job summarizing exactly what maze behaviors actually are. I attempted with this post to summarize my model of how mazes come to be and become powerful, but that is a different question. I will consider writing an explicit post to cover this, since it isn't in any of the scheduled posts either, but seems like an important thing to have. Thank you for pointing this out. I would like to take this opportunity to ask others, without anchoring them with an answer: If you had to give a short summary answer to "What exactly are maze behaviors?" what would you say? I want to know what is being communicated, and also people might have their own insights/perspectives/behaviors.

12lionhearted16hTwo thoughts. First, small technical feedback — do you think there's some classification of these factors, however narrow or broad, that could be sub-headlines? For instance, #24 and #29 seem to be similar things: #24 As the overall maze level rises, mazes gain a competitive advantage over non-mazes. #29 As maze levels rise, mazes take control of more and more of an economy and people’s lives. As do #27 and #28: #27: Mazes have reason to and do obscure that they are mazes, and to obscure the nature of mazes and maze behaviors. This allows them to avoid being attacked or shunned by those who retain enough conventional not-reversed values that they would recoil in horror from such behaviors if they understood them, and potentially fight back against mazes or to lower maze levels. The maze embracing individuals also take advantage of those who do not know of the maze nature. It is easy to see why the organizations described in Moral Mazes would prefer people not read the book Moral Mazes. #28: Simultaneously with pretending to the outside not to be mazes, those within them will claim if challenged that everybody knows they are mazes and how mazes work. While it's hard to pin down exactly what the categories would be, It seems that the first cluster is about something like feedback loops and the second culture is about something like deceit, self-deceit, etc. The categories could even be very broad like "Inherent Biases", "Incentives and Rewards", "Feedback Loops", etc. Or could be narrower. But it's difficult to follow a list of 37 propositions, some of which are relatively simple and self-contained and others are synthesis, conclusion, and extrapolation of previous points. Ok, second thought — This is all largely written from the point of view of how bad these things are as a participant. I bet it'd be interesting to flip the viewpoint and analysis and explore it from the view of a leader/executive/etc who was trying to forestall these effects. For inst

Zvi1h2

On editing note, I think that subheaders requires that things happen in header order, but I want to go in timeline order, and I don't think you can do clean breaks given that restriction. I'm presuming you could group them into types of steps in useful ways if you were so inclined and had a reason to go in that direction.

On second note, I do worry that people will think that #4 is both more endogenous and does more work than I see it as being and doing, and use that as a reason to think of this is a localized and conditional problem. But in terms... (read more)

Moloch Hasn’t Won

110

Zvi

22d7 min readShow Highlight

This post begins the Immoral Mazes sequence. See introduction for an overview of the plan. Before we get to the mazes, we need some background first.

Meditations on Moloch

Consider Scott Alexander’s Meditations on Moloch. I will summarize here.

Therein lie fourteen scenarios where participants can be caught in bad equilibria.

In an iterated prisoner’s dilemma, two players keep playing defect.
In a dollar auction, participants massively overpay.
A group of fisherman fail to coordinate on using filters that efficiently benefit the group, because they can’t punish those who don’t profi by not usi

... (Read more)

Zvi1h2

I did a little more work to make it flow better in OP, and I'm going to let it drop there unless a bunch of other people confirm they had this same issue and it actually mattered (and with the new version).

Clarifying "AI Alignment"Ω

paulfchristiano

1y3 min readΩ 19Show Highlight

When I say an AI A is aligned with an operator H, I mean:

A is trying to do what H wants it to do.

The “alignment problem” is the problem of building powerful AI systems that are aligned with their operators.

This is significantly narrower than some other definitions of the alignment problem, so it seems important to clarify what I mean.

In particular, this is the problem of getting your AI to try to do the right thing, not the problem of figuring out which thing is right. An aligned AI would try to figure out which thing is right, and like a human it may or may not succeed.

Analogy

Consider a human... (Read more)

2Vanessa Kosoy9hThe acausal attack is an example of how it can happen for systematic reasons. As for the other part, that seems like conceding that intent-alignment is insufficient and you need "corrigibility" as another condition (also it is not so clear to me what this condition means). It is possible that Alpha cannot predict it, because in Beta-simulation-world the user would confirm the irreversible action. It is also possible that the user would confirm the irreversible action in the real world because the user is being manipulated, and whatever defenses we put in place against manipulation are thrown off by the simulation hypothesis. Now, I do believe that if you set up the prior correctly then it won't happen, thanks to a mechanism like: Alpha knows that in case of dangerous uncertainty it is safe to fall back on some "neutral" course of action plus query the user (in specific, safe, ways). But this exactly shows that intent-alignment is not enough and you need further assumptions. Besides the fact ascription universality is not formalized, why is it equivalent to intent-alignment? Maybe I'm missing something. I am curious whether you can specify, as concretely as possible, what type of mathematical result would you have to see in order to significantly update away from this opinion. No, I make no such assumption. A bound on subjective regret ensures that running the AI is a nearly-optimal strategy from the user's subjective perspective. It is neither needed nor possible to prove that the AI can never enter a trap. For example, the AI is immune to acausal attacks to the extent that the user beliefs that the AI is not inside Beta's simulation. On the other hand, if the user beliefs that the simulation hypothesis needs to be taken into account, then the scenario amounts to legitimate acausal bargaining (which has its own complications to do with decision/game theory, but that's mostly a separate concern).

rohinmshah1h2Ω2

A bound on subjective regret ensures that running the AI is a nearly-optimal strategy from the user's subjective perspective.

Sorry, that's right. Fwiw, I do think subjective regret bounds are significantly better than the thing I meant by definition-optimization.

It is possible that Alpha cannot predict it, because in Beta-simulation-world the user would confirm the irreversible action. It is also possible that the user would confirm the irreversible action in the real world because the user is being manipulated, and whatever defenses we put in pl

... (read more)

How to Escape From Immoral Mazes

Zvi

3d18 min readShow Highlight

Previously in sequence and most on point: What is Success in an Immoral Maze?, How to Identify an Immoral Maze

This post deals with the goal of avoiding or escaping being trapped in an immoral maze, accepting that for now we are trapped in a society that contains powerful mazes.

We will not discuss methods of improving conditions (or preventing the worsening of conditions) within a maze, beyond a brief note on what a CEO might do. For a middle manager anything beyond not making the problem worse is exceedingly difficult. Even for the CEO this is an extraordinarily difficult task.

To rescue so... (Read more)

Zvi2h2

My explicit advice above was that if you find yourself in that situation, down-scaling your lifestyle is prohibitive (e.g. it would break up your family) then you should seek to become a loser in the Rao sense. E.g. don't quit or outright rebel, but stop trying to advance further, do the minimum to not have anything disastrous happen, and make this clear to all parties at work, while trying to save as much as possible and plan a second act if you want one after that eventually fails to hold up.

If it's just 'you have comparative advantage do... (read more)

2Zvi2hAgree that it could be +EV to sign on where you would learn specific skills - e.g. I am very confident that Year 1 at my firm is a very good school they pay you to go to! The question is whether you can trust yourself to execute on the exit strategy in light of what will happen to you and the choices you will be presented with. I'd be pretty scared of this failing.

[Question]Use-cases for computations, other than running them?

johnswentworth

2h1 min readShow Highlight

In imperative programming languages, the main purpose of a program is to specify a computation, which we then run. But it seems a rather... unimaginative use of a computation, simply to run it.

Having specified a computation, what else might one want to do with it?

Some examples:

Differentiate numerical computations (i.e. backprop)
Ask whether any possible inputs could yield a particular output (i.e. NP problems)
Search for data within the computation's output space (i.e. grep)
Given output from the computation, find a data structure which traces its execution (i.e. parsing with context-free g

MichaelA

2d19 min readShow Highlight

Overview

We’re often forced to make decisions under conditions of uncertainty. This may be empirical uncertainty (e.g., what is the likelihood that nuclear war would cause human extinction?), moral uncertainty (e.g., does the wellbeing of future generations matter morally?), or one of a number of other types of uncertainty.

But what do we really mean by “uncertainty”?

According to [one] view, certainty has two opposites: risk and uncertainty. In the case of risk, we lack certainty but we have probabilities. In the case of uncertainty, we do not even have probabilities. (Dominic Roser [who argu

... (Read more)

joracine2h1

In other words, I think it's more useful to think of those definitions as an algorithm (perhaps ML): certainty ~ f(risk, uncertainty); and the definitions provided of the driving factors as initial values. The users can then refine their threshold to improve the model's prediction capability over time, but also as a function of the class of problems (i.e. climate vs software).

2romeostevensit3hI found Marr's levels highly helpful when trying to think about this area. YMMV. Marr's levels also correspond to Aristotle's four causes if we do as Marr does and split the algorithmic level into 'representation' and 'traversal.'

Book review: Rethinking Consciousness

steve2152

9d6 min readShow Highlight

Princeton neuroscientist Michael Graziano wrote the book Rethinking Consciousness (2019) to explain his "Attention Schema" theory of consciousness (endorsed by Dan Dennett!^[1]). If you don't want to read the whole book, you can get the short version in this 2015 article.

I'm particularly interested in this topic because, if we build AGIs, we ought to figure out whether they are conscious, and/or whether that question matters morally. (As if we didn't already have our hands full thinking about the human impacts of AGI!) This book is nice and concrete and computational, and I think it at least of

... (Read more)

romeostevensit3h2

Unfortunately I don't know of a good overview. Chalmers might have one. Lukeprogs post on consciousness has some pointers.

1steve21529hI commented here [https://www.lesswrong.com/posts/biKchmLrkatdBbiH8/book-review-rethinking-consciousness#MRcbNYFbJ7at7XrSz] why I think that it shouldn't be possible to fully explain reports of consciousness without also fully explaining the hard problem of consciousness in the process of doing so. I take it you disagree (correct?) but do you see where I'm coming from? Can you be more specific about how you think about that?

2shminux15hActually the superdeterminism models allow for both to be true. There is a different assumption that breaks.

Welcome to Rationality Graz [Edit With Your Details]

Lukas T

2h1 min readShow Highlight

(The following are our suggestions for what kind of information is best to include in the welcome post of your group, feel free to replace them with whatever you think is best)

What kind of events does your group usually run? What does it usually do?

How frequently does your group organize events or meet?

Who would be a good fit for you group?

Should they have any particular skills or have done some specific background reading?

[Event][Houston] Meetup @ Empire Cafe Sun 1-19-20 2pm-5pm

2Jan 19th1732 Westheimer Road, Houston

Willa

Show Highlight

Hi y'all, Our next meetup is from 2:00pm to 5:00pm at Empire Cafe on Sunday January 19, 2020.

As Scott says in his meetup posts: "Who: Anyone who wants. Please feel free to come even if you feel awkward about it, even if you’re not “the typical SSC reader”, even if you’re worried people won’t like you, etc."

We have a discussion topic! Scott's post "What Intellectual Progress did I make in the 2010s?" (https://slatestarcodex.com/2020/01/08/what-intellectual-progress-did-i-make-in-the-2010s/) serves as the discussion topic's reference. The topic is: generally speaking, what kind of progress, inte

... (Read more)

Willa3h1

We're here! Come to the side patio outside, you'll see me in a blue and black plaid shirt and we have a sign.

[Link]Realism about rationalityΩ

168

ricraz

1y4 min readΩ 30Show Highlight

Epistemic status: trying to vaguely gesture at vague intuitions. A similar idea was explored here under the heading "the intelligibility of intelligence", although I hadn't seen it before writing this post.

There’s a mindset which is common in the rationalist community, which I call “realism about rationality” (the name being intended as a parallel to moral realism). I feel like my skepticism about agent foundations research is closely tied to my skepticism about this mindset, and so in this essay I try to articulate what it is.

Humans ascribe properties to e... (Read more)

abramdemski3h4Ω2

So, yeah, one thing that's going on here is that I have recently been explicitly going in the other direction with partial agency, so obviously I somewhat agree. (Both with the object-level anti-realism about the limit of perfect rationality, and with the meta-level claim that agent foundations research may have a mistaken emphasis on this limit.)

But I also strongly disagree in another way. For example, you lump logical induction into the camp of considering the limit of perfect rationality. And I can definitely see the reason. But from my perspective... (read more)

Open & Welcome Thread - January 2020

habryka

13d1 min readShow Highlight

If it’s worth saying, but not worth its own post, here's a place to put it.
- You can also make a shortform post.
And, if you are new to LessWrong, here's the place to introduce yourself.
- Personal stories, anecdotes, or just general comments on how you found us and what you hope to get from the site and community are welcome.

If you want to explore the community more, I recommend reading the Library, checking recent Curated posts, seeing if there are any meetups in your area, and checking out the Getting Started section of the LessWrong FAQ.

The Open Thread sequence is here.

9Wei_Dai4hAnyone else kept all photos of themselves off the public net because they saw something like this [https://www.nytimes.com/2020/01/18/technology/clearview-privacy-facial-recognition.html] coming?

Raemon3h3

I think I was more resigned to it.

Adjusting Outdoor Reset

jkaufman

5h1 min readShow Highlight

Our house has forced hot water heat with three loops, two of baseboards and one of radiators. The first floor unit has a baseboard loop and on cold winter days can't keep up. The tenants have been using electric heat [1] to supplement, but that's annoying for them and resistive electric is much more expensive than gas.

The boiler is a modern efficient one with an outdoor reset. You have a temperature sensor outside, and on warm days the system won't heat the water it's circulating to as high a temperature as on cold days. For example, if it's 55F outside it's wasteful to be circulat... (Read more)

Bay Solstice 2019 Retrospective

mingyuan

3d15 min readShow Highlight

I was the Creative Director for last year’s Winter Solstice in the Bay Area. I worked with Nat Kozak and Chelsea Voss, who were both focused more on logistics. Chelsea was also the official leader who oversaw both me and Nat and had final say on disputes. (However, I was granted dictatorial control over the Solstice arc and had final say in that arena.) I legit have no idea how any one of us would have pulled this off without the others; love to both of them and also massive respect to Cody Wild, who somehow ran the entire thing herself in 2018.

While I worked with a bunch of other people on So

... (Read more)

G Gordon Worley III6h5

Having had a couple days to sit with this thread, I think it's worth adding that I'm willing to participate in addressing this issue in future Solstice celebrations (so for 2020 at least). I think I'm a poor choice for lots of things related to Solstice organizing because I'm not close enough to the core of rationalist culture to reliably drive things in ways most rationalists would like, but within the context of a team that is doing that I think I could probably have a positive impact on the Solstice experience by pushing it to better... (read more)

4ESRogs15hBut the actual day of solstice is the first day of winter...

[Event]Madison SSC Meetup: Adversarial Collaborations

1Feb 1stMadison

marywang

Show Highlight

SSC readers, EAs, Rationalists or curious friends, join us for vegan soup and any other food and drink you bring, and conversation about the recent adversarial collaborations.

[Question]Political Roko's basilisk

Abhimanyu Pallavi Sudhir

2d1 min readShow Highlight

Why has there never been a "political Roko's basilisk", i.e. a bill or law that promises to punish any member of parliament who voted against it (or more generally any individual with government power, e.g. judge or bureaucrat, who did not do everything in their capacity to make it law)?

Even if unconstitutionality is an issue, it seems like the "more general" condition would prevent judges from overturning it, etc. And surely there are countries with all-powerful parliaments.

3avturchin14hIn early Soviet history they actually checked if a person actually supported the winning party by looking of what you did 10-20 years ago. If the person was a member of wrong party in 1917, he could be prosecuted in 1930th.

Abhimanyu Pallavi Sudhir14h3

Interesting. Did they promise to do so beforehand?

In any case, I'm not surprised the Soviets did something like this, but I guess the point is really "Why isn't this more widespread?" And also: "why does this not happen with goals other than staying in power?" E.g. why has no one tried to pass a bill that says "Roko condition AND we implement this-and-this policy". Because otherwise it seems that the stuff the Soviets did was motivated by something other than Roko's basilisk.

Vanessa Kosoy's ShortformΩ

Vanessa Kosoy

3moΩ 5Show Highlight

Vanessa Kosoy8h2Ω1

There is some similarity, but there are also major differences. They don't even have the same type signature. The dangerousness bound is a desideratum that any given algorithm can either satisfy or not. On the other hand, AUP is a specific heuristic how to tweak Q-learning. I guess you can consider some kind of regret bound w.r.t. the AUP reward function, but they will still be very different conditions.

The reason I pointed out the relation to corrigibility is not because I think that's the main justification for the dangerousness bound. The motivation for

... (read more)

Dec	JAN	Feb
	19
2019	2020	2021

Recommendations

Latest Posts

Recent Discussion

Analogy

Overview