Jonathan_Graehl

Karma: 2,755

Jonathan_Graehl 16 May 2025 19:20 UTC
1 point
−13
on: Regarding South Africa
1. Looking forward to Elon’s upcoming book, “IF I did it: confessions of a system prompter”
2. Elon is right about South Africa but foolish to patch it in prompt. Instead, think training data update weights.
3. This nano-scandal is similarly as embarrassing as the fake Path of Exile 2 account fiasco (which he did cop to eventually). Elon is doing such great works; why must he also micro-sin?

Jonathan_Graehl 18 Jan 2023 21:42 UTC
2 points
0
on: Neural networks generalize because of this one weird trick
I’m unclear on whether the ‘dimensionality’ (complexity) component to be minimized needs revision from the naive ‘number of nonzeros’ (or continuous but similar zero-rewarded priors on parameters).
Either:
1. the simplest equivalent (by naive score) ‘dimensonality’ parameters are found by the optimization method, in which case what’s the problem?
2. not. then either there’s a canonicalization of the equivalent onto- parameters available that can be used at each step, or an adjustment to the complexity score that does a good job doing so, or we can’t figure it out and we risk our optimization methods getting stuck in bad local grooves because of this.
Does this seem fair?

Jonathan_Graehl 18 Jan 2023 21:32 UTC
3 points
0
on: Neural networks generalize because of this one weird trick
This appears to be a high-quality book report. Thanks. I didn’t see anywhere the ‘because’ is demonstrated. Is it proved in the citations or do we just have ‘plausibly because’?
Physics experiences in optimizing free energy have long inspired ML optimization uses. Did physicists playing with free energy lead to new optimization methods or is it just something people like to talk about?

Jonathan_Graehl 17 Jan 2023 21:39 UTC
−2 points
−6
in reply to: Gerald Monroe’s comment on: On Cooking With Gas
This kind of reply is ridiculous and insulting.

Jonathan_Graehl 13 Jan 2023 21:51 UTC
3 points
1
on: Scaling laws vs individual differences
We have good reason to suspect that biological intelligence, and hence human intelligence roughly follow similar scaling law patterns to what we observe in machine learning systems
No, we don’t. Please state the reason(s) explicitly.

Jonathan_Graehl 11 Jan 2023 22:51 UTC
2 points
0
on: Google Search loses to ChatGPT fair and square
Google’s production search is expensive to change, but I’m sure you’re right that it is missing some obvious improvements in ‘understanding’ a la ChatGPT.
One valid excuse for low quality results is that Google’s method is actively gamed (for obvious $ reasons) by people who probably have insider info.
IMO a fair comparison would require ChatGPT to do a better job presenting a list of URLs.

Jonathan_Graehl 17 Sep 2022 20:33 UTC
2 points
0
on: Sparse trinary weighted RNNs as a path to better language model interpretability
how is a discretized weight/activation set amenable to the usual gradient descent optimizers?

Jonathan_Graehl 14 Sep 2022 3:04 UTC
2 points
0
on: Argument against 20% GDP growth from AI within 10 years [Linkpost]
You have the profits from the AI tech (+ compute supporting it) vendors and you have the improvements to everyone’s work from the AI. Presumably the improvements are more than the take by the AI sellers (esp. if open source tools are used). So it’s not appropriate to say that a small “sells AI” industry equates to a small impact on GDP.
But yes, obviously GDP growth climbing to 20% annually and staying there even for 5 years is ridiculous unless you’re a takeoff-believer.

Jonathan_Graehl 26 Aug 2022 20:36 UTC
5 points
1
in reply to: Tapatakt’s comment on: Taking the parameters which seem to matter and rotating them until they don’t
You don’t have to compute the rotation every time for the weight matrix. You can compute it once. It’s true that you have to actually rotate the input activations for every input but that’s really trivial.

Jonathan_Graehl 26 Aug 2022 20:35 UTC
2 points
0
on: Taking the parameters which seem to matter and rotating them until they don’t
Interesting idea.
Obviously doing this instead with a permutation composed with its inverse would do nothing but shuffle the order and not help.
You can easily do the same with any affine transformation, no? Skew, translation (scale doesn’t matter for interpretability).
More generally if you were to consider all equivalent networks, tautologically one of them is indeed more input activation ⇒ output interpretable by whatever metric you define (input is a pixel in this case?).
It’s hard for me to believe that rotations alone are likely to give much improvement. Yes, you’ll find a rotation that’s “better”.
What would suffice as convincing proof that this is valuable for a task: the transformation increases the effectiveness of the best training methods.
I would try at least fine-tuning on the modified network.
I believe people commonly try to train not a sequence of equivalent power networks (w/ a method to project from weights of the previous architecture to the new one), but rather a series of increasingly detailed ones.
Anyway, good presentation of an easy to visualize “why not try it” idea.

Jonathan_Graehl 26 Aug 2022 18:14 UTC
0 points
0
on: Is population collapse due to low birth rates a problem?
If human lives are good, depopulation should not be pursued. If instead you only value avg QOL, there are many human lives you’d want to prevent. But anyone claiming moral authority to do so should be intensely scrutinized.

Jonathan_Graehl 26 Aug 2022 18:11 UTC
3 points
1
on: Is population collapse due to low birth rates a problem?
To sustain high tech-driven growth rates, we probably need (pre-real-AI) an increasing population of increasingly specialized and increasingly long-lived researchers+engineers at every intelligence threshold—as we advance, it takes longer to climb up on giants’ shoulders. It’s unclear what the needs are for below-threshold population (not zero, yet). Probably Elon is intentionally not being explicit about the eugenic-adjacent angle of the situation.

Jonathan_Graehl 19 Aug 2022 21:02 UTC
5 points
0
in reply to: Ben’s comment on: What’s up with the bad Meta projects?
IMO this project needs an aesthetic leader. A bunch of technically competent people building tools they think might be useful is very likely to result in a bunch of unappealing stuff no one wants.

Jonathan_Graehl 19 Aug 2022 21:00 UTC
8 points
2
on: What’s up with the bad Meta projects?
In Carmack’s recent 5+hr interview on Lex Friedman [1], he points out that finding a particular virtual setting that people love and focusing effort on that is usually how we arrive at games/spaces that have historically driven hardware/platform adoption, and that Zucc is very obviously not doing that. The closest successful virtual space to Zucc’s approachis Roblox, a kind of social game construction kit (with pretty high market cap), but in his opinion the outcome is usually you build it and they don’t come. I believe Carmack also favors the technical results of optimizing a platform along with a particular game, which is part of his strong motivation for making things better in his immediate environment.
[1]

Jonathan_Graehl 17 Aug 2022 17:45 UTC
12 points
1
on: The longest training run
This is good thinking. Breaking out of your framework: trainings are routinely checkpointed periodically to disk (in case of crash) and can be resumed—even across algorithmic improvements in the learning method. So some trainings will effectively be maintained through upgrades. I’d say trainings are short mostly because we haven’t converged on the best model architectures and because of publication incentives. IMO benefitting from previous trainings of an evolving architecture will feature in published work over the next decade.

Jonathan_Graehl 29 Jul 2022 1:17 UTC
9 points
3
on: Sexual Abuse attitudes might be infohazardous
One of the reasons abusers of kids/teens aren’t fully prosecuted is because parents of victims rightly predict that everyone knowing you were raped by the babysitter or whatever will generate additional psych baggage and selfishly refrain from protecting other children from the same predator.

Jonathan_Graehl 6 Jul 2022 18:34 UTC
2 points
0
on: Donohue, Levitt, Roe, and Wade: T-minus 20 years to a massive crime wave?
How are we ever supposed to believe that enough variables were ‘controlled for’?
More abortions → [lag 15 years] less crime is of course plausible. We should expect smaller families produced by abortion to have more resources available for the surviving children, if any, which plausibly could reduce their criminality. But the hypothesis is clearly also motivated by a belief that we should hope genetically criminal-inclined people differentially have most of the abortions (though I’m sure this motivation is not forefronted by authors).

Jonathan_Graehl 1 Jul 2022 5:18 UTC
15 points
4
on: Looking back on my alignment PhD
Congrats on the accomplishments. Leaving aside the rest, I like the prompt: why don’t people wirehead? Realistically, they’re cautious due to having but one brain and a low visibility into what they’d become. A digital-copyable agent would, if curious about what slightly different versions of themselves would do, not hesitate to simulate one in a controlled environment.
Generally I would tweak my brain if it would reliably give me the kind of actions I’d now approve of, while providing at worst the same sort of subjective state as I’d have if managing the same results without the intervention. I wouldn’t care if the center of my actions was different as long as the things I value today were bettered.
Anyway, it’s a nice template for generating ideas for: when would an agent want to allow its values to shift?
I’m glad you broke free of trying to equal others’ bragged-about abilities. Not everyone needs to be great at everything. People who invest in learning something generally talk up the benefits of what they paid for. I’m thinking of Heinlein’s famous “specialization is for insects” where I presume much of the laundry lists of things every person should know how to do are exactly the arbtirary things he knows how to do.

Jonathan_Graehl 30 Jun 2022 18:35 UTC
55 points
6
on: Failing to fix a dangerous intersection
LA has a tradition of guerrilla freeway sign enhancements as a result of similar authority non-responsiveness. http://www.slate.com/blogs/the_eye/2015/02/11/guerrilla_public_service_on_99_invisible_richard_ankrom_replaced_a_los_angeles.html

Jonathan_Graehl 20 Jun 2022 20:48 UTC
2 points
on: Why I don’t believe in doom
A general superhuman AI motivated to obtain monopoly computational power could do a lot of damage. Security is hard. Indeed we’d best lay measures in advance. ‘Tool’ (we hope) AI will unfortunately have to be part of those measures. There’s no indication we’ll see provably secure human-designed measures built and deployed across the points of infrastructure/manufacturing leverage.