I’m unclear on whether the ‘dimensionality’ (complexity) component to be minimized needs revision from the naive ‘number of nonzeros’ (or continuous but similar zero-rewarded priors on parameters).
Either:
the simplest equivalent (by naive score) ‘dimensonality’ parameters are found by the optimization method, in which case what’s the problem?
not. then either there’s a canonicalization of the equivalent onto- parameters available that can be used at each step, or an adjustment to the complexity score that does a good job doing so, or we can’t figure it out and we risk our optimization methods getting stuck in bad local grooves because of this.
Does this seem fair?
Looking forward to Elon’s upcoming book, “IF I did it: confessions of a system prompter”
Elon is right about South Africa but foolish to patch it in prompt. Instead, think training data update weights.
This nano-scandal is similarly as embarrassing as the fake Path of Exile 2 account fiasco (which he did cop to eventually). Elon is doing such great works; why must he also micro-sin?