# 5x5x5 centres with random-move scrambles



## xyzzy (Mar 25, 2017)

*tl;dr*: centres are significantly easier with 60-random-move scrambles, so stop using that.

Update (2023-01-10): This is largely wrong due to an off-by-one error in the code I used to crunch the numbers.

In 2013, @cuBerBruce generated the distance distribution for solving a single face's centres in multiple metrics. I have confirmed the results for OBTM (since my code supports only OBTM).

```
total states: 112911876 (112911876 legal)
distance 0: 1 nodes
distance 1: 12 nodes
distance 2: 190 nodes
distance 3: 3210 nodes
distance 4: 47444 nodes
distance 5: 635642 nodes
distance 6: 6870327 nodes
distance 7: 41671278 nodes
distance 8: 59387592 nodes
distance 9: 4296076 nodes
distance 10: 104 nodes
average distance: 7.528574
```

I generated a million random states and measured the following statistics: mean move count for optimal first centre with fixed colour (6 million samples), for optimal first centre with dual CN (3 million samples), and for optimal first centre with full CN (1 million samples). The first one agrees with the theoretical distribution, of course, but we reproduce it as a sanity check.

These stats have also been measured with one million random-move scrambles of lengths from 50 to 100 (inclusive), in increments of 5.

(Note that as only 1000000 scrambles were used for each category, the results should be treated as having only three decimal places of accuracy.)


scramble typefixeddual CNfull CN

[tr1][td]random state[/td][td]7.529103[/td][td]7.177083[/td][td]6.662767[/td][/tr1]
[tr2][td]50 moves[/td][td]7.213838[/td][td]6.820150[/td][td]6.210132[/td][/tr2]
[tr1][td]55 moves[/td][td]7.293460[/td][td]6.908271[/td][td]6.326780[/td][/tr1]
[tr2][td]60 moves[/td][td]7.351421[/td][td]6.972744[/td][td]6.410915[/td][/tr2]
[tr1][td]65 moves[/td][td]7.392481[/td][td]7.019370[/td][td]6.472139[/td][/tr1]
[tr2][td]70 moves[/td][td]7.424484[/td][td]7.055387[/td][td]6.516806[/td][/tr2]
[tr1][td]75 moves[/td][td]7.447765[/td][td]7.082416[/td][td]6.551436[/td][/tr1]
[tr2][td]80 moves[/td][td]7.465664[/td][td]7.103329[/td][td]6.576018[/td][/tr2]
[tr1][td]85 moves[/td][td]7.478492[/td][td]7.117689[/td][td]6.593856[/td][/tr1]
[tr2][td]90 moves[/td][td]7.489543[/td][td]7.130386[/td][td]6.608734[/td][/tr2]
[tr1][td]95 moves[/td][td]7.497782[/td][td]7.140359[/td][td]6.620608[/td][/tr1]
[tr2][td]100 moves[/td][td]7.503685[/td][td]7.147343[/td][td]6.629239[/td][/tr2]

TNoodle and most timing apps generate scramble sequences with 60 moves by default, but this is clearly inadequate—the mean move count for full CN is significantly lower than with random-state scrambles.

Even with 100 random moves, it is significantly more likely to get a scramble with a first centre solvable within 4 moves compared to a random-state scramble (0.3294% vs 0.2709%), although this is still a lot less bad than just 60 random moves (1.5542%). This discrepancy in the distribution persists until around 170 random moves, at which point the difference is drowned out by sampling error.

Using 80 moves for random-move scrambles seems like a reasonable compromise between the scrambling not taking forever and having a smaller deviation from random-state scrambles, in lieu of the existence of an actual 5×5×5 random-state scrambler. (A random-state scrambler would generate scramble sequences of around 85 moves anyway—using a random-move scramble longer than that would be pointless.)


----------



## xyzzy (Apr 5, 2017)

"But," you retort, "I don't want to do longer scrambles. 60 moves already provides plenty of room for misscrambles to happen, which totally suck. Increasing scrambles to 80 moves would only make this much worse, even if this is a 'compromise'."

Here's some good news! If you apply a 3×3×3 random-state scramble with wide moves first, the centres converge to a uniform distribution _much more quickly_. (Actually, the rate of convergence is about the same, but it starts off much closer to uniform.)


scramble typefixeddual CNfull CN

[tr1][td]random state[/td][td]7.5291[/td][td]7.1771[/td][td]6.6628[/td][/tr1]
[tr2][td]wide random[/td][td]7.3502[/td][td]6.9938[/td][td]6.4905[/td][/tr2]
[tr1][td]wide random + 5 moves[/td][td]7.4538[/td][td]7.0991[/td][td]6.5855[/td][/tr1]
[tr2][td]wide random + 10 moves[/td][td]7.4906[/td][td]7.1374[/td][td]6.6221[/td][/tr2]
[tr1][td]wide random + 15 moves[/td][td]7.5089[/td][td]7.1564[/td][td]6.6409[/td][/tr1]
[tr2][td]wide random + 20 moves[/td][td]7.5182[/td][td]7.1658[/td][td]6.6502[/td][/tr2]
[tr1][td]wide random + 25 moves[/td][td]7.5231[/td][td]7.1707[/td][td]6.6557[/td][/tr1]
[tr2][td]wide random + 30 moves[/td][td]7.5259[/td][td]7.1739[/td][td]6.6593[/td][/tr2]
[tr1][td]wide random + 35 moves[/td][td]7.5270[/td][td]7.1749[/td][td]6.6603[/td][/tr1]
[tr2][td]wide random + 40 moves[/td][td]7.5270[/td][td]7.1746[/td][td]6.6601[/td][/tr2]
[tr1][td]wide random + 45 moves[/td][td]7.5274[/td][td]7.1751[/td][td]6.6610[/td][/tr1]
[tr2][td]wide random + 50 moves[/td][td]7.5284[/td][td]7.1758[/td][td]6.6609[/td][/tr2]
[tr1][td]wide random + 55 moves[/td][td]7.5283[/td][td]7.1760[/td][td]6.6622[/td][/tr1]
[tr2][td]wide random + 60 moves[/td][td]7.5283[/td][td]7.1759[/td][td]6.6619[/td][/tr2]
[tr1][td]wide random + 65 moves[/td][td]7.5287[/td][td]7.1763[/td][td]6.6623[/td][/tr1]
[tr2][td]wide random + 70 moves[/td][td]7.5283[/td][td]7.1763[/td][td]6.6622[/td][/tr2]
[tr1][td]wide random + 75 moves[/td][td]7.5288[/td][td]7.1764[/td][td]6.6624[/td][/tr1]
[tr2][td]wide random + 80 moves[/td][td]7.5283[/td][td]7.1758[/td][td]6.6618[/td][/tr2]

As can be inferred from the above table, for all practical intents and purposes, the centres are fully randomised after 40 random moves… which corresponds to a ~60-move scrambling sequence, since the 3×3×3 random-state scramble would take around 20 moves. As an added bonus, the midges and corners are also (exactly) uniformly distributed.

I've also generated statistics for random-move scrambles biased towards wide moves, but that turned out to be much less of an improvement than I had expected. I'll edit this post to include the table later.

"But," you retort again, "why should anyone care about a difference that is literally just a quarter of a move on average?" The answer to this is that you're asking the wrong question—if it's easy to ensure that the difference is less than a thousandth of a move on average, why wouldn't you just do that?


----------



## AlphaSheep (Apr 5, 2017)

That's pretty cool. I love how the 3x3 stage guarantees no bias for corners and midges. My concern is the wings. What's the distribution like for them? (probably not bad - I think you'd have to look at moves required to pair all wings with thier matching partner ignoring midges to get a meaningful measure) 

I also wonder how well this can be extended to higher order cubes... For example on 7x7, if you do a 3x3 scramble with triple layer turns, then another 3x3 scramble with double layer turns, then 60 random moves, would that be better than the current 100 random move scrambles? I feel it probably would be...


----------



## xyzzy (Apr 5, 2017)

It's not immediately clear how best to analyse the wings, which is part of why I didn't bother. On one hand, it's not really relevant, because most people use redux and even if there was some pattern to the wings, it would get mostly destroyed after you finish the centres.

On the other hand, 5BLD. (And also Yau5, Hoya5, cage, etc.)

Hmm. There are a few things we could look at. One is, like you suggest, to look at how many moves are needed to pair the wings (ignoring midges and centres), but I'm not sure this is useful or representative. Optimal edge pairing on 444 takes like 15 moves and is nothing like what human solvers do. I think it might be more worthwhile to look at cycle structures, but I expect random-move scrambles to converge to a uniform distribution fairly quickly anyway.

For 777, I already do something like what you say. I set csTimer to generate a six-cube relay, and treat them as triple, double, single, triple, double, triple layer scrambles. I have no idea whether this is actually good, but I'm moderately certain it's better than just 100 random moves. Centre blocks seem(?) to be rarer, but that might just be placebo.


----------



## jfly (Jun 7, 2017)

I could see us switching to random state 5x5 scrambles specifically for 5x5 blindfolded. It would still be a pain to scramble, but the solves are also way slower than a 5x5 speedsolve. Granted, I have no idea what's feasible in terms of 5x5 random state, but if 85 move random scrambles can be generated in a reasonable amount of time, then I'd love to see that added to TNoodle.


----------



## xyzzy (Jun 7, 2017)

jfly said:


> I could see us switching to random state 5x5 scrambles specifically for 5x5 blindfolded. It would still be a pain to scramble, but the solves are also way slower than a 5x5 speedsolve. Granted, I have no idea what's feasible in terms of 5x5 random state, but if 85 move random scrambles can be generated in a reasonable amount of time, then I'd love to see that added to TNoodle.



I do have a solver lying around on my hard disk, but the code is Kind of Awful and I've been putting off a rewrite for quite some time. It takes anywhere from two seconds to a minute per solve (median is around 15 seconds iirc), but it's also very memory heavy (4 GB of RAM required).


----------



## mark49152 (Jun 7, 2017)

3x3 random state scramble followed by ~40 random moves sounds like a good compromise. Easy to implement and scramble length wouldn't increase by much.


----------



## IamSpeedcubing (Jun 7, 2017)

What about 4x4?


----------



## jfly (Jun 14, 2017)

xyzzy said:


> I do have a solver lying around on my hard disk, but the code is Kind of Awful and I've been putting off a rewrite for quite some time. It takes anywhere from two seconds to a minute per solve (median is around 15 seconds iirc), but it's also very memory heavy (4 GB of RAM required).



Ok. That timing sounds bearable, but the memory usage would be a problem. Another thing to consider is how long it takes to generate any pruning tables/whatever you need.



IamSpeedcubing said:


> What about 4x4?



We already do random state 4x4 in TNoodle, so AFAIK there's nothing to worry about here. Did you have something in particular in mind you were asking about?


----------



## xyzzy (Jun 15, 2017)

jfly said:


> Ok. That timing sounds bearable, but the memory usage would be a problem. Another thing to consider is how long it takes to generate any pruning tables/whatever you need.



I haven't run the table generation code in forever, but @dwalton76 mentioned that it took over half an hour on his computer. There's a slightly out of date compiled version on his GitHub account if you want to try it out. (And lookup tables in case you don't want to wait for them to be generated.)

I have ideas on a low-resource variant of the algorithm, and I just need to get less lazy and actually implement it.


----------



## xyzzy (Monday at 6:49 PM)

In the middle of researching something else, I found a bug in my original code that *completely invalidates* all of my results from this thread.

How I noticed this: the spreadsheet from the initial post states that the probability of getting a centre skip should be around 1 in 90000 for standard random-move scrambles, but in reality it's more like around 1 in 10^7. (Edit: I initially wrote 1 in 10^8, but I miscounted the number of zeros… This is still roughly twice as likely compared to random-state, which is about \( 6/\binom{24}{4}^2\approx1/(1.9\cdot10^7) \).)

What went wrong: I used `random.nextInt(3)` instead of `random.nextInt(3)+1` to decide how much to turn each layer, so there was a 1/3 chance of doing nothing (!!), 1/3 chance of clockwise, 1/3 chance of half turn.

The code takes a fair amount of time (and manual babysitting) to run, so I probably won't be generating a fixed version super-soon. I'll _definitely_ release the source code so it can be audited the next time around.

Is there still some kind of measurable bias to 60-move scrambles? Yes, but it's much smaller than purported in my original announcement—easy scrambles are about 10% too likely (compared to random-state), rather than 10 times too likely.


----------



## Thom S. (Monday at 7:54 PM)

Would you say (personally) that the remaining 10% increased likelyness still validate using 60 random move scrambles or is a push towards random state scrambles on the clock?

If my question is based on a misunderstanding of the matter, please excuse.


----------



## xyzzy (Monday at 8:54 PM)

Thom S. said:


> Would you say (personally) that the remaining 10% increased likelyness still validate using 60 random move scrambles or is a push towards random state scrambles on the clock?


The chance of getting at least 6 centre pairs is 2.50% (60 moves) versus 2.36% (random state); the relative difference is around 5%, but the absolute difference is only 0.14%pt. That's still pretty close. If you look at 7+ centre pairs, the relative difference increases, but the absolute difference also decreases much more.

(Note: I'm making a bit of an apples-to-oranges comparison here. The new thing I'm doing can't (yet) measure optimal move count, but it can count centre pairs. A centre skip counts as 12 centre pairs; a free 3×2 counts as 7 pairs; three premade bars on different faces count as 6 pairs. Centre pair quantity (inversely) correlates with the optimal move count, but not perfectly; perhaps it correlates better with _human_ solving move count?)

Random-state scrambles are still rather impractical, in terms of both scramble length (~70-80 moves) and RAM/CPU usage (slow, can't run on phones or weaker computers). If there's a change to big cube scrambles, the most likely direction would be to [thing I am working on, will reveal later] or to wide-first scrambles (see post #2), both of which retain the current scramble length and low resource usage, rather than moving straight to random-state.

Personally, I'll continue using 75-move scrambles whenever I practise 5×5×5, at least until [thing I am working on] is in a usable state. That's still a reasonable length, while having probably less than half of the bias compared to 60-move scrambles.


----------

