# Square-1 shape probability distribution



## Lucas Garron (Sep 3, 2008)

In the WCA forum, here and http://www.worldcubeassociation.org/forum/viewtopic.php?f=4&t=449, Prusak's lucky scramble brought up the issue of fair scrambling.

(_Skip to the big headline below for the more interesting stuff._) 

In the same way that it makes most sense by solving into cube shape, and then within, it's most logical to consider scrambling as random cube-shape and random permutation of pieces in the cube shape.

Permutation of the pieces, WLOG in cube shape, is a relatively simple problem. 
However, the shapes are a funnier issue. There are two ways to define "random cube shape," if you consider only the piece arrangements on top and bottom. AUF, AUD, and middles don't really matter, and can at worst just be dealt with at the end of a scramble.

So, you can consider a "random shape" to be a random choice of one of the 170 configurations. Problem is, why not choose among the 90 shapes that you'll count if you consider shapes that are z2-flips of each other as the same? The distribution is now different. 

I think the "best" way to do it is to improve our current idea of scrambling: Do random moves for a really long while (rather, do something that produces equivalent scrambles). The current scrambler doesn't nearly scramble to an even distribution, though.

So, I took the following definition: In any state, the top position can be turned a certain number of ways to realign a twistable crack, and you rotate to a random one of each. Same for the bottom. Then you do a /, and continue. Over time, the probabilities even out. Permutation of pieces within the shapes becomes, random (I'm pretty sure, but I don't want to think of an argument, and I don't even want to think about what happens if it isn't).
The shapes eventually reach a distribution that doesn't change.

Some of you will probably recognize this as a random walk of the Square-1 shape graph. Each shape has one edge connecting to another for each way to get from it to the other (you can consider it undirected).
So, I have a 170x170 matrix (symmetric). Each row corresponds to a shape, has a sum of 1, and each entry is weighted by the number of ways it's possible to get to the column's shape. So, pretend you know how Markov stuff works, raise the matrix to a sufficiently high power, and you get a stable distribution.

More important stuff
So, after that long introduction: As far as I know, no one has ever actually calculated the probability distributions of the nodes in a random walk of the Square-1 shape graph. So I thought I'd do it. 

I tried to take the most justifiably natural way to calculate the distributions, and computed everything in Mathematica. The results are at http://cube.garron.us/sq1/sq1graph.csv. If you want to look at the shapes, I've rendered them and put them in a zip.

So, if we're ever going to do random-state scrambling for Square-1, shapes should probably be chosen by weighing according to this distribution. (Or we could do something like generating 1000 random moves, and sending it to Jaap's solver to be optimized.)

So, I've posted here to see what other mathematically inclined people think about it. Are there good arguments for another scheme? Does anyone actually know about how random the pieces should eventually be within the shapes? Does anyone really care about the eventual shape distribution? 

I can also give you the Mathematica source code and a more explicit method definition if you want.


----------



## jazzthief81 (Sep 3, 2008)

This is very interesting.

I wrote something a while ago that builds up the Square-1 shape graph with all the probabilities of each transition and renders them graphically. I was planning on doing something similar as you did by empirically working out the probability distribution by generating many random paths in the tree. I never got round to doing it.

So yes, I was certainly interested to find this out and I'm glad you did


----------



## JBCM627 (Sep 3, 2008)

4 things:

1) Nicely done. 
2a) http://www.speedsolving.com/forum/showthread.php?t=5860
2b) Did you do this for non-backtracking moves, or backtracking (or both)?
3) Do you have exact values for the shapes (fractions)?
4) Notebook, please


----------



## qqwref (Sep 4, 2008)

Hi, as for fractions, here are the decimals found in the csv file and their fractional equivalents:

0.00108754758020641 = 4/3678 = (2/3)/613
0.00135943447525802 = 5/3678 = (5/6)/613
0.00217509516041282 = 8/3678 = (4/3)/613
0.00326264274061924 = 12/3678 = 2/613
0.00435019032082564 = 16/3678 = (8/3)/613
0.00652528548123848 = 24/3678 = 4/613
0.00870038064165130 = 32/3678 = (16/3)/613
0.00978792822185770 = 36/3678 = 6/613
0.01305057096247696 = 48/3678 = 8/613

The interesting thing here is the 613. These are all nice small fractions divided by 613 and I really wonder where that number comes from, since it is a rather large prime.


Oh yeah, and very nice work Lucas


----------



## Lucas Garron (Sep 4, 2008)

JBCM627 said:


> 1) Nicely done.


Thank you, although I think it remains to be seen how nicely. 



JBCM627 said:


> 2a) http://www.speedsolving.com/forum/showthread.php?t=5860


Yeah, I have no idea. I think it's true, because I think of each backtracking as going down another random walk that still evens out, but that's so non-rigorous, it just might not be true. 



JBCM627 said:


> 2b) Did you do this for non-backtracking moves, or backtracking (or both)?


Thing is, I'm turning both the top and bottom randomly, which allows (0,0). I consider that somewhat artificial because we should allow any random turn, even the null turn (Compare: a random state could be the solved state).
How about (6,6)? Do we throw that out, too? Sometimes, you get yet other ways that bring you back to a the previous shape, without backtracking, or even the same shape.
I decided not to go for the headache and deciding (and later defending) subtle issues, so I just ran it straight, with each way to get from one shape to another (or the same ) counting, equally.



JBCM627 said:


> 4) Notebook, please


http://cube.garron.us/sq1/square1graph.nb 
(First time I tried adding sections, etc. It makes the notebook look so much more finished. )

Also, sq1graph.csv now lists distances, for convenience.


----------



## cuBerBruce (Sep 4, 2008)

qqwref said:


> The interesting thing here is the 613. These are all nice small fractions divided by 613 and I really wonder where that number comes from, since it is a rather large prime.


From Jaap's puzzle page, the number 3678 (the actual lowest common denominator, of which 613 is a prime factor) comes from counting the shapes in a way that also takes into account the number of ways a shape can be split into two halves. This gives a count of 37*1 + 62*12 + 46*46 + 12*62 + 1*37 = 3678. (Whereas the normal way to count the shapes is 5*1 + 10*3 + 10*10 + 3*10 + 1*5 = 170.)


----------



## JBCM627 (Sep 5, 2008)

Lucas Garron said:


> JBCM627 said:
> 
> 
> > 4) Notebook, please
> ...



Yeah, sections do 

The notebook does look good. I couldn't eval everything due to compatibility issues; I'll have to look at it again later when I have more time and a different computer. 6.whatever I have can't import combinatorica, it seems. The fun stuff at the bottom is especially nice, and useful too.

I guess I'll agree with allowing "null" turns too. Since the square shape is a first-order node, I'm not sure if you could even disallow backtracking.

The only potential improvement that stands out to me is using "MatrixPower[N[adj], 10000]". This is what I meant by exact vs approximate values earlier... iterating 10000 times sort of defeats the point of using markov chains. Instead, this should just be diagonalized and the diagonalized matrix considered as the limit goes to infinity... all eigenvalues less than 1 will disappear, hopefully leaving you with a limited case.
In pseudo-mathematica notation:
P = Transpose[Eigenvectors[N[adj]]];
Pinv = Inverse[P];
Diag = Pinv.N[adj].P;
Dinf = Limit[Diag^n, n->Infinity]; (yes, this is done correctly... don't use matrixpower)
Leaving you with,
Probabilities = P.Dinf.Pinv


----------



## blah (Sep 5, 2008)

Cool, the geek thread  

PS: No offense, I'm a geek myself


----------



## Lucas Garron (Sep 5, 2008)

JBCM627 said:


> P = Transpose[Eigenvectors[N[adj]]];
> Pinv = Inverse[P];
> Diag = Pinv.N[adj].P;
> Dinf = Limit[Diag^n, n->Infinity]; (yes, this is done correctly... don't use matrixpower)
> ...


Thing is, matrix^n uses matrix multiplication, and MatrixPower uses dot product, which I think is what we want.
The limit fails on your Dinf (I end up with infinities all over, which makes Probabilities a 170x170 matrix with each entry infinity). But if I I raise it to some sufficiently high power instead of taking the limit, the diagonal of Probabilities is correct. The same probabilities, and with no more an effective computation. 
And the limit of MatrixPower[adj, n] doesn't work so well, either (I think it'll eventually give the correct answer, but it's really slow).

Basically, if we're going to N[], like you did, MatrixPower to the 1000th works well enough. But do tell me if you find code that will produce the exact fractions from a reasonably computable limit. 
Right now, I'm going to try to diagonalize better, but since we already have the exact fractions, I don't think it's such a big issue.

P.S.: I really need to get more eigencomfy with linear algebra.


----------



## JBCM627 (Sep 5, 2008)

What version of mathematica are you using, Lucas? I have access to 5 on the physics computers here, if you are using that... and will try this tomorrow. Otherwise, I'll have to find a way to get an earlier version...

Edit: You can use Diag^n, as it is a diagonal matrix; it is a special case, which is why it is so useful.
Short example:
N^x = (P.D.P^-1)^x = P.D.P^-1.P.D.P.... = P.D^x.P^-1


Try Eigenvalues[N[]], and see if any of those are over 1... they should all be less than or = 1.


----------



## Lucas Garron (Sep 5, 2008)

JBCM627 said:


> What version of mathematica are you using, Lucas? I have access to 5 on the physics computers here, if you are using that... and will try this tomorrow. Otherwise, I'll have to find a way to get an earlier version...
> 
> Edit: You can use Diag^n, as it is a diagonal matrix; it is a special case, which is why it is so useful.
> Short example:
> ...


Yeah, I know about diagonalizing, but maybe I don't know enough about other eigenstuff. Anyhow, I think the issue is with Mathematica handling stuff, I'll try it again when I get version 6.

So, through dinner, I set my computer to compute the explicit, exact eigenvalues of the transition matrix. The results were so surprising, I actually laughed out loud at their beauty:

Of the 170 eigenvalues, exactly 90 are non-zero. That is certainly not a coincidence; there are 90 shapes, and hence 90 unique rows. (From minors and determinants, I can imagine that you get just the right amount of zeros. but it's still pretty, and some time I'd like to fully understand why.) 

Anyhow, the 90 non-zero eigenvalues are of three forms:
1 (The 1 root of 1-x)
The 64 roots of  - 4991442219 + 386948115252*x + 138762924658242*x^2 - 3103222624916364*x^3 - 877640411309612746*x^4 + 6767666047147902260*x^5 + 2012337578608379698422*x^6 - 8958764174373753804464*x^7 - 2105450969798079776924812*x^8 + 11046600222902203991043312*x^9 + 1123406093781088176157756896*x^10 - 8114779285022492769604795424*x^11 - 325472049644439264188809738432*x^12 + 2962546562996861776481954741760*x^13 + 56287388574936353280108941504768*x^14 - 629949208560303852773245517141504*x^15 - 6105099366584694343157749747552256*x^16 + 86874786230163907381580006692296704*x^17 + 406640998126424132369607623831846912*x^18 - 8273097635823613171991383746279940096*x^19 - 12847215092645959516368232004260052992*x^20 + 563542116540418623958067303673961512960*x^21 - 380249727800812172360337111385392087040*x^22 - 27876394040223235000410822110796320866304*x^23 + 70200985683043152366609324564260935958528*x^24 + 995061174735599595418926716008043747737600*x^25 - 4377190735910344574289281403129530741161984*x^26 - 24483749753436884219275245277429039028502528*x^27 + 172693480479595766346198778149152449118601216*x^28 + 344268199358502700321288117865500491022073856*x^29 - 4765625577836958680705390357889539456775487488*x^30 + 772858710665023800732181764002139803146518528*x^31 + 93400462329952268786337346115591965649619910656*x^32 - 173918077700241084337687739506139579202876211200*x^33 - 1244927272170148943873781598261313236445492674560*x^34 + 4715876999443357796546772004551277481760550551552*x^35 + 9204457647103604719171194562707870353618733367296*x^36 - 74472880131178952274776792185662826546928673619968*x^37 + 17443350194669512509483766280210174159010360459264*x^38 + 751913910049235990870614175455332842992117179482112*x^39 - 1392779124922777607824049090376214683191153824628736*x^40 - 4328092041600080883386980019497279165272616983330816*x^41 + 18110807337525247976881146547787829180504891067465728*x^42 + 2644680623656263536542849631837447260694572425019392*x^43 - 124957758646564498523954850327436181770010773552103424*x^44 + 181485548821389832364276808943176572906247959082958848*x^45 + 406331124721305918838666283081528680333446852510744576*x^46 - 1548126628345090784471770960023044689718520670775672832*x^47 + 549713325037757110937937164284329784848484888731451392*x^48 + 5552249583340000587865342011734478531902726023208239104*x^49 - 10580714260899344013132135083176015760588301879893557248*x^50 - 2510540104530940516126750428486043682581121309979181056*x^51 + 35729814707763782643150469325553615343782749037989462016*x^52 - 47192725438731643771622400943584186952215159805015228416*x^53 - 14671327554181472737539318334576376054183452695928504320*x^54 + 121677626730594369246671048452427099031448501592531140608*x^55 - 153528427155205489249202015529459568317970698252614369280*x^56 + 31072468697838310383691091990364643611438025213753688064*x^57 + 166184701748611768310167010069620415713246037605028462592*x^58 - 278996494557677233506937637885590804628898472557432274944*x^59 + 244873692347800039598793398534385002894376872578960588800*x^60 - 136778495367914907214791777149024478324242970827210883072*x^61 + 49139518222010279228071785788096857437263861359797338112*x^62 - 10462695095295681452359646651435512209985692421953945600*x^63 + 1011159794444683308147509475038062924991905844806287360*x^64
The 25 roots of -41864 + 3136179*x + 386613206*x^2 - 16868236200*x^3 - 402432346544*x^4 + 21212854848768*x^5 - 58429160850880*x^6 - 5476338956938432*x^7 + 47709591633416832*x^8 + 458983451386437120*x^9 - 6981888874709569536*x^10 - 1163160407514488832*x^11 + 392768286252547276800*x^12 - 1520451906669906886656*x^13 - 7565285380895954436096*x^14 + 68553662577750099099648*x^15 - 78233310032907038883840*x^16 - 913347325546567116521472*x^17 + 4118199373346723755720704*x^18 - 3435110820984016104062976*x^19 - 24441765912940143332818944*x^20 + 99523299112333703179665408*x^21 - 183592589424988418224422912*x^22 + 191248395143239778961457152*x^23 - 108940123945387453071753216*x^24 + 26498949067796948044480512*x^25
Okay, someone explain that. 
(Note: This makes 90 roots: 1^2+5^2+8^2. Why squares?)

There is a lot of fascinating math about Square-1.
There's those equations, and I have no idea about the coefficients yet.
And I still don't know why the 613 applies to the eventual distribution. Maybe there's something to do with shape symmetries?


----------



## Lucas Garron (Sep 5, 2008)

Okay, I think I can explain the probability distribution. Apparently it's mostly what I suspected.
Each shape is weighted (probability/3678) by the its degree in the full graph divided by its AUF/ADF symmetries.

So, it perfectly corresponds to the 3678, and Bruce's explanation is half of it.
Consider each shape the Square-1 can have, such that it's currently /-twistable (there's a crack going around the entire middle, and the 16 edge and corner pieces are essentially four half-sides). Each of these 3678 shapes has exactly the same probability. (If you want to consider the middle two pieces, divide the probability in half for its two possible states. But really, we can ignore the middle for most of this.)

This distribution is stable. Consider 3678 Square-1s, each in one of those shapes. The distribution of the "170" shapes is given by the probabilities I found.
Now, imagine giving each of these a / twist. The 3678 shapes will be permuted randomly (2-cycles), but each of them is still represented and each of the "170" shapes still exist with the same distribution. Performing a random AUF&ADF of each does not change the distribution, as each AUF&ADF of a shape are represented equally already, and will go to each other with equal probability.

With a little insight, it's intuitive that the distribution is stable, and it's not too hard believe that it's unique and all random walks will approximate it.

So, I think the "natural" shape distribution is resolved. But thanks to cuBerBruce and JCBM, without whom I probably wouldn't have found it, and qqwref, who specifically helped me think through the final "solution."


So all that only (yeah right, "only") leaves me to wonder now why the 90-degree characteristic polynomial factors into 3 polynomials with 1, 25, an 64 roots. (Which root corresponds to which shape? Does that even make sense to say?)


----------



## JBCM627 (Sep 5, 2008)

I'm having troubles getting it to diagonalize as well. It looks like it is just due to small decimal errors in mathematica though; P.Pinv isnt even giving me I, but some other matrix filled with small floating point errors. The code I posted earlier at least works for the distribution problem Stefan posted in the WCA forum, so I don't think diagonalizing like this is a problem; either something in the initialization is causing it to calculate approximately instead of exactly, or Mathematica just doesn't want to do it (I'm on a cruddy computer right now.. it took a few minutes just to open mathematica :/). I wish the other lab were open now; but I give up and will accept approximations til I can get in there.

The probability distribution is quite interesting. I was able to generate that polynomial (well, more or less... its giving me an imaginary part right now too and won't let me Re[]). It therefore also is having trouble factoring. As to why it ends up as 3 polynomials like that, I have no idea. It is certainly at least possible to match the shapes up with eigenvalues/roots, but as to if certain shapes fall into a certain polynomial, I don't know if a pattern will emerge.

Edit:
Cool visual implementation I found (not necessarily what we are looking for, though); but something like this could be done for probabilities of shape nodes: http://demonstrations.wolfram.com/TheEigenvectorsOfARandomGraph/
Also cool (of course!): http://demonstrations.wolfram.com/RubiksCube/

And again:
Got it working in 6; it was a stupid bug... Rotate[] is already reserved (defined for some graphics thing) and I hadn't replaced all instances of it.
Package to import is just <<'Combanitorica'... I'm not sure if the correction thing replaced something else too, though. Anyhow, it looks like its working.

Yet again:
There we go. I like my computer  Duh, N[]. I'll let my computer run this for a while now...

Again:
After about an hour, I'm pausing this thing. I really wish there were some way to monitor the evaluation's progress, but it appears not. I almost wonder if I could do this by hand faster, since I already know the eigenvalues. Will let it keep running when I go to bed.

And finally:
After it looks like 11 hours, I'm killing this. I don't know why its taking so long... all I did was Eigenvectors[adj].


----------



## smskill12 (Sep 6, 2008)

lucas you should start speedcubing with the square-1 again
unless you have been


----------



## JBCM627 (Nov 18, 2008)

So, we finally covered markov chains in class, after which I hit my head, and went back to Lucas's notebook.

Finding exact values is a bit easier than I initially thought - you only need to consider the eigenvalue 1 (which is now obvious to me as this is the only eigenvalue that will remain nonzero in the limit when you raise the eigenvalues to an infinite power). This also serves to further clarify that the approximate numbers are correct:

```
p = NullSpace[IdentityMatrix[170] - Transpose[adj]]
```
This gives you relative values. So to find the exact probabilities, you will need to multiply by 1/S, where S is the Sum of your elements. Not being too familiar with mathematica, I did this in a sort of roundabout way:

```
f = 1/Tr[p*IdentityMatrix[170]]
```

So, your probabilities are simply f*p.


```
{4/1839, 4/1839, 4/1839, 4/1839, 5/3678, 6/613, 6/613, 6/613, 8/613, \
8/613, 8/613, 6/613, 6/613, 6/613, 2/613, 4/613, 4/613, 4/613, \
16/1839, 16/1839, 16/1839, 4/613, 4/613, 4/613, 4/1839, 2/613, 2/613, \
2/613, 8/1839, 8/1839, 8/1839, 2/613, 2/613, 2/613, 2/1839, 6/613, \
4/613, 4/613, 4/613, 6/613, 6/613, 4/613, 4/613, 6/613, 2/613, 4/613, \
8/1839, 8/1839, 8/1839, 4/613, 4/613, 8/1839, 8/1839, 4/613, 4/1839, \
4/613, 8/1839, 8/1839, 8/1839, 4/613, 4/613, 8/1839, 8/1839, 4/613, \
4/1839, 4/613, 8/1839, 8/1839, 8/1839, 4/613, 4/613, 8/1839, 8/1839, \
4/613, 4/1839, 6/613, 4/613, 4/613, 4/613, 6/613, 6/613, 4/613, \
4/613, 6/613, 2/613, 6/613, 4/613, 4/613, 4/613, 6/613, 6/613, 4/613, \
4/613, 6/613, 2/613, 4/613, 8/1839, 8/1839, 8/1839, 4/613, 4/613, \
8/1839, 8/1839, 4/613, 4/1839, 4/613, 8/1839, 8/1839, 8/1839, 4/613, \
4/613, 8/1839, 8/1839, 4/613, 4/1839, 6/613, 4/613, 4/613, 4/613, \
6/613, 6/613, 4/613, 4/613, 6/613, 2/613, 2/613, 4/1839, 4/1839, \
4/1839, 2/613, 2/613, 4/1839, 4/1839, 2/613, 2/1839, 6/613, 4/613, \
2/613, 6/613, 4/613, 2/613, 6/613, 4/613, 2/613, 8/613, 16/1839, \
8/1839, 8/613, 16/1839, 8/1839, 8/613, 16/1839, 8/1839, 6/613, 4/613, \
2/613, 6/613, 4/613, 2/613, 6/613, 4/613, 2/613, 2/613, 4/1839, \
2/1839, 4/1839, 4/1839, 4/1839, 4/1839, 5/3678}
```


----------



## Stefan (Sep 20, 2010)

Digging this out for clarification requests...



Lucas Garron said:


> I think the "best" way to do it is to improve our current idea of scrambling: Do random moves for a really long while (rather, do something that produces equivalent scrambles).



Do I understand this thread correctly, that this is achieved by randomly picking a position from all with equal probability? So could one for example randomly permute the 16 pieces and then put them in that order clockwise from the top front gap, then clockwise from the bottom front gap (just making sure it's actually two halves and starting over if not)? Would that represent the many-random-moves idea?



Lucas Garron said:


> So, I took the following definition: In any state, the top position can be turned a certain number of ways to realign a twistable crack, and you rotate to a random one of each. Same for the bottom. Then you do a /, and continue.



Do the results change if we use a more 3x3x3-like definition like this?

_Random sequence of moves of U, R and D without trivial cancellations._

In other words, don't treat U and D differently from R in the definition. Does this end up being the same definition as yours or having the same result as yours? And if not, which definition is "better"?



WCA/Michael said:


> WCA scrambler



How does the current WCA scrambler work? Looks like random moves, but I don't understand the code and how it chooses the turns (one of our two definitions?).

Also: The WCA-scrambled state is always /-turnable. But should it be? I find that slightly unnatural.



http://www.jaapsch.net/puzzles/square1.htm said:


> the number of shapes if we consider rotations different (e.g. a square counts as 3 because it has three possible orientations)


Why 3? Shouldn't that be 2?


----------



## Mike Hughey (Sep 20, 2010)

Ugh. I should have searched for a thread like this when I was working on my square-1 BLD method. I was thinking it would be nice to have probabilities of shapes so I could learn them in order of likelihood of getting them, so I could be more likely to get the ones I had already memorized. But I guess I'm glad now that I didn't, since otherwise I might not have been as motivated to learn them all. I guess I didn't remember this thread because it wasn't something that interested me that much back then.



StefanPochmann said:


> http://www.jaapsch.net/puzzles/square1.htm said:
> 
> 
> > the number of shapes if we consider rotations different (e.g. a square counts as 3 because it has three possible orientations)
> ...


Why not 4? For my memorization purposes, a square has 4 possible orientations. It makes getting to square consistently a little tricky because I need to make sure I always orient it the same from the previous position. (I solved it by always using minimal movement - zero or 30 degrees.) Maybe I don't understand what "orientation" means in this context.

(And by the way, I hate the lack of ability to automatically get quote trees with the new site for cases like this. Stefan's "Why 3? Shouldn't that be 2?" quote is completely meaningless without the preceding quote.)


----------



## qqwref (Sep 20, 2010)

StefanPochmann said:


> Do I understand this thread correctly, that this is achieved by randomly picking a position from all with equal probability? So could one for example randomly permute the 16 pieces and then put them in that order clockwise from the top front gap, then clockwise from the bottom front gap (just making sure it's actually two halves and starting over if not)? Would that represent the many-random-moves idea?


I would like to do this, but I don't know a reasonably efficient solving algorithm, except for Jaap's program which uses many MB of tables and doesn't have any graphical interface. Is there one? Could someone make one? Even a program which uses many MB of tables could work fine if it could be prettied up in CubeExplorer fashion, because we already use one external program to make scrambles and it wouldn't really be too much to ask to have another.



StefanPochmann said:


> Do the results change if we [...] don't treat U and D differently from R in the definition[?] Does this end up being the same definition as yours or having the same result as yours? And if not, which definition is "better"?


I don't know about results, but there is a fundamental difference between U/D and R in the Square-1 (as it is a dihedral puzzle, not a symmetrical one where all layers work equally) so it would be inappropriate to treat the layers equally when scrambling. As for "better", well, see above - we should be trying to move towards a random-state scrambler.



StefanPochmann said:


> Also: The WCA-scrambled state is always /-turnable. But should it be? I find that slightly unnatural.


I don't find this unnatural. You wouldn't provide a 3x3 scramble with the top layer turned 45 degrees and say that it's fine since two layers can be turned. Even though the Square-1 is bandaged, I think the same reasoning makes sense: we can always use positions that allow all layers to be turned, so we should, because other positions are between the kind of real states that we would be able to see during a non-canceling move sequence.


----------



## Stefan (Sep 20, 2010)

qqwref said:


> I would like to do this, but I don't know a reasonably efficient solving algorithm, except for Jaap's program which uses many MB of tables and doesn't have any graphical interface. Is there one?



There's one here now:
http://83.169.19.211/sq1/
http://www.speedcubers.de/forum/showthread.php?tid=4901

Don't know how it works and how much resources it takes, though.



qqwref said:


> but there is a fundamental difference between U/D and R in the Square-1 (as it is a dihedral puzzle, not a symmetrical one where all layers work equally) so *it would be inappropriate to treat the layers equally when scrambling*.



Why? For example, what is wrong with my above definition which does treat them equally?



StefanPochmann said:


> Does this end up being the same definition as yours or having the same result as yours?


 
I now think it does differ, at least compared to the WCA scrambler. I just tested the WCA scrambler a bit, did this three times:
- Generate 10 scrambles of 999 moves to get a lot of data
- Remove (0,0) from the ends cause I don't want to count that
- Count (0,y), (x,0) and (x,y)

The results:
First attempt: 1211, 1207, 4138
Second attempt: 1184, 1159, 4113
Third attempt: 1212, 1166, 4124

So (U,D) turns with U or D being 0 consistently occured significantly more than half the time.

Now consider my scramble definition: random non-canceling sequence in <U,R,D>. It can still be written in the current WCA notation, of course. Imagine we just had an R turn. Next must be a U or D turn, let's say it's D. Then next must be a U or R turn. I see two options to weigh them:

- Don't weigh them, they have equal probability. Half the time it's the R and we get a (0,D) or (U,0) turn. So the (U,D) turns with U or D being 0 should occur half the time.
- Weigh them according to how many possibilities each layer has. R only has one, U probably has several. So the (U,D) turns with U or D being 0 should occur significantly *less* than half the time.

Since the WCA scrambler appears to have them significantly *more* than half the time, not half the time or significantly *less* than half the time, we seem to differ.



qqwref said:


> As for "better", well, see above - we should be trying to move towards a random-state scrambler.


 
Absolutely. But my question is whether my scrambling-by-turning definition results in a different probability distribution, affecting how the random-state scrambler picks the random state.



qqwref said:


> I don't find this unnatural. You wouldn't provide a 3x3 scramble with the top layer turned 45 degrees and say that it's fine since two layers can be turned.



Imagine someone who's not already biased by knowing how to use the puzzle. In other words, a non-cuber. I'm quite sure they align the 3x3x3 properly, but not necessarily the square-1! They might very well end up with a U turn only resulting in a UF gap but not a UB gap.


----------



## qqwref (Sep 20, 2010)

StefanPochmann said:


> There's one here now:
> http://83.169.19.211/sq1/
> http://www.speedcubers.de/forum/showthread.php?tid=4901
> 
> Don't know how it works and how much resources it takes, though.


Neat. But it's php based so there's no way to check whether it works (or how well it works) without hacking. Maybe he has some random-state code feeding into Jaap's solver and it's all run on a server somewhere.



StefanPochmann said:


> I now think it does differ, at least compared to the WCA scrambler. I just tested the WCA scrambler a bit, did this three times:
> - Generate 10 scrambles of 999 moves to get a lot of data
> - Remove (0,0) from the ends cause I don't want to count that
> - Count (0,y), (x,0) and (x,y)
> ...


I see, the third number is the first two PLUS the (x,y) with nonzero x and y.

I looked at the code and it uses Jaap's scrambler which I do not think is the best one. It has some weird biases and the functioning is pretty much opaque. Of course it is pretty much impossible to get any new scrambler past the WCA so that will have to do. The scrambler in qqtimer is better and functions as follows:
- Choose a random move of the form (x,y), with x and y chosen completely randomly. If x=y=0 (unless it's the first move) or the move is impossible, try again. If we don't have enough remaining moves to do this, try again.
- After performing it, if we have at least one remaining move, do a /.
- Repeat.



StefanPochmann said:


> Now consider my scramble definition: random non-canceling sequence in <U,R,D>. It can still be written in the current WCA notation, of course. Imagine we just had an R turn. Next must be a U or D turn, let's say it's D. Then next must be a U or R turn. I see two options to weigh them:
> 
> - Don't weigh them, they have equal probability. Half the time it's the R and we get a (0,D) or (U,0) turn. So the (U,D) turns with U or D being 0 should occur half the time.
> - Weigh them according to how many possibilities each layer has. R only has one, U probably has several. So the (U,D) turns with U or D being 0 should occur significantly *less* than half the time.
> ...


We differ because I initially thought the official scrambler used the good scrambling algorithm, which it doesn't. I also assumed that when you said the U/D/R layers would be treated equally you were essentially forcing the first weighing. I think the second weighing is a good way to go, if we must generate the moves randomly.



StefanPochmann said:


> Imagine someone who's not already biased by knowing how to use the puzzle. In other words, a non-cuber. I'm quite sure they align the 3x3x3 properly, but not necessarily the square-1! They might very well end up with a U turn only resulting in a UF gap but not a UB gap.


I don't think we should ever go by what people who don't "know[...] how to use the puzzle" think. If you think we should, why not just disassemble the puzzles and put them back together? Or, even worse, why not take off the stickers and reapply, for our scrambles? I prefer to go by the logical route of mixing up a puzzle as close to uniformly randomly as possible, without getting in the way of its proper functioning. Having a Square-1 in a position where / is impossible does not seem any more reasonable than having it halfway through a / move, or having a 3x3 off by 45 degrees, or having a 5x5 halfway through a V-lockup.


----------



## Stefan (Sep 20, 2010)

qqwref said:


> But it's php based so there's no way to check whether it works (or how well it works) without *hacking*.


Or asking (I just did)



qqwref said:


> I see, the third number is the first two PLUS the (x,y) with nonzero x and y.


That's another way to look at it, I guess. I never said x and y are nonzero, so with that I just meant *all* U+D turns, and the first two numbers were those where only one layer got turned. I should've written that in words to make it clearer, sorry.



qqwref said:


> Jaap's scrambler [...] has some weird biases and the functioning is pretty much opaque.


Yeah, some documentation and descriptive variable names would be good. Biases... well, my two definitions (weighing or not weighing the turns) are also biased. And yours might still differ, I'm not quite sure. Question still is whether in the long run, they all end up with the same distribution, and if not, which is "best".



qqwref said:


> I don't think we should ever go by what people who don't "know[...] how to use the puzzle" think.


But they truly are the most unbiased 



qqwref said:


> Square-1 in a position [...] halfway through a / move


Ooh, YEAH, my favorite shape. Let's go for this! Not sure whether I'm kidding, to be honest . Though I guess I'll just have to wait until some inventor mixes a 2-layer Square-1 with a 2x2x2... (or does that exist already?).


----------



## qqwref (Sep 21, 2010)

StefanPochmann said:


> Or asking (I just did)


Saw your post on the forum. Unfortunately I don't speak German.



StefanPochmann said:


> That's another way to look at it, I guess. I never said x and y are nonzero, so with that I just meant *all* U+D turns, and the first two numbers were those where only one layer got turned. I should've written that in words to make it clearer, sorry.


Sure. But the natural interpretation to me of "most turns have at least one zero" is "there are more turns with at least one zero than with no zeroes", so I was initially very confused when I saw that the sum of the two numbers was less than the third.



StefanPochmann said:


> Yeah, some documentation and descriptive variable names would be good. Biases... well, my two definitions (weighing or not weighing the turns) are also biased. And yours might still differ, I'm not quite sure. Question still is whether in the long run, they all end up with the same distribution, and if not, which is "best".


I mean biased as in the code has some weird weightings, which are basically just mystery numbers. I don't remember what the code does now, but I figured it out once, and in the part where it decides what turn to do next I basically said to myself "why would anyone set it up like this?". Oh well. In the long run they might not end up random, but who knows about 40 moves. That's not very much. Run a million scrambles with Jaap's scrambler and my idea (take the code from qqtimer), and yours too if you feel like coding it, and see how closely they match the distribution Lucas gave.




StefanPochmann said:


> Though I guess I'll just have to wait until some inventor mixes a 2-layer Square-1 with a 2x2x2... (or does that exist already?).


 http://twistypuzzles.com/forum/viewtopic.php?f=15&t=14733


----------



## Stefan (Sep 21, 2010)

qqwref said:


> Run a million scrambles with Jaap's scrambler and my idea (take the code from qqtimer), and yours too if you feel like coding it, and see how closely they match the distribution Lucas gave.



Maybe. I must admit I don't care all that much about Square-1 yet and this sounds like more work...



qqwref said:


> http://twistypuzzles.com/forum/viewtopic.php?f=15&t=14733


 
omg WANT!!! I should really at least keep an eye on their "new puzzles" section...


----------



## Mike Hughey (Oct 13, 2010)

For what it's worth, I incorporated the probabilities of the various cubeshapes into my square-1 BLD page. Since I really just took the chart from Christian Eggermont (with his permission), it actually makes a somewhat reasonable chart for learning to get to square optimally. Well, except for the fact that I mostly just used his paths to square, which are very old and therefore pretty bad in some cases (optimal, but very much not speed friendly). I'm realizing now that maybe I should have looked around for better ones before I did this; the algorithm for 7 moves to square is particularly bad.

You can sort that list by letter code, distance to square, or probability, in case anyone finds that useful.


----------



## Lucas Garron (Oct 16, 2010)

Stefan said:


> Do the results change if we use a more 3x3x3-like definition like this?


No.


----------



## Lucas Garron (Oct 16, 2010)

For clarification: I've held back on here until I could rewrite my code to check the distribution of all 3678 shapes with one-layer turns. You'll have to trust me for now because the code is ugly.
If anyone wants to double-check, please go ahead.

I believe the same argument proves this (intuitively), but I didn't want to say that until I verified it with calculation.

Now that we know the definitions are equivalent, I think the proper distribution we should use is pretty clear.


----------



## Lucas Garron (Oct 16, 2010)

Stefan said:


> Do I understand this thread correctly, that this is achieved by randomly picking a position from all with equal probability?


Yes, if you use the right definition of "position."
If you use "one of the 3678" (e.g. AUF matters, but we only allow /-twistable shapes) then it turns out to be the same as the limit of doing a lot of random moves (for either the one-slice or the two-slice definition; can we call them "Stefan-style" and "Lucas-style?" ). I consider it by far the most natural definition from a math perspective.



Stefan said:


> So could one for example randomly permute the 16 pieces and then put them in that order clockwise from the top front gap, then clockwise from the bottom front gap (just making sure it's actually two halves and starting over if not)? Would that represent the many-random-moves idea?


Yes, but *the "starting over" is important*. Backtracking changes the distribution.

I looked into this, and there seem to be a few ways to place the pieces randomly.

*1) Select random permutations until the gaps are okay*
This will occur with probability 3678/12870, so on average it will take about 3.5 tries to find a valid permutation.

*2) Repeatedly: Choose with probability 1/2 from edge vs. corner (if both are possible, else 1 with whichever is). Then randomly choose a remaining piece of that type*
Bad.

*3) Keep picking randomly among all remaining pieces, selecting an edge if you only have 1 space left until a gap*
Bad. 


I mention this because I can imagine an amateur programmer doing 2) or 3) without thinking about it. 2) forces everything to align to negative powers of 2, and 3) isn't any better. It's a cool realization that the greedy algorithm works, but too much effort to make it equal-distribution.

The best is to select a truly random state of the 3678. If you're worried that the program might get unlucky and try forever (practically impossible with a decent PRNG), it is possible to do this deterministically: Select a shape using the proper distribution, then randomly assign the pieces into the shape.
(Worried about the middle layer? Just flip the right side with probability 1/2.)

There is actually a lot of math going behind the scenes, but it works out pretty simple.


----------



## Lucas Garron (Oct 16, 2010)

Sorry for the quadruple post, but I want to keep my posts separate to delimit separate points.

Here's an image of how bad the current scrambler is:








The graph is sorted by MRSS probability. Yellow means that the current is that much higher than it should be, red means lower.
For example, I compute that you're about 1.5 times as likely to get one of the 10 star shapes from the current scrambler as you should (not experimentally confirmed).

I made a few slight assumptions, but the general nature of the result should be correct.


EDIT: The original post said "3 times as likely" instead of "1.5 times as likely." I had meant to re-do the computation since it didn't agree with the chart, but accidentally just quoted the quick computation. Whatever the actual statistics are, we still should go Markov-Random-State.

EDIT 2: An interesting statistic is comes from asking how often the scrambles from one distribution would need to be changed to correspond to a correct distribution. I compute about 25.4% (\( {\sum_{s\in shapes} min(randommoves_s, mrss_s)} \)).

EDIT 3: I'm not even taking into account piece permutations. I suspect the issue is even worse there.


----------



## Lucas Garron (Oct 18, 2010)

First one ever, I believe: sq1_mrss.zip


```
$ wget http://archive.garron.us/code/2010/sq1_mrss.zip
$ unzip sq1_mrss.zip
$ g++ -O5 -o sq1_mrss *.cpp
$ ./sq1_mrss
Generating 12 Square-1 Markov-Random-State Scrambles...
1) 7FC8AG36B4E2H1D5/    ( 1, 6)/( 5,-4)/( 1,-2)/( 3,-1)/(-3, 0)/( 5,-3)/(-5,-2)/( 0,-4)/( 0,-3)/(-1, 0)                           [9|25]  (511ms)
2) BAEHD25173G846FC-    ( 4, 0)/( 6,-3)/(-4,-1)/(-5,-3)/( 3, 0)/( 2,-5)/( 6,-2)/(-3, 0)/( 2,-4)/( 6,-4)/                          [10|27]  (358ms)
3) 5EH7FDAC63G81B24-    ( 1, 3)/(-1, 5)/( 4,-5)/( 0,-3)/(-4, 0)/( 3, 0)/( 6,-3)/( 3,-5)/( 0,-2)/( 0,-4)/( 0,-2)                   [10|26]  (695ms)
4) 4E2G1B3C7D658AHF-    ( 6, 5)/(-5,-2)/( 2,-1)/( 0,-5)/( 0,-3)/( 3,-1)/(-5,-2)/( 0,-2)/( 2,-1)/( 3,-4)/(-5,-2)                   [10|29]  (559ms)
5) HDE364G28CA1FB57/    ( 0, 5)/(-3, 0)/(-3, 0)/( 6,-3)/( 3,-5)/(-3,-3)/( 3, 0)/( 1,-5)/( 0,-2)/( 2, 0)/(-1, 0)/( 6, 0)           [11|27]  (1263ms)
6) 785H6GFBE13CD42A-    ( 4, 6)/( 5,-4)/( 3,-3)/( 4,-2)/( 3,-4)/(-3, 0)/(-3, 0)/( 2, 0)/( 2,-5)/( 2,-1)/( 6,-2)/(-4, 0)/          [12|32]  (2363ms)
7) 5CF3ADE78HBG4621/    ( 4, 3)/(-4,-1)/(-5,-2)/(-1,-3)/(-3, 0)/( 3,-1)/(-2,-5)/(-3, 0)/( 2, 0)/( 3,-4)/( 0,-4)/( 0,-2)           [11|30]  (926ms)
8) GAH7E3618F5B24DC-    (-2, 3)/(-1, 5)/( 3, 0)/( 3,-3)/(-5, 0)/( 6,-3)/( 1,-3)/( 5,-1)/( 1,-3)/( 6, 0)/( 4,-5)                   [10|29]  (352ms)
9) D3H172FCEGB864A5-    ( 3,-4)/( 1, 4)/(-3,-3)/( 2,-4)/( 6,-3)/( 1,-3)/( 3, 0)/( 0,-3)/( 6,-3)/(-5,-4)/( 6,-4)                   [10|30]  (118ms)
10) 57E32FGH64D81CAB/    ( 3,-4)/(-5, 4)/( 0,-3)/( 0,-4)/(-3,-3)/( 3, 0)/(-4,-4)/( 2,-4)/(-2,-4)/(-4,-4)                           [9|26]  (1196ms)
11) 5GD68E124HFB73CA-    ( 6, 2)/( 1, 4)/(-1,-4)/(-3,-3)/( 3,-2)/(-3,-3)/( 3, 0)/( 1,-1)/( 0,-2)/( 0,-4)/(-1, 0)                   [10|28]  (497ms)
12) H6728CEF41BG5A3D-    ( 1, 3)/(-1, 5)/(-5,-2)/( 2, 0)/( 3, 0)/( 1,-4)/( 0,-4)/(-4,-5)/( 2, 0)/(-2, 0)/( 6, 0)                   [10|26]  (514ms)
```

EDIT: Also, live scripting demo.

EDIT 2: For the lazy Unix-lovers who trust me: *wget http://archive.garron.us/code/2010/sq1_mrss.zip ; unzip sq1_mrss.zip; g++ -O5 -o sq1_mrss *.cpp; ./sq1_mrss*


----------



## qqwref (Oct 18, 2010)

Lucas Garron said:


> *1) Select random permutations until the gaps are okay*
> This will occur with probability 3678/12870, so on average it will take about 3.5 tries to find a valid permutation.



Not for you, Lucas, but for anyone else reading: the general form of this is a really great trick to know if you want to ensure a uniform choice but don't have a native "randomly choose an element from this set" function (or, if you don't want to have to actually generate that set, which is the other way to do it). Basically, if you want to uniformly randomly select from the set of elements from a bigger set A which satisfy some property p, a simple way is to simply randomly select elements from A until you find one which satisfies p. The only real downside is that it often takes a few tries to get one of the right elements (and may in theory take infinitely many if you are infinitely unlucky) - on average, one over the probability that an element in A satisfies p. Fortunately, generating random numbers is generally pretty fast.

This trick is useful for a lot of calculations involving randomness that, on first glance, seem quite tricky to program. You can use it to select valid positions on a Square-1 or similar bandaged dihedral puzzle; you can use it to select random numbers from 1 through N if all you can generate is random bits (e.g. with a PRNG like the Mersenne Twister); you can use it to choose random cube positions with some weird criterion, like having no 1x1x2 blocks.


----------



## cuBerBruce (Oct 19, 2010)

I somewhat dislike of the use of the "starting over if an invalid scramble state is generated" technique. For instance, if from a given state of the PRNG, it takes 10 tries to get a valid scramble, then this implies there are at least 10 different initial states of the PRNG that will end up returning the same scramble position. This may mean that this particular scramble position may have a higher probability than others of being the first scramble position generated. In turn, this could mean a higher than expected probability of two different competitions having the same scrambles.


----------



## Lucas Garron (Oct 19, 2010)

cuBerBruce said:


> I somewhat dislike of the use of the "starting over if an invalid scramble state is generated" technique. For instance, if from a given state of the PRNG, it takes 10 tries to get a valid scramble, then this implies there are at least 10 different initial states of the PRNG that will end up returning the same scramble position. This may mean that this particular scramble position may have a higher probability than others of being the first scramble position generated. In turn, this could mean a higher than expected probability of two different competitions having the same scrambles.



1) In my implementation here, I use a Mersenne Twister library (default seeded by current time), which should by far be good enough to mitigate 10 or even more being thrown away while staying random. I'm not sure it's even a big problem for smaller PRNGs.
2) The expected number of attempts is only 3.5, in exchange for much simpler code. As an alternative, I also described a deterministic algorithm that always takes 1 try (but requires longer initialization or precomputed probability tables).


----------



## qqwref (Oct 20, 2010)

cuBerBruce said:


> I somewhat dislike of the use of the "starting over if an invalid scramble state is generated" technique. For instance, if from a given state of the PRNG, it takes 10 tries to get a valid scramble, then this implies there are at least 10 different initial states of the PRNG that will end up returning the same scramble position.


Mm, yes. But we should already be using a PRNG big enough that there are many instances of each scramble in the period, so that the number of states "before" each individual instance of the scramble-number ends up getting lost in the huge number of possibilities.

Mersenne Twister, for instance, has a period of about 2^19937 ~= 4 * 10^6001. Each element in the period is a 32-bit number; we can generate a potential Square-1 situation with two of these numbers, giving us 2 * 10^6001 possible situations. Dividing by 3.5 and then by the number of Square-1 positions gives that each position occurs roughly 1.3 * 10^5989 times. I think this is enough that the distance between each one and the next is irrelevant.


PS: If you're generating, say, a random permutation of 16 elements, you know that there are 16! possibilities. Now, 16! * 107 is about 2^(50.9916), i.e. it takes up just under 51 bits. This means it takes up 99.42% or so of the positions. So one way to generate random piece permutations without wasting too many bits is:
- Generate two 32-bit numbers and drop all but 52 bits (51 + one for the middle slice).
- Check to see that the first 51 bits form a binary number strictly less than 16! * 107. About 0.58% of the time, this won't happen, so start again. If it works, take that number mod 16! and then convert the result (an integer from 0 to 16!-1) into a random permutation on the 16 pieces. You can now check if that is a valid permutation of the Square-1.

EDIT: And if you are really insane, you can use 16! * 440828, which spans all 63 bits (last one is for the middle slice) 99.9997786% of the way, so you'll only have to discard one number in every 451,769 or so attempts.


----------



## Stefan (Nov 17, 2011)

Lucas Garron said:


> *1) Select random permutations until the gaps are okay*
> This will occur with probability 3678/12870, so on average it will take about 3.5 tries to find a valid permutation.
> 
> *3) Keep picking randomly among all remaining pieces, selecting an edge if you only have 1 space left until a gap*
> Bad.



I still can't really pinpoint "why" 3) is bad, but I finally managed to convince myself *that* it is bad (that its result does differ from that of 1).

Imagine we're only doing one layer and have edges a/b/c/d and corners E/F/G/H. One possible order is abcdEFGH. That would be chosen by first picking 'a' with probability 1/8, then b with probability 1/7, and so on until G with 1/2 and H with 1/1. The probability for choosing that order is thus 1/8!. However, because we want a gap in the middle, fewer than all 8! orders are acceptable. And since we want the acceptable ones to occur with equal probability, their probability should be higher than 1/8!.

I hope that's right, and I'd still like a better understanding of "why" it goes wrong and how to do it right...


----------



## Lucas Garron (Nov 17, 2011)

Stefan said:


> I hope that's right, and I'd still like a better understanding of "why" it goes wrong and how to do it right...



That's exactly right. A position like this position is less likely than it should be. Positions with edges before the gap are made more likely because we sometimes keep trying until we get one of them. The easiest way to fix it, of course, is to start over completely, so that all *possible* arrangements are equally likely to get picked in the end.


----------



## Dene (Nov 17, 2011)

Oh I get it now, that makes sense. Yay for dumbed down information.


----------



## Stefan (Nov 17, 2011)

Lucas Garron said:


> A position like this position is less likely than it should be.



Yeah, it's like these positions that got through creation unobstructed haven't heard the good news that other positions died along the way so there's now more cake for everybody still alive.

I kinda envision this like a lot of settlers on their way to reach the west coast. Initially they're all given a piece of cake (of the same size), but some some die on the trip and others near them pick up their cake. In the end, at the west coast, some settlers will have more cake than others.



Lucas Garron said:


> Positions with edges before the gap are made more likely because we sometimes keep trying until we get one of them.



Nice, I really like that explanation. Although, the "keep trying" isn't directly from the description which said to just select an edge. For that perspective, I have another way to express it now: When creating a position and we so far have let's say abcE (three edges and a corner), nothing has intervened with the (1/8)*(1/7)*(1/6)*(1/5) probability of getting to that point. Without the gap, this probability would be spread evenly among the continuations abcEd, abcEF, abcEG and abcEH. But since the latter three aren't possible and we *are* going to continue, continuation abcEd absorbs the probability of the other three. Or in other words, it eats the cake that they just dropped.



Lucas Garron said:


> The easiest way to fix it, of course, is to start over completely, so that all *possible* arrangements are equally likely to get picked in the end.



I know, I just wanted to understand the failure better (I now do) and perhaps fix it because your _"It's a cool realization that the greedy algorithm works, but too much effort to make it equal-distribution"_ gave me *some* hope. But I wrote a small program that told me there are 26496 valid positions, which is 24*24*46, and I suspect there's no way to explain the prime factor 23 without really thinking about possible shapes, which I don't want to.


----------

